How to Parse Tables in OneNote Files Using Python
This guide shows how to extract and process table data from Microsoft OneNote .one files in Python using Document.GetChildNodes() to walk the Table → TableRow → TableCell hierarchy. Aspose.Note FOSS for Python provides programmatic access to all cell content, column metadata, and table tags.
Benefits
- Structured access: row and column counts, individual cell content, column widths
- No spreadsheet app required: extract table data from OneNote on any platform
- Free and open-source: MIT license, no API key
Step-by-Step Guide
Step 1: Install Aspose.Note FOSS for Python
Install the Aspose.Note FOSS package from PyPI using the pip command below:
pip install aspose-noteStep 2: Load the .one File
Instantiate Document with the path to a .one file to load its full content tree into memory:
from aspose.note import Document
doc = Document("MyNotes.one")
print(f"Pages: {len(list(doc))}")Step 3: Find All Tables
Use GetChildNodes(Table) to retrieve every table from the entire document recursively:
from aspose.note import Document, Table
doc = Document("MyNotes.one")
tables = doc.GetChildNodes(Table)
print(f"Found {len(tables)} table(s)")Step 4: Read Row and Cell Values
Iterate TableRow and TableCell nodes. Each cell contains RichText nodes whose .Text property gives the plain-text content:
from aspose.note import Document, Table, TableRow, TableCell, RichText
doc = Document("MyNotes.one")
for t, table in enumerate(doc.GetChildNodes(Table), start=1):
print(f"\nTable {t}: {len(table.Columns)} column(s)")
for r, row in enumerate(table.GetChildNodes(TableRow), start=1):
cell_values = []
for cell in row.GetChildNodes(TableCell):
text = " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
cell_values.append(text)
print(f" Row {r}: {cell_values}")Step 5: Read Column Widths
Read column widths and border visibility from the table.Columns collection by iterating with enumerate():
from aspose.note import Document, Table
doc = Document("MyNotes.one")
for i, table in enumerate(doc.GetChildNodes(Table), start=1):
print(f"Table {i} column widths (pts): {[col.Width for col in table.Columns]}")
print(f"Borders visible: {table.IsBordersVisible}")Step 6: Export to CSV
Export all table data to a CSV file by writing each row’s cell text values using the csv.writer module:
import csv, io
from aspose.note import Document, Table, TableRow, TableCell, RichText
doc = Document("MyNotes.one")
buf = io.StringIO()
writer = csv.writer(buf)
for table in doc.GetChildNodes(Table):
for row in table.GetChildNodes(TableRow):
values = [
" ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
for cell in row.GetChildNodes(TableCell)
]
writer.writerow(values)
writer.writerow([]) # blank row between tables
with open("tables.csv", "w", encoding="utf-8", newline="") as f:
f.write(buf.getvalue())
print("Saved tables.csv")Common Issues and Fixes
Tables appear empty
Cause: The cells contain Image nodes rather than RichText nodes.
Check: Use the following code to count RichText and Image nodes per cell and diagnose why table cells appear empty:
from aspose.note import Document, Table, TableRow, TableCell, RichText, Image
doc = Document("MyNotes.one")
for table in doc.GetChildNodes(Table):
for row in table.GetChildNodes(TableRow):
for cell in row.GetChildNodes(TableCell):
texts = cell.GetChildNodes(RichText)
images = cell.GetChildNodes(Image)
print(f" Cell: {len(texts)} text(s), {len(images)} image(s)")Column count doesn’t match Columns
table.Columns reflects the column metadata stored in the file. The actual number of cells per row may differ if rows have merged cells (the file format stores this at the binary level; the public API does not expose a merge flag).
ImportError: No module named ‘aspose’
pip install aspose-note
pip show aspose-note # confirm it is installed in the active environmentFrequently Asked Questions
Can I edit table data and save it back? No. Writing back to .one format is not supported. Changes made in-memory (e.g. via RichText.Replace()) cannot be persisted to the source file.
Are merged cells detected? The CompositeNode API does not expose merge metadata. Each TableCell is treated as a separate cell regardless of visual merging.
Can I count how many rows a table has? Yes: len(table.GetChildNodes(TableRow)).
Related Resources: