How to Parse Tables in OneNote Files Using Python

How to Parse Tables in OneNote Files Using Python

Microsoft OneNote lets users embed structured data tables directly in pages. Aspose.Note FOSS for Python exposes every table through a Table → TableRow → TableCell hierarchy, giving you programmatic access to all cell content, column metadata, and table tags.

Benefits

  1. Structured access: row and column counts, individual cell content, column widths
  2. No spreadsheet app required: extract table data from OneNote on any platform
  3. Free and open-source: MIT license, no API key

Step-by-Step Guide

Step 1: Install Aspose.Note FOSS for Python

pip install aspose-note

Step 2: Load the .one File

from aspose.note import Document

doc = Document("MyNotes.one")
print(f"Pages: {doc.Count()}")

Step 3: Find All Tables

Use GetChildNodes(Table) to retrieve every table from the entire document recursively:

from aspose.note import Document, Table

doc = Document("MyNotes.one")
tables = doc.GetChildNodes(Table)
print(f"Found {len(tables)} table(s)")

Step 4: Read Row and Cell Values

Iterate TableRow and TableCell nodes. Each cell contains RichText nodes whose .Text property gives the plain-text content:

from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")

for t, table in enumerate(doc.GetChildNodes(Table), start=1):
    print(f"\nTable {t}: {len(table.ColumnWidths)} column(s)")
    for r, row in enumerate(table.GetChildNodes(TableRow), start=1):
        cell_values = []
        for cell in row.GetChildNodes(TableCell):
            text = " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            cell_values.append(text)
        print(f"  Row {r}: {cell_values}")

Step 5: Read Column Widths

from aspose.note import Document, Table

doc = Document("MyNotes.one")
for i, table in enumerate(doc.GetChildNodes(Table), start=1):
    print(f"Table {i} column widths (pts): {table.ColumnWidths}")
    print(f"Borders visible: {table.BordersVisible}")

Step 6: Export to CSV

import csv, io
from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")
buf = io.StringIO()
writer = csv.writer(buf)

for table in doc.GetChildNodes(Table):
    for row in table.GetChildNodes(TableRow):
        values = [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in row.GetChildNodes(TableCell)
        ]
        writer.writerow(values)
    writer.writerow([])   # blank row between tables

with open("tables.csv", "w", encoding="utf-8", newline="") as f:
    f.write(buf.getvalue())
print("Saved tables.csv")

Common Issues and Fixes

Tables appear empty

Cause: The cells contain Image nodes rather than RichText nodes.

Check:

from aspose.note import Document, Table, TableRow, TableCell, RichText, Image

doc = Document("MyNotes.one")
for table in doc.GetChildNodes(Table):
    for row in table.GetChildNodes(TableRow):
        for cell in row.GetChildNodes(TableCell):
            texts = cell.GetChildNodes(RichText)
            images = cell.GetChildNodes(Image)
            print(f"  Cell: {len(texts)} text(s), {len(images)} image(s)")

Column count doesn’t match ColumnWidths

ColumnWidths reflects the metadata stored in the file. The actual number of cells per row may differ if rows have merged cells (the file format stores this at the binary level; the public API does not expose a merge flag).

ImportError: No module named ‘aspose’

pip install aspose-note
pip show aspose-note  # confirm it is installed in the active environment

Frequently Asked Questions

Can I edit table data and save it back? No. Writing back to .one format is not supported. Changes made in-memory (e.g. via RichText.Replace()) cannot be persisted to the source file.

Are merged cells detected? The CompositeNode API does not expose merge metadata. Each TableCell is treated as a separate cell regardless of visual merging.

Can I count how many rows a table has? Yes: len(table.GetChildNodes(TableRow)).


Related Resources: