How to Parse Tables in OneNote Files Using Python

How to Parse Tables in OneNote Files Using Python

Microsoft OneNote lets users embed structured data tables directly in pages. Aspose.Note FOSS for Python exposes every table through a Table → TableRow → TableCell hierarchy, giving you programmatic access to all cell content, column metadata, and table tags.

Benefits

  1. Structured access — row and column counts, individual cell content, column widths
  2. No spreadsheet app required — extract table data from OneNote on any platform
  3. Free and open-source — MIT license, no API key

Step-by-Step Guide


Common Issues and Fixes

Tables appear empty

Cause: The cells contain Image nodes rather than RichText nodes.

Check:

from aspose.note import Document, Table, TableRow, TableCell, RichText, Image

doc = Document("MyNotes.one")
for table in doc.GetChildNodes(Table):
    for row in table.GetChildNodes(TableRow):
        for cell in row.GetChildNodes(TableCell):
            texts = cell.GetChildNodes(RichText)
            images = cell.GetChildNodes(Image)
            print(f"  Cell: {len(texts)} text(s), {len(images)} image(s)")

Column count doesn’t match ColumnWidths

ColumnWidths reflects the metadata stored in the file. The actual number of cells per row may differ if rows have merged cells (the file format stores this at the binary level; the public API does not expose a merge flag).

ImportError: No module named ‘aspose’

pip install aspose-note
pip show aspose-note  # confirm it is installed in the active environment

Frequently Asked Questions

Can I edit table data and save it back? No. Writing back to .one format is not supported. Changes made in-memory (e.g. via RichText.Replace()) cannot be persisted to the source file.

Are merged cells detected? The CompositeNode API does not expose merge metadata. Each TableCell is treated as a separate cell regardless of visual merging.

Can I count how many rows a table has? Yes: len(table.GetChildNodes(TableRow)).


Related Resources: