How to Build a Light Document Model in Python
How to Build a Light Document Model in Python
The light document model (LDM) is Aspose.Words FOSS’s Python-native representation
of a Word document. This guide shows how to load a DOCX file and convert it to an
LDM using DocumentReader and its to_light_document() method from
LdmBuilderMixin.
Prerequisites
| Requirement | Detail |
|---|---|
| Python | 3.9 or later |
| Package | aspose-words-foss (MIT-licensed) |
| Input | A .docx file |
pip install aspose-words-fossStep 1 — Import DocumentReader
from aspose.words_foss import DocumentReaderDocumentReader inherits from LdmBuilderMixin, which provides the
to_light_document() method.
Step 2 — Load the DOCX File
Use load_file() for a file path, load_stream() for a file-like object, or
load_bytes() for raw bytes:
reader = DocumentReader()
reader.load_file("data/my_document.docx")Step 3 — Build the LDM
Call to_light_document() to convert the loaded DOCX into an LDM Document:
doc = reader.to_light_document()
print("Page count:", doc.page_count)
print("Sections:", len(doc.sections))
print("Paragraphs:", len(doc.all_paragraphs))Step 4 — Inspect the Document
The LDM Document exposes the full document tree. Common entry points:
# Read full text
print(doc.text[:300])
# Iterate sections
for section in doc.sections:
print("Section paragraphs:", len(section.paragraphs))
# Find headings (up to H2)
for heading in doc.headings(max_level=2):
print("Heading:", heading.text)Step 5 — Write Back to DOCX (Optional)
After inspecting or modifying the LDM, use LdmDocxWriter to produce a DOCX file:
from aspose.words_foss import LdmDocxWriter
writer = LdmDocxWriter()
writer.write(doc, "output/result.docx")Next Steps
- Writing DOCX Files with LdmDocxWriter — full write API
- Working with Document Sections — navigate sections and paragraphs