Build a Light Document Model

How to Build a Light Document Model in Python

How to Build a Light Document Model in Python

The light document model (LDM) is Aspose.Words FOSS’s Python-native representation of a Word document. This guide shows how to load a DOCX file and convert it to an LDM using DocumentReader and its to_light_document() method from LdmBuilderMixin.


Prerequisites

RequirementDetail
Python3.9 or later
Packageaspose-words-foss (MIT-licensed)
InputA .docx file
pip install aspose-words-foss

Step 1 — Import DocumentReader

from aspose.words_foss import DocumentReader

DocumentReader inherits from LdmBuilderMixin, which provides the to_light_document() method.


Step 2 — Load the DOCX File

Use load_file() for a file path, load_stream() for a file-like object, or load_bytes() for raw bytes:

reader = DocumentReader()
reader.load_file("data/my_document.docx")

Step 3 — Build the LDM

Call to_light_document() to convert the loaded DOCX into an LDM Document:

doc = reader.to_light_document()
print("Page count:", doc.page_count)
print("Sections:", len(doc.sections))
print("Paragraphs:", len(doc.all_paragraphs))

Step 4 — Inspect the Document

The LDM Document exposes the full document tree. Common entry points:

# Read full text
print(doc.text[:300])

# Iterate sections
for section in doc.sections:
    print("Section paragraphs:", len(section.paragraphs))

# Find headings (up to H2)
for heading in doc.headings(max_level=2):
    print("Heading:", heading.text)

Step 5 — Write Back to DOCX (Optional)

After inspecting or modifying the LDM, use LdmDocxWriter to produce a DOCX file:

from aspose.words_foss import LdmDocxWriter

writer = LdmDocxWriter()
writer.write(doc, "output/result.docx")

Next Steps

See Also