How to Work with Core Document Management in Python
How to Work with Core Document Management
This guide shows how to load Word documents and convert them to PDF, Markdown, and plain text using Aspose.Words FOSS for Python.
Prerequisites
Install the library:
pip install aspose-words-foss>=26.4.0Requires Python 3.10 or later.
Load a Document
Create a Document object by passing a file path. Supported input formats include DOCX, DOC, RTF, TXT, and Markdown.
import aspose.words_foss as aw
doc = aw.Document("input.docx")Convert to PDF
Call save() with SaveFormat.PDF:
import aspose.words_foss as aw
doc = aw.Document("input.docx")
doc.save("output.pdf", aw.SaveFormat.PDF)Convert to Markdown
Use SaveFormat.MARKDOWN to export to Markdown:
import aspose.words_foss as aw
doc = aw.Document("input.docx") # or .doc, .rtf, .txt, .md
doc.save("output.md", aw.SaveFormat.MARKDOWN)Extract Text
Use Document.get_text() to extract all text content:
import aspose.words_foss as aw
doc = aw.Document("input.docx")
text = doc.get_text()Summary
| Task | Method |
|---|---|
| Load a document | Document("path") |
| Export to PDF | doc.save("out.pdf", SaveFormat.PDF) |
| Export to Markdown | doc.save("out.md", SaveFormat.MARKDOWN) |
| Export to plain text | doc.save("out.txt", SaveFormat.TEXT) |
| Extract text | doc.get_text() |