Features
Features
Aspose.Words FOSS for Python is a document conversion and text extraction library for Word documents. The entire public API is built around the Document class.
Supported Input Formats
The Document class loads files from these formats automatically based on file extension:
| Format | Extensions |
|---|---|
| Word Document | .docx, .doc |
| Rich Text | .rtf |
| Plain Text | .txt |
| Markdown | .md |
Output Formats and Save Options
Call Document.save() with a SaveFormat constant for quick conversion, or pass a save-options object for additional configuration.
| Output | SaveFormat Constant | Save Options Class |
|---|---|---|
SaveFormat.PDF | PdfSaveOptions | |
| Markdown | SaveFormat.MARKDOWN | MarkdownSaveOptions — control underline formatting export |
| Plain Text | SaveFormat.TEXT | — |
Note:
PdfSaveOptionsproperties (compliance, JPEG quality, font embedding, etc.) are defined for API forward-compatibility but are not yet consumed by the PDF writer. Setting them currently has no effect on output. ForMarkdownSaveOptions, onlyexport_underline_formattingis currently applied.
For code examples and save-options configuration details, see Core Management.
Text Extraction
Document.get_text() returns the full plain-text content of any loaded document without writing to disk.
Image Support
Documents with embedded images can be converted to all supported output formats. The conversion pipeline preserves image content through the export process.
API Summary
| Class / Method | Role |
|---|---|
Document | Load documents, convert formats, extract text |
SaveFormat | Output format constants (PDF, MARKDOWN, TEXT) |
PdfSaveOptions | PDF export configuration |
MarkdownSaveOptions | Markdown export configuration |