Working with the Note Document
This guide explains how to load, traverse, and export Microsoft OneNote .one files in Python using the Document class from Aspose.Note FOSS. The Document class is the root of the document object model (DOM), and every read, traversal, or export operation starts with a Document instance that represents a single .one section file.
Loading a Document
Load from a file path
Pass the path to a .one file directly to the Document constructor:
from aspose.note import Document
doc = Document("MyNotes.one")Load from a binary stream
Use a file-like object when the data comes from a network response, cloud storage blob, or in-memory buffer:
from pathlib import Path
from aspose.note import Document
with Path("MyNotes.one").open("rb") as f:
doc = Document(f)Create an empty document
Construct a Document with no arguments to build a new OneNote document programmatically:
from aspose.note import Document, Page
doc = Document()
doc.AppendChildLast(Page())Load with options
Use LoadOptions to set optional parameters such as history loading at construction time:
from aspose.note import Document, LoadOptions
opts = LoadOptions()
opts.LoadHistory = True # include page revision history in the DOM
doc = Document("MyNotes.one", opts)LoadOptions property | Type | Default | Description |
|---|---|---|---|
LoadHistory | bool | False | When True, page history revisions are included in the DOM and accessible via Document.GetPageHistory |
DocumentPassword | str | None | None | Reserved for API compatibility — encrypted documents are not supported and raise IncorrectPasswordException |
Document Format Detection
The FileFormat property identifies which OneNote format variant the file uses:
from aspose.note import Document, FileFormat
doc = Document("MyNotes.one")
if doc.FileFormat == FileFormat.OneNote2010:
print("OneNote 2010 format")
elif doc.FileFormat == FileFormat.OneNoteOnline:
print("OneNote Online format")
elif doc.FileFormat == FileFormat.OneNote2007:
print("OneNote 2007 format")
else:
print("Format:", doc.FileFormat)FileFormat value | Description |
|---|---|
OneNote2010 | Standard .one file written by OneNote 2010 and later |
OneNoteOnline | .one file written by OneNote Online / Microsoft 365 |
OneNote2007 | Legacy .one format from OneNote 2007 |
Unknown | Format could not be identified |
Iterating Pages
Pages are the direct children of Document. Use a for loop for simple iteration:
from aspose.note import Document
doc = Document("MyNotes.one")
for page in doc:
title_text = (
page.Title.TitleText.Text
if page.Title and page.Title.TitleText
else "(untitled)"
)
author = page.Author or "(unknown)"
level = page.Level or 1
indent = " " * (level - 1)
print(f"{indent}{title_text} [by {author}]")Use GetChildNodes(Page) to get a typed list of all pages at once:
from aspose.note import Document, Page
doc = Document("MyNotes.one")
pages = doc.GetChildNodes(Page)
print(f"Total pages: {len(pages)}")Page metadata
| Property | Type | Description |
|---|---|---|
Title | Title | None | Title block containing text, date, and time sub-nodes |
Author | str | None | Author name string |
CreationTime | datetime | None | When the page was created |
LastModifiedTime | datetime | None | Most recent modification time |
Level | int | None | Sub-page indent level (1 = top-level, 2 = sub-page, etc.) |
IsConflictPage | bool | True if the page is a sync conflict copy |
Traversing the Document Tree
Document inherits from CompositeNode, giving it full tree traversal methods:
from aspose.note import Document, RichText
doc = Document("MyNotes.one")
# Extract all plain text from the entire document
all_text = [rt.Text for rt in doc.GetChildNodes(RichText) if rt.Text]
print("\n".join(all_text))The DOM hierarchy:
Document
└── Page (0..n)
├── Title
│ ├── TitleText (RichText)
│ ├── TitleDate (RichText)
│ └── TitleTime (RichText)
└── Outline (0..n)
└── OutlineElement (0..n)
├── RichText
├── Image
├── Table
│ └── TableRow → TableCell → RichText / Image
└── AttachedFileKey traversal methods inherited from CompositeNode:
| Method | Description |
|---|---|
GetChildNodes(NodeType) | Return a typed list of all descendants matching the given type |
AppendChildLast(node) | Add a child node at the end of the children list |
AppendChildFirst(node) | Add a child node at the start |
InsertChild(index, node) | Insert a child at a specific index |
RemoveChild(node) | Remove a child node |
GetEnumerator() | Iterate direct children |
Page History
When loaded with LoadHistory=True, Document.GetPageHistory returns all saved revisions
of a page, ordered from oldest to most recent:
from aspose.note import Document, LoadOptions
opts = LoadOptions()
opts.LoadHistory = True
doc = Document("MyNotes.one")
for page in doc.GetChildNodes(None.__class__.__mro__[0]): # iterate pages
pass
# Get revisions for a specific page
import aspose.note as an
doc2 = Document("MyNotes.one", opts)
pages = doc2.GetChildNodes(an.Page)
if pages:
history = doc2.GetPageHistory(pages[0])
print(f"Revision count: {history.Count}")
for rev in history:
print(f" Modified: {rev.LastModifiedTime}")Note:
GetPageHistoryreturns an emptyPageHistoryobject whenLoadHistoryisFalse(the default). Always setopts.LoadHistory = Truebefore loading if you need revision data.
Saving and Exporting
Use Document.Save to export to PDF. PDF is the only supported export format.
Save to a file
Call Document.Save() with a file path and SaveFormat.Pdf to write the export directly to disk:
from aspose.note import Document, SaveFormat
doc = Document("MyNotes.one")
doc.Save("output.pdf", SaveFormat.Pdf)Save to an in-memory stream
Pass a writable io.BytesIO buffer instead of a path to capture the PDF bytes in memory:
import io
from aspose.note import Document, SaveFormat
doc = Document("MyNotes.one")
buf = io.BytesIO()
doc.Save(buf, SaveFormat.Pdf)
pdf_bytes = buf.getvalue()Save with options
Use PdfSaveOptions from aspose.note.saving to control which pages are included in the exported PDF output:
import io
from aspose.note import Document
from aspose.note.saving import PdfSaveOptions
doc = Document("MyNotes.one")
# Export pages 2 and 3 only (zero-based index)
opts = PdfSaveOptions(PageIndex=1, PageCount=2)
buf = io.BytesIO()
doc.Save(buf, opts)
pdf_bytes = buf.getvalue()Important: PDF export requires the optional ReportLab dependency. Install it with
pip install "aspose-note[pdf]".
Detecting Layout Changes
Call DetectLayoutChanges after modifying the document to recalculate layout metrics
before saving:
from aspose.note import Document, Page
doc = Document()
page = Page()
doc.AppendChildLast(page)
# ... add content to page ...
doc.DetectLayoutChanges()
doc.Save("output.pdf")Tips and Best Practices
- Load once, query many times:
Documentloads the entire.onefile into memory on construction. Hold oneDocumentinstance and callGetChildNodesrepeatedly rather than re-loading the file for each query. - Check
IsConflictPage: OneNote sync can produce conflict pages. Filter with[p for p in doc if not p.IsConflictPage]before processing content. - Use
LoadHistory=False(default) for speed: Page history adds significant memory and parse overhead. Only setLoadHistory=Truewhen you explicitly need revision data. - Stream for large files: Pass an open file object instead of a path string when reading
large
.onefiles to avoid holding the file open after the parse completes. - Call
DetectLayoutChangesbeforeSave: When constructing a document programmatically and appending nodes, callDetectLayoutChanges()to ensure layout coordinates are consistent before exporting to PDF.
Common Issues
| Issue | Cause | Fix |
|---|---|---|
IncorrectPasswordException on load | File is encrypted; encrypted documents are not supported | Obtain an unencrypted copy of the file |
FileCorruptedException on load | File is truncated or has an unsupported internal format | Verify the file is a valid .one section (not a .onetoc2 table-of-contents) |
UnsupportedSaveFormatException on save | Attempted to save to a format other than PDF | Only PDF format is supported for export |
Empty GetPageHistory result | LoadOptions.LoadHistory was not set to True before loading | Re-load the document with opts.LoadHistory = True |
ModuleNotFoundError: reportlab | PDF export attempted without the PDF extra | Run pip install "aspose-note[pdf]" |
FAQ
What OneNote formats can Aspose.Note read?
All three OneNote section formats: OneNote2007, OneNote2010, and OneNoteOnline.
Use doc.FileFormat to identify which variant was loaded.
Can I write back to a .one file after reading it?
Aspose.Note for Python is a read-and-export library. Use Document.Save to write PDF output. The OneNote binary format is read-only in this FOSS edition.
How do I read a specific page by title?
Iterate pages with a generator expression and compare the title text property to your target string:
from aspose.note import Document
doc = Document("MyNotes.one")
target = next(
(p for p in doc
if p.Title and p.Title.TitleText and p.Title.TitleText.Text == "My Page"),
None,
)How do I count sub-pages separately from top-level pages?
Use page.Level: level 1 is top-level, level 2 is a sub-page, and so on.
from aspose.note import Document
doc = Document("MyNotes.one")
top_level = [p for p in doc if (p.Level or 1) == 1]
sub_pages = [p for p in doc if (p.Level or 1) > 1]
print(f"Top-level: {len(top_level)}, Sub-pages: {len(sub_pages)}")Does Aspose.Note support .onetoc2 table-of-contents files?
No. Only .one section files are supported. .onetoc2 notebooks will raise
FileCorruptedException.
API Reference Summary
| Class / Method | Description |
|---|---|
Document(path_or_stream, options?) | Load a .one file or create an empty document |
Document.FileFormat | Detected format variant (FileFormat enum) |
Document.GetChildNodes(NodeType) | Retrieve all descendant nodes of a given type |
Document.GetPageHistory(page) | Get revision history for a page (requires LoadHistory=True) |
Document.Save(target, format_or_options) | Export to PDF by file path or stream |
Document.DetectLayoutChanges() | Recalculate layout metrics after programmatic edits |
Document.Accept(visitor) | Walk the DOM using a DocumentVisitor subclass |
LoadOptions.LoadHistory | Set to True to include page revisions in the DOM |
LoadOptions.DocumentPassword | Reserved; encrypted files raise IncorrectPasswordException |
FileFormat | Enum: OneNote2010, OneNoteOnline, OneNote2007, Unknown |
SaveFormat.Pdf | The only supported export format; raises ValueError for other formats |
PageHistory.Count | Number of page revisions available |
Page.Level | Sub-page depth (1 = top-level) |
Page.IsConflictPage | True for sync conflict copies |
See Also
- Text Extraction: extracting plain text and formatted runs
- PDF Export: export options, page selection, and stream output
- Images and Attachments: extracting embedded images and attached files
- Tables: parsing table structure and cell content
- API Reference: Document
- Aspose.Note — Enterprise Documentation
- Aspose.Note for Python — Enterprise Documentation