Working with the Note Document
The Document class is the root of the Aspose.Note for Python document object model (DOM).
Every read, traversal, or export operation starts with a Document instance that represents
a single .one section file.
Loading a Document
Load from a file path
Pass the path to a .one file directly to the Document constructor:
from aspose.note import Document
doc = Document("MyNotes.one")Load from a binary stream
Use a file-like object when the data comes from a network response, cloud storage blob, or in-memory buffer:
from pathlib import Path
from aspose.note import Document
with Path("MyNotes.one").open("rb") as f:
doc = Document(f)Create an empty document
Construct a Document with no arguments to build a new OneNote document programmatically:
from aspose.note import Document, Page
doc = Document()
doc.AppendChildLast(Page())Load with options
Use LoadOptions to set optional parameters at load time:
from aspose.note import Document, LoadOptions
opts = LoadOptions()
opts.LoadHistory = True # include page revision history in the DOM
doc = Document("MyNotes.one", opts)LoadOptions property | Type | Default | Description |
|---|---|---|---|
LoadHistory | bool | False | When True, page history revisions are included in the DOM and accessible via Document.GetPageHistory |
DocumentPassword | str | None | None | Reserved for API compatibility — encrypted documents are not supported and raise IncorrectPasswordException |
Document Format Detection
The FileFormat property identifies which OneNote format variant the file uses:
from aspose.note import Document, FileFormat
doc = Document("MyNotes.one")
if doc.FileFormat == FileFormat.OneNote2010:
print("OneNote 2010 format")
elif doc.FileFormat == FileFormat.OneNoteOnline:
print("OneNote Online format")
elif doc.FileFormat == FileFormat.OneNote2007:
print("OneNote 2007 format")
else:
print("Format:", doc.FileFormat)FileFormat value | Description |
|---|---|
OneNote2010 | Standard .one file written by OneNote 2010 and later |
OneNoteOnline | .one file written by OneNote Online / Microsoft 365 |
OneNote2007 | Legacy .one format from OneNote 2007 |
Unknown | Format could not be identified |
Iterating Pages
Pages are the direct children of Document. Use a for loop for simple iteration:
from aspose.note import Document
doc = Document("MyNotes.one")
for page in doc:
title_text = (
page.Title.TitleText.Text
if page.Title and page.Title.TitleText
else "(untitled)"
)
author = page.Author or "(unknown)"
level = page.Level or 1
indent = " " * (level - 1)
print(f"{indent}{title_text} [by {author}]")Use GetChildNodes(Page) to get a typed list of all pages at once:
from aspose.note import Document, Page
doc = Document("MyNotes.one")
pages = doc.GetChildNodes(Page)
print(f"Total pages: {len(pages)}")Page metadata
| Property | Type | Description |
|---|---|---|
Title | Title | None | Title block containing text, date, and time sub-nodes |
Author | str | None | Author name string |
CreationTime | datetime | None | When the page was created |
LastModifiedTime | datetime | None | Most recent modification time |
Level | int | None | Sub-page indent level (1 = top-level, 2 = sub-page, etc.) |
IsConflictPage | bool | True if the page is a sync conflict copy |
Traversing the Document Tree
Document inherits from CompositeNode, giving it full tree traversal methods:
from aspose.note import Document, RichText
doc = Document("MyNotes.one")
# Extract all plain text from the entire document
all_text = [rt.Text for rt in doc.GetChildNodes(RichText) if rt.Text]
print("\n".join(all_text))The DOM hierarchy:
Document
└── Page (0..n)
├── Title
│ ├── TitleText (RichText)
│ ├── TitleDate (RichText)
│ └── TitleTime (RichText)
└── Outline (0..n)
└── OutlineElement (0..n)
├── RichText
├── Image
├── Table
│ └── TableRow → TableCell → RichText / Image
└── AttachedFileKey traversal methods inherited from CompositeNode:
| Method | Description |
|---|---|
GetChildNodes(NodeType) | Return a typed list of all descendants matching the given type |
AppendChildLast(node) | Add a child node at the end of the children list |
AppendChildFirst(node) | Add a child node at the start |
InsertChild(index, node) | Insert a child at a specific index |
RemoveChild(node) | Remove a child node |
GetEnumerator() | Iterate direct children |
Page History
When loaded with LoadHistory=True, Document.GetPageHistory returns all saved revisions
of a page, ordered from oldest to most recent:
from aspose.note import Document, LoadOptions
opts = LoadOptions()
opts.LoadHistory = True
doc = Document("MyNotes.one")
for page in doc.GetChildNodes(None.__class__.__mro__[0]): # iterate pages
pass
# Get revisions for a specific page
import aspose.note as an
doc2 = Document("MyNotes.one", opts)
pages = doc2.GetChildNodes(an.Page)
if pages:
history = doc2.GetPageHistory(pages[0])
print(f"Revision count: {history.Count}")
for rev in history:
print(f" Modified: {rev.LastModifiedTime}")Note:
GetPageHistoryreturns an emptyPageHistoryobject whenLoadHistoryisFalse(the default). Always setopts.LoadHistory = Truebefore loading if you need revision data.
Saving and Exporting
Use Document.Save to export to PDF. PDF is the only supported export format.
Save to a file
from aspose.note import Document, SaveFormat
doc = Document("MyNotes.one")
doc.Save("output.pdf", SaveFormat.Pdf)Save to an in-memory stream
import io
from aspose.note import Document, SaveFormat
doc = Document("MyNotes.one")
buf = io.BytesIO()
doc.Save(buf, SaveFormat.Pdf)
pdf_bytes = buf.getvalue()Save with options
Use PdfSaveOptions (from aspose.note.saving) to control page selection:
import io
from aspose.note import Document
from aspose.note.saving import PdfSaveOptions
doc = Document("MyNotes.one")
# Export pages 2 and 3 only (zero-based index)
opts = PdfSaveOptions(PageIndex=1, PageCount=2)
buf = io.BytesIO()
doc.Save(buf, opts)
pdf_bytes = buf.getvalue()Important: PDF export requires the optional ReportLab dependency. Install it with
pip install "aspose-note[pdf]".
Detecting Layout Changes
Call DetectLayoutChanges after modifying the document to recalculate layout metrics
before saving:
from aspose.note import Document, Page
doc = Document()
page = Page()
doc.AppendChildLast(page)
# ... add content to page ...
doc.DetectLayoutChanges()
doc.Save("output.pdf")Tips and Best Practices
- Load once, query many times:
Documentloads the entire.onefile into memory on construction. Hold oneDocumentinstance and callGetChildNodesrepeatedly rather than re-loading the file for each query. - Check
IsConflictPage: OneNote sync can produce conflict pages. Filter with[p for p in doc if not p.IsConflictPage]before processing content. - Use
LoadHistory=False(default) for speed: Page history adds significant memory and parse overhead. Only setLoadHistory=Truewhen you explicitly need revision data. - Stream for large files: Pass an open file object instead of a path string when reading
large
.onefiles to avoid holding the file open after the parse completes. - Call
DetectLayoutChangesbeforeSave: When constructing a document programmatically and appending nodes, callDetectLayoutChanges()to ensure layout coordinates are consistent before exporting to PDF.
Common Issues
| Issue | Cause | Fix |
|---|---|---|
IncorrectPasswordException on load | File is encrypted; encrypted documents are not supported | Obtain an unencrypted copy of the file |
FileCorruptedException on load | File is truncated or has an unsupported internal format | Verify the file is a valid .one section (not a .onetoc2 table-of-contents) |
UnsupportedSaveFormatException on save | Attempted to save to a format other than PDF | Only SaveFormat.Pdf is supported for export |
Empty GetPageHistory result | LoadOptions.LoadHistory was not set to True before loading | Re-load the document with opts.LoadHistory = True |
ModuleNotFoundError: reportlab | PDF export attempted without the PDF extra | Run pip install "aspose-note[pdf]" |
FAQ
What OneNote formats can Aspose.Note read?
All three OneNote section formats: OneNote2007, OneNote2010, and OneNoteOnline.
Use doc.FileFormat to identify which variant was loaded.
Can I write back to a .one file after reading it?
No. Writing back to .one is not implemented. Export to PDF with Document.Save is the
only supported output operation.
How do I read a specific page by title?
Iterate pages and compare page.Title.TitleText.Text to your target string:
from aspose.note import Document
doc = Document("MyNotes.one")
target = next(
(p for p in doc
if p.Title and p.Title.TitleText and p.Title.TitleText.Text == "My Page"),
None,
)How do I count sub-pages separately from top-level pages?
Use page.Level: level 1 is top-level, level 2 is a sub-page, and so on.
from aspose.note import Document
doc = Document("MyNotes.one")
top_level = [p for p in doc if (p.Level or 1) == 1]
sub_pages = [p for p in doc if (p.Level or 1) > 1]
print(f"Top-level: {len(top_level)}, Sub-pages: {len(sub_pages)}")Does Aspose.Note support .onetoc2 table-of-contents files?
No. Only .one section files are supported. .onetoc2 notebooks will raise
FileCorruptedException.
API Reference Summary
| Class / Method | Description |
|---|---|
Document(path_or_stream, options?) | Load a .one file or create an empty document |
Document.FileFormat | Detected format variant (FileFormat enum) |
Document.GetChildNodes(NodeType) | Retrieve all descendant nodes of a given type |
Document.GetPageHistory(page) | Get revision history for a page (requires LoadHistory=True) |
Document.Save(target, format_or_options) | Export to PDF by file path or stream |
Document.DetectLayoutChanges() | Recalculate layout metrics after programmatic edits |
Document.Accept(visitor) | Walk the DOM using a DocumentVisitor subclass |
LoadOptions.LoadHistory | Set to True to include page revisions in the DOM |
LoadOptions.DocumentPassword | Reserved; encrypted files raise IncorrectPasswordException |
FileFormat | Enum: OneNote2010, OneNoteOnline, OneNote2007, Unknown |
SaveFormat.Pdf | The only supported export format |
PageHistory.Count | Number of page revisions available |
Page.Level | Sub-page depth (1 = top-level) |
Page.IsConflictPage | True for sync conflict copies |
See Also
- Text Extraction: extracting plain text and formatted runs
- PDF Export: export options, page selection, and stream output
- Images and Attachments: extracting embedded images and attached files
- Tables: parsing table structure and cell content
- API Reference: Document