Working with the Note Document

This guide explains how to load, traverse, and export Microsoft OneNote .one files in Python using the Document class from Aspose.Note FOSS. The Document class is the root of the document object model (DOM), and every read, traversal, or export operation starts with a Document instance that represents a single .one section file.


Loading a Document

Load from a file path

Pass the path to a .one file directly to the Document constructor:

from aspose.note import Document

doc = Document("MyNotes.one")

Load from a binary stream

Use a file-like object when the data comes from a network response, cloud storage blob, or in-memory buffer:

from pathlib import Path
from aspose.note import Document

with Path("MyNotes.one").open("rb") as f:
    doc = Document(f)

Create an empty document

Construct a Document with no arguments to build a new OneNote document programmatically:

from aspose.note import Document, Page

doc = Document()
doc.AppendChildLast(Page())

Load with options

Use LoadOptions to set optional parameters such as history loading at construction time:

from aspose.note import Document, LoadOptions

opts = LoadOptions()
opts.LoadHistory = True   # include page revision history in the DOM
doc = Document("MyNotes.one", opts)
LoadOptions propertyTypeDefaultDescription
LoadHistoryboolFalseWhen True, page history revisions are included in the DOM and accessible via Document.GetPageHistory
DocumentPasswordstr | NoneNoneReserved for API compatibility — encrypted documents are not supported and raise IncorrectPasswordException

Document Format Detection

The FileFormat property identifies which OneNote format variant the file uses:

from aspose.note import Document, FileFormat

doc = Document("MyNotes.one")

if doc.FileFormat == FileFormat.OneNote2010:
    print("OneNote 2010 format")
elif doc.FileFormat == FileFormat.OneNoteOnline:
    print("OneNote Online format")
elif doc.FileFormat == FileFormat.OneNote2007:
    print("OneNote 2007 format")
else:
    print("Format:", doc.FileFormat)
FileFormat valueDescription
OneNote2010Standard .one file written by OneNote 2010 and later
OneNoteOnline.one file written by OneNote Online / Microsoft 365
OneNote2007Legacy .one format from OneNote 2007
UnknownFormat could not be identified

Iterating Pages

Pages are the direct children of Document. Use a for loop for simple iteration:

from aspose.note import Document

doc = Document("MyNotes.one")

for page in doc:
    title_text = (
        page.Title.TitleText.Text
        if page.Title and page.Title.TitleText
        else "(untitled)"
    )
    author = page.Author or "(unknown)"
    level = page.Level or 1
    indent = "  " * (level - 1)
    print(f"{indent}{title_text}  [by {author}]")

Use GetChildNodes(Page) to get a typed list of all pages at once:

from aspose.note import Document, Page

doc = Document("MyNotes.one")
pages = doc.GetChildNodes(Page)
print(f"Total pages: {len(pages)}")

Page metadata

PropertyTypeDescription
TitleTitle | NoneTitle block containing text, date, and time sub-nodes
Authorstr | NoneAuthor name string
CreationTimedatetime | NoneWhen the page was created
LastModifiedTimedatetime | NoneMost recent modification time
Levelint | NoneSub-page indent level (1 = top-level, 2 = sub-page, etc.)
IsConflictPageboolTrue if the page is a sync conflict copy

Traversing the Document Tree

Document inherits from CompositeNode, giving it full tree traversal methods:

from aspose.note import Document, RichText

doc = Document("MyNotes.one")

# Extract all plain text from the entire document
all_text = [rt.Text for rt in doc.GetChildNodes(RichText) if rt.Text]
print("\n".join(all_text))

The DOM hierarchy:

Document
  └── Page (0..n)
        ├── Title
        │     ├── TitleText (RichText)
        │     ├── TitleDate (RichText)
        │     └── TitleTime (RichText)
        └── Outline (0..n)
              └── OutlineElement (0..n)
                    ├── RichText
                    ├── Image
                    ├── Table
                    │     └── TableRow → TableCell → RichText / Image
                    └── AttachedFile

Key traversal methods inherited from CompositeNode:

MethodDescription
GetChildNodes(NodeType)Return a typed list of all descendants matching the given type
AppendChildLast(node)Add a child node at the end of the children list
AppendChildFirst(node)Add a child node at the start
InsertChild(index, node)Insert a child at a specific index
RemoveChild(node)Remove a child node
GetEnumerator()Iterate direct children

Page History

When loaded with LoadHistory=True, Document.GetPageHistory returns all saved revisions of a page, ordered from oldest to most recent:

from aspose.note import Document, LoadOptions

opts = LoadOptions()
opts.LoadHistory = True
doc = Document("MyNotes.one")

for page in doc.GetChildNodes(None.__class__.__mro__[0]):  # iterate pages
    pass

# Get revisions for a specific page
import aspose.note as an

doc2 = Document("MyNotes.one", opts)
pages = doc2.GetChildNodes(an.Page)
if pages:
    history = doc2.GetPageHistory(pages[0])
    print(f"Revision count: {history.Count}")
    for rev in history:
        print(f"  Modified: {rev.LastModifiedTime}")

Note: GetPageHistory returns an empty PageHistory object when LoadHistory is False (the default). Always set opts.LoadHistory = True before loading if you need revision data.


Saving and Exporting

Use Document.Save to export to PDF. PDF is the only supported export format.

Save to a file

Call Document.Save() with a file path and SaveFormat.Pdf to write the export directly to disk:

from aspose.note import Document, SaveFormat

doc = Document("MyNotes.one")
doc.Save("output.pdf", SaveFormat.Pdf)

Save to an in-memory stream

Pass a writable io.BytesIO buffer instead of a path to capture the PDF bytes in memory:

import io
from aspose.note import Document, SaveFormat

doc = Document("MyNotes.one")
buf = io.BytesIO()
doc.Save(buf, SaveFormat.Pdf)
pdf_bytes = buf.getvalue()

Save with options

Use PdfSaveOptions from aspose.note.saving to control which pages are included in the exported PDF output:

import io
from aspose.note import Document
from aspose.note.saving import PdfSaveOptions

doc = Document("MyNotes.one")

# Export pages 2 and 3 only (zero-based index)
opts = PdfSaveOptions(PageIndex=1, PageCount=2)
buf = io.BytesIO()
doc.Save(buf, opts)
pdf_bytes = buf.getvalue()

Important: PDF export requires the optional ReportLab dependency. Install it with pip install "aspose-note[pdf]".


Detecting Layout Changes

Call DetectLayoutChanges after modifying the document to recalculate layout metrics before saving:

from aspose.note import Document, Page

doc = Document()
page = Page()
doc.AppendChildLast(page)

# ... add content to page ...

doc.DetectLayoutChanges()
doc.Save("output.pdf")

Tips and Best Practices

  • Load once, query many times: Document loads the entire .one file into memory on construction. Hold one Document instance and call GetChildNodes repeatedly rather than re-loading the file for each query.
  • Check IsConflictPage: OneNote sync can produce conflict pages. Filter with [p for p in doc if not p.IsConflictPage] before processing content.
  • Use LoadHistory=False (default) for speed: Page history adds significant memory and parse overhead. Only set LoadHistory=True when you explicitly need revision data.
  • Stream for large files: Pass an open file object instead of a path string when reading large .one files to avoid holding the file open after the parse completes.
  • Call DetectLayoutChanges before Save: When constructing a document programmatically and appending nodes, call DetectLayoutChanges() to ensure layout coordinates are consistent before exporting to PDF.

Common Issues

IssueCauseFix
IncorrectPasswordException on loadFile is encrypted; encrypted documents are not supportedObtain an unencrypted copy of the file
FileCorruptedException on loadFile is truncated or has an unsupported internal formatVerify the file is a valid .one section (not a .onetoc2 table-of-contents)
UnsupportedSaveFormatException on saveAttempted to save to a format other than PDFOnly PDF format is supported for export
Empty GetPageHistory resultLoadOptions.LoadHistory was not set to True before loadingRe-load the document with opts.LoadHistory = True
ModuleNotFoundError: reportlabPDF export attempted without the PDF extraRun pip install "aspose-note[pdf]"

FAQ

What OneNote formats can Aspose.Note read?

All three OneNote section formats: OneNote2007, OneNote2010, and OneNoteOnline. Use doc.FileFormat to identify which variant was loaded.

Can I write back to a .one file after reading it?

Aspose.Note for Python is a read-and-export library. Use Document.Save to write PDF output. The OneNote binary format is read-only in this FOSS edition.

How do I read a specific page by title?

Iterate pages with a generator expression and compare the title text property to your target string:

from aspose.note import Document

doc = Document("MyNotes.one")
target = next(
    (p for p in doc
     if p.Title and p.Title.TitleText and p.Title.TitleText.Text == "My Page"),
    None,
)

How do I count sub-pages separately from top-level pages?

Use page.Level: level 1 is top-level, level 2 is a sub-page, and so on.

from aspose.note import Document

doc = Document("MyNotes.one")
top_level = [p for p in doc if (p.Level or 1) == 1]
sub_pages  = [p for p in doc if (p.Level or 1) > 1]
print(f"Top-level: {len(top_level)}, Sub-pages: {len(sub_pages)}")

Does Aspose.Note support .onetoc2 table-of-contents files?

No. Only .one section files are supported. .onetoc2 notebooks will raise FileCorruptedException.


API Reference Summary

Class / MethodDescription
Document(path_or_stream, options?)Load a .one file or create an empty document
Document.FileFormatDetected format variant (FileFormat enum)
Document.GetChildNodes(NodeType)Retrieve all descendant nodes of a given type
Document.GetPageHistory(page)Get revision history for a page (requires LoadHistory=True)
Document.Save(target, format_or_options)Export to PDF by file path or stream
Document.DetectLayoutChanges()Recalculate layout metrics after programmatic edits
Document.Accept(visitor)Walk the DOM using a DocumentVisitor subclass
LoadOptions.LoadHistorySet to True to include page revisions in the DOM
LoadOptions.DocumentPasswordReserved; encrypted files raise IncorrectPasswordException
FileFormatEnum: OneNote2010, OneNoteOnline, OneNote2007, Unknown
SaveFormat.PdfThe only supported export format; raises ValueError for other formats
PageHistory.CountNumber of page revisions available
Page.LevelSub-page depth (1 = top-level)
Page.IsConflictPageTrue for sync conflict copies

See Also