Guide du développeur

Aspose.Note FOSS for Python is a free, open-source library for reading Microsoft OneNote .one fichiers de section sans aucune dépendance à Microsoft Office. Elle fournit une API publique propre sous le aspose.note package, modelé d’après l’interface Aspose.Note pour .NET. La bibliothèque convient à l’automatisation de documents, à l’indexation de contenu, aux pipelines d’extraction de données et aux flux de travail d’archivage.

Ce guide du développeur couvre l’ensemble complet de l’API publique disponible dans la version 26.3.1, avec des exemples de code exécutables pour chaque fonctionnalité majeure.

Chargement du document

Charger un .one fichier à partir d’un chemin d’accès ou d’un flux binaire. Le Document classe est le point d’entrée pour toutes les opérations.

Charger depuis un chemin de fichier

from aspose.note import Document

doc = Document("MyNotes.one")

Charger depuis un flux binaire

Utile lors de la lecture depuis un stockage cloud, des réponses HTTP ou des tampons en mémoire :

from pathlib import Path
from aspose.note import Document

with Path("MyNotes.one").open("rb") as f:
    doc = Document(f)

Options de chargement

Utilisez LoadOptions pour définir des paramètres optionnels au moment du chargement :

from aspose.note import Document, LoadOptions

opts = LoadOptions()
opts.LoadHistory = True   # Include page history in the DOM
doc = Document("MyNotes.one", opts)

Remarque: DocumentPassword existe sur LoadOptions pour la compatibilité de l’API, mais les documents chiffrés ne sont pas pris en charge. Tenter de charger un fichier chiffré déclenche IncorrectPasswordException.

Structure du document (DOM)

Le modèle de document OneNote est un arbre :

Document
  └── Page (0..n)
        ├── Title
        │     ├── TitleText (RichText)
        │     ├── TitleDate (RichText)
        │     └── TitleTime (RichText)
        └── Outline (0..n)
              └── OutlineElement (0..n)
                    ├── RichText
                    ├── Image
                    ├── Table
                    │     └── TableRow
                    │           └── TableCell
                    │                 └── RichText / Image
                    └── AttachedFile

Chaque nœud expose ParentNode et un Document propriété qui remonte jusqu’à la racine. Les nœuds composites prennent en charge l’itération des enfants, FirstChild, LastChild, AppendChildLast, InsertChild, RemoveChild, et GetChildNodes(Type).

Itération des pages

Les pages sont les enfants directs de Document. Parcourez-les directement ou utilisez GetChildNodes:

from aspose.note import Document, Page

doc = Document("MyNotes.one")

for page in doc:
    title = page.Title.TitleText.Text if page.Title and page.Title.TitleText else "(untitled)"
    author = page.Author or "(unknown)"
    print(f"  {title}  [by {author}]")

Métadonnées de la page :

Propriété	Type	Description
`Title`	`Title	None`
`Author`	`str	None`
`CreationTime`	`datetime	None`
`LastModifiedTime`	`datetime	None`
`Level`	`int	None`

Extraction de texte

Extraire tout le texte brut

from aspose.note import Document, RichText

doc = Document("MyNotes.one")
all_text = [rt.Text for rt in doc.GetChildNodes(RichText) if rt.Text]
print("\n".join(all_text))

Inspecter les séquences de formatage

Chaque RichText contient une liste de TextRun segments. Chaque exécution possède son propre TextStyle:

from aspose.note import Document, RichText

doc = Document("FormattedNotes.one")
for rt in doc.GetChildNodes(RichText):
    for run in rt.TextRuns:
        style = run.Style
        flags = []
        if style.IsBold: flags.append("bold")
        if style.IsItalic: flags.append("italic")
        if style.IsHyperlink: flags.append(f"link={style.HyperlinkAddress}")
        print(f"{run.Text!r:40s} [{', '.join(flags)}]")

Extraire les hyperliens

from aspose.note import Document, RichText

doc = Document("MyNotes.one")
for rt in doc.GetChildNodes(RichText):
    for run in rt.TextRuns:
        if run.Style.IsHyperlink and run.Style.HyperlinkAddress:
            print(run.Text, "->", run.Style.HyperlinkAddress)

Extraction d’images

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
    name = img.FileName or f"image_{i}.bin"
    with open(name, "wb") as f:
        f.write(img.Bytes)
    print(f"Saved {name}  ({img.Width}x{img.Height})")

Propriétés de l’image : FileName, Bytes, Width, Height, AlternativeTextTitle, AlternativeTextDescription, HyperlinkUrl, Tags.

Analyse de tableau

from aspose.note import Document, Table, TableRow, TableCell, RichText

doc = Document("MyNotes.one")
for table in doc.GetChildNodes(Table):
    print("Column widths:", [col.Width for col in table.Columns])
    for r, row in enumerate(table.GetChildNodes(TableRow), start=1):
        cells = row.GetChildNodes(TableCell)
        row_text = [
            " ".join(rt.Text for rt in cell.GetChildNodes(RichText)).strip()
            for cell in cells
        ]
        print(f"Row {r}:", row_text)

Fichiers joints

from aspose.note import Document, AttachedFile

doc = Document("NotesWithAttachments.one")
for i, af in enumerate(doc.GetChildNodes(AttachedFile), start=1):
    name = af.FileName or f"attachment_{i}.bin"
    with open(name, "wb") as f:
        f.write(af.Bytes)
    print(f"Saved: {name}")

Étiquettes et listes numérotées

Inspecter les éléments NoteTag

from aspose.note import Document, RichText, Image, Table

doc = Document("TaggedNotes.one")
for rt in doc.GetChildNodes(RichText):
    for tag in rt.Tags:
        print(f"RichText tag: {tag.Label} icon={tag.Icon}")
for img in doc.GetChildNodes(Image):
    for tag in img.Tags:
        print(f"Image tag: {tag.Label}")

Inspecter les listes numérotées

from aspose.note import Document, OutlineElement

doc = Document("NumberedNotes.one")
for oe in doc.GetChildNodes(OutlineElement):
    nl = oe.NumberList
    if nl:
        print(f"format={nl.Format!r}")

Patron DocumentVisitor

Utiliser DocumentVisitor pour implémenter un visiteur qui parcourt l’arbre complet du document :

from aspose.note import Document, DocumentVisitor, Page, RichText, Image

class ContentCounter(DocumentVisitor):
    def __init__(self):
        self.pages = 0
        self.texts = 0
        self.images = 0

    def VisitPageStart(self, page: Page) -> None:
        self.pages += 1

    def VisitRichTextStart(self, rt: RichText) -> None:
        self.texts += 1

    def VisitImageStart(self, img: Image) -> None:
        self.images += 1

doc = Document("MyNotes.one")
counter = ContentCounter()
doc.Accept(counter)
print(f"Pages: {counter.pages}, Texts: {counter.texts}, Images: {counter.images}")

Exportation PDF

L’exportation PDF nécessite la dépendance optionnelle ReportLab. Installez-la avec :

pip install "aspose-note[pdf]"

Export PDF de base

from aspose.note import Document, SaveFormat

doc = Document("MyNotes.one")
doc.Save("output.pdf", SaveFormat.Pdf)

Export PDF avec options

import io
from aspose.note import Document, SaveFormat
from aspose.note.saving import PdfSaveOptions

doc = Document("MyNotes.one")

##With save options
opts = PdfSaveOptions()
doc.Save("output.pdf", opts)

##Save to in-memory stream
buf = io.BytesIO()
doc.Save(buf, PdfSaveOptions())
pdf_bytes = buf.getvalue()

Remarque: PdfSaveOptions.PageIndex et PageCount les champs existent mais ne sont pas transmis à l’exportateur PDF dans la version v26.3.1. Le document complet est toujours exporté.

Limitations actuelles

Zone	Statut
Lecture `.one` fichiers	Entièrement pris en charge
Export PDF (via ReportLab)	Pris en charge
Écriture vers `.one`	Non implémenté
Documents chiffrés	Non pris en charge (lève `IncorrectPasswordException`)
Formats d’enregistrement HTML / image / ONE	Déclaré pour la compatibilité API ; lever `UnsupportedSaveFormatException`

Guides disponibles

Aperçu des fonctionnalités: liste complète des fonctionnalités avec preuves
Premiers pas: prérequis, installation et premières étapes
Installation: pip install et dépendances optionnelles