Images and Attached Files — Aspose.Note FOSS for Python

Aspose.Note FOSS for Python gives direct access to both embedded images and attached files stored inside OneNote .one section files. This guide shows how to extract image bytes, inspect image metadata, and save attached file data to disk using the Image and AttachedFile node types.


Extracting Images

Every image in a OneNote document is represented by an Image node. Use GetChildNodes(Image) to retrieve all images from the document recursively:

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
    filename = img.FileName or f"image_{i}.bin"
    with open(filename, "wb") as f:
        f.write(img.Bytes)
    print(f"Saved: {filename}  ({img.Width} x {img.Height} pts)")

Image Properties

PropertyTypeDescription
FileNamestr | NoneOriginal filename stored in the OneNote file (may be None)
BytesbytesRaw image data (format matches the original, e.g. PNG, JPEG)
Widthfloat | NoneRendered width in points
Heightfloat | NoneRendered height in points
AlternativeTextTitlestr | NoneAlt text title for accessibility
AlternativeTextDescriptionstr | NoneAlt text description
HyperlinkUrlstr | NoneURL if the image is a clickable hyperlink
Tagslist[NoteTag]OneNote tags attached to this image

Save Images with Safe Filenames

When FileName is None or contains characters that are unsafe for the filesystem, sanitize the name before opening a file handle:

import re
from aspose.note import Document, Image

def safe_name(name: str | None, fallback: str) -> str:
    if not name:
        return fallback
    return re.sub(r'[<>:"/\\|?*]', "_", name)

doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
    name = safe_name(img.FileName, f"image_{i}.bin")
    with open(name, "wb") as f:
        f.write(img.Bytes)

Extract Images Per Page

Iterate pages with GetChildNodes(Page) and call GetChildNodes(Image) on each page to scope image extraction to individual slides:

from aspose.note import Document, Page, Image

doc = Document("MyNotes.one")
for page_num, page in enumerate(doc.GetChildNodes(Page), start=1):
    images = page.GetChildNodes(Image)
    print(f"Page {page_num}: {len(images)} image(s)")
    for i, img in enumerate(images, start=1):
        filename = f"page{page_num}_img{i}.bin"
        with open(filename, "wb") as f:
            f.write(img.Bytes)

Inspect Image Alt Text and Hyperlinks

Read the AlternativeTextTitle, AlternativeTextDescription, and HyperlinkUrl properties on each Image node to access accessibility metadata and link targets:

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
    if img.AlternativeTextTitle:
        print("Alt title:", img.AlternativeTextTitle)
    if img.AlternativeTextDescription:
        print("Alt desc:", img.AlternativeTextDescription)
    if img.HyperlinkUrl:
        print("Linked to:", img.HyperlinkUrl)

Detect Image File Format from Bytes

Python’s imghdr module (Python ≤ 3.12) or the struct module can detect image format from the first bytes:

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
    b = img.Bytes
    if b[:4] == b'\x89PNG':
        fmt = "png"
    elif b[:2] == b'\xff\xd8':
        fmt = "jpeg"
    elif b[:4] == b'GIF8':
        fmt = "gif"
    elif b[:2] in (b'BM',):
        fmt = "bmp"
    else:
        fmt = "bin"
    name = (img.FileName or f"image.{fmt}").rsplit(".", 1)[0] + f".{fmt}"
    with open(name, "wb") as f:
        f.write(b)

Extracting Attached Files

Embedded file attachments are stored as AttachedFile nodes. Use GetChildNodes(AttachedFile) to retrieve all attachments from the document and write each file’s raw bytes to disk:

from aspose.note import Document, AttachedFile

doc = Document("NotesWithAttachments.one")
for i, af in enumerate(doc.GetChildNodes(AttachedFile), start=1):
    filename = af.FileName or f"attachment_{i}.bin"
    with open(filename, "wb") as f:
        f.write(af.Bytes)
    print(f"Saved attachment: {filename}  ({len(af.Bytes):,} bytes)")

AttachedFile Properties

PropertyTypeDescription
FileNamestr | NoneOriginal filename stored in the OneNote file
BytesbytesRaw file content
Tagslist[NoteTag]OneNote tags attached to this node

Inspect Tags on Images and Attachments

Both Image and AttachedFile nodes support OneNote tags. Iterate the .Tags list on each node to read the Label and completion timestamp for every tag applied in the source document:

from aspose.note import Document, Image, AttachedFile

doc = Document("MyNotes.one")

for img in doc.GetChildNodes(Image):
    for tag in img.Tags:
        print(f"Image tag: {tag.Label}  completedTime={tag.CompletedTime}")

for af in doc.GetChildNodes(AttachedFile):
    for tag in af.Tags:
        print(f"Attachment tag: {tag.Label}  completedTime={tag.CompletedTime}")

Summary: Image vs AttachedFile

FeatureImageAttachedFile
FileNameYes (original image name)Yes (original file name)
BytesRaw image bytesRaw file bytes
Width / HeightYes (rendered dimensions)No
AlternativeTextTitle/DescriptionYesNo
HyperlinkUrlYesNo
TagsYesYes
Replace(node) methodYesNo

Tips

  • Always guard against None filenames. img.FileName is None when the file had no name in the source document.
  • img.Bytes is never None and is always the raw binary content; it can be zero bytes for placeholder images.
  • Use Page.GetChildNodes(Image) instead of Document.GetChildNodes(Image) to scope extraction to a single page.
  • The Image.Replace(image) method replaces the content of one image node with another in-memory; saving back to .one is not supported.

See Also

 English