图像和附件文件 — Aspose.Note FOSS for Python

Aspose.Note FOSS for Python gives direct access to both embedded images and attached files stored inside OneNote .one 节文件。内容通过以下方式公开 ImageAttachedFile 节点类型。.


提取图像

OneNote 文档中的每个图像都由一个 Image 节点表示。使用 GetChildNodes(Image) 递归检索文档中所有图像::

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
    filename = img.FileName or f"image_{i}.bin"
    with open(filename, "wb") as f:
        f.write(img.Bytes)
    print(f"Saved: {filename}  ({img.Width} x {img.Height} pts)")

图像属性

属性类型描述
FileName`strNone`
Bytesbytes原始图像数据(格式与原始相同,例如 PNG、JPEG))
Width`floatNone`
Height`floatNone`
AlternativeTextTitle`strNone`
AlternativeTextDescription`strNone`
HyperlinkUrl`strNone`
Tagslist[NoteTag]OneNote 标签已附加到此图像

使用安全文件名保存图像

FileNameNone 或包含对文件系统不安全的字符,请在保存前进行清理::

import re
from aspose.note import Document, Image

def safe_name(name: str | None, fallback: str) -> str:
    if not name:
        return fallback
    return re.sub(r'[<>:"/\\|?*]', "_", name)

doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
    name = safe_name(img.FileName, f"image_{i}.bin")
    with open(name, "wb") as f:
        f.write(img.Bytes)

按页提取图像

from aspose.note import Document, Page, Image

doc = Document("MyNotes.one")
for page_num, page in enumerate(doc.GetChildNodes(Page), start=1):
    images = page.GetChildNodes(Image)
    print(f"Page {page_num}: {len(images)} image(s)")
    for i, img in enumerate(images, start=1):
        filename = f"page{page_num}_img{i}.bin"
        with open(filename, "wb") as f:
            f.write(img.Bytes)

检查图像的替代文本和超链接

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
    if img.AlternativeTextTitle:
        print("Alt title:", img.AlternativeTextTitle)
    if img.AlternativeTextDescription:
        print("Alt desc:", img.AlternativeTextDescription)
    if img.HyperlinkUrl:
        print("Linked to:", img.HyperlinkUrl)

从字节检测图像文件格式

Python的 imghdr 模块(Python ≤ 3.12)或 struct 模块可以从首字节检测图像格式::

from aspose.note import Document, Image

doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
    b = img.Bytes
    if b[:4] == b'\x89PNG':
        fmt = "png"
    elif b[:2] == b'\xff\xd8':
        fmt = "jpeg"
    elif b[:4] == b'GIF8':
        fmt = "gif"
    elif b[:2] in (b'BM',):
        fmt = "bmp"
    else:
        fmt = "bin"
    name = (img.FileName or f"image.{fmt}").rsplit(".", 1)[0] + f".{fmt}"
    with open(name, "wb") as f:
        f.write(b)

提取附件文件

嵌入的文件附件存储为 AttachedFile 节点::

from aspose.note import Document, AttachedFile

doc = Document("NotesWithAttachments.one")
for i, af in enumerate(doc.GetChildNodes(AttachedFile), start=1):
    filename = af.FileName or f"attachment_{i}.bin"
    with open(filename, "wb") as f:
        f.write(af.Bytes)
    print(f"Saved attachment: {filename}  ({len(af.Bytes):,} bytes)")

AttachedFile 属性

属性类型描述
FileName`strNone`
Bytesbytes原始文件内容
Tagslist[NoteTag]附加到此节点的 OneNote 标签

检查图像和附件的标签

两者 ImageAttachedFile 节点支持 OneNote 标签::

from aspose.note import Document, Image, AttachedFile

doc = Document("MyNotes.one")

for img in doc.GetChildNodes(Image):
    for tag in img.Tags:
        print(f"Image tag: {tag.Label}  completedTime={tag.CompletedTime}")

for af in doc.GetChildNodes(AttachedFile):
    for tag in af.Tags:
        print(f"Attachment tag: {tag.Label}  completedTime={tag.CompletedTime}")

摘要:Image 与 AttachedFile

特性ImageAttachedFile
FileName是(原始图像名称)是(原始文件名)
Bytes原始图像字节原始文件字节
Width / Height是(渲染尺寸)
AlternativeTextTitle/Description
HyperlinkUrl
Tags
Replace(node) 方法

提示

  • 始终防范 None 文件名。. img.FileNameNone 当文件在源文档中没有名称时。.
  • img.Bytes 永不 None 并且始终是原始二进制内容;对于占位图像,它可能为零字节。.
  • 使用 Page.GetChildNodes(Image) 而不是 Document.GetChildNodes(Image) 以将提取范围限定在单个页面上。.
  • Image.Replace(image) 方法在内存中将一个图像节点的内容替换为另一个;保存回去 .one 不受支持。.

另见

 中文