图像和附件文件 — Aspose.Note FOSS for Python
Aspose.Note FOSS for Python gives direct access to both embedded images and attached files stored inside OneNote .one 节文件。内容通过以下方式公开 Image 和 AttachedFile 节点类型。.
提取图像
OneNote 文档中的每个图像都由一个 Image 节点表示。使用 GetChildNodes(Image) 递归检索文档中所有图像::
from aspose.note import Document, Image
doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
filename = img.FileName or f"image_{i}.bin"
with open(filename, "wb") as f:
f.write(img.Bytes)
print(f"Saved: {filename} ({img.Width} x {img.Height} pts)")图像属性
| 属性 | 类型 | 描述 |
|---|---|---|
FileName | `str | None` |
Bytes | bytes | 原始图像数据(格式与原始相同,例如 PNG、JPEG)) |
Width | `float | None` |
Height | `float | None` |
AlternativeTextTitle | `str | None` |
AlternativeTextDescription | `str | None` |
HyperlinkUrl | `str | None` |
Tags | list[NoteTag] | OneNote 标签已附加到此图像 |
使用安全文件名保存图像
当 FileName 是 None 或包含对文件系统不安全的字符,请在保存前进行清理::
import re
from aspose.note import Document, Image
def safe_name(name: str | None, fallback: str) -> str:
if not name:
return fallback
return re.sub(r'[<>:"/\\|?*]', "_", name)
doc = Document("MyNotes.one")
for i, img in enumerate(doc.GetChildNodes(Image), start=1):
name = safe_name(img.FileName, f"image_{i}.bin")
with open(name, "wb") as f:
f.write(img.Bytes)按页提取图像
from aspose.note import Document, Page, Image
doc = Document("MyNotes.one")
for page_num, page in enumerate(doc.GetChildNodes(Page), start=1):
images = page.GetChildNodes(Image)
print(f"Page {page_num}: {len(images)} image(s)")
for i, img in enumerate(images, start=1):
filename = f"page{page_num}_img{i}.bin"
with open(filename, "wb") as f:
f.write(img.Bytes)检查图像的替代文本和超链接
from aspose.note import Document, Image
doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
if img.AlternativeTextTitle:
print("Alt title:", img.AlternativeTextTitle)
if img.AlternativeTextDescription:
print("Alt desc:", img.AlternativeTextDescription)
if img.HyperlinkUrl:
print("Linked to:", img.HyperlinkUrl)从字节检测图像文件格式
Python的 imghdr 模块(Python ≤ 3.12)或 struct 模块可以从首字节检测图像格式::
from aspose.note import Document, Image
doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
b = img.Bytes
if b[:4] == b'\x89PNG':
fmt = "png"
elif b[:2] == b'\xff\xd8':
fmt = "jpeg"
elif b[:4] == b'GIF8':
fmt = "gif"
elif b[:2] in (b'BM',):
fmt = "bmp"
else:
fmt = "bin"
name = (img.FileName or f"image.{fmt}").rsplit(".", 1)[0] + f".{fmt}"
with open(name, "wb") as f:
f.write(b)提取附件文件
嵌入的文件附件存储为 AttachedFile 节点::
from aspose.note import Document, AttachedFile
doc = Document("NotesWithAttachments.one")
for i, af in enumerate(doc.GetChildNodes(AttachedFile), start=1):
filename = af.FileName or f"attachment_{i}.bin"
with open(filename, "wb") as f:
f.write(af.Bytes)
print(f"Saved attachment: {filename} ({len(af.Bytes):,} bytes)")AttachedFile 属性
| 属性 | 类型 | 描述 |
|---|---|---|
FileName | `str | None` |
Bytes | bytes | 原始文件内容 |
Tags | list[NoteTag] | 附加到此节点的 OneNote 标签 |
检查图像和附件的标签
两者 Image 和 AttachedFile 节点支持 OneNote 标签::
from aspose.note import Document, Image, AttachedFile
doc = Document("MyNotes.one")
for img in doc.GetChildNodes(Image):
for tag in img.Tags:
print(f"Image tag: {tag.Label} completedTime={tag.CompletedTime}")
for af in doc.GetChildNodes(AttachedFile):
for tag in af.Tags:
print(f"Attachment tag: {tag.Label} completedTime={tag.CompletedTime}")摘要:Image 与 AttachedFile
| 特性 | Image | AttachedFile |
|---|---|---|
FileName | 是(原始图像名称) | 是(原始文件名) |
Bytes | 原始图像字节 | 原始文件字节 |
Width / Height | 是(渲染尺寸) | 否 |
AlternativeTextTitle/Description | 是 | 否 |
HyperlinkUrl | 是 | 否 |
Tags | 是 | 是 |
Replace(node) 方法 | 是 | 否 |
提示
- 始终防范
None文件名。.img.FileName是None当文件在源文档中没有名称时。. img.Bytes永不None并且始终是原始二进制内容;对于占位图像,它可能为零字节。.- 使用
Page.GetChildNodes(Image)而不是Document.GetChildNodes(Image)以将提取范围限定在单个页面上。. - 该
Image.Replace(image)方法在内存中将一个图像节点的内容替换为另一个;保存回去.one不受支持。.