When I add the PDF metadata by using python, it's the right info showing, no matter in IDE printed to test or open with Adobe Reader or tested by script in Directory opus 13 running. But it only mess info showing in the Directory opus 13. There should be something wrong with the encoding. Would you pls help to check where's wrong? Thanks!
Which version of Opus are you using?
How do we reproduce the problem you're seeing?
Can you zip and attach an example file with the issue?
I tested recent versions, including 13.6.8. It still have this problem.
Here's my python coding, you just change the file path and you can see the problem.
# -*- coding: utf-8 -*-
import sys
import PyPDF2
from PyPDF2.generic import NameObject, TextStringObject
def get_pdf_metadata(file_path):
with open(file_path, 'rb') as file:
reader = PyPDF2.PdfReader(file)
metadata = reader.metadata
return metadata
def add_category_to_pdf(file_path, category):
metadata = get_pdf_metadata(file_path)
metadata[NameObject('/Subject')] = TextStringObject(category)
with open(file_path, 'rb') as file:
reader = PyPDF2.PdfReader(file)
writer = PyPDF2.PdfWriter()
for page in reader.pages:
writer.add_page(page)
writer.add_metadata({**reader.metadata, NameObject('/Subject'): TextStringObject(category)})
with open(file_path, "wb") as output_file:
writer.write(output_file)
file_path = r'pdf address.pdf'
category = "测试"
add_category_to_pdf(file_path, category)
print(f"Added the category info: {category}")
category_bytes = category.encode('utf-8')
print(f"Category in bytes: {category_bytes}")
print(f"The encoding of the category string is: {sys.getdefaultencoding()}")
Thank you!
Could you zip and attach a PDF file with metadata that isn't displayed correctly by Opus?
new PDF Document.zip (1.1 KB)
Here' are the coding and a sample pdf, the metadata is mess info showing
If my coding is wrong, the attribute of the PDF should be not showing correctly, but it shows right. Only the Directory showed wrong metadata. By the way, after added the metadata by coding, any of the pdf Reader can read the metadata correctly, except the Directory Opus.
I think your encoding is invalid.
Your Subject field looks like this:
Within the brackets:
Two bytes, UTF-16 BOM: 0xfe
0xff
Two bytes, character 测
encoded as UTF-16: 0x6d
0x4b
But after that comes the string \213Õ
encoded as ASCII, not UTF-16.
Even if the intention was to use the string \213
to indicate the second character by its octal value, you've put ASCII characters into a UTF-16 field, next to UTF-16 characters.
You seem to be mixing two encoding types within the same field, which doesn't make sense to me (unless the PDF format allows that somehow).