Hi,
I think this may be a Prince issue, but don't have the low-level skills to prove it. (I'm using Prince 12.5.1)
I have a HTML file with non-latin characters (e.g. ☙ – ❧) in the head/title element which, after running through Prince, show up in the PDF metadata title tag - but it seems that they may be in the wrong encoding, as after processing with Ghostscript [1], the title is mangled.
Some web searching suggested that the title needs to be either PDFDocEncoding or UTF-16BE with a Byte Order Mark (page 158 of the 1.7 PDF Reference Manual). [2]
Files from a reduced test case attached.
optimised.pdf was generated from test.pdf with the command:
Any ideas?
Cheers,
Dave.
[1] Using gs to linearize, per https://www.princexml.com/doc/prince-output/#pdf-compression
[2] https://stackoverflow.com/questions/9188189/wrong-encode-when-update-pdf-meta-data-using-ghostscript-and-pdfmark
I think this may be a Prince issue, but don't have the low-level skills to prove it. (I'm using Prince 12.5.1)
I have a HTML file with non-latin characters (e.g. ☙ – ❧) in the head/title element which, after running through Prince, show up in the PDF metadata title tag - but it seems that they may be in the wrong encoding, as after processing with Ghostscript [1], the title is mangled.
Some web searching suggested that the title needs to be either PDFDocEncoding or UTF-16BE with a Byte Order Mark (page 158 of the 1.7 PDF Reference Manual). [2]
Files from a reduced test case attached.
optimised.pdf was generated from test.pdf with the command:
gs -sDEVICE=pdfwrite -dNOPAUSE -dQUIET -dBATCH -sOutputFile=optimised.pdf test.pdf
Any ideas?
Cheers,
Dave.
[1] Using gs to linearize, per https://www.princexml.com/doc/prince-output/#pdf-compression
[2] https://stackoverflow.com/questions/9188189/wrong-encode-when-update-pdf-meta-data-using-ghostscript-and-pdfmark