You cannot do this in Prince yet, although we hope to add a feature for this in the future. In the meantime it may be possible to achieve with a third-party PDF processing tool.
Hmmm... This may be a showstopper for us using Prince. With the volume of pdfs going to print that we have, we cannot go for a half and half solution. We need to go for a real solution.
Perhaps you could provide me with a bit more information about the metadata, then I can take a look. Is it specified as just a chunk of text which should be embedded as is, does it have varying IDs etc.
Our specific need is basically what you see in my original post. There are basically two variables here: one is the invoiceMetadata and the other is the documentRef. Both are simple strings. The rest is static as far as I am concerned. I am no PDF expert, but I suspect that the length needs to be calculated, and I have no idea what the purpose of the "id" attribute is. But the whole chunk is a standard xmp metadata thingie, so the interwebs should have some info on that.
I've attached a PDF file that includes the XMP metadata packet, does this work for you?
One issue we need to resolve is how to combine this metadata with any other XMP metadata required by specific PDF profiles like PDF/A or PDF/X. I don't think we can just blindly concatenate the two XML chunks, so we will either need to parse the file and merge them ourselves or embed them in two separate places.
As I suggested to Howcome in an email, one solution would be to imitate some other pdf libraries which will programmatically let you build the xmp metadata.
Perhaps something like this?
var ns = PDF.registerXmpNamespace(namespacetag, namespaceuri);
ns.setValue(tag, value);
To build my example from the original post, the html would then contain something like this:
We are using a rich template engine with data binding, so it would probably be easiest to just include this as data binding elements in the template, which means that having this in javascript in the document would be the way to go. That way we don't have to send along the data model when we go to render the PDF.
The latest build of Prince has a new --pdf-xmp option which can be used to include additional XMP metadata in the PDF. Currently it is taken from an external file, but it is also possible to specify it in JavaScript as a data URL string via the PDF.xmp property. Since encoding the XMP as a data URL isn't very convenient we plan to add a simpler interface for this in future.
It's an XMP file, so basically the <x:xmpmeta> element and its contents (the xpacket processing instructions are ignored as Prince generates those itself when it produces the PDF file).
This has taken a while before I got a chance to test.
I am trying now, and I am unable to get it to work. Can you please check if you can see anything obvious that I am missing? I have included the entire script tag in my html template here.
<script>
PDF.embedFonts(true);
PDF.subsetFonts(true);
//PDF.artificialFonts (boolean)
PDF.compress(false);
PDF.encrypt(false);
//PDF.userPassword, ownerPassword (string, can be null)
PDF.allowPrint(true);
PDF.allowModify(false);
PDF.allowCopy(true);
PDF.allowAnnotate(false);
//PDF.keyBits (40 | 128)
//PDF.script (string, can be null)
//PDF.openAction (eg. "print")
//PDF.pageLayout (single-page | one-column | two-column[-left/right)
//PDF.pageMode (auto | show-bookmarks | fullscreen | show-attachments)
//PDF.printScaling (auto | none)
//PDF.profile (string, can be null)
//PDF.outputIntent (URL string, can be null)
@{
var xmp = @"<x:xmpmeta xmlns:x=""adobe:ns:meta/"">
<rdf:RDF xmlns:rdf=""http://www.w3.org/1999/02/22-rdf-syntax-ns#"">
<rdf:Description rdf:about="""" xmlns:sbank=""http://schemas.somebank.no/2015/11/"">
<sbank:invoiceMetadata>PAGE1;55555555;SomeBank;Morten Middlename Lastname;444444;5555555;3333;N;;;1980-01-01;P;</sbank:invoiceMetadata>
<sbank:documentRef>DREF-00123456789</sbank:documentRef>
</rdf:Description>
<rdf:Description rdf:about="""" xmlns:xmp=""http://ns.adobe.com/xap/1.0/"" />
</rdf:RDF>
</x:xmpmeta>";
}
PDF.xmp = @xmp;
//PDF.xmp(@xmp);
</script>
Sorry, that is Razor syntax. Inside the @{ } is C# code. The same with the @xmp, that is just accessing the C# variable.
I tweaked the xml a little bit to remove line breaks to make sure that wasn't the issue. But it still does not work. I have included the rendered html template here (script part only) so that you can have a look at exactly what goes into Prince.
<script>
PDF.embedFonts(true);
PDF.subsetFonts(true);
//PDF.artificialFonts (boolean)
PDF.compress(false);
PDF.encrypt(false);
//PDF.userPassword, ownerPassword (string, can be null)
PDF.allowPrint(true);
PDF.allowModify(false);
PDF.allowCopy(true);
PDF.allowAnnotate(false);
//PDF.keyBits (40 | 128)
//PDF.script (string, can be null)
//PDF.openAction (eg. "print")
//PDF.pageLayout (single-page | one-column | two-column[-left/right)
//PDF.pageMode (auto | show-bookmarks | fullscreen | show-attachments)
//PDF.printScaling (auto | none)
//PDF.profile (string, can be null)
//PDF.outputIntent (URL string, can be null)
PDF.xmp('<x:xmpmeta xmlns:x="adobe:ns:meta/"><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><rdf:Description rdf:about="" xmlns:sbank="http://schemas.somebank.no/2015/11/"><sbank:invoiceMetadata>PAGE1;55555555;SomeBank;Morten Middlename Lastname;444444;5555555;3333;N;;;1980-01-01;P;</sbank:invoiceMetadata><sbank:documentRef>DREF-00123456789</sbank:documentRef></rdf:Description><rdf:Description rdf:about="" xmlns:xmp="http://ns.adobe.com/xap/1.0/" /></rdf:RDF></x:xmpmeta>');
</script>
Adding the --javascript did the trick. Now for the $64.000 question: How do we enable javascript when we use Prince within C#?
Here is our code:
var events = new PdfGenerationEvents();
var prn = new Prince(@"Prince/bin/prince.exe", events);
lock(outputStream)
prn.ConvertMemoryStream(html, outputStream);
I have a reply from our printer that they need the metadata to look exactly as our current production setup, which includes line breaks in the correct places. But it looks like this encodeURIComponent function is stripping away \r\n characters. Is there a way to send in metadata and keep the line breaks?
Oh now that's just nasty, this is RDF/XML and whitespace characters between the tags are semantically irrelevant and should be completely ignored!
How are they processing the metadata and why does it need linebreaks in specific places? This sounds like their software is not conforming to the XMP specification at all.
I agree, it sounds like they are doing some half-assed string parsing. But it's what I have to work with right now. If there is no way to keep the linebreaks in your solution as it is, I will work with them to see what they can do on their end.
Currently Prince is parsing the XML so that it can check correctness and merge in any additional metadata properties needed for the PDF profile (eg. PDF/A or PDF/X, which require additional XMP).
Copying the text through verbatim is not compatible with this approach.
We could add newlines after the elements, but I would be very nervous about maintaining this in the future; what if a different printer requires \n and chokes on \r\n, or some other weirdness.
If everyone sticks to the spec it should guarantee compatibility with the widest range of vendors.