I am using Prince 11 with Vranda font on Win 10.
I have an HTML page containing some Bengali text. It renders properly in Firefox 53.0 (32-bit) and Chrome 58.0.3029.81 (64-bit). When converted to a PDF using Prince, some of the characters do not display properly.
To display properly in the browsers, I have used ZWNJ characters to control the combining of certain Bengali characters, in particular U+09B0 and U+09AE when followed by U+09CD, U+09C7, and U+09C1.
I am following the guidelines (as best I understand them) found in:
1. http://unicode.org/review/pr-9.pdf
2. https://en.wikipedia.org/wiki/Zero-width_non-joiner
3. http://www.unicode.org/L2/L2006/06053-zwj-bengali-lig.pdf
4. http://unicode.org/review/pr-30.pdf
5. https://en.wikipedia.org/wiki/Bengali_alphabet
Prince does not seem to handle the ZWNJ character in the same manner as the browsers.
Following some suggestions on the Forum, I have added:
prince-text-replace: '\200C' '\200B'
to the CSS file (substituting ZWSP for ZWNJ).
This fixes most of the display issues. However, one character still does not display properly.
Reference (3) especially discusses non-ligature and ligature forms of combined characters. I think the troublesome character is one of these ligature characters whose "normal form" is as non-ligatured.
I have attached:
1. BengaliTest.html - the sample HTML page
2. TroublesomeText.png - shows which characters are not shown properly in PDF
3. PrintIt.css - the CSS file
4. BengaliTest-NoTextReplace.pdf - resulting PDF without the text replace
5. BengaliTest-TextReplace.pdf - resulting PDF with the text replace
TroublesomeText.png is a screen shot of the browser output, showing the proper display. The outlined character do not show properly in the .pdfs.
I have an HTML page containing some Bengali text. It renders properly in Firefox 53.0 (32-bit) and Chrome 58.0.3029.81 (64-bit). When converted to a PDF using Prince, some of the characters do not display properly.
To display properly in the browsers, I have used ZWNJ characters to control the combining of certain Bengali characters, in particular U+09B0 and U+09AE when followed by U+09CD, U+09C7, and U+09C1.
I am following the guidelines (as best I understand them) found in:
1. http://unicode.org/review/pr-9.pdf
2. https://en.wikipedia.org/wiki/Zero-width_non-joiner
3. http://www.unicode.org/L2/L2006/06053-zwj-bengali-lig.pdf
4. http://unicode.org/review/pr-30.pdf
5. https://en.wikipedia.org/wiki/Bengali_alphabet
Prince does not seem to handle the ZWNJ character in the same manner as the browsers.
Following some suggestions on the Forum, I have added:
prince-text-replace: '\200C' '\200B'
to the CSS file (substituting ZWSP for ZWNJ).
This fixes most of the display issues. However, one character still does not display properly.
Reference (3) especially discusses non-ligature and ligature forms of combined characters. I think the troublesome character is one of these ligature characters whose "normal form" is as non-ligatured.
I have attached:
1. BengaliTest.html - the sample HTML page
2. TroublesomeText.png - shows which characters are not shown properly in PDF
3. PrintIt.css - the CSS file
4. BengaliTest-NoTextReplace.pdf - resulting PDF without the text replace
5. BengaliTest-TextReplace.pdf - resulting PDF with the text replace
TroublesomeText.png is a screen shot of the browser output, showing the proper display. The outlined character do not show properly in the .pdfs.