Hello!
I have a problem with special characters in pdf files. I tried looking at the other threads about no glyphs problems, but I did not find an answer for my problem. I have tried this in both version 7.1 and version 6, same thing happen both places.
I have this html source:
I try to run this manually on the linux command line this way:
The result is the attached pdf file, and the log says:
The three characters is as the file says, three different versions of the gamma character. I have copied the characters out of windows charmap (Times New Roman font).
Since the meta tag is set to use utf-8, i would expect the pdf to work with all three characters. But only the small greek gamma work. The interresting thing is that when i try to set windows charmap to unicode and windows: greek, unicode show all three gamma characters, but for windows: greek, only the small greek gamma character is there, just as in the pdf output (see attached images from charmap).
How do I know if prince actually use utf-8 for the pdf generation? I have also tried putting an xml starter (<?xml version="1.0" encoding="UTF-8"?>) above the doctype in the html instead of the meta tag, still the same result.
I have a problem with special characters in pdf files. I tried looking at the other threads about no glyphs problems, but I did not find an answer for my problem. I have tried this in both version 7.1 and version 6, same thing happen both places.
I have this html source:
<!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<style type="text/css">
.times {
font-size: 12px;
font-weight: normal;
font-family: Timew New Roman;
}
.arial {
font-size: 12px;
font-weight: normal;
font-family: Arial;
}
.verdana {
font-size: 12px;
font-weight: normal;
font-family: Verdana;
}
</style>
</head>
<body>
<h1>Gamma characters:</h1>
<div class="times">
Latin capital letter gamma: Ɣ<br />
Latin small letter gamma: ɣ<br />
Greek small letter gamma: γ<br />
</div>
<br />
<div class="arial">
Latin capital letter gamma: Ɣ<br />
Latin small letter gamma: ɣ<br />
Greek small letter gamma: γ<br />
</div>
<br />
<div class="verdana">
Latin capital letter gamma: Ɣ<br />
Latin small letter gamma: ɣ<br />
Greek small letter gamma: γ<br />
</div>
<br />
</body>
</html>
I try to run this manually on the linux command line this way:
/usr/local/lib/prince/bin/prince encoding.html -o encoding.pdf --log=encoding.log
The result is the attached pdf file, and the log says:
Wed Feb 22 15:02:24 2012: ---- begin
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0194, fallback to '?'
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0263, fallback to '?'
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0194, fallback to '?'
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0263, fallback to '?'
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0194, fallback to '?'
Wed Feb 22 15:02:24 2012: warning: no glyphs for character U+0263, fallback to '?'
Wed Feb 22 15:02:24 2012: ---- end
The three characters is as the file says, three different versions of the gamma character. I have copied the characters out of windows charmap (Times New Roman font).
Since the meta tag is set to use utf-8, i would expect the pdf to work with all three characters. But only the small greek gamma work. The interresting thing is that when i try to set windows charmap to unicode and windows: greek, unicode show all three gamma characters, but for windows: greek, only the small greek gamma character is there, just as in the pdf output (see attached images from charmap).
How do I know if prince actually use utf-8 for the pdf generation? I have also tried putting an xml starter (<?xml version="1.0" encoding="UTF-8"?>) above the doctype in the html instead of the meta tag, still the same result.