I've read the docs and searched the forums but I'm still not fully sure of how charsets work.
My scripts use ISO-8859-15 (single byte) so I've set $unicode=FALSE and $charset='ISO-8859-15' in the TCPDF constructor. I've written this test code:
$pdf->writeHTML('<p>Character 151: ' . chr(151) . ' (dash in cp1252, control char in ISO-8859-1 and ISO-8859-15)</p>');
$pdf->writeHTML('<p>Character 128: ' . chr(128) . ' (euro in cp1252, control char in ISO-8859-1 and ISO-8859-15)</p>');
$pdf->writeHTML('<p>Character 164: ' . chr(164) . ' (diaeresis in cp1252, currency in ISO-8859-1 and euro in ISO-8859-15)</p>');
$pdf->writeHTML('<p>Entity &euro;: €</p>');
$pdf->writeHTML('<p>Entity &#8364;: €</p>');
$pdf->writeHTML('<p>Entity &#x20AC; €</p>');
I've run it with times, helvetica and courier and I get the same (inconsistent) result:
Character 151: — (dash in cp1252, control char in ISO-8859-1 and ISO-8859-15)
Character 128: € (euro in cp1252, control char in ISO-8859-1 and ISO-8859-15)
Character 164: ¤ (diaeresis in cp1252, currency in ISO-8859-1 and euro in ISO-8859-15)
Entity €: ¤
Entity €: €
Entity € €
That is, the output sometimes uses the cp1252 encoding (dash and plain euro) and sometimes uses ISO-8859-15 (currency).
The code in example_006.php sets $unicode=TRUE and no $charset and works fine in my system but it's using 7-bit ASCII data. If I do the same in my code and use iconv() to get UTF-8 out of my ISO-8859-15 I still get inconsistent results.
My questions:
Is it a bug?
Is the issue font related? Do I need to rebuild the included fonts?
Will these chars look the same in all computers?
Please apologize me if I look too picky but I'll be generating PDF files from user input.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've read the docs and searched the forums but I'm still not fully sure of how charsets work.
My scripts use ISO-8859-15 (single byte) so I've set $unicode=FALSE and $charset='ISO-8859-15' in the TCPDF constructor. I've written this test code:
$pdf->writeHTML('<p>Character 151: ' . chr(151) . ' (dash in cp1252, control char in ISO-8859-1 and ISO-8859-15)</p>');
$pdf->writeHTML('<p>Character 128: ' . chr(128) . ' (euro in cp1252, control char in ISO-8859-1 and ISO-8859-15)</p>');
$pdf->writeHTML('<p>Character 164: ' . chr(164) . ' (diaeresis in cp1252, currency in ISO-8859-1 and euro in ISO-8859-15)</p>');
$pdf->writeHTML('<p>Entity &euro;: €</p>');
$pdf->writeHTML('<p>Entity &#8364;: €</p>');
$pdf->writeHTML('<p>Entity &#x20AC; €</p>');
I've run it with times, helvetica and courier and I get the same (inconsistent) result:
Character 151: — (dash in cp1252, control char in ISO-8859-1 and ISO-8859-15)
Character 128: € (euro in cp1252, control char in ISO-8859-1 and ISO-8859-15)
Character 164: ¤ (diaeresis in cp1252, currency in ISO-8859-1 and euro in ISO-8859-15)
Entity €: ¤
Entity €: €
Entity € €
That is, the output sometimes uses the cp1252 encoding (dash and plain euro) and sometimes uses ISO-8859-15 (currency).
The code in example_006.php sets $unicode=TRUE and no $charset and works fine in my system but it's using 7-bit ASCII data. If I do the same in my code and use iconv() to get UTF-8 out of my ISO-8859-15 I still get inconsistent results.
My questions:
Please apologize me if I look too picky but I'll be generating PDF files from user input.