Hi,
I encountered issue on parsing HTML code with writeHTML.
Example how to reproduce:
require_once 'tcpdf/tcpdf.php';
$pdf = new TCPDF(PDF_PAGE_ORIENTATION, PDF_UNIT, PDF_PAGE_FORMAT, true, 'UTF-8', false);
$pdf->AddPage();
$pdf->SetFont('freeserif', '', 12);$content = '<div>Test: Р 123</div>';
$pdf->writeHTML($content);
$pdf->Write(0, $content);$pdf->Output('test.pdf', 'F');
It works fine and outputs 2 lines (HTML and plain) to PDF.
But if change $content:
$content = '<div>Test: Р</div>';
then TCPDF outputs only second (plain text) line.
Looks like the problem is the last character "Р" (Unicode: U+0420, UTF-8: D0A0, CYRILLIC CAPITAL LETTER ER) before closing </div>.
The following code seems working fine with the freesans font:
$content = '
$content = '
However the freeserif font seems to have some problems to render on Firefox when the subsetting is turned on, so you can turn it off for now:
$pdf->SetFont('freeserif', '', 12, '', false);
I'll investigate if there is a font issue or not.
Unfortunately it does not help.
Check the code below:
Sreenshot from Adobe Reader 11 (TCPDF v.6.0.058) attached. No Firefox used.
I am unable to reproduce your problem.
Maybe your letter is not properly encoded?
Try to use the html entity instead for the Ƥ letter:
Ƥ
According http://htmlentities.net "Р" letter is Р not Ƥ. In origin it is a normal capitalized cyrillic letter "Р" (pronunciation: ER). Other cyrillic letters placed before </div> outputs as normal. But "Р" fails.
$pdf->writeHTML('<div>Test A: Р</div>'); // Outputs: Test A: Р
$pdf->writeHTML('<div>Test B: Р</div>'); // No output
Very sad.
Seems that you are trying to use the wrong character:
http://www.fileformat.info/info/unicode/char/1056/index.htm
instead of:
http://www.fileformat.info/info/unicode/char/420/index.htm
420 DEC = 1A4 HEX
Did you try it yourself?
Outputs: Test A: Ƥ
HTML entity is not Unicode.
See http://www.fileformat.info/info/unicode/char/420/index.htm — Section HTML Entity (decimal) = Р
Then:
Outputs: Test A: Р (as needed result)
But:
Outputs nothing.
Look at the bug hack below:
Maybe it helps somebody to save time.
Last edit: Alexander Vasilyev 2014-02-14
Just corrected replace code to support of any closing tag:
Please do NOT reply on a closed ticket, instead use the Help forum.
Your document seems to be bad encoded since the html entities are accepted.
Р and Ƥ are equivalent since one is in decimal and the other in hexadecimal representation.