Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Problems with text rendering on some PDF documents

Help
Ruslan
2013-11-07
2013-11-07
  • Ruslan
    Ruslan
    2013-11-07

    Hi, first of all thanks for the lib, its quite nice.

    But recently I came across some PDF documents on which the text isn't drawn properly. Point is that when I'm trying to write English text on some of PDF documents with "Arial" font, the text is not displayed correctly, namely latin chars are replaced with unreadable chars.
    Here is one of such documents on which the problem can be reproduced: http://www.planetpdf.com/forumarchive/testMod.pdf

    Code to reproduce the problem:

    public class DrawTextOnDocument {
        public static void main(String[] args) throws Exception {
            PDDocument doc = null;
            try {
                String inputFileName = "testMod.pdf";
                String outputFileName = "testMod_result.pdf";
    
                FileLocator inlocator = new FileLocator(inputFileName);
                doc = PDDocument.createFromLocator(inlocator);
    
                IFontFactory factory = FontOutlet.get().lookupFontFactory(doc);
                FontQuery query = new FontQuery("Arial", PDFontStyle.REGULAR);
                PDFont font = factory.getFont(query);
                float fontSize = 20;
    
                PDPage page = doc.getPageTree().getFirstPage();
                while (page != null)
                {
                    CSCreator creator = CSCreator.createFromProvider(page);
                    creator.textSetFont(null, font, fontSize);
                    creator.textLineMoveTo(100, 700);
                    creator.textShow("Hello, World!");
                    creator.close();
    
                    page = page.getNextPage();
                }
    
                FileLocator outlocator = new FileLocator(outputFileName);
                doc.save(outlocator, null);
            } finally {
                if (doc != null) {                
                    doc.close();                
                }
            }
        }
    }
    

    Сan I somehow use "Arial" font to draw text on this PDF? Is there anything else I need to get this font to render properly?

    Thanks in advance
    Ruslan

     
    Last edit: Ruslan 2013-11-07
  • Elfi Heck
    Elfi Heck
    2013-11-07

    The font "Arial" in the existing PDF is defined as a Type0 font with an Identity-H map and a Type2 CID font as its descendant font. This means that the code points in the PDF's contents must be the same as the actual glyph indices in the font file. To get at these indices you'd have to use a character map contained in the font file itself, but CSCreator does not do this, and instead writes the chracters' Unicode code points in the PDF contents. As a font file's glyph indices are almost never the same as the corresponding characters' Unicode code points, you get the wrong glyphs.
    It should be possible to implement a mapping to handle this, but I don't think we will do that. You can have a go at it if you want to dig deeper into font handling.
    As a quick workaround you could use a another font instead of the one from the document. The builtin fonts are quite easy to access. See the examples.