Menu

CSCharacterParser - in seldom cases glyphAscent and glyphDescent returns 0

Help
Peter M
2014-01-10
2014-02-03
  • Peter M

    Peter M - 2014-01-10

    Hi,

    I use CSCharacterParser when hilighting words in a custom pdfviewer im working on. I found that in seldom cases glyphAscent and glyphDescent returns 0. To fix/hack this I use the following in "basicTextShowGlyphs":

    @Override
    protected void basicTextShowGlyphs(PDGlyphs glyphs, float advance)
            throws CSException {
        AffineTransform tx;
        tx = (AffineTransform) getDeviceTransform().clone();
        tx.concatenate(textState.globalTransform);
        lastStartX = tx.getTranslateX();
        lastStartY = tx.getTranslateY();
        // get the transformed character bounding box
        double glyphAscent = glyphs.getAscent();
        double glyphDescent = glyphs.getDescent();
        double ascent = (textState.fontSize * glyphAscent) / THOUSAND;
        double descent = (textState.fontSize * glyphDescent) / THOUSAND;
    
        // hack begin
        if (ascent==0.0 && descent == 0.0 ) {
            ascent=(textState.fontSize * 3)/4;
            descent=(textState.fontSize * 1)/4;
        }
        // hack end
    
        if (descent > 0) {
            descent = -descent;
        }
        double[] pts = new double[] { 0, descent, advance, ascent };
        tx.deltaTransform(pts, 0, pts, 0, 2);
        //
        float x = (float) lastStartX;
        float y = (float) (lastStartY + pts[1]);
        float width = (float) pts[2];
        float height = (float) (pts[3] - pts[1]);
        if (width < 0) {
            x += width;
            width = -width;
        }
        if (height < 0) {
            y += height;
            height = -height;
        }
        Rectangle2D charRect = new Rectangle2D.Float(x, y, width, height);
        if (getBounds() == null || getBounds().intersects(charRect)) {
            onCharacterFound(glyphs, charRect);
        }
        // advance text matrix and store position for reference
        super.basicTextShowGlyphs(glyphs, advance);
        tx = (AffineTransform) getDeviceTransform().clone();
        tx.concatenate(textState.globalTransform);
        lastStopX = tx.getTranslateX();
        lastStopY = tx.getTranslateY();
    }
    

    Please let me know if you have a better suggestion.

    Best, Peter

     
  • mtraut

    mtraut - 2014-01-17

    This sounds reasonable as a workaround. But it may be better to go to the roots. So maybe you can send us a document where this possibly wrong ascent/descent is derived. If absolutely required, a workaround that computes a default may be better implemented in the PDFontDescriptor itself.

    ciao, Michael

     
  • mtraut

    mtraut - 2014-02-03

    Well, this doc has about 130 pages. You do not want me to scan through this, do you? A look at the first pages shows the use of three (standard) fonts, each of wich correctly evaluates to some non 0 ascent / descent.

    Maybe you could narrow this down with a code snippet acting on some specific part....

     
  • Peter M

    Peter M - 2014-02-03

    Here is a single page for which all chars returns 0 on getAscent() and getDescent():

    https://drive.google.com/file/d/0BxXDYY2kfIQ0c3VNeElNUlBIY1E/edit?usp=sharing

    Im sorry my understanding of working with fonts in a pdf is very limited.

    Best, Peter

     
  • mtraut

    mtraut - 2014-02-03

    This page references two fonts, both of them with ascent and descent explicitly set to 0. So the creator gets what he deserves :-) Here's nothing you can workaround, except on the creator side.

    If i remember correctly, the font descriptor should not even be contained, as it is derived from one of the basefonts. And if it is embedded, it should at least copy the correct values.

     

Log in to post a comment.