Rectangle returned from text search

Help
Steven M
2009-05-13
2013-05-28
  • Steven M
    Steven M
    2009-05-13

    Firstly I'll give an overview of what I'm trying to do. I'm searching a string in the PDF file then getting the array of the hits back. I'm pulling one of these hits out and finding the rectangle which covers it. What I'm then hoping to do it create a link with that rectange and place it ontop of the found word. The issue it when I get the rectangle back from the search hits the X values look fine but the Y values are huge negative numbers
    For example this is one of the objects returned from the arraylist after is is passed into a CSSearchHits Object

    rec java.awt.geom.Rectangle2D$Float[x=85.03938,y=-8159.078,w=31.891998,h=14.0]

    As seen the Y is extremely large and should be between 0 and 600 rather than -8159. Also they always seem to be between -8000 and -8500.

    Is it possible to get some incite into what is causing this?
    Thanks,
    Steven

     
    • mtraut
      mtraut
      2009-05-13

      We had some problems with nested text transformations earlier - but these should be fixed with the december release.

      May be you have found another issue - could you provide a test document and some test description for me?

       
      • Steven M
        Steven M
        2009-05-14

        Ok, here is a modified version of the watermark example which I added the search and annotations. This could either be a bug or I'm using the package in the incorrect way.

        What this does is takes the word (as can be seen from this it is manual set to "duck") and then for the first instance which it is found on the page it added and annotation which is meant to cover the word.
        This works if you can get the right offset for the Y axis in the case of documents where the pages are in a vertical format.

        I think the issue is due to the corners being calculated over the full document rather than the page but it is strange because a certain offset for the Y in my example case (was 7750) would make all the annotation match correctly. This would enplane why the values for X are fine as the document I tested is in a vertical format so the X values are not able to be spanned across multiple pages.

        If you want the pdf file I use give me your email address and I'll send it to you

        -------------------------------------------------------------------------------------------------------------------
            protected void markPages()

            {

                PDPage page = getDoc().getPageTree().getFirstPage();

                while (page != null)
                {

                    markPage(page);

                    page = page.getNextPage();

                }

            }

            protected void markPage(PDPage page)

            {

                CSTextSearcher searcher = new CSTextSearcher();

                searcher.setSearchString("duck");

                CSDeviceBasedInterpreter interpreter = new CSDeviceBasedInterpreter(null, searcher);

                interpreter.process(page.getContentStream(), page.getResources());

                ArrayList dire = new ArrayList(); 

                dire = searcher.getHits();

                CSTextSearchHit hit = new CSTextSearchHit();

                CSContent content = CSContent.createNew();

                CSCreator creator = CSCreator.createFromContent(content, page);

                creator.saveState();

                if (dire.size() != 0)

                {

                    System.out.println("Doing this");

                    hit = (CSTextSearchHit) dire.get(0);

                    System.out.println("This is rec " + hit.getRect());   

                    PDLinkAnnotation annot = (PDLinkAnnotation) PDLinkAnnotation.META.createNew();

                    page.addAnnotation(annot);

                    CDSRectangle rect = new CDSRectangle(hit.getRect());

                                //This is used when I offset the Y in the duck.pdf to get the annotations to match
                            //rect.setLowerLeftY(Math.abs(rect.getLowerLeftY() + 7750));

                    //rect.setUpperRightY(Math.abs(rect.getUpperRightY() + 7750));

                                //This prints out the corners of the rectangle to the console so they can be seen
                    System.out.println("Lower Left X " + rect.getLowerLeftX());

                    System.out.println("Lower Left Y " + rect.getLowerLeftY());

                    System.out.println("Upper Right X " + rect.getUpperRightX());

                    System.out.println("Upper Right Y " + rect.getUpperRightY());

                    annot.setRectangle(rect);
                   

                    PDBorderStyle bs = (PDBorderStyle) PDBorderStyle.META.createNew();

                    bs.setWidth(4);

                    bs.setStyle(COSName.create("S"));

                    annot.setBorderStyle(bs);

                    annot.setColor(new float[] { 0.5f, 0.5f, 0.5f });

                    annot.setHighlightingMode(COSName.create("O"));

                    PDAction action = PDActionJavaScript.createNew("app.alert('hello')");

                    annot.setAction(action);   

                }

                creator.restoreState();

                creator.close();
                page.cosAddContents(content.createStream());

            }
        -----------------------------------------------------------------------------------------------------------------------
        Thanks,
        Steven

         
    • mtraut
      mtraut
      2009-05-14

      Please, upload e a reference document via a support request in the tracker. This will ease our support staff.

      In the meantime i will have a look at your code.

       
    • Steven M
      Steven M
      2009-05-14

      I've uploaded the PDF file I've been using. I have the same issue for any pdf I test though.

       
      • mtraut
        mtraut
        2009-05-14

        Hi,

        there is a bug in the computation of the global text transformation used for searching - my apologies.

        A fix will be included in the next release. You can fix this issue by changing the methods:

        CSDeviceAdapter.textLineMove
        CSDeviceAdapter.textLineNew
        CSDeviceAdapter.textSetTransform

        Here you will find the transformation that is computed in the wrong order. Correct code is:

                        ...
                // restart global transformation
                textState.globalTransform.setTransform(graphicsState.transform);
                textState.globalTransform.concatenate(textState.transform);

        With this change the text is found correctly

         
        • Steven M
          Steven M
          2009-05-14

          Yes, working now.
          Thanks again for the quick reply.
          Steven

           
    • mtraut
      mtraut
      2009-05-14

      btw., some remoarks to your snippet.

      you don't need to create and add a content stream to do the things you do. Page content and annotations are in no way related in PDF.

      So, the lines

      >

      CSContent content = CSContent.createNew();

      CSCreator creator = CSCreator.createFromContent(content, page);

      creator.saveState();

      >

      and

      >

      creator.restoreState();

      creator.close();
      page.cosAddContents(content.createStream());

      >

      are not relevant to your example.