Menu

PDF content search

phil
2015-09-16
2015-09-27
  • phil

    phil - 2015-09-16

    Thanks for docFetcher.

    Feature request:
    would you be so kind to integrate true pdf content search, the best tool I know being https://pdfgrep.org/

    I have pdf examples where docFetcher 1.1.16 (and other tools such as grepWin) will not find text content, while pdfgrep does.

    P.S. There is a non upto date pdfgrep version for Windows on http://soft.rubypdf.com/software/pdfgrep-windows-version

     

    Last edit: phil 2015-09-16
  • Nam-Quang Tran

    Nam-Quang Tran - 2015-09-16

    What exactly is "true pdf content search"? If you're talking about searching for words in the contents of PDF files, DocFetcher already does that.

     
  • phil

    phil - 2015-09-22

    Only in some pdf cases, that the all discussions I had with pdf creator and pdfgrep developpers.
    How I can send you a sample pdf where docFetcher won't perform the content text search?

     
  • Nam-Quang Tran

    Nam-Quang Tran - 2015-09-22

    DocFetcher's PDF content extraction relies on a third-party component calld Apache PDFBox, so if it doesn't work on certain files for some reason, there isn't much I could do about it anyway.

    In any case, my (reversed) address is: users.sourceforge.net <- qforce@

     

    Last edit: Nam-Quang Tran 2015-09-22
  • phil

    phil - 2015-09-27

    Humm, sorry it seems I faulty constructed the index, indeed docFetcher finds within searcheable pdf (contractry to grepWin), but still you should look at the new --warn-empty from https://pdfgrep.org/news.html

     

Log in to post a comment.