Menu

#94 Search for many words NEAR each other

wont-fix
nobody
None
1
2014-05-26
2014-05-18
sensor66
No

Hi, great system you guys are having here.
I come from doing litarature research using FolioView (now by RocketSoftware). It does not work anymore after Win XP, and Rocket can't get the database to work under later win versions. I am sure more people have this problem.

So we can convert our database to (e.g.) 213MB of 286 PDF files. What we then need is a way to search for let's say up to 6 words NEAR each other. FolioView did that "within paragraphs".

Right now in DocFetcher, for two words this would look like:

"word1 word2"~50 OR "word2 word1"~50

But of course for 6 words this would become very complicated.

Is there a way to do this, or could it be programmed as a specal search function?

Discussion

  • sensor66

    sensor66 - 2014-05-18

    One way to do this would be to divide all text into paragraphs, i.e. each paragraph is a record, and then search for the desired words within the paragraphs.

     

    Last edit: sensor66 2014-05-18
  • Nam-Quang Tran

    Nam-Quang Tran - 2014-05-18

    Hi,

    Implementing something like this within DocFetcher is out of the question, as it would require rewriting huge swaths of core functionality.

    Instead, you could write a script to split the PDF files into smaller text files, where each text file contains one paragraph from a PDF file. Then you could search the text files with DocFetcher.

    If this isn't enough, then you'll probably need new software written for this specific purpose.

    Best regards
    q:-) <= Quang

     
  • Nam-Quang Tran

    Nam-Quang Tran - 2014-05-26
    • status: open --> wont-fix
     

Log in to post a comment.