Download Latest Version DocumentGrep_source_V0_9_3.zip (49.9 kB)
Email in envelope

Get an email when there's a new version of DocumentGrep

Home
Name Modified Size InfoDownloads / Week
V0.9 2026-01-13
README 2026-01-13 3.1 kB
Totals: 2 Items   3.1 kB 37
DocumentGrep Overview

Name of the program:
Further development of my project (PDF)GrepGui lead to the new name DocumentGrep. As it is now possible to search text in PDF, DOC, DOCX, ODT, TXT, RTF and HTML files, it was time to change the name of the project.

Description:
This is a GUI for the command line tools grep, pdfgrep, pdftotext, unrtf, odt2txt, antiword,docx2txt, html2text and libreoffice. DocumentGrep search text in multiple files types. You can use regular expressions for the search (https://en.wikipedia.org/wiki/Regular_expression). This GUI and the command line tools work without indexing. Either the document is converted into text and processed by the RegExpr libary of Andrey V. Sorokin or handeled by the cli command itself (like pdfgrep).

Performance:
This GUI works well when searching in several hundreds of documents, depending on the speed of your system and the length of the documents. Libreoffice is also used to convert the document to text, but only as the last option. Libreoffice is not designed to convert it in fast way. Using libre office for this purpose will take a very long time. So if you want to serach text in doc (antiword), docx (docx2txt), odt (odt2txt) or rtf (unrtf), please install the additional packages by "sudo apt-get install grep pdfgrep pdftotext unrtf odt2txt antiword,docx2txt html2text". You can check, if the additional packed are installed by clickling in the menu on info - about. If the packages are installed Libre/Open Office won't be used for these filetypes anymore. 

Results/Viewers:
It is recommended to use pdfgrep instead of pdftotext, as you can open the result of your search on the correct page, text marked, in the pdfviewer (if the pdf viewer supports this). If you use the text based search, you will only see the line number. You can choose your favorite text editor or pdf viewer in the options, all other documents will be opened with be the standard applications, set by your desktop environment.

Config Dir:
[homedir]/.config/DocumentGrep

Available Languages:
- English
- German

You can add your own language by editing the file [homedir]/.config/DocumentGrep/language.set
 end of the file (please don't leave empty lines in the file).
Change all numbers at the beginning of the copied text to a new unique number and replace the values (after the'=') in your language.
If you have made your own translation please send it to stephan.stein@online.de, I will place it on the homepage, so everyone can download it.

PDF Viewers:
In order to be able to doubleclick on a search result, in order to open it with an external document viewer, a viewer have to be installed and you have enter the name of the viewer in the options (menu/options)
At the first program start the program tries to find an installed document viewer. The following viewers are checked:
Viewer  Example for options (not all tested)

Okular (KDE)    okular -p $PAGE $FILE

Evince (Gnome)  evince -p $PAGE -l $SEARCH $FILE

Atril (MATE)    atril -p $PAGE -l $SEARCH $FILE

XPDF        xpdf $FILE $PAGE

Gostview        gv -page=$PAGE $FILE

Source: README, updated 2026-01-13