CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources.
A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources.
CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on.
Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . ...
An open source search engine with RESTFul API and crawlers
...Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on Windows and Linux/Unix/BSD.
This project creates a command line java application that uses LibreOffice in a headless mode to convert a document to the pdf file format. The source document had to be in a file format that LibreOffice can open.
Purpose is to render allmost all mails (body + attachments) into one or more PDFs. Focus was not set on a "sexy" rendition but on a rendition at all. Mails are read through imap or from a directory, renderer and saved as PDF in an output directory