Detexter is an app designed to extract text from PDF files.
Extract Helium (formely Carbon) backups
Analyze files to get their real format. Retrieve corrupted ones.
A command line tool to extract data from xml files
Picks up text from a web page using a html template.
Open source Extract Transform Load engine written in Java
Java Based Heavy-duty utilitity to process large delimited text files
Graphical utility to inspect EBML files (WebM, Matroska), in Java
NAS-KONV delivers XSLT scripts to extract data out of ALKIS NAS files
XPath HTML parser
simple extract XML-Nodes to CSV