Parallel text aligner dessigned to generate transation memories (TMX files) from two files tagged with any kind of XML-based tags. The application uses the tag structure and the text blok length to perform the alignment.
**CODE MOVED TO GITHUB: https://github.com/bitextor **
Bitextor is an application created to generate translation memories using multilingual websites as a corpus source. It downloads an entire website and applies a set of heuristics (based mainly on HTML tag structure and text block length) to find bitexts.
The Textract Project consists of C++ source code to extract text from a growing assortment of file formats. Output is indexing-ready. The Textract Project is intended as a foundation to support research-quality search engines.
Visual xsltproc is a tool which help to write xslt file, and debug it to find errors. It writes xml, and generates xml (Syntax highlighting of XML & line Nr.). Finally if the result is XSL-FO it generates the pdf on Apache FOP java. Build on QT4.2.