The Textract Project consists of C++ source code to extract text from a growing assortment of file formats. Output is indexing-ready. The Textract Project is intended as a foundation to support research-quality search engines.

Project Activity

See All Activity >

Categories

HTML/XHTML

License

GNU General Public License version 2.0 (GPLv2)

Follow Textract

Textract Web Site

Other Useful Business Software
Cloud-based help desk software with ServoDesk Icon
Cloud-based help desk software with ServoDesk

Full access to Enterprise features. No credit card required.

What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
Try ServoDesk for free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Textract!

Additional Project Details

Operating Systems

Windows

Intended Audience

Developers

Programming Language

C++

Database Environment

Flat-file

Related Categories

C++ HTML XHTML

Registered

2008-11-13