DataExtract is a program that scans files of many different types - text, PDF, Word, Excel etc, extracting all kinds of structured patterns, like email addresses and phone numbers, from them.
Features
- Reads Plain Text From Most Of The Major File Types - PDF, DOC, DOCX etc.
- Processes Extracted Text Looking For Specific Data Items Like Email Addresses.
- Define Your Own Text Patterns To Search For.
- Or Select From A Large Number Of Existing Library Patterns.
- Define Words Or Phrases Of Interest To Search For.
- Add Your Own Sets Of Data Items For Extraction.
- Screen Colours Configurable.
- Six Different Ways To See Extracted Data.
- Comprehensive Help.
- Extract Data From Single, Multiple Files or Whole Folder Structures.
License
Apache License V2.0Follow DataExtract
Other Useful Business Software
Gen AI apps are built with MongoDB Atlas
MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of DataExtract!