It would be great to implement apache tika in Rivulet ES.
Apache Tika is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries.
* HyperText Markup Language
* XML and derived formats
* Microsoft Office document formats
* OpenDocument Format
* Portable Document Format
* Electronic Publication Format
* Rich Text Format
* Compression and packaging formats
* Text formats
* Audio formats
* Image formats
* Video formats
* Java class files and archives
* The mbox format
Thanks and Regards Dimce Iliev
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.