Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
Access competitive interest rates on your digital assets.
Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform.
Geographic restrictions, eligibility, and terms apply.
Eureka is a software for information processing. It can be used by scientists, students, journalists or writer to organize their work. Eureka can work with multiple sources of information : Web pages, HTML contents indexation, books notes.