Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
This project aims to implement in java the following textmining techniques: Text Language Detection, Keywords and keyphrases extraction, Text Classification, Text Clustering, Single or multiple documents Summarization, Plagiarism Detection.
This library offers an API to useful tools that can be utilized for any text mining project. More details will be mentioned in the project's future documentation
DM LiWeCool is a tool for preprocessing light-weight CSV data files as Weka-compatible. It includes merging different header lines into one, editing values (encoding, categorizing, etc) and saving data as ARFF or XRFF (Weka native). It is Java-based.
Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
Wikipedia Concept Association Map (WCAM) is new approach for textual knowledge representation and understanding. All concepts and associations are stored in a graph database for better performance and easy distribution.
Moara is a biological textmining tool and consists of a Java library and some auxiliary MySQL databases for gene/protein training and extraction of mentions and its further normalization and disambiguation.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
The Fiber project seeks to create a modular open source text mining tool that provides a contextual foundation for analysis in the dissemination of large quantities of text data.
@Note2 is now available in www.anote-project.org
@Note is a Biomedical Text Mining workbench that integrates current Biomedical Text Mining (BioTM) methods and provides biologists with intuitive tools capable of supporting their bibliographic searches and further literature curation.
hypKNOWsys aims at developing a Java-based workbench for knowledge discovery and knowledge management. Currently, hypKNOWsys has released two intermediate tools: DIAsDEM Workbench (textmining for semantic tagging) and WUMprep (Web mining pre-processing)
JDMF is a data mining framework written in Java. Main features include: simplicity, flexibility, many algorithms to choose from, many formats of input (e.g. XML, CSV, JDBC, Java beans) and output data (e.g. XML, plain text info, charts).
This project intends to create an indexing search engine, for knowledge management. The primary object is to apply an information retrieval core. And implement a knowledge data discovery theory such as data mining algorithm, text mining.
The aim of TextToOnto is to support developers in the ontology construction process by
applying text mining techniques. For this purpose it builds on KAON
(http://kaon.semanticweb.org)
txtkit is a visual text mining tool for exploring large amounts of multilingual texts. It's an multiuser-application which mainly focuses on the process of reading and reasoning as series of decisions and events.
webExtractor is a Java application that is used for extracting specific content from web based HTML, XML, CSV, and free form text. The extracted data can be used for data gathering and mining purposes.
GraphSpider is a pattern matcher which searches parsed text in phrase-structure tree or dependency graph format for syntactic structures matching a set of patterns in MPL, a regexp-like pattern language. Applications: information extraction, text mining.
reputron is a knowledge extraction engine platform that covers all aspect of text mining, relevance, indexing and querying on a corpus of text documents.