MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Level Up Your Cyber Defense with External Threat Management
See every risk before it hits. From exposed data to dark web chatter. All in one unified view.
Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
A multi-agent architecture for building interactive dramas. It uses the Jason's BDI engine, being the Jason's agent-oriented programming language utilized for performing the drama management and for authoring behaviors for the characters.
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
This application illustrates natural language processing using tagged grammars and statistical classification. Outputs are shown with the EMMA specification of the W3C. A viewer is provided to allow for more user-friendly viewing of EMMA results.
Web application to make user-friendly requests on large XML database.
Tools to XML-ize large bodies of semi-formal texts (like floras).
Computer-assisted specimen identification.
Uses natural language processing, 2D/3D images analysis and generation.
Maximum entropy is a powerful method for constructing statistical models of classification tasks, such as part of speech tagging in Natural Language Processing. Several example applications using maxent can be found in the OpenNLP Tools Library.
Full access to Enterprise features. No credit card required.
What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
Reconcile is an open source research platform for coreference resolution. It combines a large number of open source NLP components and provides extension points for researchers to plug in additional features and techniques.
Java Suffix array library for phrase discovery. Inspired initially by the classic paper of Yamamoto & Church, with newer ideas from Abouelhoda et al and Kim et al. Adapted for large alphabet so that words can be tokenized as alphabet characters.
A language modeling tool kit written in Java for natural language processing applications. It can handle character-by-character modeling of unknown words, language model combination, comparison, and evaluation, as well as a number of smoothing techniques
This project is contains implementations of algorithms to integrate the output of different NLP tools (part of speech taggers, morphologies, parsers, etc.) in order to obtain more accurate, more robust and more fine-grained linguistic analyses.
Note that the code is outdated, but left here for documentation purposes. Its functionality may be reimplemented within the NLP2RDF project (http://code.google.com/p/nlp2rdf).
D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.
This project is a compilation of tools/libraries to help with tasks related to Text Analytics mainly in Java. These tools range from simple wrappers to sophisticated mining tasks that can improve the productivity of researchers and engineers.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
NLP4J library is a toolset written in Java for Natural Language Processing. This version is oriented to Document Classification and uses Naive Bayes, TF-IDF, etc. There are also pre-processing tools.
Facilitates data mining/natural language processing experiments to be executed on weblogs, such as classification, clustering and rating. As part of these experiments, it is possible to apply Latent Semantic Analysis.
NLPTools-ES is a Spanish plugin for GATE (General Architecture for Text Engineering). It includes a tokenizer, sentence splitter, gazetteer, pos tagger.
QuickAI (pronounced, "quickeye", or just "Quick" for short) is a return to the fundamental goals of creating an artificial intelligence. The priorities are to implement core models of knowledge and knowing, a reasoning engine, and a simple interface.
TBLTools is a set of GATE processing resources that implements the Fast Transformation Based Learning Algorithm. You can train it to learn rules for NLP tasks such as Named Entity Recognition and Shallow parsing.
JWNL is a Java API for accessing the WordNet relational dictionary. WordNet is widely used for developing NLP applications, and a Java API such as JWNL will allow developers to more easily use Java for building NLP applications.