Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.
Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
This is a Java-based project for complex event extraction from text and co-reference resolution. Currently the code can read BioNLP shared task format (http://2011.bionlp-st.org/) and i2b2 Natural Language Processing for Clinical Data shared task format (https://www.i2b2.org/NLP/DataSets/Main.php). Event extraction includes finding events and the parameters for an event in a text.
The method is based on SVM but other ML algorithms can be adopted. The method details are explained in the...
HanNanum is a Korean Morphological Analyzer and POS Tagger. A plug-in component-based architecture is adapted to the new Java version for flexible use. You can find the work flow for morphological analysis, POS tagging, noun extraction, etc.
Contact:
kschoi@kaist.ac.kr
hjjeong@world.kaist.ac.kr
jWords is a port of WORDS (by William Whitaker, a free latin-to-english dictionary program written in Ada), to Java. Besides the dictionary will be translated to the German language.
The program creates OWL ontology files that describe relationships between entities. Basis are definitions found by searching Wikipedia articles for specific lexico-syntactic patterns.
ELIA(Eyegaze Language Integration Analysis) supports the analysis of eye-tracking data for studies in language processing. ELIA eases early analysis of data to enable iterative development of experiments in response to spoken language.
The Simple Semantic Classifier classifies short chunks of natural language text into broad semantic classes that correspond to the OBO ontologies provided as input.
CORPSE (CORPus SEarch) is a powerful search engine written in Java. The aim is to provide an efficient implementation of a word level inverted index search with various cool functions that can be used on very large corpora.
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Java program to create a (potentially multilingual) glossary of the unique words in any given Lojban text.
Note that the Sourceforge page for this was superceded by the Bitbucket repository: https://bitbucket.org/pretoriusjf/vlastezba/overview
Any further updates will be made there.
WordNetLMF converts WordNet (http://wordnet.princeton.edu/) lexicographer files into KYOTO-LMF, the LMF dialect used in the KYOTO project (http://www.kyoto-project.eu/).
This is a fast C implementation of Arturo Camacho's SWIPE' pitch extraction algorithm. See the project homepage for more about the advantages of the SWIPE' algorithm. swipe-1.0.tar.gz contains the current source, which should compile quite neatly.
Affisix is a program for automatic recognition of prefixes. It takes large amount of words and according to the user setting it tries to determine which segments of these words are prefixes.
Genie is a highly sophisticated cognitive child-machine. Genie at its core is an artificial intelligence project, focusing on creating a new form of life.
The Varro toolkit is a system for identifying and frequently recurring unordered subtrees in semi-structured data. It is mostly for linguistics but has applications in semi-structured data mining too.
The Scheme Natural Language Toolkit (S-NLTK) is a Scheme R6RS library for language and text processing, and various tasks related to symbolic and statistical analysis of language data.
Core program and associated utilities for building a machine translation system using the Example-Based paradigm, where previously-translated text is used to infer new translations of previously-unseen text.
Sanchay is a collection of tools and APIs for language researchers. It has some implementations of NLP algorithms, some flexible APIs, several user friendly annotation interfaces and Sanchay Query Language for language resources.