Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.
Build generative AI apps with Vertex AI. Switch between models without switching platforms.
JBoost is a simple, robust system for classification. JBoost contains implementations of several boosting algorithms in an alternating decision tree framework. In addition, JBoost provides extensible software for adding more learning algorithms.
Java Micro Benchmark - control tasks required to determine the comparative performance characteristics of the computer system on different platforms. Can be used to determine the effect of different software on the performance of a computer system.
HanNanum is a Korean Morphological Analyzer and POS Tagger. A plug-in component-based architecture is adapted to the new Java version for flexible use. You can find the work flow for morphological analysis, POS tagging, noun extraction, etc.
Contact:
kschoi@kaist.ac.kr
hjjeong@world.kaist.ac.kr
The Wikipedia Miner toolkit provides simplified access to Wikipedia. This open encyclopedia represents a vast, constantly evolving multilingual database of concepts and semantic relations; a promising resource for nlp and related research.
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
ELIA(Eyegaze Language Integration Analysis) supports the analysis of eye-tracking data for studies in language processing. ELIA eases early analysis of data to enable iterative development of experiments in response to spoken language.
MediaWikiRevisionsExtractor extracts the history of a particular wiki page, computes the modifications made between each revisions and finally, stores the whole set of modifications in a file.
Parsers for biological data based on scanner generators like Flex (C), Re2c(C), Jflex (Java) and Ifickle (Tcl). This scanner generators are providing easier maintainance, development and higher speed than hand written scanners. Scanner output is SQL.
Enrich and query corpora in the TEI-XML vocabulary. CorpusReader manage very large corpora and corpora containing milestone annotation. It provides tools for enriching corpora with output of linguistic parsers, and for extracting quantitative information
Cougar Squared is a new Java library for machine learning and data mining research, supporting research needs of the community. It is written by researchers for researchers. It extends the WEKA and YALE machine learning frameworks.
Data mining tool for sequences (e.g. trajectories on a map, visited web pages, etc.) that creates a succinct description of the sequences, given a taxonomy (e.g. regions and sub-regions in the map, categories and sub-categories of pages, etc.).
Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
With the "xix" library, GATE functionality is available in XQuery (via an MXQuery extension). OpenCalais invocation is supported, too. -- Source code at http://sgv-jenkins-01.ethz.ch/job/xixlib/ws/-- "Show project details" for instruction
D.U.C.K (Determine segmentation of Unknown words by using Context Knowledge)is an NLP tool, which aims to find the correct segmentation for unknown words in written Hebrew. Statistics from different scopes will be used to determine the segmentation.
SPASE Model is a collection of tools for working with the structured data model information. Tools can convert the relational version of the data model into various expressions, including XSD, XMI and PDF documentation.
MonteCarlo portfolio simulation - it can be used as stand-alone command line application - it takes simple XML file needed data as entry and creates simple XML file with output, also this stuff have JNI and ISAPI interface.
OpenDMAP (Open Source Direct Memory Access Parser) is a natural language processing (text mining) application: a semantic parser for information extraction.
* Java classes for parsing text, conversion to XML or to evaluate in Java. The parser is textual-script-controlled with a syntax near Backus Naur Format, named ZBNF. * Some routines for conversion: C-Header or Java to XMI, XML-Documentation generation,
Maximally flat (maxflat) digital filter design in Java, with arbitrary numbers of poles and zeros. "Maximally flat" means that the magnitude frequency response has the maximum number of vanishing derivatives at 0 and pi.
OpenGGD aims to be a solution that centralizes the GPS information of a vehicle fleet, acting as an interface among different GPS devices or their control programs and different GIS programs
JColorGrid is a Java application for graphical 2D data exploration and rendering of colorgrids, a generalization of the popular heat-map representation. The software exists as a command-line application, a graphical viewer, and an API for development.
MACOW is a formal and scalable mandatory access implementation suitable on open worlds such as the provided on Semantic Web, Autonomic Computing and Coaltions and Federations scenarios. It is able to access control on distributed systems.
TBLTools is a set of GATE processing resources that implements the Fast Transformation Based Learning Algorithm. You can train it to learn rules for NLP tasks such as Named Entity Recognition and Shallow parsing.
T-Rex (Trainable Relation Extraction) is a highly configurable machine learning-based Information Extraction from Text framework, which includes tools for document classification, entity extraction and relation extraction.