MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Cloud-based help desk software with ServoDesk
Full access to Enterprise features. No credit card required.
What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
MediaWikiRevisionsExtractor extracts the history of a particular wiki page, computes the modifications made between each revisions and finally, stores the whole set of modifications in a file.
Knime (http://www.knime.org) nodes for sequence bioinformatics. Sequime is an eclipse plug-in for the KNIME data mining platform, providing additional nodes for reading, processing and visualizing sequence information.
Parsers for biological data based on scanner generators like Flex (C), Re2c(C), Jflex (Java) and Ifickle (Tcl). This scanner generators are providing easier maintainance, development and higher speed than hand written scanners. Scanner output is SQL.
Cloud data warehouse to power your data-driven innovation
BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.
BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
Enrich and query corpora in the TEI-XML vocabulary. CorpusReader manage very large corpora and corpora containing milestone annotation. It provides tools for enriching corpora with output of linguistic parsers, and for extracting quantitative information
Cougar Squared is a new Java library for machine learning and data mining research, supporting research needs of the community. It is written by researchers for researchers. It extends the WEKA and YALE machine learning frameworks.
The system searches synonyms (and related words) in Wikipedia. WikIDF generates index database of Wikipedia (for Russian, English, and German). The continuation of this project is "wikokit" at code.google.com
Feating constructs a classification ensemble comprising a set of local models. It is effective at reducing the error of both stable and unstable learners, including SVM. For details see the paper at http://dx.doi.org/10.1007/s10994-010-5224-5.
Enterprises and companies seeking a solution to manage all their procurement operations and processes
eBuyerAssist by Eyvo is a cloud-based procurement solution designed for businesses of all sizes and industries. Fully modular and scalable, it streamlines the entire procurement lifecycle—from requisition to fulfillment. The platform includes powerful tools for strategic sourcing, supplier management, warehouse operations, and contract oversight. Additional modules cover purchase orders, approval workflows, inventory and asset management, customer orders, budget control, cost accounting, invoice matching, vendor credit checks, and risk analysis. eBuyerAssist centralizes all procurement functions into a single, easy-to-use system—improving visibility, control, and efficiency across your organization. Whether you're aiming to reduce costs, enhance compliance, or align procurement with broader business goals, eBuyerAssist helps you get there faster, smarter, and with measurable results.
The NITE XML Toolkit supports the creation, analysis, and browsing of annotated multimodal, text, or spoken language corpora, and represents both timing and rich linguistic structure. It contains libraries for developers and some end user tools.
Executable program that measures sizes and other properties of colonies arrayed in a grid format (intended for 768, 384, or 96 colonies on agar plates) from jpeg images
Siafu simulates individual agents and their context, from home to city-wide scenarios. As a developer, you use the API to write your simulation for the purposes of data-set generation, test or visualization, optionally hooking it to your own application.
Data mining tool for sequences (e.g. trajectories on a map, visited web pages, etc.) that creates a succinct description of the sequences, given a taxonomy (e.g. regions and sub-regions in the map, categories and sub-categories of pages, etc.).
Regexp testing tool allows to apply group of regexps to huge arrays of data (millions or so) in order to investigate search or search/replacement possibilities of regexp group.
library for capturing, storing and visualizing timeseries data
The JTimeSeries has moved to github
Please go to https://github.com/JTimeSeries/jtimeseries
The SourceForge copy has not been maintained since Sep 2012
A java library to assist with capturing and storing timeseries data/metrics. Provides facilities to publish timeseries data across a network, a lightweight server to persist series data, and client user interface components for real time visualization
OpenSHORE is an XML based Semantic Document Repository (SDR) with a free definable meta model that builds up a semantic network from sections and relations in documents. The acronym SHORE means Semantic Hypertext Object Repository.
Ontea - Pattern based Semantic Annotation Platform. Ontea search or create semantic meta data from text or documents using pattern based approaches. Implementation currently includes regular expressions (regex) patterns
JGraph is the most powerful, lightweight, feature-rich, and thoroughly documented open-source graph component available for Java. See the project homepage at www.jgraph.com for information and downloads.
A lyrical analysis and classification tool focused specifically on rhyming style in rap lyrics. Functions include phonetic transcription, rhyme visualization, and rapper classification.
Contextor is a light-weight simple-to-use Java based library to help developers and researchers working with the general concept of a resource; as examples, resources can be text resources, web resources, images and videos.
Example-based Modeling (EMO) is an tool to create data models, with examples, using a web interface. You interactively create a web-accessible database of models and samples for those models. A white paper describes the underlying assumptions.