Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.
Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
A general purpose source code indexer and cross-referencer that provides web-based browsing of source code with links to the definition and usage of any identifier. Supports multiple languages. Up-to-date information in http://lxr.sourceforge.net
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
XMLTV (http://xmltv.org/) is for grabbing TV listings primarily from websites. It has a grabber for Danish Television that grabs from http://tv.tv2.dk, but here we maintain serveral others. You can find documentation on http://niels.dybdahl.dk/xmltvdk
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.
Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
Framework for text mining, data integration and dataanalysis. Keywords: ontology and graph alignment, relation mining, warehouse, semantic database integration, bioinformatics, systems biology, microarray, Java.
Framework (scripts, configuration, code) to build free and public services around travel and leisure data. That project makes an extensive use of already existing data sources such as Geonames and dbPedia, and adds some glue around those (eg, links).
Graph-based Extraction and Summarization - a generic graph-based summarization framework. Basic functionality is provided - third-party modules can be plugged in.
High performance distributed in-memory key/value store
Infinispan is an open source, Java based data grid platform. ***IMPORTANT*** Starting with Infinispan 5.0.0.FINAL, Infinispan releases are no longer hosted in Sourceforge. They can now be located in www.jboss.org/infinispan/downloads
Spider that recollects data from MySpace Social Network.
At now, it is only designed to extract information from native american people because it is used for a social science study in the UNAM (Universidad Nacional Autónoma de México).
Webstats Solr is an attempt to make Apache Access log easier to Data Mine. By adding a powerful Search Engine (SOLR) as a Backend and using Java Script and HTML and maybe PHP I hope to out date AWStats.
Pypes is a framework which allows users to break complex data processing logic down into a series of smaller less complex tasks. These tasks, referred to as components, can then be connected so that the output of one becomes the input to another.
Nucular Archiving System for creating full text indices for fielded data. Python API, web, and command line interfaces. Fast. Very light weight. Concurrent read/writes with no possible locking issues. No server process. Proximity. Facets. Funny name.
This is a Python script to parse your irssi logs and input them into a MySQL database which you can then use to search and display your logs on the web. It incrementally updates the database from the logs and is ideally run as a cronjob often.
Visualization of the contact network and user data from the popular business network XING.com. The web-based software can be used by every registered user from XING.
Build your music portal easily. With this PHPNuke module you will be able to publish music information: Artists data, albums, songs, audio samples, chart lists, themes and more. You can install it with no technical knowledge.
Irudiko is a library written in C++ for generating Locality Sensitive Hashing sketches from any textual and web document. Mainly designed to work with HTML pages, it has also an optimization support for English or Italian documents.
This project intends to create an indexing search engine, for knowledge management. The primary object is to apply an information retrieval core. And implement a knowledge data discovery theory such as data mining algorithm, text mining.
A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
phpTrafMon is a set of scripts written in php. It shows in an attractive and user-friendly way the traffic in a local network and a share in it of every user. phpTrafMon requires MySQL, crontab and a popular IPFM program.
Image2DocInfo has been made to quickly tag digital pictures. A GUI allows you to set attributes for an image, and then store them in XML files. Those files follow the Dublin Core naming scheme and are stored in the same directories than the pictures.