Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.
Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
ResCarta Toolkit offers an open source solution to creating, storing, viewing, and searching digital collections. Applications in the toolkit let users create and edit metadata, convert data to open standard ResCarta format, index and host collections.
Search engine and data mining applications and ClueWeb datasets.
The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software, including the Indri search engine in C++, the Galago search engine research framework in Java, the RankLib learning to rank library, ClueWeb09 and ClueWeb12 datasets and the Sifaka data mining application.
panFMP is a generic framework suitable for harvested XML metadata that is searchable through Apache Lucene without any additional RDBMS. Fields can be defined by XPath allowing for full text queries on all types of fields including numerical ranges.
The code was moved to Github: https://github.com/pangaea-data-publisher/panfmp
Project moved to GitHub!
https://github.com/carrot2/carrot2
Carrot2 is an Open Source Search Results Clustering Engine. It can automatically organize small collections of documents, e.g. search results, into thematic categories. Carrot2 integrates very well with both Open Source and proprietary search engines.
Digital Learning Sciences (DLS) is a mission-centered, not-for-profit organization dedicated to improving learning through the use of digital content and tools.
IDRA (InDexing and Retrieving Automatically) is a tool which allows indexing a wide range of text (TXT, DOC, PDF) and image annotations files (XML), query-based searching, visualizing an index, saving it for re-usability, evaluation, etc.
With PubSearch you can search for publications of an author in more publication databases at one time. PubSearch can crawl also "cited by" publications transitively for you! You can export publications to citation formats. You can add your own format templates and publication database definitions.
See the link below for more details.
Access competitive interest rates on your digital assets.
Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform.
Geographic restrictions, eligibility, and terms apply.
Hier geht es um einen Webbrowser zum Kooperativen Surfen im Web. Dazu kann man sich an einem Server anmelden und Gruppen bilden zum Surfen. Alle Mitglieder einer solchen Gruppe sehen dann, wo die anderen gerade unterwegs sind und können sich gegenseitig ü
This project aims to build a suite of Natural Language Processing tools. Modules will include corpus indexing and access tools, a part-of-speech tagger, tokenisers, text classification software, etc.
A system to perform analysis of large documents for the purpose of cataloging similar documents. Similarity is based upon contextual analysis of these documents done by identifying common words and proper nouns.
XML documents To Generated dynamic web application supporting CRUD actions. Credits to Ministry of Culture and Communication, France; UNESCO; Ecole Nationale des Chartes, France; PASS-TECH, France.
Common Library is a learning content and document management system that enables dynamic outcome-based learning object alignment through semantic analysis & granular addressability. For implementation questions contact http://www.bluefletch.com.
Kneobase is an enterprise search engine, based upon the Lucene search engine and the Spring framework. It allows to perform full-text search across many different content sources. It is highly adaptable out-of-the-box and has a pluggable architecture.
=DOES NOT WORK ANYMORE AS DSA HAS PUT CAPTCHA= DSA Practical Driving Test Monitor helps you find any available practical driving test slot within specified date range. Runs on Linux/Mac/Windows and automates your manual task of finding the test slot.
Ex-Crawler is divided into 3 subprojects (Crawler Daemon, distributed gui Client, (web) search engine) which together provide a flexible and powerful search engine supporting distributed computing. More informations: http://ex-crawler.sourceforge.net
AIS - Associative Indexing Service, an application for storing bookmarks, memos, indexing of big (lifetime) archives for fast future access to the data by (personalized) keywords. In other words - it is an extension of human associative memory :)
The Semantic Web implementation using native xml database as backend storage. A SPARQL java compiler to XQuery using Jena. There are XQuery scripts for native xml database Sedna(http://modis.ispras.ru/sedna/).
The Cornell Web Lab Collaboration Server is a suite of tools and services for GUI-based extraction, analysis and sharing of archived web data. See http://weblab.infosci.cornell.edu/ and http://www.cs.cornell.edu/~weigel for details about the project.
iVia is an Internet subject portal or virtual library system. As a hybrid expert and machine built collection creation and management system, resources can be crawled and metadata and selected full-text can be automatically generated/extracted.
OAI4J is a Java library that implements a client API for the OAI-PMH standard specification from the Open Archives Initiative. It also has support for the upcoming OAI-ORE specification.
Contineo is a Web-based Document Management System (DMS). Features: Folder organization, document Versioning, Bulk import, import from mailbox. NOTE: this project has been DISMISSED in favor of LogicalDOC http://sourceforge.net/projects/logicaldoc