MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
Start Free
Build Securely on Azure with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.
Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
The purpose of this project is to implement a generic Search Engine for object oriented domain models.
This project is based on published work by the author and it's intended to become the authors grade thesis project.
Memomics Forge is a meta-project for software that utilizes the Memomics Semantic Service.
Memomics Semantic Service provides semantic data which can be embedded in applications via webservices.
A PHP library/framework for the development of websites. The main features are: database independence, template-driven content, theme-able content generation, integrated WML generation, user content management, Lucene server integration.
Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
Trusted by nearly 20,000 customers worldwide, and all major cloud providers.
OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
In this project we like to implement an interface for the open yahoo search service which is called BOSS.
Then we will develop multi new search futures based on this API.
This is an ***old archive*** of tools developed for facilitating the use of Creative Commons licenses and metadata. --- For the most up to date representation of any of the projects listed here, please see: http://creativecommons.org/project/Developer.
The WhereIsNow Web Service Client Library project is a java library used to query the WhereIsNow webservices. You can freely embed it in your code to easily develop new clients and integrate the WhereIsNow features in your own applications.
Jbox is a Java full-text search engine framework. It is not a complete application, but rather a code library and API that can easily be used for constructing a search engineer.
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.
Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
The Cornell Web Lab Collaboration Server is a suite of tools and services for GUI-based extraction, analysis and sharing of archived web data. See http://weblab.infosci.cornell.edu/ and http://www.cs.cornell.edu/~weigel for details about the project.
High performance faceted/parametric search implementation that handles various types of semi-structured data. Written in Java. * We have moved to Google code: http://code.google.com/p/browse-engine, this page is to be deprecated.
The Java-Sitemapper is a Java API for building sitemap files to improve search indexing on Google, Yahoo!, MSN, and Ask.com. This project strives to implement the latest in search technology for use on the Java platform.
Contineo is a Web-based Document Management System (DMS). Features: Folder organization, document Versioning, Bulk import, import from mailbox. NOTE: this project has been DISMISSED in favor of LogicalDOC http://sourceforge.net/projects/logicaldoc
FlixFinder: Tivo & Netflix marriage. Automatically find and schedule upcoming movies in cable/satellite listings based on your netflix queue. Now Greasemonkey script. (Original project deprecated since the tv listings are no longer available).
This project is aimed at extracting keywords from documents either as files or on the Internet. It applies sophisticated keyword ranking algorithm to extract most relevant keywords for a document and has also the capability of finding similar document in
This project tries to find geographical locations (formulated by GPS positions) for specific queries (eg 'art') in specific environments (eg 'Maastricht') by analyzing webpages.
Info Hub is an open source web based data/information repository/search engine. It allows browse and keyword search to documents and outside links. It is a great solution for project related data managment.
A Java library which allows to parse the latest freely available RDF files available at DMOZ (Open Directory Project) and inserts them into any JDBC compliant relational database (i.e. MySQL, PostgreSQL and others to come like Oracle, MS Access, SQLite).
The search aggregator allows users to initiate searches across multiple applications and receive aggregated results. This project is based on Lucene, written in Java, exposes web and plugin interfaces, and supports the Open Search and Json standards.
The Retrieval Component Integrator Project (RECOIN) intends to provide an extensible framework of Java classes to build a meta-search and information retrieval (IR) system based on heterogenous IR components as part of a modular retrieval process. The so
JAMP provides several functions to index and manage your media files on resources like storage systems or dvds. The userinterface is webbased and fully written in java.
OpenSiteSearch is the new Open Source version of OCLC's original java-based web application for building Z39.50 portals (i.e. virtual union catalogues). This project is specifically aimed at the library community.
The development of this project has ended. Please take a look to Constellio Enterprise Search. Constellio is based on Apache Solr, Apache Tika, and google search appliance connectors. http://www.constellio.com