New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.
Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
Claim $300 Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
MuSE-CIR is a Multigram-based Search Engine and Collaborative Information Retrieval system. Written in Java /JSP, supports any JDBC connectable database - thoroughly tested only with OracleXE, and somewhat with MySQL, JSP on Apache Tomcat 5.5
Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
nxs crawler is a program to crawl the internet. The program generates random ip numbers and attempts to connect to the hosts. If the host will answer, the result will be saved in a xml file. After than the crawler will disconnect... Additionally you can
JeCARS (Java Extendable Contents And Rights System) is a RESTful webservice which delivers pluggable output formats, e.g. Atom feeds or HTML.
Third party applications can be plugged in.
A JCR (JSR-170) repository (Jackrabbit) is used for storage.
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.
Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
The Semantic Web implementation using native xml database as backend storage. A SPARQL java compiler to XQuery using Jena. There are XQuery scripts for native xml database Sedna(http://modis.ispras.ru/sedna/).
Java/Swish-e bridge. This application is built arround a simple API and a Web container to provide access to the search facility (via web-services) and management/indexing (wep app).
Javen library is a framework for developing C++ application simply, with similar API to Java library. Hawk search engine is a software platform that used to build Vertical Search Product more easily for the Moderate Company or End Users.
Contineo is a Web-based Document Management System (DMS). Features: Folder organization, document Versioning, Bulk import, import from mailbox. NOTE: this project has been DISMISSED in favor of LogicalDOC http://sourceforge.net/projects/logicaldoc
Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
OpenMKS is a search & navigational tool for large multimedia collections. With pluggable functionality and a core subsystem supporting the z39.50 ZING Community SRW search & retrieval specification, it can be run either as a Servlet or as a Web Service.
The Retrieval Component Integrator Project (RECOIN) intends to provide an extensible framework of Java classes to build a meta-search and information retrieval (IR) system based on heterogenous IR components as part of a modular retrieval process. The so
Glue is a WSMO compliant discovery engine that aims at developing an efficient system for the management of semantically described Web Services and their discovery.
QuickWCM is a Web Content Manager (WCM) with a very easy to use web-based interface, seamless security model, integrated search engine and more. QuickWCM runs on JSR-170 repository and is easy to extend with JSR-168 portlets.
Java program to extract postings and comments from http://www.livejournal.com (blog) into DB and view/classify/process it. LJ loader. Components to reuse: perl-like, but efficient Web pages scraper, trees analyzer, concurrent scheduler.
The goal of the project is to guide developers in designing Web applications which uses various Opensource frameworks such as spring and hibernate etc to build a scaleable, efficient and reliable Web application.
Spidertron is a multithreaded web crawling API for web sites of moderate size (hundreds of thousands of pages) that allows you to focus not on the crawling but on processing of the information retreived.
Catalogo is a system for cataloguing resources on a web site. It allows semantic search of information on an intranet using metadata, RDF and ontology concepts. It provides a Catalog server (Java web applications) and a Catalog client (Firefox plug-in).
XQEngine is a Java component for searching collections of XML documents that uses an XQuery front end. The engine has a straightforward API that allows it to be easily embedded in end user applications. Requires some basic Java programming skills.
The Jorne project develops software and open standards for linking Lojban text with WWW and Semantic Web metadata (e.g. RDF/N3, RSS, XML). Lojban is an artificial spoken and written language based on predicate logic.
Develop a java API (JAR library, with an example web GUI) for content management. Simple but powerful, based on Apache Lucene project, it would be embeded on projects requiring content management.
DVDWeb is a Web Service which provides organization/search/lookup services through JAX-RPC API. The search can be done against the builtin DB (the user\'s private list of DVDs according to UPC codes) or against other Internet sites such as imdb or yahoo.