New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.
Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
Claim $300 Free
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.
Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
Nutch is an open source search engine.WebSphere Information Integrator Content Edition(IICE) is an IBM product that used to integrate enterprise content management systems.Nutch-IICE is a plugin for Nutch and an enterprise content search solution.
DOSE: a distributed platform for semantic elaboration that provides semantic services such as automatic annotation of web resources at the document substructure level, semantic search facilities, semantic annotation storage and retrieval.
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.
Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
geoLucene is an extension of Lucene that allows to effectively index and search documents that contain locational information (longitude/latitude). It uses R-tree as a spacial index. See http://www.gossamer-threads.com/lists/lucene/java-dev/53378.
Sharehound is a network file systems indexer and searcher written in Java. Currently supports SMB file shares (i.e. MS Windows-based shares) and FTP resources. Web UI is used for search and crawl monitoring. RSS feed is provided for search results.
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.
Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
Analyze and visualization of the social structuring from "lastfm.de" which contained user data, friendslist, groups, group-members and musical neighbours.
SearchSaver is a web search tool, that enables you to search multiple search engines simultaneously and export selected results to XML (RSS, Atom) or PDF files. It presents the search results in a tabbed interface, as well as tree-style explorer view.
The BigBlogZoo is a semantic news aggregator. More than 50,000 preloaded categorized feeds can be browsed, crawled, XSLT transformed, reaggregated and aggregated. By crowdsourcing we hope to create the world's largest collection of classified newsfeeds.
Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
OpenMKS is a search & navigational tool for large multimedia collections. With pluggable functionality and a core subsystem supporting the z39.50 ZING Community SRW search & retrieval specification, it can be run either as a Servlet or as a Web Service.
The Retrieval Component Integrator Project (RECOIN) intends to provide an extensible framework of Java classes to build a meta-search and information retrieval (IR) system based on heterogenous IR components as part of a modular retrieval process. The so
Piscator is a small SQL/XML search engine. Once an XML feed is loaded, it can be queried using plain SQL. The setup is almost identical to the DB2 side tables approach.
OpenSiteSearch is the new Open Source version of OCLC's original java-based web application for building Z39.50 portals (i.e. virtual union catalogues). This project is specifically aimed at the library community.
The development of this project has ended. Please take a look to Constellio Enterprise Search. Constellio is based on Apache Solr, Apache Tika, and google search appliance connectors. http://www.constellio.com
WebNews Crawler is a specific web crawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
The Informa library provides a convenient Java API for handling news channels and metadata about them. Different syntax formats (RSS 0.91, 1.0, 2.0 and Atom 0.3, 1.0) for feeds are supported. Also support for channel information descriptions (OPML) avail
Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
This forum software is a Java based discussion forum, that uses JDBC to store data in a database. This discussion forum is available in different languages and has features for easy integration into a site and easy administration of forum.