With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Memomics Forge is a meta-project for software that utilizes the Memomics Semantic Service.
Memomics Semantic Service provides semantic data which can be embedded in applications via webservices.
NewsRack is a tool/service that attempts to automate news monitoring. Based on user-specified definitions and rules, NewsRack will enable automated downloading, classification, filing, and long-term archiving of news.
GHIRL is the Graph-based Heterogeneous Information Representation Language: a java library for representing, querying, and navigating graph- or network-based data structures.
JavaPub is a one-click install BibTex-publications portal based on a simple java codebase. It features a drag-and-drop uploader module to upload BibTex files and a module that generates the html-index and entry-pages for publication listings.
New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.
Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
Web-as-corpus tools in Java.
* Simple Crawler (and also integration with Nutch and Heritrix)
* HTML cleaner to remove boiler plate code
* Language recognition
* Corpus builder
Other spiders has a limited link depth, follows links not randomized or are combined with heavy indexing machines. This spider will has not link depth limits, randomize next url, that will be checked for new urls.
nxs crawler is a program to crawl the internet. The program generates random ip numbers and attempts to connect to the hosts. If the host will answer, the result will be saved in a xml file. After than the crawler will disconnect... Additionally you can
In this project we like to implement an interface for the open yahoo search service which is called BOSS.
Then we will develop multi new search futures based on this API.
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud
Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
The Semantic Web implementation using native xml database as backend storage. A SPARQL java compiler to XQuery using Jena. There are XQuery scripts for native xml database Sedna(http://modis.ispras.ru/sedna/).
Java/Swish-e bridge. This application is built arround a simple API and a Web container to provide access to the search facility (via web-services) and management/indexing (wep app).
The Java-Sitemapper is a Java API for building sitemap files to improve search indexing on Google, Yahoo!, MSN, and Ask.com. This project strives to implement the latest in search technology for use on the Java platform.
Retriever is a simple crawler packed as a Java library that allows developers to collect and manipulate documents reachable by a variety of protocols (e.g. http, smb). You'll easily crawl documents shared in a LAN, on the Web, and many other sources.
Simple Porn Downloader is a tiny all Java based application that uses a list of keywords and starting urls to crawl webpages and branch out searching for specific media extensions which are downloaded and presented in an html page.
Command line application written in Java useful for automation of downloading process and filtering contents of downloaded files. jDownloader uses simple script file to configure downloading and filtering processes.
A Java library as a wrapper for the Google Search Appliance's search protocol XML API. The XML API is publicly available at: http://code.google.com/gsa_apis/xml_reference.html The homepage and tutorial for this project is at: http://gsa-japi.sf.net
JxtASK is a P2P system that is aimed to search, download and share academic content hosted on websites that will join the JxtASK community. Joining is simple: siteadmins must generate(even automatically)a XML catalog which describes the files.
Google() meets the Matrix. Red Piranha combines Lucene (Searching Ability), XML-RDF (ability to learn), Tomcat (for P2P Power) and Spring (Ease of use) to not only let you find anything, anywhere, but to actually understand what you are looking for.
SearchSaver is a web search tool, that enables you to search multiple search engines simultaneously and export selected results to XML (RSS, Atom) or PDF files. It presents the search results in a tabbed interface, as well as tree-style explorer view.
SearchSite is intended to support out-of-the-box search for small to medium websites, bridging the gap between simple PHP/Perl scripts at one extreme or something like Nutch which is intended to deal with millions of pages at the other.
Open Source Semantic Web Search Engine Software: If two machines anywhere on the web can agree on the same definition of a digital service or digital good, then machine to machine transactions can use this lingua franca to transact on the users behalf.
OpenMKS is a search & navigational tool for large multimedia collections. With pluggable functionality and a core subsystem supporting the z39.50 ZING Community SRW search & retrieval specification, it can be run either as a Servlet or as a Web Service.
The Retrieval Component Integrator Project (RECOIN) intends to provide an extensible framework of Java classes to build a meta-search and information retrieval (IR) system based on heterogenous IR components as part of a modular retrieval process. The so
Piscator is a small SQL/XML search engine. Once an XML feed is loaded, it can be queried using plain SQL. The setup is almost identical to the DB2 side tables approach.