With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Try free now
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
Decima is a database that was designed to support time-series data mining. It consists of PostgreSQL custom type definition, implementation of GiST index for that type and snowflake database schema.
The Java Sitemap Parser can parse a website's Sitemap (http://www.sitemaps.org/). This is useful for web crawlers that want to discover URLs from a website that is using the Sitemap Protocol.
This project has been incorporated into crawler-commons (https://github.com/crawler-commons/crawler-commons) and is no longer being maintained.
Retriever is a simple crawler packed as a Java library that allows developers to collect and manipulate documents reachable by a variety of protocols (e.g. http, smb). You'll easily crawl documents shared in a LAN, on the Web, and many other sources.
Turn Your Content into Interactive Magic - For Free
From Canva to Slides, Desmos to YouTube, Lumio works with the tech tools you are already using.
Transform anything you share into an engaging digital experience - for free. Instantly convert your PDFs, slides, and files into dynamic, interactive sessions with built-in collaboration tools, activities, and real-time assessment. From teaching to training to team building, make every presentation unforgettable. Used by millions for education, business, and professional development.
The DeDuplicator is an add-on module (plug-in) for the webcrawler Heritrix. It offers a means to reduce the amount of duplicate data collected in a series of snapshot crawls.
LogCrawler is an ANT task for automatic testing of web applications. Using a HTTP crawler it visits all pages of a website and checks the server logfiles for errors. Use it as a "smoketest" with your CI system like CruiseControl.
WebNews Crawler is a specific webcrawler (spider, fetcher) designed to acquire and clean news articles from RSS and HTML pages. It can do a site specific extraction to extract the actual news content only, filtering out the advertising and other cruft.
Crawl-By-Example runs a crawl, which classifies the processed pages by subjects and finds the best pages according to examples provided by the operator. Crawl-By-Example is a plugin to the Heritrix crawler, and was done as a part of GSoC06 program.
Focus on your application, and leave the database to us
Fully managed, cost-effective relational database service for PostgreSQL, MySQL, and SQL Server. Try Enterprise Plus edition for a 99.99% availability SLA and category-leading performance.
J-Obey is a Java Library/package, which allows people writing their own crawlers to have a stable Robots.txt parser, if you are writing a webcrawler of some sort you can use J-Obey to take out the hassle of writing a Robots.txt parser/intrepreter.
A configurable knowledge management framework. It works out of the box, but it's meant mainly as a framework to build complex information retrieval and analysis systems. The 3 major components: Crawler, Analyzer and Indexer can also be used separately.
SmartCrawler is a java-based fully configurable, multi-threaded and extensible crawler, which is able to fetch and analyze the contents of a web site by using dinamically pluggable filters
WebLoupe is a java-based tool for analysis, interactive visualization (sitemap), and exploration of the information architecture and specific properties of local or publicly accessible websites. Based on web spider (or webcrawler) technology.
Pödznsatch is a open and distributed hypergoogle of love. It is a semantic web application for social networking, word-of-mouth analysis and profiling. The Pödznsatch architecture includes a bot crawler, an inference engine and a query interface.
WebSPHINX is a webcrawler (robot, spider) Java class library, originally developed by Robert Miller of Carnegie Mellon University. Multithreaded, tollerant HTML parsing, URL filtering and page classification, pattern matching, mirroring, and more.
Spider is webcrawler written in the Java.Based on an Regular expression string the spider parses the internet for web pages matching this string and stores it in an MYSQL database.
studiMaps is a web based application for visualization and analysis of social networks. It consists of two software components: a web-crawler for getting data and the web based application for visualization.