Deploy in 115+ regions with the modern database for every enterprise.
MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.
Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
Ruya is a Python-based breadth-first, level-, delayed, event-based-crawler for crawling English, Japanese websites. It is targeted solely towards developers who want crawling functionality in their projects using API, and crawl control.
Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
Auto Proxy Filter Test (APFT) automates the testing of safe and unsafe URLs against a content filtering proxy (such as Dansguardian) and helps prevent regressions. APFT is useful to people who are designing filter rules.
JLinkCheck is an Ant Task written in Java for checking links in websites. It is not just checking one single page, but crawling a whole site like a spider, generating a report in XML and (X)HTML. JReptator will be its succesor with many more features
Toke is a webmining toolkit for web exploring, indexing and searching for Java. Toke allows to you crawl public or private web sites, in order to create web estatistics, web Pajek graphs, Lucene indexs and word frequency files for data clustering.
This project provides a system tray application that monitors the status of a project which uses a DART dashboard. Status is displayed by color-coded icons, and message dialogs alert the user when the build status changes.
InSite is a Web site management tool written in perl. It checks link integrity and does some basic content monitoring of your site's files directly on the local disk, which gives it a huge speed advantage over similar tools.
404SEF is a component for Mambo CMS (4.5.x right now, 4.6.x soon) to provide Human Readable URLs. Works with apache and IIS. Provides proper 404 status code for missing content, logs 404 errors, and user-defined custom redirection via special shortcuts
Automatic link management program. Has three functions: List links in database in html format, add links to database using browser and optionaly check for bad links (by cron job). This eliminates the need for the "Report bad link" on too many web sites
Like social boomarking, allows users to share their bookmarks online. Like wiki, anyone can freely edit links. Export / Import boomarks with your browser. Many other features: RSS and Atom feeds, URL check, popular categories, XLIink, ...
A content management system which allows web developers to create and organize a collection of URLs (a.k.a. - a link farm) using a searchable labeling system.
Bugkilla is a set of java tools for the functional test of J2EE Web Applications.
Specification and execution of tests will be automated for web front end and business logic layer.
One goal is to integrate with existing frameworks and tools.
Memephage is an automated web log (blog). It passively gathers and summarizes links from various places. Currently: IRC, social MUDs, e-mail, and web browsers. Uses the POE multitasking and networking framework for Perl.
Checks links in HTML files. Checks almost any tag attributes known to contain references to other resources. Supports multithreading. Written in Java. Frontends for Console and AWT available, Swing in development.