Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Lightspeed golf course management software
Lightspeed Golf is all-in-one golf course management software to help courses simplify operations, drive revenue and deliver amazing golf experiences.
From tee sheet management, point of sale and payment processing to marketing, automation, reporting and more—Lightspeed is built for the pro shop, restaurant, back office, beverage cart and beyond.
Auto Proxy Filter Test (APFT) automates the testing of safe and unsafe URLs against a content filtering proxy (such as Dansguardian) and helps prevent regressions. APFT is useful to people who are designing filter rules.
Brain Net. A Distributed Search Engine Network. Aims to be a scalable distributed search engine allowing dynamic connection of nodes. Performance depends on bandwidth.
Linx URL filter is a console filter used to retrieve and filter the text contents of web pages. Linx URL filter uses plugins to extract information like tags, links, mail addresses and scripts
Martus Solutions provides seamless budgeting, reporting, and forecasting tools that integrate with accounting systems for real-time financial insights
Martus' collaborative and easy-to-use budgeting and reporting platform will save you hundreds of hours each year. It's designed to make the entire budgeting process easier and create unlimited financial transparency.
Yngvi allows Apache Webmasters to determine filesystem locations that are accessible through the server -- including those which have not been explicitly published. It's good for finding inadvertent exposures or unintended configuration inheritances.
rss2imap is a tool which gets RSS (RDF Site Summary) from web and delivers to the IMAP server as an e-mail message. it enables you to use IMAP supported MUA as a RSS reader, and to unify mail check and site update check with ONE client.
JLinkCheck is an Ant Task written in Java for checking links in websites. It is not just checking one single page, but crawling a whole site like a spider, generating a report in XML and (X)HTML. JReptator will be its succesor with many more features
Toke is a webmining toolkit for web exploring, indexing and searching for Java. Toke allows to you crawl public or private web sites, in order to create web estatistics, web Pajek graphs, Lucene indexs and word frequency files for data clustering.
Landlords, multi-family homes, manufactured home communities, single family homes, associations, commercial properties and mixed portfolios.
Rent Manager is award-winning property management software built for residential, commercial, and short-term-stay portfolios of any size. The program’s fully customizable features include a double-entry accounting system, maintenance management/scheduling, marketing integration, mobile applications, more than 450 insightful reports, and an API that integrates with the best PropTech providers on the market.
Sperowider Website Archiving Suite is a set of Java applications, the primary purpose of which is to spider dynamic websites, and to create static distributable archives with a full text search index usable by an associated Java applet.
This project provides a system tray application that monitors the status of a project which uses a DART dashboard. Status is displayed by color-coded icons, and message dialogs alert the user when the build status changes.
InSite is a Web site management tool written in perl. It checks link integrity and does some basic content monitoring of your site's files directly on the local disk, which gives it a huge speed advantage over similar tools.
Robust featureful multi-threaded CLI web spider using apache commons httpclient v3.0 written in java. ASpider downloads any files matching your given mime-types from a website. Tries to reg.exp. match emails by default, logging all results using log4j.
404SEF is a component for Mambo CMS (4.5.x right now, 4.6.x soon) to provide Human Readable URLs. Works with apache and IIS. Provides proper 404 status code for missing content, logs 404 errors, and user-defined custom redirection via special shortcuts
Automatic link management program. Has three functions: List links in database in html format, add links to database using browser and optionaly check for bad links (by cron job). This eliminates the need for the "Report bad link" on too many web sites
Like social boomarking, allows users to share their bookmarks online. Like wiki, anyone can freely edit links. Export / Import boomarks with your browser. Many other features: RSS and Atom feeds, URL check, popular categories, XLIink, ...
The main function of this script is to shorten long website-URL's -- converting long URL's into easy-to-remember, short ones. [htaccess, MOD_REWRITE, XHTML 1.0 strict, CSS 1, JS 1.2, PHP 5X, MySQL 4X]
A content management system which allows web developers to create and organize a collection of URLs (a.k.a. - a link farm) using a searchable labeling system.
Programmable web client utilising HttpUnit with input & output files in XML. Eccles includes the ability to create a GUI to control/monitor the processing, and can be used for website testing as well as automating web transactions.
Bugkilla is a set of java tools for the functional test of J2EE Web Applications.
Specification and execution of tests will be automated for web front end and business logic layer.
One goal is to integrate with existing frameworks and tools.