Open-source LLM Friendly Web Crawler & Scraper
A scalable web crawler framework for Java
A web scraping and browser automation library for Node.js
Declarative web scraping
Lighter, faster browser kernel of blink to integrate HTML UI in apps
ACHE is a web crawler for domain-specific search
A powerful Spider(Web Crawler) system in Python
Open source web crawler for Java
Perl Web Scraping Project
Open source Search Engine and Enterprise Search