Turn entire websites into LLM-ready markdown or structured data
AI-ready web crawler that extracts and structures website content
dude uncomplicated data extraction: A simple framework
Declarative web scraping
Python & command-line tool to gather text on the Web
Open source web scraping system for automated data collection tasks
The undetected self-hosted browser automation platform
Python tool for crawling and extracting structured data from news site
Python library for scraping and analyzing online news articles easily
Fast CLI web crawler for discovering endpoints in modern web apps
Cross platform GUI tool for downloading videos from Bilibili sites
A fast, high-level web crawling and web scraping framework
Agentic browser; privacy-first alternative to ChatGPT Atlas
A browser testing and web crawling library for PHP and Symfony
Desktop tool for collecting and exporting Xiaohongshu post data
A Headless, Utility-First, and Zero-Runtime UI Component Library
High-performance Rust web crawler and scraper for large-scale data
The headless Chrome/Chromium driver on top of Puppeteer
A scalable web crawler framework for Java
Lightweight Ruby DSL for scraping structured data from web pages
Lightweight .NET framework for fast web crawling and data scraping
An adaptive Web Scraping framework
A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama
This is a public repository containing scrapers
Collection of Python web scraping scripts for data extraction tasks