Java library for working with real-world HTML
A fast, high-level web crawling and web scraping framework
CLI tool to save complete web pages as single self-contained HTML file
Fast CLI tool for cloning entire websites for local browsing offline
ML-based HTML scraper that learns extraction rules from examples
JavaScript + BeautifulSoup = JSSoup
Open source web crawler for Java
Lightweight Java web crawler framework with jQuery-style extraction
Compile your mobile web pages into mobile aps via build.phonegap.com