scraping free download

14 projects for "scraping" with 2 filters applied:

Libraries ChromeOS Clear Filters & Widen Search

Streamline Azure Security with Palo Alto Networks VM-Series
Centrally manage physical and virtualized firewalls with Panorama

Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.

Learn more
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
1

X-Crawl

Flexible Node.js AI-assisted crawler library

A high-performance web crawling and scraping framework for Node.js, designed for large-scale data extraction.

Downloads: 0 This Week

Last Update: 2025-04-06
See Project
2

Python-Spider

Python3 web crawler practice

...As part of the author’s public learning-path repositories, python-spider likely includes examples of HTTP requests, HTML parsing, maybe concurrency or scheduling to crawl multiple pages, and techniques to handle common web-scraping issues. For people wanting to get hands-on with building scrapers, collecting data, or learning how to navigate web programming in Python, this repository acts as a didactic reference or starting point. Because it’s published publicly under an open license, users are free to fork and adapt the code.

Downloads: 0 This Week

Last Update: 2025-12-08
See Project
3

Parsera

Lightweight library for scraping web-sites with LLMs

Scrape data from any website with only a link and column descriptions. Parsera is a tool designed to scrape web content, specifically handling poorly structured or messy websites.

Downloads: 1 This Week

Last Update: 2025-10-08
See Project
4

Symfony DomCrawler

Eases DOM navigation for HTML and XML documents

Symfony DomCrawler is a PHP component that provides powerful tools for navigating and extracting data from HTML and XML documents. It allows developers to parse, filter, and manipulate web pages using CSS selectors and XPath expressions. DomCrawler is widely used for web scraping, testing, and processing structured content, and integrates well with other Symfony components like BrowserKit.

Downloads: 0 This Week

Last Update: 3 days ago
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

Jikan REST

The REST API for Jikan

Jikan REST is an unofficial RESTful API for MyAnimeList.net, providing access to anime, manga, and user data by scraping the website. It allows developers to integrate MyAnimeList data into their applications without relying on the official API.

Downloads: 0 This Week

Last Update: 2025-04-25
See Project
6

Article Extractor

To extract main article from given URL with Node.js

A Node.js library for extracting main content from web articles, removing unnecessary clutter like ads and navigation elements.

Downloads: 0 This Week

Last Update: 2026-05-03
See Project
7

Helium

Lighter web automation with Python

...It replaces verbose boilerplate code with natural language-like API calls such as click("Login") or write("hello", into="Name"). Helium manages browser setup, waits, and teardown, enabling quick development of scripts for testing, scraping, or task automation without requiring deep Selenium knowledge.

Downloads: 0 This Week

Last Update: 2026-05-09
See Project
8

Spatie Crawler

An easy to use, powerful crawler implemented in PHP

Spatie Crawler is a PHP library that allows developers to crawl websites and extract information efficiently. It can be used for web scraping, link checking, or automated testing of web pages. The library is simple to use and supports customizable crawling strategies, including controlling crawl depth and handling redirects. It’s suitable for building crawlers that navigate large or dynamically generated websites.

Downloads: 0 This Week

Last Update: 2026-05-18
See Project
9

reCAPTCHA

PHP client library for reCAPTCHA, a free service

...The ecosystem supports mobile and enterprise variants, but the repo focuses on common web integrations and best practices for verifying the token securely. Deployed correctly, reCAPTCHA reduces credential stuffing, bot sign-ups, and scraping without degrading the experience for typical users.

Downloads: 2 This Week

Last Update: 2026-04-27
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
10

User Agents

A JavaScript library for generating random user agents with data

User Agents is a JavaScript library that generates realistic and up-to-date user agent strings and browser fingerprints based on real-world usage data. The library is designed to help developers simulate authentic browser traffic patterns, which is particularly useful in web scraping, testing, and automation scenarios. Unlike simpler random user agent generators, it uses frequency-weighted datasets to ensure that generated values reflect how browsers are actually used in the wild. The dataset is updated automatically on a daily basis, ensuring that generated user agents remain current and relevant over time. ...

Downloads: 0 This Week

Last Update: 17 hours ago
See Project
11

prometheus-net

.NET library to instrument your code with Prometheus metrics

This is a .NET library for instrumenting your applications and exporting metrics to Prometheus.

Downloads: 0 This Week

Last Update: 2024-11-15
See Project
12

RobotsDisallowed

A curated list of the most common and most interesting robots.txt

RobotsDisallowed is a public catalog that tracks websites and organizations explicitly blocking AI and web-scraping crawlers in their robots.txt or related mechanisms. It focuses on documenting the growing trend of content owners asserting control over how their data is used for model training and automated harvesting. The project aggregates domains, notes the targeted bots or user agents, and surfaces patterns for researchers, policymakers, and tool builders.

Downloads: 0 This Week

Last Update: 2025-10-28
See Project
13

Enlive

Selector-based templating and transformation system for Clojure

Enlive is a Clojure library for HTML templating, transformation, and scraping, supporting composable manipulation of HTML/XML in a functional style. It allows selecting, transforming, and generating HTML fragments using CSS selectors, and supports server-side template composition, dynamic pages, and content rewriting. By default selector-transformation pairs are run sequentially. When you know that several transformations are independent, you can now specify (as an optimization) to process them in lockstep. ...

Downloads: 0 This Week

Last Update: 2025-09-24
See Project
14

Node Crawler

Web Crawler/Spider for NodeJS + server-side jQuery

Most powerful, popular and production crawling/scraping package for Node, happy hacking.

Downloads: 0 This Week

Last Update: 2023-09-20
See Project