data scrape free download

Showing 29 open source projects for "data scrape"

View related business solutions

Linux Clear Filters & Widen Search

Stop Storing Third-Party Tokens in Your Database
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.

Try Auth0 for Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

Linkedin Scraper

A library that scrapes Linkedin for user data

Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways.

Downloads: 0 This Week

Last Update: 2026-01-27
See Project
2

Parsera

Lightweight library for scraping web-sites with LLMs

Scrape data from any website with only a link and column descriptions. Parsera is a tool designed to scrape web content, specifically handling poorly structured or messy websites.

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
3

Automa

A chrome extension for automating your browser by connecting blocks

...There're dozens of workflows been shared by Automa users which you can add and customize. Auto-fill forms, do a repetitive task, take a screenshot, or scrape website data, the choice is yours. You can even schedule when the automation will execute! Browse the Automa marketplace where you can share and download workflows with others.

Downloads: 27 This Week

Last Update: 2025-08-11
See Project
4

rvest

Simple web scraping for R

rvest helps you scrape (or harvest) data from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup and RoboBrowser. If you’re scraping multiple pages, I highly recommend using rvest in concert with polite. The polite package ensures that you’re respecting the robots.txt and not hammering the site with too many requests.

Downloads: 0 This Week

Last Update: 2025-08-29
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

dude uncomplicated data extraction

dude uncomplicated data extraction: A simple framework

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

Downloads: 0 This Week

Last Update: 2024-03-02
See Project
6

Scrapy

A fast, high-level web crawling and web scraping framework

Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...

Downloads: 20 This Week

Last Update: 4 days ago
See Project
7

Ferret

Declarative web scraping

A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome/Chromium via Chrome Devtools Protocol to handle dynamically rendered web pages. ferret is extremely extensible, and creating custom functions and types is super easy. ferret allows users to focus on the data. ...

Downloads: 0 This Week

Last Update: 2025-05-07
See Project
8

jsoup

Java library for working with real-world HTML

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...

Downloads: 3 This Week

Last Update: 2026-01-01
See Project
9

kimuraframework

AI-first Ruby framework for building fast, flexible web scraping spide

Kimurai is an open source web scraping framework written in Ruby that simplifies the process of building automated data extraction tools. It provides a clean domain-specific language that allows developers to define scraping logic and data schemas with minimal boilerplate code. Kimurai can use AI-assisted extraction to identify where data resides in HTML pages, automatically generating selectors that are cached for future use so subsequent scraping runs operate with pure Ruby performance....

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
10

DenchClaw

Fully Managed OpenClaw Framework for all knowledge work ever

...One of its most distinctive capabilities is its ability to use the user’s existing browser session, enabling it to log into services, scrape data, and perform actions like outreach or research as if it were the user.

Downloads: 3 This Week

Last Update: 3 hours ago
See Project
11

Roach

The complete web scraping toolkit for PHP

Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package on its own or install one of the framework-specific adapters. ...

Downloads: 0 This Week

Last Update: 2025-03-21
See Project
12

Firecrawl MCP Server

Adds powerful web scraping and search to Cursor and Claude

firecrawl-mcp-server is the official MCP integration for Firecrawl that brings high-recall web scraping, crawling, and search into IDEs and agent runtimes. It exposes tools for single-page scrape, multi-URL batch jobs, site discovery, and search enrichment, returning cleaned, structured content suitable for downstream LLM reasoning. The server is designed to run with Firecrawl’s hosted API or self-hosted deployments, making it flexible for enterprise data-governance requirements. Built-in behaviors include JavaScript rendering, automatic retries, and streamable HTTP so long pages and large crawls can flow incrementally into agents. ...

Downloads: 0 This Week

Last Update: 2025-10-08
See Project
13

ai-scrapper

🚀 Discover AI Web Scraper! 🚀 Tired of copying and pasting data from websites? I developed a desktop application with Electron and Gemini AI to extract structured data easily and efficiently! 🤖✨

1 Review

Downloads: 1 This Week

Last Update: 2025-05-31
See Project
14

Catbird Linux

Linux for content creation, web scraping, coding, and data analysis.

Catbird Linux is a USB pluggable Live Linux operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather...

Downloads: 45 This Week

Last Update: 2025-08-29
See Project
15

URS (Universal Reddit Scraper)

A comprehensive Reddit scraping command-line tool written in Python

Universal Reddit Scraper, a comprehensive Reddit scraping command-line tool written in Python. Whether you are using URS for enterprise or personal use, I am very interested in hearing about your use case and how it has helped you achieve a goal. This is a comprehensive Reddit scraping tool that integrates multiple features. All files except for those generated by the wordcloud tool are exported to JSON by default. Wordcloud files are exported to PNG by default. All exported files are saved...

Downloads: 0 This Week

Last Update: 2023-05-08
See Project
16

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
17

Ansible Role: prometheus

Deploy Prometheus monitoring system

Ansible-Prometheus is an Ansible role for automating the deployment and configuration of Prometheus monitoring systems.

Downloads: 0 This Week

Last Update: 2024-11-22
See Project
18

ruia

Async Python framework for fast and flexible web scraping spiders

Ruia is an asynchronous web scraping micro-framework built for Python that focuses on simplicity, speed, and flexibility when creating web crawlers. Ruia is powered by Python’s asyncio library along with aiohttp, enabling developers to perform concurrent network requests efficiently and scrape data from websites with minimal overhead. Ruia follows a “write less, run faster” philosophy, emphasizing concise code and streamlined spider development. It provides a structured approach to building scraping projects through components such as data items, spiders, middleware, and plugins. Developers can define structured fields to extract information from HTML content and process responses asynchronously to improve crawling performance. ...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
19

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose what you've already scraped. ...

Downloads: 0 This Week

Last Update: 2021-10-05
See Project
20

lazynlp

Library to scrape and clean web pages to create massive datasets

LazyNLP is a lightweight tool for collecting and curating large-scale text datasets for machine learning and NLP applications with minimal manual effort.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
21

SEO MACROSCOPE

SEO Macroscope is a website scanning tool, to check your website

...Generate and export text and XML sitemaps from the crawled pages. Analyze redirect chains. Use custom filters to verify the presence/absence of tracking tags. Use CSS Selectors, XPath Queries, and Regular Expressions to scrape website data.

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
22

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

...Since it simplifies things DDS is not usable for all kinds of scrapers, but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. Django Dynamic Scraper tries to keep its data structure in the database as separated as possible from the models in your app, so it comes with its own Django model classes for defining scrapers, runtime information related to your scraper runs and classes.

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
23

google-play-scraper

Node.js scraper to get data from Google Play

Node.js module to scrape application data from the Google Play store. Retrieves the full detail of an application. Retrieves a list of applications from one of the collections at Google Play. Retrieves a list of apps that results of searching by the given term. Returns the list of applications by the given developer name. Given a string returns up to five suggestions to complete a search query term.

Downloads: 0 This Week

Last Update: 2022-03-22
See Project
24

Simple-Scrape

Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.

Downloads: 0 This Week

Last Update: 2017-04-28
See Project
25

Amazon ASIN Check

Get Price and Rating of list of ASIN numbers.

This program uses a list of Amazon Standard Identification Number to scrape the price, rating and sold by data of the product. Input is a text file and data is outputted as a CSV file. Written in C# and developed by B.J Erasmus. This project was given by HNS2014 on freelancer.com

Downloads: 0 This Week

Last Update: 2014-07-01
See Project