scrape free download - SourceForge

Showing 22 open source projects for "scrape"

View related business solutions

Internet Mac Clear Filters & Widen Search

Our Free Plans just got better! | Auth0 by Okta
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.

Try free now
Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
1

SpotiScrape

Downloads: 0 This Week

Last Update: 2023-10-30
See Project
2

Scrapy

A fast, high-level web crawling and web scraping framework

Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...

Downloads: 45 This Week

Last Update: 2024-06-21
See Project
3

Automa

A chrome extension for automating your browser by connecting blocks

.... There're dozens of workflows been shared by Automa users which you can add and customize. Auto-fill forms, do a repetitive task, take a screenshot, or scrape website data, the choice is yours. You can even schedule when the automation will execute! Browse the Automa marketplace where you can share and download workflows with others.

Downloads: 16 This Week

Last Update: 2024-02-06
See Project
4

crawlee

A web scraping and browser automation library for Node.js

... that make your crawlers look human-like. It's not unblockable, but it will save you money in the long run. Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. Meet our community on Discord. We believe websites are best scraped in the language they're written in. Crawlee runs on Node.js and it's built in TypeScript to improve code completion in your IDE, even if you don't use TypeScript yourself.

Downloads: 5 This Week

Last Update: 2024-10-04
See Project
Secure Online Fax and Business Text Messaging Service
Elevate your business communications with Notifyre's secure SMS and fax solutions.

Send and receive SMS and fax online, from email, app or with our developer friendly SMS & fax API. HIPAA compliant & ISO 27001 certified. Outstanding value and 5-star service.

Learn More
5

Linkedin Scraper

A library that scrapes Linkedin for user data

Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways. You can login and logout, and the cookie will stay...

Downloads: 1 This Week

Last Update: 2023-07-04
See Project
6

rvest

Simple web scraping for R

rvest helps you scrape (or harvest) data from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup and RoboBrowser. If you’re scraping multiple pages, I highly recommend using rvest in concert with polite. The polite package ensures that you’re respecting the robots.txt and not hammering the site with too many requests.

Downloads: 1 This Week

Last Update: 2024-02-12
See Project
7

Roach

The complete web scraping toolkit for PHP

Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package...

Downloads: 2 This Week

Last Update: 2024-04-04
See Project
8

jsoup

Java library for working with real-world HTML

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...

Downloads: 2 This Week

Last Update: 2024-07-10
See Project
9

JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.

Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you have...

Downloads: 0 This Week

Last Update: 2024-09-29
See Project
A new approach to fast data transfer | IBM Aspera
For organizations interested in a file transfer and streaming solution

IBM Aspera takes a different approach to tackling the challenges of big data movement over global WANs. Rather than optimize or accelerate data transfer, Aspera eliminates underlying bottlenecks by using a breakthrough transport technology that fully utilizes available network bandwidth to maximize speed and quickly scale up with no theoretical limit.

Learn More
10

Soketi

Just another simple, fast, and resilient open-source WebSockets server

Ever dreamed about Serverless WebSockets? Soketi can be deployed to Cloudflare Workers. All around the world, closer to your users. Same Pusher protocol. Powered by Cloudflare's Durable Objects and KV, you can achieve great speeds at edge for your users.

Downloads: 0 This Week

Last Update: 2024-03-25
See Project
11

Ferret

Declarative web scraping

A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome...

Downloads: 0 This Week

Last Update: 2023-03-28
See Project
12

dude uncomplicated data extraction

dude uncomplicated data extraction: A simple framework

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

Downloads: 0 This Week

Last Update: 2024-03-02
See Project
13

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 2 This Week

Last Update: 2023-04-12
See Project
14

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose...

Downloads: 1 This Week

Last Update: 2021-10-05
See Project
15

SEO MACROSCOPE

SEO Macroscope is a website scanning tool, to check your website

... pages. Export reports to Excel and CSV formats. Generate and export text and XML sitemaps from the crawled pages. Analyze redirect chains. Use custom filters to verify the presence/absence of tracking tags. Use CSS Selectors, XPath Queries, and Regular Expressions to scrape website data.

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
16

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

..., but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. Django Dynamic Scraper tries to keep its data structure in the database as separated as possible from the models in your app, so it comes with its own Django model classes for defining scrapers, runtime information related to your scraper runs and classes.

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
17

Simple-Scrape

Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.

Downloads: 0 This Week

Last Update: 2017-04-28
See Project
18

chrome-extensions-examples

All Chrome Extension examples collected into one repository

This is not an official mirror of the Chrome extension examples. Report any issues with the examples themselves to Google's issue trackers/forums. The Chrome Extensions examples did not exist as a Git repository, and browsing both the samples page and the VCViewer did not seem particularly handy. So, I decided to scrape the content into this repository for easier browsing and (possible) editing.

Downloads: 0 This Week

Last Update: 2022-05-11
See Project
19

Scra.php

Scrape anything!

The ultimate customiseable YAML-ised Web Scraper for PHP

Downloads: 0 This Week

Last Update: 2014-01-20
See Project
20

PHP Nuke BitTorrent Module

PHP Nuke BitTorrent Module is a PHP Nuke Module that allow webmasters to host on their portals a fully functional Bit Torrent archive. Torrent upload, automatic scrape, comments, rates, privacy rules, internal tracker and much more!

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
21

dataflowkit

Golang framework for scraping data from web pages

Golang Web Scraper library for extracting data from web pages. Save results as CSV, JSON, XML

Downloads: 0 This Week

Last Update: 2018-03-09
See Project
22

Blackfire Player

Web Crawling, Web Testing, and Web Scraping application

Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses. Some Blackfire Player use cases: Crawl a website/API and check expectations -- aka Acceptance Tests; Scrape a website/API and extract values; Monitor a website; Test code with unit test integration (PHPUnit, Behat, Codeception, ...); Test code behavior from the outside thanks to the native...

Downloads: 0 This Week

Last Update: 2019-06-11
See Project