scrape free download - SourceForge

Showing 76 open source projects for "scrape"

View related business solutions

Make Recruiting and Onboarding Easy
Simple, easy-to-use applicant tracking and employee Onboarding system for any sized organization.

Take away the pain and hassle associated with applicant recruitment, hiring, and onboarding with ApplicantStack. Designed for HR professionals and recruiters, ApplicantStack helps streamline the recruiting and onboarding processes to improve productivity and reduce costs. ApplicantStack provides a complete toolkit that includes tools for posting, launching, and advertising jobs, assessing and managing candidates, collaborating with teams, centralizing information for quick hiring and onboarding, and more.

Learn More
Propelling Payments for Software Platforms
For SaaS businesses to monetize payments through its turnkey PayFac-as-a-Service solution.

Exact Payments delivers easy-to-integrate embedded payment solutions enabling you to rapidly onboard merchants, instantly activate a variety of payment methods and accelerate your revenue — delivering an end-to-end payment processing platform for SaaS businesses.

Learn More
1

SpotiScrape

Downloads: 0 This Week

Last Update: 2023-10-30
See Project
2

Scrapy

A fast, high-level web crawling and web scraping framework

Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...

Downloads: 34 This Week

Last Update: 2024-06-21
See Project
3

DocSearch

The easiest way to add search to your documentation

... with the interaction patterns of each OS. We scrape your documentation or technical blog, configure the Algolia application and send you the snippet you'll have to integrate. It's that simple. You don't need to configure any settings or even have an Algolia account. We take care of this for you! We'll send you a small snippet to integrate DocSearch to your website and an invite to your fully configured Algolia application.

Downloads: 4 This Week

Last Update: 2024-07-16
See Project
4

JMX Exporter

A process for exposing JMX Beans via HTTP for Prometheus consumption

JMX to Prometheus exporter: a collector that can configurable scrape and expose mBeans of a JMX target. This exporter is intended to be run as a Java Agent, exposing a HTTP server and serving metrics of the local JVM. It can be also run as a standalone HTTP server and scrape remote JMX targets, but this has various disadvantages, such as being harder to configure and being unable to expose process metrics (e.g., memory and CPU usage). Running the exporter as a Java agent is strongly encouraged.

Downloads: 3 This Week

Last Update: 2024-05-31
See Project
ContractSafe: Contract Management Software
Take Control Of Your Contracts Without Wrecking The Budget

Ditch those spreadsheets, shared drives & crazy-expensive solutions with too many bells & whistles. ContractSafe offers the simplest way to manage your contracts efficiently without breaking the bank.

Learn More
5

rvest

Simple web scraping for R

rvest helps you scrape (or harvest) data from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup and RoboBrowser. If you’re scraping multiple pages, I highly recommend using rvest in concert with polite. The polite package ensures that you’re respecting the robots.txt and not hammering the site with too many requests.

Downloads: 3 This Week

Last Update: 2024-02-12
See Project
6

Elasticsearch Exporter

Elasticsearch stats exporter for Prometheus

Prometheus exporter for various metrics about Elasticsearch, written in Go. The exporter fetches information from an Elasticsearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with --es.all and --es.indices. We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter...

Downloads: 2 This Week

Last Update: 2023-12-21
See Project
7

jsoup

Java library for working with real-world HTML

jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...

Downloads: 4 This Week

Last Update: 2024-07-10
See Project
8

Automa

A chrome extension for automating your browser by connecting blocks

.... There're dozens of workflows been shared by Automa users which you can add and customize. Auto-fill forms, do a repetitive task, take a screenshot, or scrape website data, the choice is yours. You can even schedule when the automation will execute! Browse the Automa marketplace where you can share and download workflows with others.

Downloads: 2 This Week

Last Update: 2024-02-06
See Project
9

Artisan View

Manage your views in Laravel projects through artisan

This package adds a handful of view-related commands to Artisan in your Laravel project. Generate blade files that extend other views, scaffold out sections to add to those templates, and more. All from the command line we know and love.

Downloads: 2 This Week

Last Update: 2024-04-22
See Project
Innovate faster with enterprise-ready generative AI—enhanced by Gemini
Build, deploy, and scale machine learning (ML) models faster, with fully managed ML tools for any use case.

Vertex AI offers everything you need to build and use generative AI—from AI solutions, to Search and Conversation, to 130+ foundation models, to a unified AI platform.

Try for free
10

Linkedin Scraper

A library that scrapes Linkedin for user data

Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways. You can login and logout, and the cookie will stay...

Downloads: 1 This Week

Last Update: 2023-07-04
See Project
11

crawlee

A web scraping and browser automation library for Node.js

... that make your crawlers look human-like. It's not unblockable, but it will save you money in the long run. Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. Meet our community on Discord. We believe websites are best scraped in the language they're written in. Crawlee runs on Node.js and it's built in TypeScript to improve code completion in your IDE, even if you don't use TypeScript yourself.

Downloads: 0 This Week

Last Update: 2024-07-24
See Project
12

URS (Universal Reddit Scraper)

A comprehensive Reddit scraping command-line tool written in Python

Universal Reddit Scraper, a comprehensive Reddit scraping command-line tool written in Python. Whether you are using URS for enterprise or personal use, I am very interested in hearing about your use case and how it has helped you achieve a goal. This is a comprehensive Reddit scraping tool that integrates multiple features. All files except for those generated by the wordcloud tool are exported to JSON by default. Wordcloud files are exported to PNG by default. All exported files are saved...

Downloads: 0 This Week

Last Update: 2023-05-08
See Project
13

Prometheus Redis Metrics Exporter

Prometheus Exporter for Redis Metrics. Supports Redis 2.x, 3.x, 4.x, 5

... for the Redis instances then you can set the password via the --redis.password command line option of the exporter (this means you can currently only use one password across the instances you try to scrape this way. Use several exporters if this is a problem). If your Redis instance requires authentication then there are several ways how you can supply a username (new in Redis 6.x with ACLs) and a password.

Downloads: 0 This Week

Last Update: 2024-07-17
See Project
14

Ferret

Declarative web scraping

A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome...

Downloads: 1 This Week

Last Update: 2023-03-28
See Project
15

Soketi

Just another simple, fast, and resilient open-source WebSockets server

Ever dreamed about Serverless WebSockets? Soketi can be deployed to Cloudflare Workers. All around the world, closer to your users. Same Pusher protocol. Powered by Cloudflare's Durable Objects and KV, you can achieve great speeds at edge for your users.

Downloads: 0 This Week

Last Update: 2024-03-25
See Project
16

Rod

A Devtools driver for web automation and scraping

Rod is a high-level driver for DevTools Protocol. It's widely used for web automation and scraping. Rod can automate most things in the browser that can be done manually. Chained context design, intuitive to timeout or cancel the long-running task. Auto-wait elements to be ready. Debugging friendly, auto input tracing, remote monitoring headless browser. Thread-safe for all operations. Automatically find or download browser. High-level helpers like WaitStable, WaitRequestIdle,...

Downloads: 0 This Week

Last Update: 2024-07-12
See Project
17

dude uncomplicated data extraction

dude uncomplicated data extraction: A simple framework

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

Downloads: 0 This Week

Last Update: 2024-03-02
See Project
18

Roach

The complete web scraping toolkit for PHP

Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package...

Downloads: 0 This Week

Last Update: 2024-04-04
See Project
19

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 2 This Week

Last Update: 2023-04-12
See Project
20

Instagram Scraper

Scrapes an instagram user's photos and videos

instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos. Use responsibly. To scrape a private user's media you must be an approved follower. Providing username and password is optional, if not supplied the scraper runs as a guest. In this case all private user's media will be unavailable. All user's stories and high-resolution profile pictures will also be unavailable. By default, downloaded media will be placed...

Downloads: 3 This Week

Last Update: 2022-06-17
See Project
21

MedGui Reborn & MetroMed

MedGui Reborn is a frontend (GUI) for Mednafen multi emulator.

MedGui Reborn is a frontend (GUI) for Mednafen multi emulator, written in Microsoft Visual Studio Community. MetroMed is a appendix to MedGui Reborn and offer a modern "metro" style GUI for Mednafen. The programs are the evolution of MedGui and includes more features:

Downloads: 320 This Week

Last Update: 2024-05-11
See Project
22

htmLawed

PHP code to purify & filter HTML

The htmLawed PHP script makes HTML more secure and standards- & policy-compliant. The customizable HTML filter/purifier can balance tags, ensure proper nestings, neutralize XSS, restrict HTML, beautify code like Tidy, implement anti-spam measures, etc.

1 Review

Downloads: 79 This Week

Last Update: 2023-08-05
See Project
23

Email Scraper and Validator

This is a simple desktop application built with Python and Tkinter that allows users to scrape email addresses from websites and validate them using an external API. It also provides features to save the scraped emails to a database, and export the data to various file formats. 1. Enter a list of website URLs or emails in the input field. 2. Click the Scrape button to scrape email addresses from the provided websites. 3. Click the Validate button to validate the scraped email addresses. 4. Use...

Downloads: 0 This Week

Last Update: 2024-03-03
See Project
24

Overdrive Ebook Scraper

Perform OCR on an Overdrive Read ebook to convert it to plain text.

Perform OCR on an Overdrive Read ebook to convert it to plain text.

Downloads: 0 This Week

Last Update: 2024-01-30
See Project
25

scraper-with-chatgpt

It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.

Downloads: 0 This Week

Last Update: 2023-08-28
See Project