web scraper free download

Showing 29 open source projects for "web scraper"

View related business solutions

Mac Clear Filters & Widen Search

Purchasing and invoice automation solution for small to mid market companies.
Save your team 10s of hours/week with a fully personalized and automated procurement process.

ProcureDesk is an integrated purchasing and invoicing platform tailored to help small to medium sized businesses streamline their procurement processes. This user-friendly system automates workflows and consolidates purchasing data into a centralized dashboard, allowing companies to control spending and enhance transparency efficiently. Features like automated invoice matching, simple requisition creation, and immediate cash flow insights minimize manual tasks and boost operational efficiency. ProcureDesk is perfect for smaller enterprises leveraging big-business strategies to reduce costs and optimize their purchasing activities. Discover how ProcureDesk can transform your procurement process into a more effective and manageable part of your business.

Learn More
Event Management Software
Ideal for conference and event planners, independent planners, associations, event management companies, non-profits, and more.

YesEvents offers a comprehensive suite of services that spans the entire conference lifecycle and ensures every detail is executed with precision. Our commitment to exceptional customer service extends beyond conventional boundaries, consistently exceeding expectations and enriching both organizer and attendee experiences.

Learn More
1

CyberScraper 2077

A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.

Downloads: 8 This Week

Last Update: 2024-09-10
See Project
2

Goutte

Goutte, a simple PHP Web Scraper

Goutte is a screen scraping and web crawling library for PHP. Goutte provides a nice API to crawl websites and extract data from the HTML/XML responses. Goutte depends on PHP 7.1+. Add fabpot/goutte as a require dependency in your composer.json file. Create a Goutte Client instance (which extends Symfony\Component\BrowserKit\HttpBrowser). Make requests with the request() method. The method returns a Crawler object (Symfony\Component\DomCrawler\Crawler). To use your own HTTP settings, you may...

Downloads: 3 This Week

Last Update: 2023-04-01
See Project
3

dude uncomplicated data extraction

dude uncomplicated data extraction: A simple framework

Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.

Downloads: 0 This Week

Last Update: 2024-03-02
See Project
4

html-metadata

MetaData html scraper and parser for Node.js (supports Promises

The aim of this library is to be a comprehensive source for extracting all HTML-embedded metadata. Currently, it supports Schema.org microdata using a third-party library, a native BEPress, Dublin Core, Highwire Press, JSON-LD, Open Graph, Twitter, EPrints, PRISM, and COinS implementation, and some general metadata that doesn't belong to a particular standard (for instance, the content of the title tag, or meta description tags). Planned is support for RDFa, AGLS, and other yet unheard-of...

Downloads: 0 This Week

Last Update: 2024-08-24
See Project
Control remote support software for remote workers and IT teams
Raise the bar for remote support and reduce customer downtime.

ConnectWise ScreenConnect, formerly ConnectWise Control, is a remote support solution for Managed Service Providers (MSP), Value Added Resellers (VAR), internal IT teams, and managed security providers. Fast, reliable, secure, and simple to use, ConnectWise ScreenConnect helps businesses solve their customers' issues faster from any location. The platform features remote support, remote access, remote meeting, customization, and integrations with leading business tools.

Learn More
5

Ulixee Hero

The web browser built for scraping

It's the first modern headless browsers designed specifically for scraping instead of just automated testing. Hero provides access to the W3C DOM specification without the need for Puppeteer's complicated evaluate callbacks and multi-context switching. We've recreated a fully compliant DOM directly in NodeJS allowing you bypass the headaches of previous scraper tools. The powerful Chrome engine sits under the hood, allowing for lightning fast rendering. Emulators make it easy to disguise your...

Downloads: 0 This Week

Last Update: 18 hours ago
See Project
6

ScrapeGraphAI

Python scraper based on AI

Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.

Downloads: 0 This Week

Last Update: 2 days ago
See Project
7

crwlr

Library for Rapid (Web) Crawler and Scraper Development

This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler could...

Downloads: 0 This Week

Last Update: 2024-08-05
See Project
8

AutoScraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.

Downloads: 0 This Week

Last Update: 2023-04-12
See Project
9

SecretAgent

The web scraper that's nearly impossible to block

SecretAgent is a headless browser that’s nearly impossible to detect. It achieves this by emulating real users. And it has powerful auto-replay functionality that lets you create and debug scripts in record setting time.

Downloads: 1 This Week

Last Update: 2023-08-14
See Project
Make Your Observability Stack Effortless
For Software Engineers, DevOps, Data Architects, and IT Leaders

The progression to modern application stacks and microservices architectures has resulted in orders of magnitude more logs, metrics, events, and traces. Like gravity, data attracts more data, making it increasingly difficult to move and process as it accumulates over time. More than ever, there is a need to be able to stream-process, filter, mask, transform, aggregate, analyze, and route that data to various data tier destinations optimized for specific usage.

Learn More
10

soup

Web Scraper in Go, similar to BeautifulSoup

Web Scraper in Go, similar to BeautifulSoup. soup is a small web scraper package for Go, with its interface highly similar to that of BeautifulSoup. Pointer containing the pointer to the current html node. NodeValue containing the current html node's value, i.e. the tag name for an ElementNode, or the text in case of a TextNode. Error containing an error in a struct if one occurs, else nil is returned. A detailed text explanation of the error can be accessed using the Error() function. A field...

Downloads: 0 This Week

Last Update: 2023-01-25
See Project
11

JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.

... a job website you'd like to write a scraper for, you are welcome to implement it, Review the Base Scraper for implementation details. JobFunnel supports scraping jobs from the same job website across locales & domains. If you are interested in adding support, you may only need to define session headers and domain strings, Review the Base Scraper for further implementation details.

Downloads: 1 This Week

Last Update: 2023-04-10
See Project
12

django-dynamic-scraper

Creating Scrapy scrapers via the Django admin interface

Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project. Since it simplifies things DDS is not usable for all kinds of scrapers...

Downloads: 0 This Week

Last Update: 2022-09-05
See Project
13

google-play-scraper

Node.js scraper to get data from Google Play

... to the one specified. Returns the list of permissions an app has access to. Retrieve a full list of categories present from the dropdown menu on Google Play. Since every library call performs one or multiple requests to an Google Play API or web page, sometimes it can be useful to cache the results to avoid requesting the same data twice.

Downloads: 0 This Week

Last Update: 2022-03-22
See Project
14

X-RAY

The next web scraper, see through the <html> noise

Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't...

Downloads: 0 This Week

Last Update: 2021-10-05
See Project
15

WebExtractServer

WebExtractServer use with WebExtractLte for use with web browsers

Browse data, fetched by WebExtractLte directly in your browser. Designed to be used with Webscraper (webscraper.io) - third party web scraper tool, available as plugin for Chrome and Firefox.

Downloads: 0 This Week

Last Update: 2019-04-29
See Project
16

htmlparser

Products of the project: Java HTMLParser - VietSpider Web Data Extractor - Extractor VietSpider News. Click on "Show project details" to see more feature about each product.

Downloads: 0 This Week

Last Update: 2015-06-24
See Project
17

Scra.php

Scrape anything!

The ultimate customiseable YAML-ised Web Scraper for PHP

Downloads: 0 This Week

Last Update: 2014-01-20
See Project
18

MuhVieh - Filmverwaltung

Ein Skript zur Verwaltung der persönlichen Filmsammlung.

Das Skript stellt eine Filmdatenbank zur Verfügung. Des Weiteren beinhaltet es die Aufschlüsselung nach Genres, eine Benutzerverwaltung und eine ansprechende Präsentation der Inhalte.

Downloads: 0 This Week

Last Update: 2013-11-22
See Project
19

xWebScraper

This is an advanced web scraper with user friendly GUI which let the user define rules and web addresses to extract data from one time or periodically and a target database filed that the data should be saved in.

Downloads: 0 This Week

Last Update: 2014-07-13
See Project
20

AdWords Screen Scraper

This tool will fetch information from Google's Keyword Tool for a user with PHP's cURL library. Unlike most scrapers, this one integrates the captcha verification so as to thwart SPAM requests. This tool would allow analysis of the data when complete

1 Review

Downloads: 0 This Week

Last Update: 2013-04-09
See Project
21

WebScraper - Web Data Extraction

A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line

1 Review

Downloads: 0 This Week

Last Update: 2013-04-24
See Project
22

Stock Scraper

JSP tag library, Java Class Library, and Dot Netand Mono(C#, VB.Net, any Dot Net Language) DLL to display stock quotes, stock charts and other stock data. (Examples for VB.Net, C#, Java, JSP Tags are included)

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
23

LJLoader

Java program to extract postings and comments from http://www.livejournal.com (blog) into DB and view/classify/process it. LJ loader. Components to reuse: perl-like, but efficient Web pages scraper, trees analyzer, concurrent scheduler.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project
24

News Scraper

Collection of web site scrapers that format web sites into RSS.

Downloads: 0 This Week

Last Update: 2013-03-07
See Project
25

hirudo

Hirudo is a Java Swing application for downloading web content. It functions as a screen-scraper, filename generator and download manager. All this and much more in an intuitive cross-platform user interface. Hirudo requires Java 1.4.

Downloads: 0 This Week

Last Update: 2013-03-22
See Project