Showing 15 open source projects for "website scraper"

View related business solutions
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    ...Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you have a job website you'd like to write a scraper for, you are welcome to implement it, Review the Base Scraper for implementation details. JobFunnel supports scraping jobs from the same job website across locales & domains. If you are interested in adding support, you may only need to define session headers and domain strings, Review the Base Scraper for further implementation details.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ScrapeGraphAI

    ScrapeGraphAI

    Python scraper based on AI

    Extracting content from websites and local documents using LLM. ScrapeGraphAI is a web scraping python library that uses LLM and direct graph logic to create scraping pipelines for websites and local documents (XML, HTML, JSON, Markdown, etc.). Just say which information you want to extract and the library will do it for you.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    Spider is a high-performance web crawler and web scraping library written in Rust that enables developers to crawl and index websites efficiently. It focuses on speed, concurrency, and reliability by using asynchronous and multi-threaded processing to handle large volumes of web pages. It can rapidly crawl websites to collect links, retrieve page content, and extract structured information from HTML documents. Spider can operate concurrently across many pages, allowing it to gather large...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    crwlr

    crwlr

    Library for Rapid (Web) Crawler and Scraper Development

    This library provides kind of a framework and a lot of ready-to-use, so-called steps, that you can use as building blocks, to build your own crawlers and scrapers with. Before diving into the library, let's have a look at the terms crawling and scraping. For most real-world use cases, those two things go hand in hand, which is why this library helps with and combines both. A (web) crawler is a program that (down)loads documents and follows the links in it to load them as well. A crawler...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    Email Scraper and Validator
    This is a simple desktop application built with Python and Tkinter that allows users to scrape email addresses from websites and validate them using an external API. It also provides features to save the scraped emails to a database, and export the data to various file formats. 1. Enter a list of website URLs or emails in the input field. 2. Click the Scrape button to scrape email addresses from the provided websites. 3. Click the Validate button to validate the scraped email...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    ...Since it simplifies things DDS is not usable for all kinds of scrapers, but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. Django Dynamic Scraper tries to keep its data structure in the database as separated as possible from the models in your app, so it comes with its own Django model classes for defining scrapers, runtime information related to your scraper runs and classes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JAWS - Just Another Web Scraper

    JAWS - Just Another Web Scraper

    A simple Web Scraper using Regular Expression or Html Agility

    JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    IAD dispatch web scraper

    A very simple web scraper for taxi dispatch data.

    Introduction: The Dulles International Airport (IAD) near Washington, D.C. has a taxi service provided by the Washington Flyer. Taxi cabs are leased by drivers and rides are regulated using a queue system. Drivers enter a corral near the Arrival gate and wait for dispatchers to announce passengers. There is a website that displays useful information about the queue. The number of taxis waiting in queue, the wait time of the last vehicle out, and the number of taxis to exit the corral in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Products of the project: Java HTMLParser - VietSpider Web Data Extractor - Extractor VietSpider News. Click on "Show project details" to see more feature about each product.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10

    python-web_excavator

    Genral Data Mining API: Only write html parsing code.

    A general web scraper that uses the requests library to communicate with the website. Scraper() contains a parser object, which you can add parsing handles to. ParseHandle() is the code mining for you data from an html source. Repo: https://github.com/crispycret/web_excavator
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    IP Proxy Scraper

    IP Proxy Scraper

    IP Proxy Scraper lets you extract multiple proxies

    This lightweight yet powerful application extracts IPs and ports from a list of specified websites. If you are in need of multiple proxies simply insert the desired website URLs and with a single click your proxies are gathered and presented to you in the output window, ready to be copied and saved. IP Proxy Scraper is also available for Linux, check it out here: https://sourceforge.net/projects/ipproxyscraperlinux/
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    IP Proxy Scraper - Linux

    IP Proxy Scraper - Linux

    Extracts multiple proxies from a list of websites

    Lightweigh and easy to use tool to extract multiple proxies from a list of websites. IP Proxy Scraper is also available for windows, check it out here: https://sourceforge.net/projects/ipproxyscraper/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    National Lottery Scraper is a tool to connect to South Africa's National Lottery website (http://www.nationallottery.co.za/), download and display Lotto, Lotto Plus, and PowerBall results.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    A simple to set up web scraper written in Java. It uses modified regEx to quickly write complex patterns to parse data out of a website. It contains a GUI tool for testing your configuration scripts and is fully automated through the command line
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Blackfire Player

    Blackfire Player

    Web Crawling, Web Testing, and Web Scraping application

    Blackfire Player is a powerful Web Crawling, Web Testing, and Web Scraper application. It provides a nice DSL to crawl HTTP services, assert responses, and extract data from HTML/XML/JSON responses. Some Blackfire Player use cases: Crawl a website/API and check expectations -- aka Acceptance Tests; Scrape a website/API and extract values; Monitor a website; Test code with unit test integration (PHPUnit, Behat, Codeception, ...); Test code behavior from the outside thanks to the native Blackfire Profiler integration -- aka Unit Tests from the HTTP layer (tm). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB