Showing 36 open source projects for "web scrape"

View related business solutions
  • Red Hat Ansible Automation Platform on Microsoft Azure Icon
    Red Hat Ansible Automation Platform on Microsoft Azure

    Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

    Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
  • Manage your IT department more effectively Icon
    Manage your IT department more effectively

    Streamline your business from end to end with ConnectWise PSA

    ConnectWise PSA (formerly Manage) allows you to stop working in separate systems, and helps you build a more profitable business. No more duplicate data entries, inefficient employees, manual invoices, and the inability to accurately track client service issues. Get a behind the scenes look into the award-winning PSA that automates processes for each area of business: sales, help desk, support, finance, and HR.
  • 1
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Scrapy

    Scrapy

    A fast, high-level web crawling and web scraping framework

    Scrapy is a fast, open source, high-level framework for crawling websites and extracting structured data from these websites. Portable and written in Python, it can run on Windows, Linux, macOS and BSD. Scrapy is powerful, fast and simple, and also easily extensible. Simply write the rules to extract the data, and add new functionality if you wish without having to touch the core. Scrapy does the rest, and can be used in a number of applications. It can be used for data mining, monitoring...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 3
    rvest

    rvest

    Simple web scraping for R

    rvest helps you scrape (or harvest) data from web pages. It is designed to work with magrittr to make it easy to express common web scraping tasks, inspired by libraries like beautiful soup and RoboBrowser. If you’re scraping multiple pages, I highly recommend using rvest in concert with polite. The polite package ensures that you’re respecting the robots.txt and not hammering the site with too many requests.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    jsoup

    jsoup

    Java library for working with real-world HTML

    ... attempt to create a clean parse from the HTML you provide, regardless of whether the HTML is well-formed or not. You have HTML in a Java String, and you want to parse that HTML to get at its contents, or to make sure it's well formed, or to modify it. The String may have come from user input, a file, or from the web.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Business Continuity Solutions | ConnectWise BCDR Icon
    Business Continuity Solutions | ConnectWise BCDR

    Build a foundation for data security and disaster recovery to fit your clients’ needs no matter the budget.

    Whether natural disaster, cyberattack, or plain-old human error, data can disappear in the blink of an eye. ConnectWise BCDR (formerly Recover) delivers reliable and secure backup and disaster recovery backed by powerful automation and a 24/7 NOC to get your clients back to work in minutes, not days.
  • 5
    crawlee

    crawlee

    A web scraping and browser automation library for Node.js

    Crawlee is a web scraping and browser automation library. It helps you build reliable crawlers. Fast. Crawlee won't fix broken selectors for you (yet), but it helps you build and maintain your crawlers faster. When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back. It keeps your proxies healthy by rotating them smartly with good fingerprints...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Ferret

    Ferret

    Declarative web scraping

    A web scraping system aiming to simplify data extraction from the web. ferret has a declarative query language that makes it easy to focus on the data that you need to get. ferret has the ability to scrape JS rendered pages, handle all page events, and emulate user interactions. the ferret was designed as a library from the ground up. it can be easily embedded into any Go application. ferret helps you to focus on the data you need using an easy-to-learn declarative language. ferret uses Chrome...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Rod

    Rod

    A Devtools driver for web automation and scraping

    Rod is a high-level driver for DevTools Protocol. It's widely used for web automation and scraping. Rod can automate most things in the browser that can be done manually. Chained context design, intuitive to timeout or cancel the long-running task. Auto-wait elements to be ready. Debugging friendly, auto input tracing, remote monitoring headless browser. Thread-safe for all operations. Automatically find or download browser. High-level helpers like WaitStable, WaitRequestIdle, HijackRequests...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Roach

    Roach

    The complete web scraping toolkit for PHP

    Roach is a complete web scraping toolkit for PHP. It is a shameless clone heavily inspired by the popular Scrapy package for Python. Roach allows us to define spiders that crawl and scrape web documents. But wait, there’s more. Roach isn’t just a simple crawler, but includes an entire pipeline to clean, persist and otherwise process extracted data as well. It’s your all-in-one resource for web scraping in PHP. Roach doesn’t depend on a specific framework. Instead, you can use the core package...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Propelling Payments for Software Platforms Icon
    Propelling Payments for Software Platforms

    For SaaS businesses to monetize payments through its turnkey PayFac-as-a-Service solution.

    Exact Payments delivers easy-to-integrate embedded payment solutions enabling you to rapidly onboard merchants, instantly activate a variety of payment methods and accelerate your revenue — delivering an end-to-end payment processing platform for SaaS businesses.
  • 10
    AutoScraper

    AutoScraper

    A Smart, Automatic, Fast and Lightweight Web Scraper for Python

    This project is made for automatic web scraping to make scraping easy. It gets a URL or the HTML content of a web page and a list of sample data that we want to scrape from that page. This data can be text, URL or any HTML tag value of that page. It learns the scraping rules and returns similar elements. Then you can use this learned object with new URLs to get similar content or the exact same element of those new pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    htmLawed

    PHP code to purify & filter HTML

    The htmLawed PHP script makes HTML more secure and standards- & policy-compliant. The customizable HTML filter/purifier can balance tags, ensure proper nestings, neutralize XSS, restrict HTML, beautify code like Tidy, implement anti-spam measures, etc.
    Downloads: 104 This Week
    Last Update:
    See Project
  • 12
    scraper-with-chatgpt
    It is a powerful data scraping tool that helps you extract information from various online sources. Easily collect data from Google SERP, Maps, Shopify, Zillow, and more. With a user-friendly interface, you can scrape and save data in JSON or Excel formats. Unlock insights from the web effortlessly with scrape-it.cloud API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Catbird Linux

    Catbird Linux

    Linux for content creation, web scraping, coding, and data analysis.

    Catbird Linux is an operating system built for media creation, web scraping, and software coding. It is the daily driver you want for retrieving data, making videos or podcasts, and making software tools to automate the repetitive tasks. It is ready for work in Python, Lua, and Go languages, with numerous packages for web scraping or downloading data via API calls. Using Catbird Linux, it is possible to accomplish in depth stock market analysis, track weather trends, follow social media...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    TorrentPier

    TorrentPier

    🐂 Bull-powered BitTorrent tracker engine

    TorrentPier — bull-powered BitTorrent Public/Private tracker engine, written in php. High speed, simple modification, high load architecture. In addition, we have very helpful official support forum, where it's possible to get any support and download modifications for engine.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    ConsoleWebScraper

    ConsoleWebScraper

    It allows you to input a URL and it will scrape the HTML content...

    ConsoleWebScraper is a simple console application that allows you to scrape web pages and save the results. Usage Open a command prompt as administrator. Navigate to the directory containing the utility. Run the utility as a command-line argument. For example: .\ConsoleWebScraper.exe When you start the application, you'll see a title and a menu guide. The menu guide will tell you what commands you can use. After the application has successfully completed its operation...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Till

    Till

    DataHen Till is a companion tool to your existing web scraper

    DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes. Web scraping is usually easy to get started, especially on a small scale. However, as you try to scale it up, it gets exponentially difficult. Scraping 10,000 records can easily be done with simple web scraper scripts in any programming language, but as you try to scrape millions...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    JobFunnel

    JobFunnel

    Scrape job websites into a single spreadsheet with no duplicates.

    Scrape job websites into a single spreadsheet with no duplicates. Automated tool for scraping job postings into a .csv file. You can search for jobs with YAML configuration files or by passing command arguments. By performing regular scraping and reviewing, you can cut through the noise of even the busiest job markets. Run funnel with your settings YAML to populate your master CSV file with jobs from available providers. JobFunnel can be easily automated to run nightly with crontab. If you have...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    X-RAY

    X-RAY

    The next web scraper, see through the <html> noise

    Supports strings, arrays, arrays of objects, and nested object structures. The schema is not tied to the structure of the page you're scraping, allowing you to pull the data in the structure of your choosing. The API is entirely composable, giving you great flexibility in how you scrape each page. Paginate through websites, scraping each page. X-ray also supports a request delay and a pagination limit. Scraped pages can be streamed to a file, so if there's an error on one page, you won't lose...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    SEO MACROSCOPE

    SEO MACROSCOPE

    SEO Macroscope is a website scanning tool, to check your website

    ... pages. Export reports to Excel and CSV formats. Generate and export text and XML sitemaps from the crawled pages. Analyze redirect chains. Use custom filters to verify the presence/absence of tracking tags. Use CSS Selectors, XPath Queries, and Regular Expressions to scrape website data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    ..., but it is well suited for the relatively common case of regularly scraping a website with a list of updated items (e.g. news, events, etc.) and then dig into the detail page to scrape some more infos for each item. Django Dynamic Scraper tries to keep its data structure in the database as separated as possible from the models in your app, so it comes with its own Django model classes for defining scrapers, runtime information related to your scraper runs and classes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    google-play-scraper

    google-play-scraper

    Node.js scraper to get data from Google Play

    Node.js module to scrape application data from the Google Play store. Retrieves the full detail of an application. Retrieves a list of applications from one of the collections at Google Play. Retrieves a list of apps that results of searching by the given term. Returns the list of applications by the given developer name. Given a string returns up to five suggestions to complete a search query term. Retrieves a page of reviews for a specific application. Returns a list of similar apps...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    mp3scraper

    mp3scraper

    Scrape web pages for MP3 links and generate RSS files.

    Moved to GitHub. https://github.com/ggocek/mp3scraper
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    JAWS - Just Another Web Scraper

    JAWS - Just Another Web Scraper

    A simple Web Scraper using Regular Expression or Html Agility

    JAWS or Just Another Web Scraper, is part of the Data Scraping Softwares developed by SVbook, alongside JATI (Image to Text) and JAVT (Video to Text). JAWS offer easy interface to scrape data from the website using regular expression, text preprocessing, or HTML Agility Pack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    metabase=universal data-storage for "things". with its dynamical structure, it's a strong back-end application. motivation:store any data in an "universal admin" (maybe scrape it from different sources),edit,categorize,reorganizes,process,publis
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next