Showing 12 open source projects for "scraping"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    Happy DOM

    Happy DOM

    Happy DOM is a JavaScript implementation of a web browser

    Happy DOM is a JavaScript implementation of a web browser without its graphical user interface. It includes many web standards from WHATWG DOM and HTML. The goal of Happy DOM is to emulate enough of a web browser to be useful for testing, scraping web sites, and server-side rendering. Happy DOM focuses heavily on performance and can be used as an alternative to JSDOM. Happy DOM now supports Declarative Shadow DOM which can be used for server-side rendering of web components. This package makes it possible to use Happy DOM with Jest.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    TradingView Chart Data Extractor

    TradingView Chart Data Extractor

    Extract price and indicator data from TradingView charts

    ...Too many indicators or too low a time resolution will increase the data points and potentially overload the free server. Avoid this by hosting/running the script on your local machine or scraping multiple times with fewer indicators and manually combining the CSV afterward. Simply append the URL of a chart/idea published on TradingView to the link below. This is not the URL of a security's chart, but the URL for a user-published chart.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    unfluff

    unfluff

    Automatically extract body content (and other cool stuff) from HTML

    unfluff is a Node.js library designed to automatically extract the main content from an HTML document — stripping away navigation bars, ads, footers and other boilerplate to leave you with the “body content”, metadata (title, author, date) and other useful fields. It’s a tool very much aimed at content-analysis, web scraping, building datasets, or repurposing article text for downstream processing (like machine-learning or summarization). The API is simple: you feed in raw HTML and it returns a structured object with the extracted text and other fields. It supports caching internal representations to speed up repeated extractions. While its language support is best for English, it is still widely used in web-content-processing pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Python ADB

    Python ADB

    Python ADB + Fastboot implementation

    ...Under the hood it speaks the ADB protocol directly and can connect via USB or over TCP, which is useful for lab setups and headless servers. Because it’s Python, you can compose device actions with your favorite testing, scraping, or data-collection libraries in one process. The project also includes utilities for robust connection handling and timeouts so flaky USB links don’t derail long runs. It’s well-suited to CI test farms, large-scale telemetry, and custom device control workflows.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    ScraperEdit for XBMC

    XML bindings and a GUI for creating and editing XBMC Scrapers

    This program is an editor for creating XBMC Scrapers. It is similar to ScraperEditor, an other editor using ScraperXML, that runs under .Net environment. This program runs under Sun/Oracle's Java Runtime. HELP WANTED! I am looking for someone, who would help me writing documentation, like user's manual and on-line help. Also if someone want to help, translated language files are always welcome...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    SQLDOM

    HTML parser and DOM-related procedures for Microsoft T-SQL

    SQLDOM is an easy and robust way to parse HTML directly into SQL tables, manipulate DOM nodes in a JQuery-like manner, and to render HTML from the SQL-based DOM. SQLDOM is written entirely in native T-SQL, and uses only temporary database objects (tempdb). No changes to user databases are required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9

    HTML DOM Parser

    HTML parser which can be used for screen-scraping applications

    htmldom parses the HTML file and provides methods for iterating and searching the parse tree in a similar way as Jquery. To report bugs please mail me at bhimsen.pes@gmail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    datalus
    PHP web API designed to simplify object handling(loading, saving, querying, displaying, and editing), abstract the data from its display structure, and layout and allow the target data to be delivered to any supported format without special logic.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A robust website scraping framework that uses XML, XPath, RegEx and scripting to consume, parse, normalize and traverse HTML based on a set of seed URLs. Scrape.NET is built using C#, TidyForNet (the p-invoke only version) and HTML Tidy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    Xidel

    Xidel is a cli webpage scraping tool supporting XPath/XQuery 3 and CSS

    Xidel is a command line tool to download web pages and extract data from them. This data can be extracted using XPath/XQuery 3.0 (with a compatibility modes for XPath 2.0 and XQuery 1.0), JSONiq, CSS 3 selectors, and custom, pattern-matching templates that are like an annotated version of the processed page. It can download files over HTTP/S connections, follow redirections, links, or extracted values, and also process local files. The extracted values can then be exported as...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB