Showing 14 open source projects for "scraping"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Colly

    Colly

    Elegant Scraper and Crawler Framework for Golang

    ...Clean API. Fast (>1k request/sec on a single core) Manages request delays and maximum concurrency per domain. Automatic cookie and session handling. Sync/async/parallel scraping. Distributed scraping. Caching, automatic encoding of non-unicode responses. Robots.txt support. Google App Engine support.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Zendriver

    Zendriver

    A blazing fast, async-first, undetectable webscraping

    Zendriver is a modern Python web automation and scraping framework that leverages the Chrome DevTools Protocol to provide fast, asynchronous control over real browser instances. Unlike traditional tools that rely on Selenium or WebDriver, Zendriver communicates directly with the browser through CDP, enabling higher performance and more precise control over browser behavior. The framework is designed to be difficult to detect by anti-bot systems, making it suitable for advanced scraping and automation use cases where stealth is important. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Elasticsearch Exporter

    Elasticsearch Exporter

    Elasticsearch stats exporter for Prometheus

    ...The exporter fetches information from an Elasticsearch cluster on every scrape, therefore having a too short scrape interval can impose load on ES master nodes, particularly if you run with --es.all and --es.indices. We suggest you measure how long fetching /_nodes/stats and /_all/_stats takes for your ES cluster to determine whether your scraping interval is too short. As a last resort, you can scrape this exporter using a dedicated job with its own scraping interval. Commandline parameters start with a single - for versions less than 1.1.0rc1. Username and password can be passed either directly in the URI or through the ES_USERNAME and ES_PASSWORD environment variables. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    DrissionPage

    DrissionPage

    Python based web automation tool. Powerful and elegant

    DrissionPage is a Python-based automation framework that blends the capabilities of Selenium for browser automation with Requests-HTML for fast, headless web data extraction. It enables seamless switching between browser-controlled and headless HTTP sessions within the same interface. Ideal for web scraping, testing, and automation, DrissionPage is lightweight and highly efficient, offering more flexibility than standard Selenium or Requests usage alone.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    CEF Python

    CEF Python

    Python bindings for the Chromium Embedded Framework (CEF)

    Python bindings for the Chromium Embedded Framework (CEF). CEF Python is an open source project founded by Czarek Tomczak in 2012 to provide Python bindings for the Chromium Embedded Framework (CEF). The Chromium project focuses mainly on Google Chrome application development while CEF focuses on facilitating embedded browser use cases in third-party applications. Lots of applications use CEF control, there are more than 100 million CEF instances installed around the world. There are...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    serverless-chrome

    serverless-chrome

    Run headless Chrome/Chromium on AWS Lambda

    ...Serverless Chrome takes care of building and bundling the Chrome binaries and making sure Chrome is running when your serverless function executes. In addition, this project also provides a few example services for common patterns (e.g. taking a screenshot of a page, printing to PDF, some scraping, etc.). Why? Because it's neat. It also opens up interesting possibilities for using the Chrome DevTools Protocol (and tools like Chromeless or Puppeteer) in serverless architectures and doing testing/CI, web-scraping, pre-rendering, etc. You must configure your AWS credentials either by defining AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environmental variables, or using an AWS profile.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Twint

    Twint

    An advanced Twitter scraping & OSINT tool written in Python

    Twint is an advanced open-source Twitter scraping and OSINT tool written in Python that extracts tweets, user data, followers, likes, and more—without relying on Twitter’s API—making it highly useful for researchers, analysts, and hobbyists who want to bypass rate limits and access public Twitter data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    django-dynamic-scraper

    django-dynamic-scraper

    Creating Scrapy scrapers via the Django admin interface

    Django Dynamic Scraper (DDS) is an app for Django build on top of the scraping framework Scrapy. While preserving many of the features of Scrapy it lets you dynamically create and manage spiders via the Django admin interface. With Django Dynamic Scraper (DDS) you can define your Scrapy scrapers dynamically via the Django admin interface and save your scraped items in the database you defined for your Django project.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    EsiGate
    This toolkit allows developpers to assemble several web applications into a single one. Pages are build using contents, resources and page templates coming from the original applications. The applications may use different technologies (Java, PHP...)
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    WKZombie

    WKZombie

    WKZombie is a Swift framework for iOS/OSX to navigate within websites

    WKZombie is a Swift framework for iOS/OSX to navigate within websites and collect data without the need of a User Interface or API, also known as a Headless browser. It can be used to run automated tests/snapshots and manipulate websites using Javascript. WKZombie is an iOS/OSX web-browser without a graphical user interface. It was developed as an experiment in order to familiarize myself with using functional concepts written in Swift 4. It incorporates WebKit (WKWebView) for rendering and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    A Java library that makes accessing webpages and internet content simple, including submitting forms, "clicking" links, screen scraping, and navigating structure in a Javascript-like fashion. Integrated with the Chrome javascript engine Rhino.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Aracnis is a Java based framework for building distributed web spiders. These spiders can be used to accomplish a variety of tasks, for example, screen-scraping and link integrity checking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    dataflowkit

    Golang framework for scraping data from web pages

    Golang Web Scraper library for extracting data from web pages. Save results as CSV, JSON, XML
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB