Showing 48 open source projects for "api"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Python API for JMComic

    Python API for JMComic

    Python crawler and API for downloading JMComic albums and images

    JMComic-Crawler-Python is a Python library and crawler framework designed to programmatically access and download comic content from the JMComic platform. It provides a structured API that allows developers to retrieve albums, chapters, and images using simple Python code while handling the necessary network requests and data processing behind the scenes. It supports both web-based and mobile API interfaces, enabling flexible interaction with the platform depending on the available endpoints. Its architecture includes components for configuration management, download orchestration, and client communication, allowing users to automate the retrieval of manga chapters or entire albums. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Grab Framework Project

    Grab Framework Project

    Web Scraping Framework

    ...With Grab you can build web scrapers of various complexity, from simple 5-line scripts to complex asynchronous website crawlers processing millions of web pages. Grab provides an API for performing network requests and for handling the received content e.g. interacting with DOM tree of the HTML document. The single request/response API that allows you to build network request, perform it and work with the received content. The API is built on top of urllib3 and lxml libraries. The Spider API to build asynchronous web crawlers. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    BrowserBox

    BrowserBox

    Remote isolated browser API for security

    Remote isolated browser API for security, automation visibility and interactivity. Run-on our cloud, or bring your own. Full scope double reverse web proxy with a multi-tab, mobile-ready browser UI frontend. Plus co-browsing, advanced adaptive streaming, secure document viewing and more! But only in the Pro version. BrowserBox is a full-stack component for a web browser that runs on a remote server, with a UI you can embed on the web.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction. The project supports a broad set of collection patterns, including searches by keyword, hashtag, user, date range, engagement thresholds, language, and location, making it useful for research, monitoring, and data gathering workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    Firecrawl

    Firecrawl

    Turn entire websites into LLM-ready markdown or structured data

    Crawl and convert any website into LLM-ready markdown or structured data. Built by Mendable.ai and the Firecrawl community. Includes powerful scraping, crawling, and data extraction capabilities. Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown or structured data. We crawl all accessible subpages and give you clean data for each. No sitemap is required.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    crawler

    crawler

    Collection of JS reverse engineering examples for web scraping study

    ...Each directory in the project focuses on a specific target service or scenario, showing how browser network requests and JavaScript code can be studied to reproduce API calls programmatically. Many examples illustrate techniques such as debugging scripts, intercepting requests, analyzing encrypted parameters, and understanding authentication flows. crawler also explores common anti-scraping defenses and demonstrates how developers can examine them through debugging tools and reverse engineering techniques.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Basketball Reference

    Basketball Reference

    NBA Stats API via Basketball Reference

    ...This library was created for another Python project where I was trying to estimate an NBA player's productivity. A lot of sports-related APIs are expensive - luckily, Basketball Reference provides a free service which can be scraped and translated into a usable API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    dxy-covid-19-crawler

    dxy-covid-19-crawler

    Realtime crawler for COVID-19 outbreak statistics from DXY data

    ...DXY-COVID-19-Crawler automatically crawls data at regular intervals, typically every minute, ensuring that newly published statistics are captured as quickly as possible. Retrieved data is stored in MongoDB and archived so that the entire progression of the outbreak can be traced over time. It also provided an API that allowed developers to easily access the collected data for building dashboards, visualizations, and other analytical tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    ...Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any number of Scrapyd servers by grouping and filtering, and then invoke the HTTP JSON API of Scrapyd on the cluster with just a few clicks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    rnet

    rnet

    Python HTTP client with TLS and HTTP/2 fingerprint emulation support

    rnet is an ergonomic and modular Python HTTP client designed for developers who need advanced control over network requests and protocol behavior. It provides a flexible API for making HTTP requests while supporting both asynchronous and blocking workflows, allowing it to integrate easily into different Python applications and runtimes. rnet focuses on low-level protocol customization, giving users fine-grained control over TLS and HTTP/2 configuration in order to emulate specific browser behaviors. This includes support for TLS fingerprinting techniques such as JA3 and JA4 as well as detailed HTTP/2 settings, enabling more accurate simulation of real client network traffic. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    ...It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other features include the fact that it is multi-thread and has distribution support. WebMagic is very easy to integrate. Add dependencies to your pom.xml. WebMagic use slf4j with slf4j-log4j12 implementation. If you customized your slf4j implementation, please exclude slf4j-log4j12. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    QueryList

    QueryList

    Progressive PHP web crawler framework with jQuery-like DOM parsing

    QueryList is an extensible PHP web scraping and crawling framework designed to extract and process data from web pages. It provides a simple and expressive API that allows developers to collect structured information from HTML documents using familiar DOM traversal techniques. It is built on top of phpQuery and uses CSS3 selectors similar to those found in jQuery, making it easy for developers to query and manipulate page elements during scraping tasks. QueryList supports common data extraction scenarios such as retrieving lists of titles, links, images, and other page elements from structured or semi-structured content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    watercrawl

    watercrawl

    AI-ready web crawler that extracts and structures website content

    ...WaterCrawl also offers real-time monitoring capabilities, allowing users to track crawling progress, performance metrics, and errors during large data collection jobs. Developers can integrate the tool into applications through a REST API and multiple client SDKs, enabling automated data pipelines and AI data preparation workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    tumblr-crawler

    tumblr-crawler

    Python crawler to download photos and videos from Tumblr blogs

    ...Users can specify one or multiple blogs to crawl by editing a configuration file or by passing parameters through the command line. Once executed, the script fetches media from the Tumblr API and stores the downloaded files in folders named after each blog. tumblr-crawler avoids re-downloading files that have already been saved, making repeated runs safe and useful for recovering missing media. It also supports optional proxy configuration, which can help when access to Tumblr content requires routing requests through a proxy server. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    changedetection.io

    changedetection.io

    The best free open source website change detection and restock service

    ...Monitor and track PDF file changes, and know when a PDF file has text changes. Know when your favourite product is on sale, or other special deals are announced before anyone else. Detect and monitor changes in JSON API responses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    crawlee

    crawlee

    A web scraping and browser automation library for Node.js

    ...Crawlee won't fix broken selectors for you (yet), but it helps you build and maintain your crawlers faster. When a website adds JavaScript rendering, you don't have to rewrite everything, only switch to one of the browser crawlers. When you later find a great API to speed up your crawls, flip the switch back. It keeps your proxies healthy by rotating them smartly with good fingerprints that make your crawlers look human-like. It's not unblockable, but it will save you money in the long run. Crawlee is built by people who scrape for a living and use it every day to scrape millions of pages. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Scrapling

    Scrapling

    An adaptive Web Scraping framework

    ...Its powerful spider system supports multi-session crawling, pause and resume functionality, and real-time streaming of scraped data. Scrapling combines high performance, memory efficiency, and extensive async support to deliver blazing-fast scraping workflows. With a developer-friendly API, CLI tools, MCP server integration for AI-assisted extraction, and Docker support, it offers a complete solution for modern web scrapers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    MechanicalSoup

    MechanicalSoup

    A Python library for automating interaction with websites

    ...MechanicalSoup was created by M Hickford, who was a fond user of the Mechanize library. Unfortunately, Mechanize was incompatible with Python 3 until 2019 and its development stalled for several years. MechanicalSoup provides a similar API, built on Python giants Requests (for HTTP sessions) and BeautifulSoup (for document navigation). Since 2017 it is a project actively maintained by a small team.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Pydoll

    Pydoll

    Async Python library in automating Chromium browsers without WebDriver

    ...Instead of using external drivers, it connects directly to the Chrome DevTools Protocol through WebSocket, allowing scripts to control browser behavior more efficiently and with fewer compatibility issues. It provides a high-level API that simplifies common browser automation tasks while still offering access to low-level protocol features for advanced control. Its architecture is built around asynchronous programming using Python’s asyncio framework, enabling concurrent automation of multiple tabs and browser contexts. Pydoll also includes tools for monitoring and intercepting network traffic, allowing developers to analyze or modify requests and responses during automation workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    newspaper4k

    newspaper4k

    Python library for scraping and analyzing online news articles easily

    Newspaper4k is a Python library designed for extracting, processing, and analyzing news articles from websites. It is a continuation and active fork of the original newspaper3k library, which had stopped receiving updates, with the goal of keeping the ecosystem maintained while adding improvements and bug fixes. It provides developers with tools to automatically download web pages, extract the main article content, and collect associated metadata such as titles, authors, images, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    DotnetSpider

    DotnetSpider

    Lightweight .NET framework for fast web crawling and data scraping

    DotnetSpider is a web crawling and data extraction framework built on the .NET Standard platform. It is designed to help developers create efficient and scalable crawlers for collecting structured data from websites. It provides a high-level API that simplifies the process of defining spiders, managing requests, and extracting content from web pages. Developers can create custom spiders by extending base classes and configuring pipelines that handle downloading, parsing, and storing collected data. DotnetSpider is modular, allowing different components such as request schedulers, downloaders, and storage systems to work together in a flexible workflow. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender :...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB