Showing 413 open source projects for "python data analysis"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Dataproc Templates

    Dataproc Templates

    Dataproc templates and pipelines for solving simple in-cloud data task

    Dataproc templates are designed to address various in-cloud data tasks, including data import/export/backup/restore and bulk API operations. These templates leverage the power of Google Cloud's Dataproc, supporting both Dataproc Serverless and Dataproc clusters. Google provides this collection of pre-implemented Dataproc templates as a reference and for easy customization.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    douyin

    douyin

    Open source Douyin crawler for collecting and downloading public data

    DouyinCrawler is an open source data collection tool designed to gather publicly available information from the Douyin platform. It demonstrates how to build a Python-based web crawler combined with a graphical interface and command line functionality. It allows users to collect data from various types of Douyin content, including user profiles, videos, hashtags, and music pages.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    CloudQuery

    CloudQuery

    The open-source cloud asset inventory powered by SQL

    ...Integrate CloudQuery with your current visualization, monitoring, and alerting such as Grafana. CloudQuery supports the TimescaleDB PostgreSQL extension, giving you full historical snapshots of your cloud asset inventory. Data analysis, security, auditing, and compliance. Leverage SQL to get visibility into your cloud infrastructure and SaaS applications. Build a cloud-asset inventory across any of our supported official or community providers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Helium Browser

    Helium Browser

    Private, fast, and honest web browser

    Helium is a Chromium-based web browser designed to deliver privacy, speed, and simplicity by removing Google’s proprietary services, telemetry, and bloat. It’s built atop ungoogled-chromium, extending its philosophy with additional privacy features, design refinements, and user experience improvements aimed at transparency and control. Helium blocks ads and trackers by default through an integrated, unbiased uBlock Origin extension prepackaged as a native browser component. Its UI and...
    Downloads: 98 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    Mist Cloud Management Platform

    Mist Cloud Management Platform

    Mist is an open source, multicloud management platform

    Mist CE is an open-source multi-cloud management platform, offering unified control and monitoring for hybrid and multi-cloud environments.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    SearXNG

    SearXNG

    Free internet metasearch engine which aggregates

    SearXNG is a free and open-source metasearch engine designed to aggregate results from multiple search engines while prioritizing user privacy and anonymity. Instead of maintaining its own index, it queries numerous external search providers and merges the results into a single interface, increasing coverage and diversity of information. One of its core principles is privacy, as it does not track users, store personal data, or create search profiles, making it a strong alternative to...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 7
    SPX

    SPX

    A simple & straight-to-the-point PHP profiling extension

    ...Multi metrics capable: 22 are currently supported (various time & memory metrics, included files, objects in use, I/O...). Able to collect data without losing context. For example Xhprof (and potentially its forks) aggregates data per caller / callee pairs, which implies the loss of the full call stack and forbids timeline or Flamegraph based analysis.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    news-please

    news-please

    Python tool for crawling and extracting structured data from news site

    ...It combines several established technologies and libraries to perform web crawling and content extraction, enabling reliable processing across a wide range of news sources. Developers can use the software either as a standalone command line application or integrate it into their own Python applications through its library interface. Extracted article data can be stored in different formats and systems, including JSON files or database-backed storage solutions.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    waitress

    waitress

    A WSGI server for Python 3

    Waitress is meant to be a production-quality pure-Python WSGI server with very acceptable performance. It has no dependencies except ones which live in the Python standard library. It runs on CPython on Unix and Windows under Python 3.7+. It is also known to run on PyPy 3 (python version 3.7+) on UNIX. It supports HTTP/1.0 and HTTP/1.1. Waitress now validates that chunked encoding extensions are valid, and don't contain invalid characters that are not allowed. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    dude uncomplicated data extraction

    dude uncomplicated data extraction

    dude uncomplicated data extraction: A simple framework

    Dude is a very simple framework for writing web scrapers using Python decorators. The design, inspired by Flask, was to easily build a web scraper in just a few lines of code. Dude has an easy-to-learn syntax. Dude is currently in Pre-Alpha. Please expect breaking changes. You can run your scraper from terminal/shell/command-line by supplying URLs, the output filename of your choice and the paths to your python scripts to dude scrape command.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    FEAPDER

    FEAPDER

    Powerful Python crawler framework for scalable web scraping tasks

    feapder is a Python-based web crawling framework designed to simplify the process of building scalable and efficient web scrapers. It focuses on providing a developer-friendly environment that makes it easier to create, run, and manage crawlers for a variety of data collection tasks. It includes several built-in spider types, such as AirSpider, Spider, TaskSpider, and BatchSpider, which address different crawling scenarios ranging from lightweight scraping to distributed and batch-based jobs. feapder supports features such as breakpoint resume, allowing crawlers to continue from where they stopped without losing progress. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Crawl4AI

    Crawl4AI

    Open-source LLM Friendly Web Crawler & Scraper

    Crawl4AI is a high-performance, AI‑ready web crawler tailored for LLM data ingestion and RAG pipelines. It supports adaptive crawling heuristics (stopping when enough info is gathered), structured markdown output, and high-speed parallel execution. Designed to operate at scale with optional Docker deployment and framework integrations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    ...Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths. These capabilities make the project suitable for building search indexers, data extraction pipelines, & SEO analysis tools.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 14
    Linkedin Scraper

    Linkedin Scraper

    A library that scrapes Linkedin for user data

    Linkedin Scraper is a library that scrapes Linkedin for user data. Version 2.0.0 and before is called linkedin_user_scraper and can be installed via pip3 install --user linkedin_user_scraper. The reason is that LinkedIn has recently blocked people from viewing certain profiles without having previously signed in. So by setting scrape=False, it doesn't automatically scrape the profile, but Chrome will open the linkedin page anyways. You can login and logout, and the cookie will stay in the...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    hosts

    hosts

    Consolidate and extend hosts files from several well-curated sources

    Consolidating and extending hosts files from several well-curated sources. You can optionally pick extensions to block pornography, social media, and other categories. The unified hosts file is optionally extensible. Extensions are used to include domains by category. Currently, we offer the following categories: fakenews, social, gambling, and porn. Extensions are optional, and can be combined in various ways with the base hosts file. The combined products are stored in the alternates...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 16
    Snoop Project

    Snoop Project

    This is the most powerful software taking into account CIS location

    Snoop is an open data intelligence tool (OSINT world). Snoop Project is one of the most promising OSINT tools for finding nicknames. This is the most powerful software taking into account the CIS location. Is your life slideshow? Ask Snoop. Snoop project is developed without taking into account the opinions of the NSA and their friends, that is, it is available to the average user. Snoop is a research work (own database / closed bugbounty) in the field of searching and processing public data...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 17
    OpenWPM

    OpenWPM

    A web privacy measurement framework

    OpenWPM is a web privacy measurement framework that makes it easy to collect data for privacy studies on a scale of thousands to millions of websites. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection. Check out the instrumentation section below for more details. OpenWPM is tested on Ubuntu 18.04 via TravisCI and is commonly used via the docker container that this repo builds, which is also based on Ubuntu. Although we...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    CloudEvents

    CloudEvents

    CloudEvents Specification

    Events are everywhere. However, event producers tend to describe events differently. The lack of a common way of describing events means developers must constantly re-learn how to consume events. This also limits the potential for libraries, tooling and infrastructure to aide the delivery of event data across environments, like SDKs, event routers or tracing systems. The portability and productivity we can achieve from event data is hindered overall. CloudEvents is a specification for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    ScrapydWeb

    ScrapydWeb

    Web app for Scrapyd cluster management

    Web app for Scrapyd cluster management, with support for Scrapy log analysis & visualization. Make sure that Scrapyd has been installed and started on all of your hosts. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings on the first startup.) Add your Scrapyd servers, both formats of string and tuple are supported, you can attach basic auth for accessing the Scrapyd server, as well as a string for grouping or labeling. You can select any...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    UTMStack

    UTMStack

    Customizable SIEM and XDR powered by Real-Time correlation

    Welcome to the UTMStack open-source project! UTMStack is a unified threat management platform that merges SIEM (Security Information and Event Management) and XDR (Extended Detection and Response) technologies. Our unique approach allows real-time correlation of log data, threat intelligence, and malware activity patterns from multiple sources, enabling the identification and halting of complex threats that use stealthy techniques. UTMStack stands out in threat prevention by surpassing the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    Algo VPN

    Algo VPN

    Set of Ansible scripts that simplifies the setup of a personal VPN

    Introducing Algo, a self-hosted personal VPN server designed for ease of deployment and security. Algo automatically deploys an on-demand VPN service in the cloud that is not shared with other users, relies on only modern protocols and ciphers, and includes only the minimal software you need. And it’s free. For anyone who is privacy conscious, travels for work frequently, or can’t afford a dedicated IT department, this one’s for you. Really, the paid-for services are just commercial...
    Downloads: 46 This Week
    Last Update:
    See Project
  • 22
    BrowserOS

    BrowserOS

    Agentic browser; privacy-first alternative to ChatGPT Atlas

    BrowserOS is an open-source, agentic web browser built on a Chromium base that integrates AI agents directly into the browsing experience. Rather than just doing standard browsing, it places AI intelligence at the core: you can connect your own API keys (for e.g., OpenAI, Anthropic, Google Gemini) or run local models (via e.g., Ollama) so that your browsing data and automation stay on your machine — privacy and control are emphasized throughout. The interface remains familiar to users of...
    Downloads: 36 This Week
    Last Update:
    See Project
  • 23
    Docker-ELK

    Docker-ELK

    The Elastic stack (ELK) powered by Docker and Compose

    A turnkey Docker Compose stack to spin up the ELK stack (Elasticsearch, Logstash, Kibana) for log collection, analysis, and visualization. Based on official Elastic images and enhanced with configuration defaults optimized for local development and testing.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    PyFCM

    PyFCM

    Python client for FCM - Firebase Cloud Messaging

    Python client for FCM - Firebase Cloud Messaging (Android, iOS and Web) Firebase Cloud Messaging (FCM) is the new version of GCM. It inherits the reliable and scalable GCM infrastructure, plus new features. GCM users are strongly recommended to upgrade to FCM. Using FCM, you can notify a client app that new email or other data is available to sync.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    CyberScraper 2077

    CyberScraper 2077

    A Powerful web scraper powered by LLM | OpenAI, Gemini & Ollama

    CyberScraper 2077 is not just another web scraping tool – it's a glimpse into the future of data extraction. Born from the neon-lit streets of a cyberpunk world, this AI-powered scraper uses OpenAI, Gemini and LocalLLM Models to slice through the web's defenses, extracting the data you need with unparalleled precision and style.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB