Showing 998 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    JC

    JC

    CLI tool and python library

    ...The JC parsers can also be used as python modules. In this case, the output will be a python dictionary, or a list of dictionaries, instead of JSON. Two representations of the data are available. The default representation uses a strict schema per parser and converts known numbers to int/float JSON values. Certain known values of None are converted to JSON null, known boolean values are converted, and, in some cases, additional semantic context fields are added.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Superduper

    Superduper

    Superduper: Integrate AI models and machine learning workflows

    Superduper is a Python-based framework for building end-2-end AI-data workflows and applications on your own data, integrating with major databases. It supports the latest technologies and techniques, including LLMs, vector-search, RAG, and multimodality as well as classical AI and ML paradigms. Developers may leverage Superduper by building compositional and declarative objects that out-source the details of deployment, orchestration versioning, and more to the Superduper engine. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    dplyr

    dplyr

    dplyr: A grammar of data manipulation

    dplyr is an R package that provides a consistent and intuitive grammar for data manipulation, enabling users to filter, arrange, summarize, and transform data efficiently. Part of the tidyverse ecosystem, dplyr simplifies complex data operations through a clear and readable syntax, whether working with data frames, tibbles, or databases. It is widely used in data science and statistical analysis workflows.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    ConcurrentSim.jl

    ConcurrentSim.jl

    Discrete event process oriented simulation framework written in Julia

    A discrete event process-oriented simulation framework written in Julia inspired by the Python library SimPy. One of the longest-lived Julia packages (originally under the name SimJulia).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    pywebview

    pywebview

    Build GUI for your Python program with JavaScript, HTML, and CSS

    pywebview is a lightweight cross-platform wrapper around a webview component that allows to display HTML content in its own native GUI window. It gives you power of web technologies in your desktop application, hiding the fact that GUI is browser based. You can use pywebview either with a lightweight web framework like Flask or Bottle or on its own with a two way bridge between Python and DOM. pywebview uses native GUI for creating a web component window: WinForms on Windows, Cocoa on macOS...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Ethereum ETL

    Ethereum ETL

    Python scripts for ETL (extract, transform and load) jobs for Ethereum

    Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in Google BigQuery. Ethereum ETL lets you convert blockchain data into convenient formats like CSVs and relational databases.
    Downloads: 2 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    Soufflé

    Soufflé

    Datalog variant for tool designers crafting analyses in Horn clauses

    Rapid prototyping for your analysis problems with logic; enabling deep design-space explorations; designed for large-scale static analysis; e.g., points-to analysis for Java, taint-analysis, and security checks. Futamura projections/partial evaluation for effective translation to parallel C++; optimized staged compilation; specialized data-structures for logical relations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ta4j

    ta4j

    A Java library for technical analysis

    Ta4j is an open-source Java library for technical analysis. It provides the basic components for the creation, evaluation, and execution of trading strategies. Ta4j is available on Maven Central. You can also download example code from the maven central repository. The wiki is the best place to start learning about ta4j. For more detailed questions, please use the issues tracker. We can calculate indicators over this bar series, in order to forecast the direction of prices through the study of past market data. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 12
    Gretel Synthetics

    Gretel Synthetics

    Synthetic data generators for structured and unstructured text

    Unlock unlimited possibilities with synthetic data. Share, create, and augment data with cutting-edge generative AI. Generate unlimited data in minutes with synthetic data delivered as-a-service. Synthesize data that are as good or better than your original dataset, and maintain relationships and statistical insights. Customize privacy settings so that data is always safe while remaining useful for downstream workflows. Ensure data accuracy and privacy confidently with expert-grade reports....
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 14
    Behaviour Suite Reinforcement Learning

    Behaviour Suite Reinforcement Learning

    bsuite is a collection of carefully-designed experiments

    bsuite is a research framework developed by Google DeepMind that provides a comprehensive collection of experiments for evaluating the core capabilities of reinforcement learning (RL) agents. Its main goal is to identify, measure, and analyze fundamental aspects of learning efficiency and generalization in RL algorithms. The library enables researchers to benchmark their agents on standardized tasks, facilitating reproducible and transparent comparisons across different approaches. Each...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    APIPark

    APIPark

    APIPark is the #1 open-source AI Gateway and Developer Portal

    ...No matter which AI model you use, APIPark provides a one-stop integration solution. It unifies the management of all authentication information and tracks the costs of API calls. Standardize the request data format for all AI models. When switching AI models or modifying prompts, it won’t affect your app or microservices, simplifying your AI usage and reducing maintenance costs. You can quickly combine AI models and prompts into new APIs. For example, using OpenAI GPT-4 and custom prompts, you can create sentiment analysis APIs, translation APIs, or data analysis APIs. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16

    Protocol Buffers

    Google's data interchange format

    Protocol Buffers are Google’s fast and simple, language- and platform-neutral, extensible mechanism for serializing structured data. It allows you to define how your data should be structured once, and then using a special generated source code, you can then easily write and read your structured data to and from a variety of data streams and using a variety of languages. Protocol Buffers currently supports a wide array of languages, including C++, Java, Python, Ruby, and many others with more to come.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 17
    CTGAN

    CTGAN

    Conditional GAN for generating synthetic tabular data

    CTGAN is a collection of Deep Learning based synthetic data generators for single table data, which are able to learn from real data and generate synthetic data with high fidelity. If you're just getting started with synthetic data, we recommend installing the SDV library which provides user-friendly APIs for accessing CTGAN. The SDV library provides wrappers for preprocessing your data as well as additional usability features like constraints. When using the CTGAN library directly, you may...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    bearer

    bearer

    Code security scanning tool (SAST) to discover security risks

    Welcome to the Bearer documentation. Bearer is a static application security testing (SAST) tool that scans your source code and analyzes your data flows to discover, filter and prioritize security risks and vulnerabilities leading to sensitive data exposures (PII, PHI, PD). We provides built-in rules against a common set of security risks and vulnerabilities, known as OWASP Top 10. Leakage of sensitive data through cookies, internal loggers, third-party logging services, and into analytics...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 19
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps. Dash apps are very lightweight, requiring only a limited number of lines of Python or...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Flow

    Flow

    A static type checker for JavaScript

    Flow is a static type checker for JavaScript. It was designed to help improve code quality and developer productivity. It does this through several smart capabilities. First, it identifies problems as you code, so you no longer have to waste time guessing and checking again and again. Second, it understands your code and makes its knowledge available, allowing you to build other smart tools on top of it. Third, it helps you refactor safely so you can focus on the changes you want to make and...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 21
    GDScript Toolkit

    GDScript Toolkit

    Independent set of GDScript tools - parser, linter and formatter

    Independent set of GDScript tools, parser, linter and formatter. This project provides a set of tools for daily work with GDScript. At the moment it provides a parser that produces a parse tree for debugging and educational purposes. A linter that performs a static analysis according to some predefined configuration. A formatter that formats the code according to some predefined rules. A code metrics calculator which calculates the cyclomatic complexity of functions and classes. To install...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 22
    NVTX (NVIDIA Tools Extension Library)

    NVTX (NVIDIA Tools Extension Library)

    C-based Application Programming Interface (API)

    NVTX (NVIDIA Tools Extension) is a cross-platform API designed to annotate source code with rich metadata that can be consumed by developer profiling and debugging tools. It allows developers to insert markers, ranges, and events directly into their applications, providing contextual insight into how code executes on CPUs and GPUs. These annotations are visualized in tools such as NVIDIA Nsight Systems and Nsight Compute, enabling developers to identify performance bottlenecks, track...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Peroxide

    Peroxide

    Rust numeric library with high performance and friendly syntax

    Rust numeric library contains linear algebra, numerical analysis, statistics and machine learning tools with R, MATLAB, Python-like macros. Peroxide uses a 1D data structure to represent matrices, making it straightforward to integrate with BLAS (Basic Linear Algebra Subprograms). This means that Peroxide can guarantee excellent performance for linear algebraic computations by leveraging the optimized routines provided by BLAS.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    gosec

    gosec

    Golang security checker

    ...Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance with the License. You can integrate third-party code analysis tools with GitHub code scanning by uploading data as SARIF files. The workflow shows an example of running the gosec as a step in a GitHub action workflow that outputs the results.sarif file. The workflow then uploads the results.sarif file to GitHub using the upload-serif action. Gosec can be configured to only run a subset of rules, to exclude certain file paths, and produce reports in different formats. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 25
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    ...Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling multimodal content in ways that preserve cross-modal relationships and semantic context, often treating content elements as interconnected knowledge entities rather than separate data silos. The system uses a multi-stage pipeline (e.g., document parsing, content analysis, knowledge graph construction, intelligent retrieval) so queries can navigate across modalities with deeper understanding and relevance.
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB