Showing 532 open source projects for "python source"

View related business solutions
  • Cut Data Warehouse Costs up to 54% with BigQuery Icon
    Cut Data Warehouse Costs up to 54% with BigQuery

    Migrate from Snowflake, Databricks, or Redshift with free migration tools. Exabyte scale without the Exabyte price.

    BigQuery delivers up to 54% lower TCO than cloud alternatives. Migrate from legacy or competing warehouses using free BigQuery Migration Service with automated SQL translation. Get serverless scale with no infrastructure to manage, compressed storage, and flexible pricing—pay per query or commit for deeper discounts. New customers get $300 in free credit.
    Try BigQuery Free
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 1
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively. Improving data quality and understanding your model’s failure modes are the most impactful ways to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    DearPyGui

    DearPyGui

    Graphical User Interface Toolkit for Python with minimal dependencies

    Dear PyGui is an easy-to-use, dynamic, GPU-Accelerated, cross-platform graphical user interface toolkit(GUI) for Python. It is “built with” Dear ImGui. Features include traditional GUI elements such as buttons, radio buttons, menus, and various methods to create a functional layout. Additionally, DPG has an incredible assortment of dynamic plots, tables, drawings, debuggers, and multiple resource viewers. DPG is well suited for creating simple user interfaces as well as developing complex...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    AWS SDK for pandas

    AWS SDK for pandas

    Easy integration with Athena, Glue, Redshift, Timestream, Neptune

    aws-sdk-pandas (formerly AWS Data Wrangler) bridges pandas with the AWS analytics stack so DataFrames flow seamlessly to and from cloud services. With a few lines of code, you can read from and write to Amazon S3 in Parquet/CSV/JSON/ORC, register tables in the AWS Glue Data Catalog, and query with Amazon Athena directly into pandas. The library abstracts efficient patterns like partitioning, compression, and vectorized I/O so you get performant data lake operations without hand-rolling...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    The PyPlot module for Julia

    The PyPlot module for Julia

    Plotting for Julia based on matplotlib.pyplot

    This module provides a Julia interface to the Matplotlib plotting library from Python, and specifically to the matplotlib.pyplot module. PyPlot uses the Julia PyCall package to call Matplotlib directly from Julia with little or no overhead (arrays are passed without making a copy). (See also PythonPlot.jl for a version of PyPlot.jl using the alternative PythonCall.jl package.) This package takes advantage of Julia's multimedia I/O API to display plots in any Julia graphical backend,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 5
    Apache Airflow Provider

    Apache Airflow Provider

    Great Expectations Airflow operator

    Due to apply_default decorator removal, this version of the provider requires Airflow 2.1.0+. If your Airflow version is 2.1.0, and you want to install this provider version, first upgrade Airflow to at least version 2.1.0. Otherwise, your Airflow package version will be upgraded automatically, and you will have to manually run airflow upgrade db to complete the migration. This operator currently works with the Great Expectations V3 Batch Request API only. If you would like to use the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    VisPy

    VisPy

    Main repository for Vispy

    Vispy is an open-source, high-performance interactive visualization library in Python, designed for creating scientific visualizations and interactive plots. It leverages the power of modern Graphics Processing Units (GPUs) through OpenGL to render large datasets efficiently. Vispy supports a wide range of visualization types, including 2D plots, 3D visualizations, volume rendering, and more, making it suitable for scientific research, data analysis, and educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PySyft

    PySyft

    Data science on data without acquiring a copy

    Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    PySR

    PySR

    High-Performance Symbolic Regression in Python and Julia

    PySR is an open-source tool for Symbolic Regression: a machine learning task where the goal is to find an interpretable symbolic expression that optimizes some objective. Over a period of several years, PySR has been engineered from the ground up to be (1) as high-performance as possible, (2) as configurable as possible, and (3) easy to use. PySR is developed alongside the Julia library SymbolicRegression.jl, which forms the powerful search engine of PySR. The details of these algorithms are...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    ClearML

    ClearML

    Streamline your ML workflow

    ClearML is an open source platform that automates and simplifies developing and managing machine learning solutions for thousands of data science teams all over the world. It is designed as an end-to-end MLOps suite allowing you to focus on developing your ML code & automation, while ClearML ensures your work is reproducible and scalable. The ClearML Python Package for integrating ClearML into your existing scripts by adding just two lines of code, and optionally extending your experiments and other workflows with ClearML powerful and versatile set of classes and methods. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 10
    Metaflow

    Metaflow

    A framework for real-life data science

    Metaflow is a human-friendly Python library that helps scientists and engineers build and manage real-life data science projects. Metaflow was originally developed at Netflix to boost productivity of data scientists who work on a wide variety of projects from classical statistics to state-of-the-art deep learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Population Shift Monitoring

    Population Shift Monitoring

    Monitor the stability of a Pandas or Spark dataframe

    popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets. popmon creates histograms of features binned in time-slices, and compares the stability of the profiles and distributions of those histograms using statistical tests, both over time and with respect to a reference. It works with numerical, ordinal, categorical features, and the histograms can be higher-dimensional, e.g. it can also track correlations between any two...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Timesketch

    Timesketch

    Collaborative forensic timeline analysis

    Timesketch is a collaborative forensic timeline analysis platform used to investigate security incidents by turning diverse evidence into a single, searchable chronology. Analysts ingest logs and artifacts from many sources—endpoints, servers, cloud services—and Timesketch normalizes them into events on a unified timeline. Powerful search, aggregations, and saved views help you pivot quickly, highlight anomalies, and preserve investigative steps for later review. The system supports tagging,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Panda-Helper

    Panda-Helper

    Panda-Helper: Data profiling utility for Pandas DataFrames and Series

    Panda-Helper is a simple data-profiling utility for Pandas DataFrames and Series. Assess data quality and usefulness with minimal effort. Quickly perform initial data exploration, so you can move on to more in-depth analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Run Page

    Run Page

    Make your own running home page

    GitHub Actions manages automatic synchronization of runs and generation of new pages. Gatsby-generated static pages, fast. Support for Vercel (recommended) and GitHub Pages automated deployment. React Hooks. Mapbox for map display. Supports most sports apps such as nike strava. Automatically backup gpx data for easy backup and uploading to other software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    AI Data Science Team is a Python library and agent ecosystem designed to accelerate and automate common data science workflows by modeling them as specialized AI “agents” that can be orchestrated to perform tasks like data cleaning, transformation, analysis, visualization, and machine learning. It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    nb-clean

    nb-clean

    Clean Jupyter notebooks of outputs, metadata, and empty cells

    nb-clean cleans Jupyter notebooks of cell execution counts, metadata, outputs, and (optionally) empty cells, preparing them for committing to version control. It provides both a Git filter and pre-commit hook to automatically clean notebooks before they're staged, and can also be used with other version control systems, as a command line tool, and as a Python library. It can determine if a notebook is clean or not, which can be used as a check in your continuous integration pipelines....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PyVista

    PyVista

    3D plotting and mesh analysis through a streamlined interface

    3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK). PyVista is a helper module for the Visualization Toolkit (VTK) that takes a different approach on interfacing with VTK through NumPy and direct array access. This package provides a Pythonic, well-documented interface exposing VTK’s powerful visualization backend to facilitate rapid prototyping, analysis, and visual integration of spatially referenced datasets. This module can be used for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    classic.tplx

    classic.tplx

    A more accurate representation of jupyter notebooks

    A more accurate representation of Jupyter notebooks when converting to pdfs. This template was designed to make converted Jupyter notebooks look (almost) identical to the actual notebook. If something doesn't exist in the original notebook then it doesn't belong in the conversion. As of nbconvert 5.5.0, the majority of these improvements have been merged into nbconvert's default template. Version 3.x of this package will continue to support nbconvert 5.5.0 and lower, whereas in the future...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    NannyML

    NannyML

    Detecting silent model failure. NannyML estimates performance

    NannyML is an open-source python library that allows you to estimate post-deployment model performance (without access to targets), detect data drift, and intelligently link data drift alerts back to changes in model performance. Built for data scientists, NannyML has an easy-to-use interface, and interactive visualizations, is completely model-agnostic, and currently supports all tabular classification use cases.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    OpenCL.jl

    OpenCL.jl

    OpenCL Julia bindings

    Julia interface for the OpenCL parallel computation API. This package aims to be a complete solution for OpenCL programming in Julia, similar in scope to PyOpenCL for Python. It provides a high level API for OpenCL to make programing hardware accelerators, such as GPUs, FPGAs, and DSPs, as well as multicore CPUs much less onerous.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Encord Active

    Encord Active

    The toolkit to test, validate, and evaluate your models and surface

    Encord Active is an open-source toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling to supercharge model performance. Encord Active has been designed as a all-in-one open source toolkit for improving your data quality and model performance. Use the intuitive UI to explore your data or access all the functionalities programmatically. Discover errors, outliers, and edge-cases within your data - all in one open source...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    airda

    airda

    airda(Air Data Agent

    airda(Air Data Agent) is a multi-smart body for data analysis, capable of understanding data development and data analysis needs, understanding data, generating data-oriented queries, data visualization, machine learning and other tasks of SQL and Python codes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    KubeRay

    KubeRay

    A toolkit to run Ray applications on Kubernetes

    KubeRay is a powerful, open-source Kubernetes operator that simplifies the deployment and management of Ray applications on Kubernetes. It offers several key components. KubeRay core: This is the official, fully-maintained component of KubeRay that provides three custom resource definitions, RayCluster, RayJob, and RayService. These resources are designed to help you run a wide range of workloads with ease.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB