Showing 1214 open source projects for "python data analysis"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Granite TSFM

    Granite TSFM

    Foundation Models for Time Series

    granite-tsfm collects public notebooks, utilities, and serving components for IBM’s Time Series Foundation Models (TSFM), giving practitioners a practical path from data prep to inference for forecasting and anomaly-detection use cases. The repository focuses on end-to-end workflows: loading data, building datasets, fine-tuning forecasters, running evaluations, and serving models. It documents the currently supported Python versions and points users to where the core TSFM models are hosted and how to wire up service components. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    DeepLabCut

    DeepLabCut

    Implementation of DeepLabCut

    ...This package is collaboratively developed by the Mathis Group & Mathis Lab at EPFL (releases prior to 2.1.9 were developed at Harvard University). The code is freely available and easy to install in a few clicks with Anaconda (and pypi). DeepLabCut is an open-source Python package for animal pose estimation.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Colab-MCP

    Colab-MCP

    An MCP server for interacting with Google Colab

    ...This approach bridges the gap between local AI agents and remote high-performance compute environments, allowing users to offload heavy workloads such as machine learning training, data analysis, and dependency-heavy tasks to Colab’s GPU and TPU resources. By exposing Colab as an MCP server, the tool enables seamless integration with a wide range of AI assistants and agent frameworks, creating a standardized interface for tool use and execution.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    AutoResearchClaw

    AutoResearchClaw

    Autonomous research from idea to paper. Chat an Idea. Get a Paper 🦞

    AutoResearchClaw is an open-source framework designed to automatically generate full academic research papers from a single idea or topic. Built in Python, it orchestrates a multi-stage research pipeline that gathers literature, formulates hypotheses, runs experiments, analyzes results, and writes the final paper. The system retrieves real academic references from sources such as arXiv and Semantic Scholar to ensure credible citations. It can automatically generate code for experiments, run...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 6
    GitDiagram

    GitDiagram

    AI tool that converts GitHub repositories into interactive diagrams

    GitDiagram is an open source web application designed to help developers quickly understand the structure and architecture of GitHub repositories by automatically generating interactive diagrams. It analyzes repository metadata such as the file tree and project documentation to build a visual representation of how different components of a project relate to one another. It uses an AI-powered pipeline to interpret repository structure and transform that information into system design diagrams...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    OpenViking

    OpenViking

    Context database designed specifically for AI Agents

    OpenViking is an open-source context database engineered for efficient indexing and retrieval of large amounts of unstructured or semi-structured context data used by AI applications. It’s primarily designed to serve as a high-performance, scalable backend for storing app context, embeddings, conversational histories, and other textual artifacts that need rapid lookup and semantic search, which makes it especially useful for systems like chatbots or memory-augmented agents. The project is...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Eidos

    Eidos

    An extensible framework for Personal Data Management

    ...Unlike cloud-based knowledge tools, Eidos runs entirely on the user’s machine, ensuring privacy and high performance through local storage. The platform integrates large language models to enable AI-assisted features such as summarizing documents, translating content, and interacting with stored data conversationally. It also includes an extension system that allows developers to create custom tools, scripts, and workflows using programming languages such as TypeScript or Python.
    Downloads: 26 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    Browser Use

    Browser Use

    Make websites accessible for AI agents

    Browser-Use is a framework that makes websites accessible for AI agents, enabling automated interactions and data extraction from web pages.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    ...The first software requirement is Python 2.6, 2.7, or Python 3.3+. This is required to use the library. PyAudio is required if and only if you want to use microphone input (Microphone). PyAudio version 0.2.11+ is required, as earlier versions have known memory management bugs when recording from microphones in certain situations. To hack on this library, first make sure you have all the requirements listed in the "Requirements" section.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 12
    notebooklm-py

    notebooklm-py

    Unofficial Python API and agentic skill for Google NotebookLM

    ...The project covers notebook management, source ingestion, conversational querying, research workflows, and sharing controls, while also enabling the generation of a wide range of study and media artifacts. These outputs include audio overviews, videos, slide decks, infographics, quizzes, flashcards, reports, data tables, and mind maps, with configurable formats and export options.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 13
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    Wan2.2 is a major upgrade to the Wan series of open and advanced large-scale video generative models, incorporating cutting-edge innovations to boost video generation quality and efficiency. It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting,...
    Downloads: 178 This Week
    Last Update:
    See Project
  • 14
    Transformer Debugger

    Transformer Debugger

    Tool for exploring and debugging transformer model behaviors

    Transformer Debugger (TDB) is a research tool developed by OpenAI’s Superalignment team to investigate and interpret the behaviors of small language models. It combines automated interpretability methods with sparse autoencoders, enabling researchers to analyze how specific neurons, attention heads, and latent features contribute to a model’s outputs. TDB allows users to intervene directly in the forward pass of a model and observe how such interventions change predictions, making it...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Petastorm

    Petastorm

    Petastorm library enables single machine or distributed training

    ...It can also be used from pure Python code. A dataset created using Petastorm is stored in Apache Parquet format. On top of a Parquet schema, petastorm also stores higher-level schema information that makes multidimensional arrays into a native part of a petastorm dataset. Petastorm supports extensible data codecs. These enable a user to use one of the standard data compressions (jpeg, png) or implement her own.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MathModelAgent

    MathModelAgent

    An Agent Designed for Mathematical Modeling

    MathModelAgent is an AI agent system designed specifically for assisting with mathematical modeling tasks and academic problem solving. The platform automates the process of analyzing mathematical problems, constructing models, generating code for simulations or computations, and producing a complete research-style report. The project uses a multi-agent architecture where different specialized agents handle tasks such as problem interpretation, modeling design, programming implementation,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    MemU

    MemU

    MemU is an open-source memory framework for AI companions

    MemU is an agentic memory layer for LLM applications, specifically designed for AI companions. Transform your memory into an intelligent file system that automatically organizes, connects, and evolves with your memories. Simple, fast, and reliable memory infrastructure for AI applications. Powerful tools and dedicated support to scale your AI applications with confidence. Full proprietary features, commercial usage rights, and white-labeling options for your enterprise needs. SSO/RBAC...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 18
    pi-autoresearch

    pi-autoresearch

    Autonomous experiment loop extension for pi

    pi-autoresearch is an automation-oriented research assistant project that focuses on orchestrating iterative information gathering, analysis, and synthesis workflows with minimal human intervention. It is designed to simulate a continuous research loop where queries are generated, refined, and expanded based on previous outputs, enabling deeper exploration of complex topics. The system likely integrates with external data sources or APIs to retrieve information and process it into structured insights. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    uqlm

    uqlm

    Uncertainty Quantification for Language Models, is a Python package

    UQLM is a Python library developed to detect hallucinations and quantify uncertainty in the outputs of large language models. The system implements a variety of uncertainty quantification techniques that assign confidence scores to model responses. These scores help developers determine how likely a generated answer is to contain errors or fabricated information.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    MineContext

    MineContext

    MineContext is your proactive context-aware AI partner

    MineContext is an open-source, proactive AI assistant designed to capture, understand, and leverage a user’s digital context in order to provide meaningful insights, summaries, and productivity support. The system continuously collects contextual data from sources such as screenshots and user activity, then processes and organizes this information into structured knowledge that can be reused later. Unlike traditional chat-based assistants, MineContext operates in the background and delivers...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    UMAP

    UMAP

    Uniform Manifold Approximation and Projection

    Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualization similarly to t-SNE, but also for general non-linear dimension reduction. It is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low-dimensional projection of the data that has the closest possible equivalent fuzzy topological structure. First of all UMAP is fast. It can handle large datasets and high dimensional...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    SetFit

    SetFit

    Efficient few-shot learning with Sentence Transformers

    SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers. It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    TaxHacker

    TaxHacker

    Self-hosted AI accounting app. LLM analyzer for receipts

    ...It integrates large language models to analyze these documents, extract relevant financial information, and categorize expenses or income based on configurable rules. Users can deploy the application on their own infrastructure, ensuring that financial data remains private and under their control rather than being processed by external services. The software provides tools for tracking income streams, monitoring expenses, and organizing financial records in a structured format. Because the system supports customizable prompts and categories, users can adapt the AI analysis to match their accounting workflows or tax requirements.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    SageMaker Training Toolkit

    SageMaker Training Toolkit

    Train machine learning models within Docker containers

    Train machine learning models within a Docker container using Amazon SageMaker. Amazon SageMaker is a fully managed service for data science and machine learning (ML) workflows. You can use Amazon SageMaker to simplify the process of building, training, and deploying ML models. To train a model, you can include your training script and dependencies in a Docker container that runs your training code. A container provides an effectively isolated environment, ensuring a consistent runtime and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 25
    High-Level Training Utilities Pytorch

    High-Level Training Utilities Pytorch

    High-level training, data augmentation, and utilities for Pytorch

    Contains significant improvements, bug fixes, and additional support. Get it from the releases, or pull the master branch. This package provides a few things. A high-level module for Keras-like training with callbacks, constraints, and regularizers. Comprehensive data augmentation, transforms, sampling, and loading. Utility tensor and variable functions so you don't need numpy as often. Have any feature requests? Submit an issue! I'll make it happen. Specifically, any data augmentation, data...
    Downloads: 5 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB