Showing 1057 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 1
    NVIDIA FLARE

    NVIDIA FLARE

    NVIDIA Federated Learning Application Runtime Environment

    NVIDIA Federated Learning Application Runtime Environment NVIDIA FLARE is a domain-agnostic, open-source, extensible SDK that allows researchers and data scientists to adapt existing ML/DL workflows(PyTorch, TensorFlow, Scikit-learn, XGBoost etc.) to a federated paradigm. It enables platform developers to build a secure, privacy-preserving offering for a distributed multi-party collaboration. NVIDIA FLARE is built on a componentized architecture that allows you to take federated...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    PaddleNLP

    PaddleNLP

    Easy-to-use and powerful NLP library with Awesome model zoo

    PaddleNLP It is a natural language processing development library for flying paddles, with Easy-to-use text area API, Examples of applications for multiple scenarios, and High-performance distributed training Three major features, aimed at improving the modeling efficiency of the flying oar developer's text field, aiming to improve the developer's development efficiency in the text field, and provide rich examples of NLP applications. Provide rich industry-level pre-task capabilities...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    MiniSom

    MiniSom

    MiniSom is a minimalistic implementation of the Self Organizing Maps

    MiniSom is a minimalistic and Numpy-based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial Neural Network able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. Minisom is designed to allow researchers to easily build on top of it and to give students the ability to quickly grasp its details. The project initially aimed for a minimalistic implementation of...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    Nixtla TimeGPT

    Nixtla TimeGPT

    TimeGPT-1: production ready pre-trained Time Series Foundation Model

    TimeGPT is a production ready, generative pretrained transformer for time series. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code. Whether you're a bank forecasting market trends or a startup predicting product demand, TimeGPT democratizes access to cutting-edge predictive insights, eliminating the need for a dedicated team of machine learning engineers. A generative model for time series. TimeGPT is capable of...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    bbox-visualizer

    bbox-visualizer

    Make drawing and labeling bounding boxes easy as cake

    Make drawing and labeling bounding boxes easy as cake. This package helps users draw bounding boxes around objects, without doing the clumsy math that you'd need to do for positioning the labels. It also has a few different types of visualizations you can use for labeling objects after identifying them. There are optional functions that can draw multiple bounding boxes and/or write multiple labels on the same image, but it is advisable to use the above functions in a loop in order to have...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    talos

    talos

    Hyperparameter Optimization for TensorFlow, Keras and PyTorch

    Talos radically changes the ordinary Keras, TensorFlow (tf.keras), and PyTorch workflow by fully automating hyperparameter tuning and model evaluation. Talos exposes Keras and TensorFlow (tf.keras) and PyTorch functionality entirely and there is no new syntax or templates to learn. Talos is made for data scientists and data engineers that want to remain in complete control of their TensorFlow (tf.keras) and PyTorch models, but are tired of mindless parameter hopping and confusing...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    UpTrain

    UpTrain

    Your open-source LLM evaluation toolkit

    Get scores for factual accuracy, context retrieval quality, guideline adherence, tonality, and many more. You can’t improve what you can’t measure. UpTrain continuously monitors your application's performance on multiple evaluation criterions and alerts you in case of any regressions with automatic root cause analysis. UpTrain enables fast and robust experimentation across multiple prompts, model providers, and custom configurations, by calculating quantitative scores for direct comparison...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    JamAI Base

    JamAI Base

    The collaborative spreadsheet for AI

    JamAI Base is an open-source backend platform designed to simplify the development of retrieval-augmented generation systems and AI-driven applications. The platform integrates both a relational database and a vector database into a single embedded architecture, allowing developers to store structured data alongside semantic embeddings. It includes built-in orchestration for large language models, vector search, and reranking pipelines so that AI applications can retrieve relevant...
    Downloads: 8 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    NVIDIA Generative AI Examples

    NVIDIA Generative AI Examples

    Generative AI reference workflows

    NVIDIA GenerativeAIExamples is an open-source repository that provides practical reference implementations and example workflows for building generative AI applications using NVIDIA’s software ecosystem. The project is designed to help developers accelerate the development of AI applications by providing ready-to-run pipelines, notebooks, and tools that demonstrate how to integrate large language models into real-world systems. The repository includes examples covering topics such as...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    Flowly AI

    Flowly AI

    Flowly is 100x faster than OpenClaw

    Flowly is an open-source personal AI assistant that runs locally on your machine and connects to multiple communication platforms like Telegram, WhatsApp, Discord, and Slack. It acts as a centralized AI system that can perform tasks such as web browsing, file management, command execution, scheduling, and more—all while keeping your data private. Designed for flexibility, Flowly supports multiple AI providers and models through LiteLLM, allowing users to customize how their assistant...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    MedicalGPT

    MedicalGPT

    MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training

    MedicalGPT training medical GPT model with ChatGPT training pipeline, implementation of Pretraining, Supervised Finetuning, Reward Modeling and Reinforcement Learning. MedicalGPT trains large medical models, including secondary pre-training, supervised fine-tuning, reward modeling, and reinforcement learning training.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Orion

    Orion

    A machine learning library for detecting anomalies in signals

    Orion is a machine-learning library built for unsupervised time series anomaly detection. Such signals are generated by a wide variety of systems, few examples include telemetry data generated by satellites, signals from wind turbines, and even stock market price tickers. We built this to provide one place where users can find the latest and greatest in machine learning and deep learning world including our own innovations. Abstract away from the users the nitty-gritty about preprocessing,...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    Memobase

    Memobase

    Fast backend for long-term AI user memory via structured profiles

    Memobase is an open source backend system that enables long-term user memory functionality for AI applications by capturing and structuring information about users across interactions. Its design centers on creating user profiles and recording event timelines, allowing AI systems to remember, understand, and evolve in their behaviour toward individual users over time. Instead of relying purely on traditional embedding-based retrieval or RAG systems, Memobase uses profile and timeline...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    Cognita

    Cognita

    Open source RAG framework for building scalable modular AI apps

    Cognita is an open source framework designed to help developers build, organize, and deploy Retrieval-Augmented Generation (RAG) applications in a structured and production-ready way. It addresses the gap between quick experimentation in notebooks and the complexity of deploying scalable AI systems by introducing a modular and API-driven architecture. Cognita provides reusable components such as parsers, data loaders, embedders, retrievers, and query controllers, allowing teams to customize...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Triton Inference Server

    Triton Inference Server

    The Triton Inference Server provides an optimized cloud

    Triton Inference Server is an open-source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from multiple deep learning and machine learning frameworks, including TensorRT, TensorFlow, PyTorch, ONNX, OpenVINO, Python, RAPIDS FIL, and more. Triton supports inference across cloud, data center, edge, and embedded devices on NVIDIA GPUs, x86 and ARM CPU, or AWS Inferentia. Triton delivers optimized performance for many query types, including real-time, batched, ensembles, and audio/video streaming. Provides Backend API that allows adding custom backends and pre/post-processing operations. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 18
    PyTorch Geometric

    PyTorch Geometric

    Geometric deep learning extension library for PyTorch

    It consists of various methods for deep learning on graphs and other irregular structures, also known as geometric deep learning, from a variety of published papers. In addition, it consists of an easy-to-use mini-batch loader for many small and single giant graphs, a large number of common benchmark datasets (based on simple interfaces to create your own), and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. We have outsourced a lot of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    AutoClip

    AutoClip

    AI-powered video clipping and highlight generation

    AutoClip is an open-source, AI-powered video processing system designed to automate the extraction of “highlight” segments from full-length videos — ideal for creators who want to generate bite-sized clips, compilations, or highlight reels without manually sifting through hours of footage. The system supports downloading videos from major platforms (e.g. YouTube, Bilibili), or accepting local uploads, and then applies AI analysis to identify segments worth clipping based on content (e.g....
    Downloads: 17 This Week
    Last Update:
    See Project
  • 20
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 21
    nanoGPT

    nanoGPT

    The simplest, fastest repository for training/finetuning models

    NanoGPT is a minimalistic yet powerful reimplementation of GPT-style transformers created by Andrej Karpathy for educational and research use. It distills the GPT architecture into a few hundred lines of Python code, making it far easier to understand than large, production-scale implementations. The repo is organized with a training pipeline (dataset preprocessing, model definition, optimizer, training loop) and inference script so you can train a small GPT on text datasets like Shakespeare...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Continuous Claude v3

    Continuous Claude v3

    Context management for Claude Code. Hooks maintain state via ledgers

    Continuous Claude v3 is a persistent, multi-agent development environment built around the Claude Code CLI that aims to overcome the limitations of standard LLM context windows. Rather than relying on a single session’s context, Continuous Claude uses mechanisms like ledgers, YAML handoffs, and a memory system to preserve and recall state across multiple sessions, ensuring that learned insights and plans are not lost when context compaction occurs. The project orchestrates many specialized...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 24
    DeepSeek-V3.2-Exp

    DeepSeek-V3.2-Exp

    An experimental version of DeepSeek model

    DeepSeek-V3.2-Exp is an experimental release of the DeepSeek model family, intended as a stepping stone toward the next generation architecture. The key innovation in this version is DeepSeek Sparse Attention (DSA), a sparse attention mechanism that aims to optimize training and inference efficiency in long-context settings without degrading output quality. According to the authors, they aligned the training setup of V3.2-Exp with V3.1-Terminus so that benchmark results remain largely...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 25
    Deep Lake

    Deep Lake

    Data Lake for Deep Learning. Build, manage, and query datasets

    Deep Lake (formerly known as Activeloop Hub) is a data lake for deep learning applications. Our open-source dataset format is optimized for rapid streaming and querying of data while training models at scale, and it includes a simple API for creating, storing, and collaborating on AI datasets of any size. It can be deployed locally or in the cloud, and it enables you to store all of your data in one place, ranging from simple annotations to large videos. Deep Lake is used by Google, Waymo,...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB