Showing 1057 open source projects for "python data analysis"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    llmware

    llmware

    Unified framework for building enterprise RAG pipelines

    llmware is an open source framework designed to simplify the creation of enterprise-grade applications powered by large language models. The platform focuses on building secure and private AI workflows that can run locally on laptops, edge devices, or self-hosted servers without relying exclusively on cloud APIs. It provides a unified interface for constructing retrieval-augmented generation pipelines, agent workflows, and document intelligence applications. One of the framework’s defining...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Code2Prompt

    Code2Prompt

    Convert codebases into structured prompts optimized for LLM analysis

    code2prompt is an open source command line tool designed to convert an entire codebase into a structured prompt that can be easily used with large language models. It analyzes a project directory, gathers relevant source files, and formats them into a single prompt that includes the source tree and code content. This approach helps developers quickly provide full project context to AI models without manually copying files or assembling prompts. code2prompt is built in Rust and focuses on...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    gTTS

    gTTS

    Python library and CLI tool to interface with Google Translate

    gTTS (Google Text-to-Speech) is a Python library and command-line tool that wraps the speech functionality of Google Translate. It lets you send text to the Google Translate TTS endpoint and receive spoken audio back as MP3 data, either written to a file, a file-like object, or standard output. The library is designed to handle long texts, using a speech-specific sentence tokenizer that keeps intonation and punctuation natural while splitting requests into acceptable chunks. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 10 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    Xiyan MCP Server

    Xiyan MCP Server

    A Model Context Protocol (MCP) server

    The XiYan MCP Server is a Model Context Protocol (MCP) server that enables natural language queries to databases, powered by XiYan-SQL, a state-of-the-art text-to-SQL model. It allows users to interact with databases using conversational language, simplifying data retrieval processes. ​
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    PyCaret

    PyCaret

    An open-source, low-code machine learning library in Python

    ...This makes experiments exponentially fast and efficient. PyCaret is essentially a Python wrapper around several machine learning libraries and frameworks such as scikit-learn, XGBoost, LightGBM, CatBoost, Optuna, Hyperopt, Ray, and few more. The design and simplicity of PyCaret are inspired by the emerging role of citizen data scientists, a term first used by Gartner. Citizen Data Scientists are power users who can perform both simple and moderately sophisticated analytical tasks that would previously have required more technical expertise.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PipesHub

    PipesHub

    Workplace AI platform for enterprise search and workflow automation

    PipesHub AI is an open-source, enterprise-grade workplace AI platform designed to unify search, knowledge management, and workflow automation across distributed organizational systems. It connects to a wide range of enterprise tools such as Google Workspace, Slack, Jira, and Confluence, aggregating data into a centralized knowledge layer that can be queried using natural language. The platform uses knowledge graphs and ranking algorithms to provide context-rich answers along with traceable...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models. It is evaluated on many benchmarks (speech recognition, translation, sound...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    pmdarima

    pmdarima

    Statistical library designed to fill the void in Python's time series

    A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Griptape

    Griptape

    Python framework for AI workflows and pipelines with chain of thought

    The Griptape framework provides developers with the ability to create AI systems that operate across two dimensions: predictability and creativity. For predictability, Griptape enforces structures like sequential pipelines, DAG-based workflows, and long-term memory. To facilitate creativity, Griptape safely prompts LLMs with tools (keeping output data off prompt by using short-term memory), which connects them to external APIs and data stores. The framework allows developers to transition...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    WanGP

    WanGP

    AI video generator optimized for low VRAM and older GPUs use

    Wan2GP is an open source AI video generation toolkit designed to make modern generative models accessible on consumer-grade hardware with limited GPU memory. It acts as a unified interface for running multiple video, image, and audio generation models, including Wan-based models as well as other systems like Hunyuan Video, Flux, and Qwen. A key focus of the project is reducing VRAM requirements, enabling some workflows to run on as little as 6 GB while still supporting older Nvidia and...
    Downloads: 40 This Week
    Last Update:
    See Project
  • 12
    DeepWiki Open

    DeepWiki Open

    AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories

    DeepWiki Open is an open-source, AI-powered wiki generator that automatically creates fully navigable, richly structured wiki documentation for GitHub, GitLab, or Bitbucket repositories by combining code analysis, vector embeddings, retrieval-augmented generation (RAG), and visualization tools. Users can enter a repository URL and the system will clone the project, build semantic embeddings of its codebase, extract architecture and relationships, generate human-readable documentation, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ChemCrow

    ChemCrow

    Chemcrow

    ChemCrow is an AI-powered framework designed to assist in chemical research and discovery. It integrates AI models with chemical knowledge bases to provide intelligent recommendations for synthesis planning, reaction prediction, and material discovery. This tool helps automate and accelerate research in computational chemistry and drug development.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 14
    Recommenders

    Recommenders

    Best practices on recommendation systems

    The Recommenders repository provides examples and best practices for building recommendation systems, provided as Jupyter notebooks. The module reco_utils contains functions to simplify common tasks used when developing and evaluating recommender systems. Several utilities are provided in reco_utils to support common tasks such as loading datasets in the format expected by different algorithms, evaluating model outputs, and splitting training/test data. Implementations of several...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    fastai

    fastai

    Deep learning library

    ...It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying patterns of many deep learning and data processing techniques in terms of decoupled abstractions. These abstractions can be expressed concisely and clearly by leveraging the dynamism of the underlying Python language and the flexibility of the PyTorch library. fastai is organized around two main design goals: to be approachable and rapidly productive, while also being deeply hackable and configurable. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    MedgeClaw

    MedgeClaw

    Open-source AI research assistant for biomedicine

    MedgeClaw is a specialized AI-powered research assistant tailored for biomedical and scientific workflows, built on top of OpenClaw and Claude Code architectures. It integrates a large library of domain-specific skills, enabling it to perform complex analyses in areas such as genomics, drug discovery, and clinical research. The system connects conversational interfaces with computational environments, allowing users to initiate research tasks through messaging platforms while the backend...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Label Sleuth

    Label Sleuth

    Open source no-code system for text annotation and building of text

    An open-source no-code system for text annotation and building text classifiers. No AI knowledge needed. From task definition to working model in just a few hours! While domain experts label their data, Label Sleuth automatically trains in the background-appropriate machine learning models. To avoid wasted labeling effort, Label Sleuth employs active learning techniques to guide the user in what they should be labeled next. Domain experts can quickly start labeling their data through an...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    Gradio

    Gradio

    Create UIs for your machine learning model in Python in 3 minutes

    ...One of the best ways to share your machine learning model, API, or data science workflow with others is to create an interactive demo that allows your users or colleagues to try out the demo in their browsers.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 19
    Wanwu AI Agent Platform

    Wanwu AI Agent Platform

    Enterprise AI agent platform for workflows, models, and RAG apps

    Wanwu is an enterprise-grade AI agent development platform designed to help organizations build and deploy intelligent applications at scale. It provides a multi-tenant environment that enables teams to create AI agents, orchestrate workflows, and implement retrieval-augmented generation systems within a unified framework. Wanwu integrates large language models with business process automation, allowing developers to design complex, production-ready AI solutions tailored to enterprise needs....
    Downloads: 12 This Week
    Last Update:
    See Project
  • 20
    GitDiagram

    GitDiagram

    AI tool that converts GitHub repositories into interactive diagrams

    GitDiagram is an open source web application designed to help developers quickly understand the structure and architecture of GitHub repositories by automatically generating interactive diagrams. It analyzes repository metadata such as the file tree and project documentation to build a visual representation of how different components of a project relate to one another. It uses an AI-powered pipeline to interpret repository structure and transform that information into system design diagrams...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ai-cookbook

    ai-cookbook

    Examples and tutorials to help developers build AI systems

    ...The repository contains examples that demonstrate how to build AI workflows using modern tools such as large language models, autonomous agents, and external APIs. Developers can learn how to construct applications like intelligent assistants, automation pipelines, and AI-powered data analysis tools through step-by-step tutorials and ready-to-run scripts. The code examples are designed to emphasize practical architecture patterns that are commonly used in production environments, helping developers understand how to integrate AI services into software products.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    UltraRAG

    UltraRAG

    Less Code, Lower Barrier, Faster Deployment

    UltraRAG 2.0 is a low-code, MCP-enabled RAG framework that aims to lower the barrier to building complex retrieval pipelines for research and production. It provides end-to-end recipes—from encoding and indexing corpora to deploying retrievers and LLMs—so users can reproduce baselines and iterate rapidly. The toolkit comes with built-in support for popular RAG datasets, large corpora, and canonical baselines, plus documentation that walks from “quick start” to debugging and case analysis. It...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    TorchRec

    TorchRec

    Pytorch domain library for recommendation systems

    TorchRec is a PyTorch domain library built to provide common sparsity & parallelism primitives needed for large-scale recommender systems (RecSys). It allows authors to train models with large embedding tables sharded across many GPUs. Parallelism primitives that enable easy authoring of large, performant multi-device/multi-node models using hybrid data-parallelism/model-parallelism. The TorchRec sharder can shard embedding tables with different sharding strategies including data-parallel,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    RecBole

    RecBole

    A unified, comprehensive and efficient recommendation library

    A unified, comprehensive and efficient recommendation library. We design general and extensible data structures to unify the formatting and usage of various recommendation datasets. We implement more than 100 commonly used recommendation algorithms and provide formatted copies of 28 recommendation datasets. We support a series of widely adopted evaluation protocols or settings for testing and comparing recommendation algorithms.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Avalanche

    Avalanche

    End-to-End Library for Continual Learning based on PyTorch

    Avalanche is an end-to-end Continual Learning library based on Pytorch, born within ContinualAI with the unique goal of providing a shared and collaborative open-source (MIT licensed) codebase for fast prototyping, training and reproducible evaluation of continual learning algorithms. Avalanche can help Continual Learning researchers in several ways. This module maintains a uniform API for data handling: mostly generating a stream of data from one or more datasets. It contains all the major...
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB