Showing 1396 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Python Client For NLP Cloud

    Python Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    MCP Snowflake Server

    MCP Snowflake Server

    A Model Context Protocol (MCP) server implementation

    An MCP server implementation that facilitates database interactions with Snowflake, allowing execution of SQL queries and presentation of data insights as resources. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    DataFrame

    DataFrame

    C++ DataFrame for statistical, Financial, and ML analysis

    This is a C++ analytical library designed for data analysis similar to libraries in Python and R. For example, you would compare this to Pandas, R data.frame, or Polars. You can slice the data in many different ways. You can join, merge, and group-by the data. You can run various statistical, summarization, financial, and ML algorithms on the data. You can add your custom algorithms easily.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    Open Interpreter

    Open Interpreter

    A natural language interface for computers

    Open Interpreter is an open-source tool that provides a natural-language interface for interacting with your computer. It lets large language models (LLMs) run code locally (Python, JavaScript, shell, etc.), enabling you to ask your computer to do tasks like data analysis, file manipulation, browsing, etc. in human terms (“chat with your computer”), with safeguards. Runs locally or via configured remote LLM servers/inference backends, giving flexibility to use models you trust or have locally. It prompts you to approve code before executing, and supports both online LLM models and local inference servers. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 6
    Biomni

    Biomni

    Biomni: a general-purpose biomedical AI agent

    Biomni is a general-purpose biomedical AI agent designed to autonomously perform complex research tasks across a wide range of scientific domains, combining language model reasoning with structured planning and execution. It integrates retrieval-augmented generation with code-based execution, allowing it to access external knowledge, process data, and generate testable hypotheses in scientific workflows. The system is built to support researchers by automating repetitive and time-consuming tasks such as literature review, data analysis, and experimental design. Biomni operates within a comprehensive environment that includes tools, APIs, and datasets, enabling it to execute multi-step research processes rather than just generating text responses. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Databend

    Databend

    Cloud-native open source data warehouse for analytics and AI queries

    ...Databend provides a unified engine capable of handling analytics, vector search, and full-text search within a single platform. Databend supports SQL-based workflows and enables real-time data ingestion, transformation, and analysis through streaming and task orchestration features. With its cloud-native design and distributed architecture, Databend can run both as a self-hosted system or within managed environments to power data analytics, AI workloads, and large-scale data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Claude Skills

    Claude Skills

    Public repository for Agent Skills

    ...Rather than relying on handcrafted prompts every time, Skills teach an AI agent procedural knowledge and task-specific workflows so it can apply that expertise reliably, whether the task involves document creation, data analysis, design generation, or technical automation. Each Skill lives in its own directory with a SKILL.md file containing metadata and instructions, and can include supplemental scripts or assets that the agent uses to perform complex operations when relevant.
    Downloads: 104 This Week
    Last Update:
    See Project
  • 9
    SEO Machine

    SEO Machine

    A specialized Claude Code workspace for creating long-form

    SEO Machine is an AI-powered content production system built as a structured workspace for generating long-form, SEO-optimized blog content through automated workflows. It integrates research, writing, analysis, and optimization into a single pipeline, allowing users to produce high-quality articles tailored to search engine performance. The system uses specialized commands and agents to perform tasks such as keyword research, competitor analysis, content drafting, and optimization. It incorporates real data sources like Google Analytics and Search Console to guide decision-making and improve content effectiveness. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    LOTUS

    LOTUS

    AI-Powered Data Processing: Use LOTUS to process all of your datasets

    LOTUS is an open-source framework and query engine designed to enable efficient processing of structured and unstructured datasets using large language models. The system provides a declarative programming model that allows developers to express complex AI data operations using high-level commands rather than manually orchestrating model calls. It offers a Python interface with a Pandas-like API, making it familiar for data scientists and engineers already working with data analysis libraries. The core concept of the framework is the use of semantic operators, which extend traditional relational database operations to support reasoning over text and other unstructured data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning, evaluating and running models. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    IDA Pro MCP

    IDA Pro MCP

    MCP Server for IDA Pro

    The IDA Pro MCP Server is a Model Context Protocol (MCP) server designed to integrate with IDA Pro, a popular disassembler and debugger. It enables AI assistants to interact with IDA Pro, facilitating tasks such as code analysis and reverse engineering. ​
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    Model Context Protocol Python SDK

    Model Context Protocol Python SDK

    The official Python SDK for Model Context Protocol servers and clients

    The Python SDK for Model Context Protocol provides utilities to interact with the protocol, enabling seamless communication with AI models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    fastdup

    fastdup

    An unsupervised and free tool for image and video dataset analysis

    fastdup is a powerful free tool designed to rapidly extract valuable insights from your image & video datasets. Assisting you to increase your dataset images & labels quality and reduce your data operations costs at an unparalleled scale.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Machine Learning and Data Science Apps

    Machine Learning and Data Science Apps

    A curated list of applied machine learning and data science notebooks

    ...Most examples are written in Python and frequently use Jupyter notebooks to present practical implementations and experiments. The project encourages contributions from data scientists and domain experts who want to share applied analytics projects and techniques that address real business challenges.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    fireworks-tech-graph

    fireworks-tech-graph

    Claude Code skill for generating production-quality SVG+PNG technical

    fireworks-tech-graph is an AI-driven project focused on building structured knowledge graphs that map relationships between technologies, concepts, and entities within technical domains. It aims to transform unstructured information into interconnected graphs that can be queried and analyzed for insights, making it easier to understand complex ecosystems such as software stacks or research fields. The system likely leverages AI techniques for entity extraction, relationship mapping, and...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    C3

    C3

    The goal of CLAIMED is to enable low-code/no-code rapid prototyping

    ...The system emphasizes reproducibility and scalability, allowing researchers and engineers to reuse existing components and integrate them into larger scientific or data engineering workflows. It also aims to support trusted and explainable AI systems by integrating tools for fairness analysis, explainability, and adversarial robustness.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    EconML

    EconML

    Python Package for ML-Based Heterogeneous Treatment Effects Estimation

    EconML is a Python package for estimating heterogeneous treatment effects from observational data via machine learning. This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    NBA Sports Betting Machine Learning

    NBA Sports Betting Machine Learning

    NBA sports betting using machine learning

    NBA-Machine-Learning-Sports-Betting is an open-source Python project that applies machine learning techniques to predict outcomes of National Basketball Association games for analytical and betting-related research. The system gathers historical team statistics and game data spanning multiple seasons, beginning with the 2007–2008 NBA season and continuing through the present. Using this dataset, the project constructs matchup features that represent team performance trends and contextual information about each game. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    reverse-SynthID

    reverse-SynthID

    Reverse engineering Gemini's SynthID detection

    Reverse-SynthID is a research-focused project that analyzes and reverse-engineers Google’s SynthID watermarking system used in AI-generated images. It leverages signal processing and spectral analysis techniques to identify hidden watermark patterns without access to proprietary encoding methods. The project introduces a multi-resolution “SpectralCodebook” that maps watermark characteristics across different image sizes. Using this approach, it can detect SynthID watermarks with high...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 22
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    FlexLLMGen

    FlexLLMGen

    Running large language models on a single GPU

    FlexLLMGen is an open-source inference engine designed to run large language models efficiently on limited hardware resources such as a single GPU. The system focuses on high-throughput generation workloads where large batches of text must be processed quickly, such as large-scale data extraction or document analysis tasks. Instead of requiring expensive multi-GPU systems, the framework uses techniques such as memory offloading, compression, and optimized batching to run large models on commodity hardware. The architecture distributes computation and memory usage across the GPU, CPU, and disk in order to maximize the number of tokens processed during inference. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    FinGLM

    FinGLM

    Committed to building an open, public welfare

    FinGLM is an open-source financial large language model initiative aimed at advancing artificial intelligence applications within the finance industry. The project focuses on developing domain-specific language models that understand financial terminology, corporate reports, and economic datasets. By combining large language model architectures with financial datasets such as corporate annual reports and structured financial records, FinGLM aims to improve AI performance on tasks that...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MiroThinker

    MiroThinker

    MiroThinker is an open source deep research agent

    MiroThinker is an open-source deep research AI agent designed to perform complex reasoning, information gathering, and predictive analysis tasks. The system focuses on enabling long-horizon research workflows by allowing the agent to interact repeatedly with external tools, search systems, and data sources while refining its reasoning through iterative steps. Rather than simply generating responses from a single prompt, the agent performs structured multi-step reasoning processes that involve searching for information, analyzing evidence, and synthesizing conclusions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB