Showing 178 open source projects for "python data analysis"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Khoj

    Khoj

    An AI personal assistant for your digital brain

    Get more done with your open-source AI personal assistant. Khoj is a desktop application to search and chat with your notes, documents, and images. It is an offline-first, open-source AI personal assistant that is accessible from Emacs, Obsidian or your Web browser. Khoj is a thinking tool that is transparent, fun, and easy to engage with. You can build faster and better by using Khoj to search and reason across all your data sources. Khoj learns from your notes and documents to function as...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    Swirl

    Swirl

    Swirl queries any number of data sources with APIs

    ...It's intended for use by developers and data scientists who want to solve multi-silo search problems from enterprise search to new monitoring & alerting solutions that push information to users continuously. Built on the Python/Django/RabbitMQ stack, SWIRL includes connectors to Apache Solr, ChatGPT, Elastic, OpenSearch | PostgreSQL, Google BigQuery plus generic HTTP/GET/JSON with configurations for premium services.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    spacy-llm

    spacy-llm

    Integrating LLMs into structured NLP pipelines

    Large Language Models (LLMs) feature powerful natural language understanding capabilities. With only a few (and sometimes no) examples, an LLM can be prompted to perform custom NLP tasks such as text categorization, named entity recognition, coreference resolution, information extraction and more. This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Streamline Analyst

    Streamline Analyst

    AI agent that streamlines the entire process of data analysis

    Streamline Analyst is a cutting-edge, open-source application powered by Large Language Models (LLMs) designed to revolutionize data analysis. This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Ludwig AI

    Ludwig AI

    Low-code framework for building custom LLMs, neural networks

    Declarative deep learning framework built for scale and efficiency. Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. Declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures. Automatic batch size selection, distributed training (DDP, DeepSpeed),...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    GenAI Agents

    GenAI Agents

    Implementations for various Generative AI Agent techniques

    GenAI Agents is a large, tutorial-driven repository that teaches you how to design, build, and experiment with generative AI agents. It spans a spectrum from simple conversational bots and basic question-answering agents to complex multi-agent systems that coordinate on research, education, business workflows, and creative tasks. The implementations leverage modern frameworks such as LangChain, LangGraph, AutoGen, PydanticAI, CrewAI, and more, showing how each can be wired into realistic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    HN Time Capsule

    HN Time Capsule

    Analyzing Hacker News discussions from a decade ago in hindsight

    HN Time Capsule is a creative and nostalgic project that captures and preserves snapshots of Hacker News content over time, providing a historical look at how topics, discussions, and popular threads have evolved. Rather than functioning like a live aggregator, it stores periodic captures of posts and comments, creating a time capsule that lets researchers, enthusiasts, and historians trace changes in sentiment, technology trends, and community priorities across different eras of the Hacker...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Engram

    Engram

    A New Axis of Sparsity for Large Language Models

    Engram is a high-performance embedding and similarity search library focused on making retrieval-augmented workflows efficient, scalable, and easy to adopt by developers building search, recommendation, or semantic matching systems. It provides utilities to generate embeddings from text or other structured data, index them using efficient approximate nearest neighbor algorithms, and perform real-time similarity queries even on large corpora. Engineered with speed and memory efficiency in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Slack MCP Server

    Slack MCP Server

    The most powerful MCP Slack Server with no permission requirements

    Slack MCP Server is an open-source server implementation that connects Slack workspaces to AI systems through the Model Context Protocol (MCP). MCP is a standardized protocol that allows large language models and AI agents to securely interact with external tools and data sources such as messaging platforms, databases, or file systems. The slack-mcp-server acts as an intermediary layer that exposes Slack data and messaging functionality to AI clients while enforcing access rules and communication standards. Through this architecture, AI assistants can read message histories, interact with channels, and retrieve contextual information from Slack conversations in order to perform tasks such as automated analysis, collaboration assistance, or contextual code review. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    Paper2Slides is an automation tool that converts research papers, reports, and other documents into polished slide decks and posters with minimal manual effort. It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Parallax

    Parallax

    Parallax is a distributed model serving framework

    Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    OM1

    OM1

    Modular AI runtime for robots

    OM1 is an open-source AI platform designed to build autonomous agents capable of interacting with digital environments and completing complex tasks. The project focuses on creating a modular architecture where language models can coordinate with external tools, APIs, and knowledge sources to accomplish multi-step objectives. Instead of operating as simple conversational systems, OM1 agents can plan actions, retrieve information, and execute tasks across different services. The framework...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    PaperBanana

    PaperBanana

    Extension of Google Research’s PaperBanana

    PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Lagent

    Lagent

    A lightweight framework for building LLM-based agents

    Lagent is a lightweight open-source framework designed to help developers build autonomous agents powered by large language models. The framework provides tools and abstractions that allow language models to interact with external tools, execute tasks, and perform multi-step reasoning processes. Instead of using LLMs only for text generation, Lagent enables developers to transform models into agents capable of performing actions such as retrieving data, executing code, or interacting with...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Learn AI Engineering

    Learn AI Engineering

    Learn AI and LLMs from scratch using free resources

    ...The curation recognizes modern AI realities, including data pipelines, evaluation, prompt engineering, retrieval-augmented generation, and cost/performance trade-offs. It’s equally useful for refreshers—dipping into a specific module before a project—as it is for a full, self-directed curriculum. By centralizing the best references in one place, the repo reduces the overhead of finding, filtering, and sequencing resources, letting you focus on learning and building.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    LLM Foundry

    LLM Foundry

    LLM training code for MosaicML foundation models

    Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and code. It is open source, available for commercial use, and matches the quality of LLaMA-7B. MPT-7B was trained on the MosaicML platform in 9.5 days with zero human intervention at a cost of ~$200k. Large language models (LLMs) are changing the world, but for those outside well-resourced industry labs, it can be extremely difficult to train and deploy...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    DeepSearcher

    DeepSearcher

    Open Source Deep Research Alternative to Reason and Search

    DeepSearcher is an open-source “deep research” style system that combines retrieval with evaluation and reasoning to answer complex questions using private or enterprise data. It is designed around the idea that high-quality answers require more than top-k retrieval, so it orchestrates multi-step search, evidence collection, and synthesis into a comprehensive response. The project integrates with vector databases (including Milvus and related options) so organizations can index internal...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    kg-gen

    kg-gen

    Knowledge Graph Generation from Any Text

    kg-gen is an open-source framework developed by the STAIR Lab that automatically generates knowledge graphs from unstructured text using large language models. The system is designed to transform plain text sources such as documents, articles, or conversation transcripts into structured graphs composed of entities and relationships. Instead of relying on traditional rule-based extraction techniques, KG-Gen uses language models to identify entities and their relationships, producing...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    MING

    MING

    A large-scale model of medical consultation in Chinese

    MING is an open-source medical large language model designed for intelligent medical consultation and question answering in Chinese. The project focuses on building a healthcare-focused conversational system capable of responding to medical questions, analyzing case descriptions, and guiding diagnostic reasoning. It is trained using medical instruction tuning so that the model can understand patient symptoms and respond with structured explanations and clinical suggestions. One of its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    LLM-Pruner

    LLM-Pruner

    On the Structural Pruning of Large Language Models

    LLM-Pruner is an open-source framework designed to compress large language models through structured pruning techniques while maintaining their general capabilities. Large language models often require enormous computational resources, making them expensive to deploy and inefficient for many practical applications. LLM-Pruner addresses this issue by identifying and removing non-essential components within transformer architectures, such as redundant attention heads or feed-forward...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    AICGSecEval

    AICGSecEval

    A.S.E (AICGSecEval) is a repository-level AI-generated code security

    AICGSecEval is an open-source benchmark framework designed to evaluate the security of code generated by artificial intelligence systems. The project was developed to address concerns that AI-assisted programming tools may produce insecure code containing vulnerabilities such as injection flaws or unsafe logic. The framework constructs evaluation tasks based on real-world software repositories and known vulnerability cases derived from CVE records. By simulating realistic development...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    LLMCompiler

    LLMCompiler

    An LLM Compiler for Parallel Function Calling

    LLMCompiler is an open-source framework designed to optimize how large language models orchestrate multiple external tool or function calls during complex reasoning tasks. Traditional LLM agent systems typically execute tool calls sequentially, which can create latency, higher costs, and reduced reliability when solving multi-step problems. LLMCompiler addresses this limitation by applying principles from classical compilers to analyze a task and construct an execution plan that allows...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    BISHENG

    BISHENG

    BISHENG is an open LLM devops platform for next generation apps

    BISHENG is an open LLM application DevOps platform, focusing on enterprise scenarios. It has been used by a large number of industry-leading organizations and Fortune 500 companies. "Bi Sheng" was the inventor of movable type printing, which played a vital role in promoting the transmission of human knowledge. We hope that BISHENG can also provide strong support for the widespread implementation of intelligent applications. Everyone is welcome to participate.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    BAML

    BAML

    The AI framework that adds the engineering to prompt engineering

    BAML is an open-source framework and domain-specific language designed to bring structured engineering practices to prompt development for large language model applications. Instead of treating prompts as unstructured text, BAML introduces a schema-driven approach where prompts are defined as typed functions with explicit inputs and outputs. This design allows developers to treat language model interactions as predictable software components rather than ad-hoc prompt strings. The framework...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 25
    BuildingAI

    BuildingAI

    Build your own AI application system for free

    BuildingAI is an open-source project focused on applying artificial intelligence techniques to architectural design and building information modeling workflows. The platform aims to bridge the gap between natural language interfaces and building design tools by allowing AI systems to interpret user instructions and convert them into structured architectural operations. By combining generative AI capabilities with building data models, the system can assist with tasks such as design...
    Downloads: 6 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB