Showing 39 open source projects for "python data analysis"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    Databend

    Databend

    Cloud-native open source data warehouse for analytics and AI queries

    ...Databend provides a unified engine capable of handling analytics, vector search, and full-text search within a single platform. Databend supports SQL-based workflows and enables real-time data ingestion, transformation, and analysis through streaming and task orchestration features. With its cloud-native design and distributed architecture, Databend can run both as a self-hosted system or within managed environments to power data analytics, AI workloads, and large-scale data.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 2
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    Ruff

    Ruff

    An extremely fast Python linter, written in Rust

    An extremely fast Python linter, written in Rust. Ruff aims to be orders of magnitude faster than alternative tools while integrating more functionality behind a single, common interface. Ruff can be used to replace Flake8 (plus dozens of plugins), isort, pydocstyle, yesqa, eradicate, pyupgrade, and autoflake, all while executing tens or hundreds of times faster than any individual tool. Ruff is extremely actively developed and used in major open-source projects. Ruff can be configured...
    Downloads: 18 This Week
    Last Update:
    See Project
  • 4
    QSV

    QSV

    Blazing-fast Data-Wrangling toolkit

    qsv is a fast, command-line CSV data toolkit written in Rust that extends the capabilities of xsv. It’s designed to make working with CSV files at scale easy and efficient, offering over 40 powerful subcommands for tasks like querying, sampling, splitting, deduplicating, and more. qsv is ideal for data engineers, analysts, and developers who need high-performance CSV manipulation on the command line.
    Downloads: 85 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Rerun

    Rerun

    Visualize streams of multimodal data

    Rerun is an open-source tool that helps developers visualize real-time multimodal data streams, such as images, point clouds, and tensors, for debugging and understanding ML and robotics systems. Designed for use with Python and Rust, it captures logged data and renders it through an interactive desktop interface, making it easier to understand how complex systems behave over time.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    Peroxide

    Peroxide

    Rust numeric library with high performance and friendly syntax

    Rust numeric library contains linear algebra, numerical analysis, statistics and machine learning tools with R, MATLAB, Python-like macros. Peroxide uses a 1D data structure to represent matrices, making it straightforward to integrate with BLAS (Basic Linear Algebra Subprograms). This means that Peroxide can guarantee excellent performance for linear algebraic computations by leveraging the optimized routines provided by BLAS.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Flowsurface

    Flowsurface

    A native desktop charting platform for crypto markets

    Flowsurface is a powerful open-source desktop charting platform tailored for crypto markets, built primarily in Rust with a focus on real-time data visualization and market microstructure analysis. Instead of traditional price charts alone, Flowsurface emphasizes order flow and liquidity visualization through advanced chart types like historical DOM heatmaps, footprint charts, and depth ladder displays. This enables traders and analysts to understand actual executed trades, liquidity distribution, and tempo changes that often precede significant market movements. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 8
    Spider

    Spider

    High-performance Rust web crawler and scraper for large-scale data

    ...Spider can operate concurrently across many pages, allowing it to gather large datasets in a short period of time. Spider also provides mechanisms for subscribing to crawl events so developers can process page data such as URLs, status codes, or HTML content as it is discovered. It supports advanced capabilities such as headless browser rendering, background crawling tasks, and configurable rules that control crawl depth or ignored paths. These capabilities make the project suitable for building search indexers, data extraction pipelines, & SEO analysis tools.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Monty

    Monty

    A minimal, secure Python interpreter written in Rust for use by AI

    ...It prioritizes guardrails like resource limits and restricted capabilities, which is especially useful for agentic workflows that need to execute small pieces of Python for data transforms, validation, or tool-like computations. Because it’s written in Rust, it’s positioned to deliver a compact, portable runtime that can be embedded into larger systems that need dependable isolation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    CocoIndex

    CocoIndex

    ETL framework to index data for AI, such as RAG

    CocoIndex is an open-source framework designed for building powerful, local-first semantic search systems. It lets users index and retrieve content based on meaning rather than keywords, making it ideal for modern AI-based search applications. CocoIndex leverages vector embeddings and integrates with various models and frameworks, including OpenAI and Hugging Face, to provide high-quality semantic understanding. It’s built for transparency, ease of use, and local control over your search...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11
    Polars

    Polars

    Dataframes powered by a multithreaded, vectorized query engine

    Polars is a high-performance, multi-language DataFrame library built in Rust using Apache Arrow. It delivers blazing-fast, vectorized, and parallel data manipulation with both eager and lazy execution, making it an excellent tool for data processing in Python, Rust, Node.js, R, and SQL contexts.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Linfa

    Linfa

    A Rust machine learning framework

    linfa aims to provide a comprehensive toolkit to build Machine Learning applications with Rust. Kin in spirit to Python's scikit-learn, it focuses on common preprocessing tasks and classical ML algorithms for your everyday ML tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Stoolap

    Stoolap

    A Modern Embedded SQL Database written in Rust

    Stoolap is a lightweight, self-hosted analysis and visualization tool designed to help developers and operations teams explore log data, metrics, and debugging information from distributed systems or local applications. Instead of relying on heavy commercial observability platforms, stoolap provides a fast, focused interface where users can filter, query, and visualize time-series data, logs, traces, and error metrics in a cohesive environment.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    PostgresML

    PostgresML

    The GPU-powered AI application database

    PostgresML is a complete platform in a PostgreSQL extension. Build simpler, faster, and more scalable models right inside your database. Explore the SDK and test open source models in our hosted database. Combine and automate the entire workflow from embedding generation to indexing and querying for the simplest (and fastest) knowledge-based chatbot implementation. Leverage multiple types of natural language processing and machine learning models such as vector search and personalization...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 15
    Code2Prompt

    Code2Prompt

    Convert codebases into structured prompts optimized for LLM analysis

    code2prompt is an open source command line tool designed to convert an entire codebase into a structured prompt that can be easily used with large language models. It analyzes a project directory, gathers relevant source files, and formats them into a single prompt that includes the source tree and code content. This approach helps developers quickly provide full project context to AI models without manually copying files or assembling prompts. code2prompt is built in Rust and focuses on...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    LOL HTML

    LOL HTML

    Low output latency streaming HTML parser/rewriter with CSS API

    Low Output Latency streaming HTML rewriter/parser with CSS-selector based API. It is designed to modify HTML on the fly with minimal buffering. It can quickly handle very large documents, and operate in environments with limited memory resources. The crate serves as a back-end for the HTML rewriting functionality of Cloudflare Workers, but can be used as a standalone library with a convenient API for a wide variety of HTML rewriting/analysis tasks. The parser switches back to the tag scanner...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Qdrant

    Qdrant

    Vector Database for the next generation of AI applications

    Qdrant is a vector similarity engine & vector database. It deploys as an API service providing search for the nearest high-dimensional vectors. With Qdrant, embeddings or neural network encoders can be turned into full-fledged applications for matching, searching, recommending, and much more! Provides the OpenAPI v3 specification to generate a client library in almost any programming language. Alternatively, utilize ready-made client for Python or other programming languages with additional...
    Downloads: 72 This Week
    Last Update:
    See Project
  • 18
    SurrealDB

    SurrealDB

    A scalable, distributed, collaborative, document-graph database

    With an SQL-style query language, real-time queries with highly-efficient related data retrieval, advanced security permissions for multi-tenant access, and support for performant analytical workloads, SurrealDB is the next generation serverless database. SurrealDB is the ultimate cloud database for tomorrow's applications. SurrealDB is an innovative NewSQL cloud database, suitable for serverless applications, jamstack applications, single-page applications, and traditional applications. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    OpenAI Harmony

    OpenAI Harmony

    Renderer for the harmony response format to be used with gpt-oss

    Harmony is a response format developed by OpenAI for use with the gpt-oss model series. It defines a structured way for language models to produce outputs, including regular text, reasoning traces, tool calls, and structured data. By mimicking the OpenAI Responses API, Harmony provides developers with a familiar interface while enabling more advanced capabilities such as multiple output channels, instruction hierarchies, and tool namespaces. The format is essential for ensuring gpt-oss...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    BAML

    BAML

    The AI framework that adds the engineering to prompt engineering

    BAML is an open-source framework and domain-specific language designed to bring structured engineering practices to prompt development for large language model applications. Instead of treating prompts as unstructured text, BAML introduces a schema-driven approach where prompts are defined as typed functions with explicit inputs and outputs. This design allows developers to treat language model interactions as predictable software components rather than ad-hoc prompt strings. The framework...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 21
    Summa

    Summa

    Full-text IPFS-friendly and WASM-compatible Search in Rust

    Summa is a full-text IPFS-friendly search engine that may be launched on both large servers and inside your browser. Thanks to the embedded IPFS daemon, your data can be replicated and published through P2P, allowing for a truly distributed and uncensorable search experience. And, thanks to compatibility with WASM, Summa can be launched entirely inside your browser, enabling you to search in network-published indices without ever having to execute search queries on remote servers.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Korvus

    Korvus

    Korvus is a search SDK that unifies the entire RAG pipeline

    Korvus is an open-source retrieval-augmented generation (RAG) pipeline designed to run entirely inside PostgreSQL, allowing developers to build AI search and knowledge systems directly within a database environment. The project consolidates the typical steps of a RAG pipeline—including embedding generation, document retrieval, reranking, and text generation—into a single query executed within the Postgres ecosystem. By leveraging PostgresML and vector extensions such as pgvector, Korvus...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    X For You Feed Algorithm

    X For You Feed Algorithm

    Algorithm powering the For You feed on X

    X For You Feed Algorithm is the open-sourced core recommendation system that powers the For You feed on X (the social network formerly known as Twitter), and it represents one of the first times a major social platform has published production-level ranking code for public review and experimentation. The repository contains the full pipeline that ingests user engagement and content candidate data, processes it through retrieval, hydration, filtering, scoring, and selection layers, and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Bend

    Bend

    A massively parallel, high-level programming language

    Bend is an interactive programming environment (REPL) built on top of the Kotlin language, designed to allow users to explore, experiment, and learn Kotlin in a live, feedback-driven manner. The tool lets you define variables, functions, or values at the prompt and iteratively refine them—immediately seeing output and types—while preserving state across commands. It emphasizes discoverability and experimentation: users can inspect functions, call them on sample inputs, and evolve logic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB