Showing 116 open source projects for "compute"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    Compute Library

    Compute Library

    The Compute Library is a set of computer vision and machine learning

    The Compute Library is a set of computer vision and machine learning functions optimized for both Arm CPUs and GPUs using SIMD technologies. The library provides superior performance to other open-source alternatives and immediate support for new Arm® technologies e.g. SVE2.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    tt-metal

    tt-metal

    TT-NN operator library, and TT-Metalium low level kernel programming

    ...The project is designed for developers who need direct access to the company’s Tensix processor architecture, exposing a programming model that is closer to hardware control than high-level inference frameworks. Instead of following a traditional GPU model centered on massive thread parallelism, the platform is built around a grid of specialized compute nodes called Tensix cores, each with local SRAM, dedicated compute units, and multiple RISC-V control processors. The SDK provides the abstractions and APIs needed to manage data movement, compute kernels, memory coordination, and execution flow across this architecture.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    PySyft

    PySyft

    Data science on data without acquiring a copy

    Most software libraries let you compute over the information you own and see inside of machines you control. However, this means that you cannot compute on information without first obtaining (at least partial) ownership of that information. It also means that you cannot compute using machines without first obtaining control over those machines. This is very limiting to human collaboration and systematically drives the centralization of data, because you cannot work with a bunch of data without first putting it all in one (central) place. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Step 3.5 Flash

    Step 3.5 Flash

    Fast, Sharp & Reliable Agentic Intelligence

    ...Unlike dense models that activate all their parameters for every token, Step 3.5 Flash uses a sparse Mixture-of-Experts (MoE) architecture that selectively engages only about 11 billion of its roughly 196 billion total parameters per token, delivering high-quality reasoning and interaction at far lower compute cost and latency than traditional large models. Its design targets deep reasoning, long-context handling, coding, and real-time responsiveness, making it suitable for building autonomous agents, advanced assistants, and long-chain cognitive workflows without sacrificing performance.
    Downloads: 5 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 5
    PyTorch/XLA

    PyTorch/XLA

    Enabling PyTorch on Google TPU

    ...Cloud TPU VM is currently on general availability and provides direct access to the TPU host. The recommended setup for running distributed training on TPU Pods uses the pairing of Compute VM Instance Groups and TPU Pods. Each of the Compute VM in the instance group drives 8 cores on the TPU Pod.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    DeepSeekMath-V2

    DeepSeekMath-V2

    Towards self-verifiable mathematical reasoning

    DeepSeekMath-V2 is a large-scale open-source AI model designed specifically for advanced mathematical reasoning, theorem proving, and rigorous proof verification. It’s built by DeepSeek as a successor to their earlier math-specialist models. Unlike general-purpose LLMs that might generate plausible-looking math but sometimes hallucinate or mishandle rigorous logic, Math-V2 is engineered to not only generate solutions but also self-verify them, meaning it examines the derivations, checks...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    clip-retrieval

    clip-retrieval

    Easily compute clip embeddings and build a clip retrieval system

    clip-retrieval is an open-source toolkit designed to build large-scale semantic search systems for images and text by leveraging CLIP embeddings to enable multimodal retrieval. It allows developers to compute embeddings for both images and text efficiently and then index them for fast similarity search across massive datasets. The system is optimized for performance and scalability, capable of processing tens or even hundreds of millions of embeddings using GPU acceleration. It includes components for inference, indexing, filtering, and serving results through APIs, making it a complete pipeline for building production-ready retrieval systems. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    MiniRAG

    MiniRAG

    Making RAG Simpler with Small and Open-Sourced Language Models

    MiniRAG is a lightweight retrieval-augmented generation tool designed to bring the benefits of RAG workflows to smaller datasets, edge environments, and constrained compute settings by simplifying embedding, indexing, and retrieval. It extracts text from documents, codes, or other structured inputs and converts them into embeddings using efficient models, then stores these vectors for fast nearest-neighbor search without requiring huge databases or separate vector servers. When a query is issued, MiniRAG retrieves the most relevant contexts and feeds them into a generative model to produce an answer that is grounded in the source material rather than hallucinated. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Z80-μLM

    Z80-μLM

    Z80-μLM is a 2-bit quantized language model

    ...The project sits at the intersection of machine learning and systems constraints, showing how model architecture, quantization, and inference code generation can be adapted to extreme memory and compute limits. It also functions as an educational reference for how to reduce inference to operations that fit an old-school instruction set and runtime environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    MiniMind-V

    MiniMind-V

    "Big Model" trains a visual multimodal VLM with 26M parameters

    MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines techniques from modern vision-language modeling but focuses on efficiency and simplicity so that individuals or small teams can explore multimodal learning without massive GPU clusters. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Netflix Maestro

    Netflix Maestro

    Netflix’s Workflow Orchestrator

    ...It was designed to support the demanding internal infrastructure of Netflix, where thousands of workflows must process massive volumes of data reliably and efficiently every day. The platform enables engineers and data scientists to define workflows using structured configuration files and execute tasks across diverse compute environments, including scripts, containers, and notebook environments. Maestro provides built-in mechanisms for retry logic, task scheduling, dependency management, and error handling, which are essential when orchestrating production-scale pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenMythos

    OpenMythos

    A theoretical reconstruction of the Claude Mythos architecture

    ...The architecture incorporates advanced techniques such as mixture-of-experts routing, adaptive computation time, and multiple attention mechanisms to dynamically allocate compute where needed. It is highly configurable through a centralized configuration system, allowing experimentation with different architectural parameters such as loop depth, attention type.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 13
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data. It is designed to save time for a data scientist. It abstracts the common way to preprocess the data,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    FlashMLA

    FlashMLA

    FlashMLA: Efficient Multi-head Latent Attention Kernels

    ...The library supports both BF16 and FP16 data types, and includes a paged KV cache implementation with a block size of 64 to efficiently manage memory during decoding. On very compute-bound settings, it can reach up to ~660 TFLOPS on H800 SXM5 hardware, while in memory-bound configurations it can push memory throughput to ~3000 GB/s. The team regularly updates it with performance improvements; for example, a 2025 update claims 5 % to 15 % gains on compute-bound workloads while maintaining API compatibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Burn

    Burn

    Burn is a new comprehensive dynamic Deep Learning Framework

    Burn is a new comprehensive dynamic Deep Learning Framework from Tracel AI built using Rust with extreme flexibility, compute efficiency and portability as its primary goals. Burn emphasizes performance, flexibility, and portability for both training and inference. Developed in Rust, it is designed to empower machine learning engineers and researchers across industry and academia.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Feynman

    Feynman

    The open source AI research agent

    Feynman is a command-line AI research agent designed to automate complex research workflows by orchestrating multiple specialized agents that collaborate to gather, analyze, and synthesize information into structured outputs. It operates as a “Claude Code for research,” allowing users to input natural language queries and receive fully developed, source-grounded research briefs, literature reviews, or experimental analyses. The system is built around a multi-agent architecture that includes...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 17
    TurboQuant PyTorch

    TurboQuant PyTorch

    From-scratch PyTorch implementation of Google's TurboQuant

    TurboQuant PyTorch is a specialized deep learning optimization framework designed to accelerate neural network inference and training through advanced quantization techniques within the PyTorch ecosystem. The project focuses on reducing the computational and memory footprint of models by converting floating-point representations into lower-precision formats while preserving performance. It provides tools for experimenting with different quantization strategies, enabling developers to balance...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    GLM-4.1V

    GLM-4.1V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.1V — often referred to as a smaller / lighter version of the GLM-V family — offers a more resource-efficient option for users who want multimodal capabilities without requiring large compute resources. Though smaller in scale, GLM-4.1V maintains competitive performance, particularly impressive on many benchmarks for models of its size: in fact, on a number of multimodal reasoning and vision-language tasks it outperforms some much larger models from other families. It represents a trade-off: somewhat reduced capacity compared to 4.5V or 4.6V, but with benefits in terms of speed, deployability, and lower hardware requirements — making it especially useful for developers experimenting locally, building lightweight agents, or deploying on limited infrastructure. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    ...It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Substra

    Substra

    Low-level Python library used to interact with a Substra network

    An open-source framework supporting privacy-preserving, traceable federated learning and machine learning orchestration. Offers a Python SDK, high-level FL library (SubstraFL), and web UI to define datasets, models, tasks, and orchestrate secure, auditable collaborations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Model Explorer

    Model Explorer

    A modern model graph visualizer and debugger

    Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Multica

    Multica

    The open-source managed agents platform

    ...It introduces a paradigm where agents can be assigned tasks, participate in discussions, and autonomously execute work while reporting progress and blockers in real time. The system integrates with multiple AI coding tools and provides a unified interface for managing tasks, compute environments, and agent execution pipelines. It includes both a web interface and a CLI that connects local or cloud-based runtimes to the platform, enabling flexible deployment and scaling. Multica emphasizes collaboration between humans and AI by allowing agents to operate alongside developers in shared workspaces. It also supports reusable skill accumulation, meaning that solutions generated by agents can be reused across projects to improve efficiency over time.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    DeepSeek Coder

    DeepSeek Coder

    DeepSeek Coder: Let the Code Write Itself

    DeepSeek-Coder is a series of code-specialized language models designed to generate, complete, and infill code (and mixed code + natural language) with high fluency in both English and Chinese. The models are trained from scratch on a massive corpus (~2 trillion tokens), of which about 87% is code and 13% is natural language. This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 24
    Spice.ai OSS

    Spice.ai OSS

    A self-hostable CDN for databases

    Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    WebLLM

    WebLLM

    Bringing large-language models and chat to web browsers

    WebLLM is a modular, customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration. WebLLM offers a minimalist and modular interface to access the chatbot in the browser. The WebLLM package itself does not come...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB