Showing 80 open source projects for "throughput"

View related business solutions
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Qwen-Agent

    Qwen-Agent

    Agent framework and applications built upon Qwen>=3.0

    Qwen-Agent is a framework for building applications / agents using Qwen models (version 3.0+). It provides components for instruction following, tool usage (function calling), planning, memory, RAG (retrieval augmented generation), code interpreter, etc. It ships with example applications (Browser Assistant, Code Interpreter, Custom Assistant), supports GUI front-ends, backends, server setups. Agent workflow can maintain context / memory to perform multi-turn or more complex logic over time....
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Pocket TTS

    Pocket TTS

    A TTS that fits in your CPU (and pocket)

    ...Because it is CPU-oriented, it fits well in server environments where GPU access is limited, in desktop apps, or in edge deployments where simplicity matters more than maximum throughput. It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    VidGear

    VidGear

    A High-performance cross-platform Video Processing Python framework

    ...The framework is built around modular components called “gears,” each responsible for tasks such as video capture, streaming, encoding, and network transmission. It supports multi-threaded and asynchronous operations, enabling low-latency processing and efficient handling of high-throughput video streams. VidGear is designed to handle a wide range of use cases, including live streaming, video stabilization, screencasting, and distributed video systems. Its emphasis on simplicity allows developers to implement advanced multimedia pipelines with minimal code while maintaining performance and flexibility.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    ...The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. Additionally, it can generate embeddings for extracted content and integrate with vector databases like Milvus, making it well-suited for retrieval-augmented generation pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Stable Diffusion WebUI Forge

    Stable Diffusion WebUI Forge

    Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion

    Stable Diffusion WebUI Forge is a performance- and feature-oriented fork of the popular AUTOMATIC1111 interface that experiments with new backends, memory optimizations, and UX improvements. It targets heavy users and researchers who push large models, control nets, and high-resolution pipelines where default settings can become bottlenecks. The fork typically introduces toggles for scheduler behavior, attention implementations, caching, and precision modes to reach better speed or quality...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Scweet

    Scweet

    Scrape tweets, profiles, followers and following from Twitter/X

    Scweet is a Python-based Twitter/X scraping library and CLI designed to collect tweets, profile timelines, followers, following lists, and user profile data without requiring the official Twitter/X API or a developer account. Instead of depending on deprecated unauthenticated scraping methods, it works by using X’s web GraphQL API together with authenticated browser cookies, which gives it a more current and practical approach for data extraction. The project supports a broad set of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    MaxText

    MaxText

    A simple, performant and scalable Jax LLM

    ...It is optimized to run efficiently on Google Cloud TPUs and GPUs, enabling researchers and engineers to train models ranging from small experiments to extremely large distributed workloads. The framework focuses on simplicity while still supporting advanced techniques such as model sharding, distributed computation, and high-throughput training pipelines. MaxText includes ready-to-use configurations and reproducible training examples that help developers understand how to deploy large-scale AI workloads with modern machine learning infrastructure.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    ...BitNet is built to scale across architectures, with configurable kernels and tiling strategies that adapt to different hardware, and it supports large models with impressive throughput even on modest resources.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B),...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    OpenAI Forward

    OpenAI Forward

    An efficient forwarding service designed for LLMs

    ...The project can proxy both local and cloud-hosted language model services, which makes it useful for teams that want a single control layer regardless of whether they are using something like LocalAI or a hosted provider compatible with OpenAI-style APIs. A major emphasis of the repository is asynchronous performance, using tools such as uvicorn, aiohttp, and asyncio to support high-throughput forwarding workloads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Loki Mode

    Loki Mode

    Multi-agent autonomous startup system for Claude Code

    ...By supporting multiple AI providers (like Claude Code, OpenAI Codex CLI, and Google Gemini CLI), loki-mode dynamically selects and spawns only the needed agents for a given project, optimizing computational resources and task throughput. Its Reason-Act-Reflect-Verify (RARV) cycle with self-verification loops emphasizes quality and resilience, automating end-to-end development lifecycles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    FastDeploy

    FastDeploy

    High-performance Inference and Deployment Toolkit for LLMs and VLMs

    ...The platform enables developers to deploy trained models quickly using optimized inference pipelines that support GPUs, specialized AI accelerators, and other hardware architectures. FastDeploy includes advanced acceleration technologies such as speculative decoding, multi-token prediction, and efficient KV cache management to improve throughput and latency during inference. It also offers compatibility with OpenAI-style APIs and vLLM-like interfaces, allowing developers to integrate deployed models easily into existing applications and services.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    slime LLM

    slime LLM

    slime is an LLM post-training framework for RL Scaling

    slime is an open-source large language model (LLM) post-training framework developed to support reinforcement learning (RL)-based scaling and high-performance training workflows for advanced LLMs, blending training and rollout modules into an extensible system. It offers a flexible architecture that connects high-throughput training (e.g., via Megatron-LM) with a customizable data generation pipeline, enabling researchers and engineers to iterate on new RL training paradigms effectively. The framework is designed to support a wide range of training modes, allowing both synchronous and asynchronous RL workflows and programmable rollout interfaces that simplify experimentation with custom environments and reward signals. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    nanochat

    nanochat

    The best ChatGPT that $100 can buy

    nanochat is a from-scratch, end-to-end “mini ChatGPT” that shows the entire path from raw text to a chatty web app in one small, dependency-lean codebase. The repository stitches together every stage of the lifecycle: tokenizer training, pretraining a Transformer on a large web corpus, mid-training on dialogue and multiple-choice tasks, supervised fine-tuning, optional reinforcement learning for alignment, and finally efficient inference with caching. Its north star is approachability and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    ZeusDB Vector Database

    ZeusDB Vector Database

    Blazing-fast vector DB with similarity search and metadata filtering

    ZeusDB is a vector database built for fast, scalable similarity search with strong production ergonomics. It combines high-performance approximate nearest neighbor indexes with clean APIs and metadata filtering so applications can retrieve semantically relevant items at low latency. The storage layer is designed for durability and growth, supporting sharding, replication, and background compaction while keeping query tails predictable. Developers get multiple ingestion paths—batch,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DLRM

    DLRM

    An implementation of a deep learning recommendation model (DLRM)

    ...It includes data loaders for standard benchmarks (like Criteo), training scripts, evaluation tools, and capabilities like mixed precision, gradient compression, and memory fusion to maximize throughput.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Smallpond

    Smallpond

    A lightweight data processing framework built on DuckDB and 3FS

    ...Users write Python-like code (via DataFrame APIs or SQL strings) to express their transformations; behind the scenes, tasks are scheduled (often via Ray) and pushed into DuckDB instances operating on partitioned data. Because the storage layer (3FS) is optimized for random access and high throughput, smallpond can shuffle data, repartition, and manage intermediate results across nodes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Ring

    Ring

    Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

    Ring is a reasoning Mixture-of-Experts (MoE) large language model (LLM) developed by inclusionAI. It is built from or derived from Ling. Its design emphasizes reasoning, efficiency, and modular expert activation. In its “flash” variant (Ring-flash-2.0), it optimizes inference by activating only a subset of experts. It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    marqo

    marqo

    Tensor search for humans

    A tensor-based search and analytics engine that seamlessly integrates with your applications, websites, and workflows. Marqo is a versatile and robust search and analytics engine that can be integrated into any website or application. Due to horizontal scalability, Marqo provides lightning-fast query times, even with millions of documents. Marqo helps you configure deep-learning models like CLIP to pull semantic meaning from images. It can seamlessly handle image-to-image, image-to-text and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MiniMax-01

    MiniMax-01

    Large-language-model & vision-language-model based on Linear Attention

    ...MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel strategies such as LASP+, varlen ring attention, and Expert Tensor Parallelism, enabling a training context of 1 million tokens and up to 4 million tokens at inference. MiniMax-VL-01 extends this core by adding a 303M-parameter Vision Transformer and a two-layer MLP projector in a ViT–MLP–LLM framework, allowing the model to process images at dynamic resolutions up to 2016×2016.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Nitrux

    Nitrux

    A Linux system for modern computers on an immutable foundation.

    A Linux system for modern computers on an immutable foundation. Powered by OpenRC, MauiKit, NX AppHub, and Hyprland.
    Leader badge
    Downloads: 423 This Week
    Last Update:
    See Project
  • 25
    DeiT (Data-efficient Image Transformers)
    ...Its key idea is a specialized distillation strategy—including a learnable “distillation token”—that lets a transformer learn effectively from a CNN or transformer teacher on modest-scale datasets. The project provides compact ViT variants (Tiny/Small/Base) that achieve excellent accuracy–throughput trade-offs, making transformers practical beyond massive pretraining regimes. Training involves carefully tuned augmentations, regularization, and optimization schedules to stabilize learning and improve sample efficiency. The repo offers pretrained checkpoints, reference scripts, and ablation studies that clarify which ingredients matter most for data-efficient ViT training.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB