Showing 115 open source projects for "runtime"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 1
    ONNX Runtime

    ONNX Runtime

    ONNX Runtime: cross-platform, high performance ML inferencing

    ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 2
    Monoio

    Monoio

    Rust async runtime based on io-uring

    Monoio is a Rust asynchronous runtime designed for high-performance I/O-bound servers and applications, built around native OS async I/O primitives (e.g. io_uring on Linux, epoll / kqueue on other Unix-like systems), rather than layering atop an existing runtime. Its design philosophy centers on a “thread-per-core” model where each core runs its own event loop, minimizing cross-thread synchronization needs, avoiding the overhead and complexity of task scheduling, and letting developers write efficient, low-overhead asynchronous networking or I/O code. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    ByteHook

    ByteHook

    ByteHook is an Android PLT hook library

    ...As such, Bhook would serve developers needing fine-grained control over runtime execution, e.g. to intercept calls, log behaviors, protect processes, or adapt system behavior dynamically.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    kokoro-onnx

    kokoro-onnx

    TTS with kokoro and onnx runtime

    kokoro-onnx is a text-to-speech toolkit that wraps the Kokoro neural TTS model in an easy-to-use ONNX Runtime interface, so you can generate speech from Python with minimal setup. It focuses on running efficiently on commodity hardware, including macOS with Apple Silicon, while still delivering near real-time performance for many use cases. The project ships prebuilt model files and a simple example script, so you can go from installation to producing an audio.wav file in just a few steps. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • anny is an all-in-one platform for managing hybrid workplaces and shared resources. Icon
    anny is an all-in-one platform for managing hybrid workplaces and shared resources.

    For Businesses looking for a flexible solution for internal and external bookings

    Enable your employees to easily book desks, meeting rooms, parking spots, equipment, and more – all in one place. With flexible rules and group permissions, you stay in full control of who can access what.
    Learn More
  • 5
    Luminal

    Luminal

    Deep learning at the speed of light

    ...Instead of treating data processing as a series of ad-hoc scripts, Luminal models transformations as strongly typed building blocks that can be composed into reliable, scalable pipelines. The project emphasizes correctness and performance by requiring explicit types for the data flowing through transformations, reducing runtime surprises and allowing for highly optimized execution. It is particularly well-suited for data engineering workflows where large datasets must be processed incrementally, efficiently, and deterministically. The framework also includes a runtime capable of executing pipelines across multiple backends, making it flexible in cloud and local environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LiteRT

    LiteRT

    LiteRT is the new name for TensorFlow Lite (TFLite)

    LiteRT is an experimental, real-time inference runtime built by Google AI Edge to run lightweight ML models on edge devices with ultra-low latency. It focuses on delivering predictable and consistent performance for models used in time-critical applications like robotics, AR/VR, and IoT. LiteRT is designed to be hardware-agnostic, with minimal dependencies and tight control over execution scheduling.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    IREE

    IREE

    A retargetable MLIR-based machine learning compiler runtime toolkit

    IREE (Intermediate Representation Execution Environment, pronounced as "eerie") is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the data center and down to satisfy the constraints and special considerations of mobile and edge deployments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples....
    Downloads: 312 This Week
    Last Update:
    See Project
  • Ango Hub | All-in-one data labeling platform Icon
    Ango Hub | All-in-one data labeling platform

    For AI teams and Computer Vision team in organizations of all size

    AI-Assisted features of the Ango Hub will automate your AI data workflows to improve data labeling efficiency and model RLHF, all while allowing domain experts to focus on providing high-quality data.
    Learn More
  • 10
    Spice.ai OSS

    Spice.ai OSS

    A self-hostable CDN for databases

    Spice is a portable runtime offering developers a unified SQL interface to materialize, accelerate, and query data from any database, data warehouse, or data lake. Spice connects, fuses, and delivers data to applications, machine-learning models, and AI backends, functioning as an application-specific, tier-optimized Database CDN. The Spice runtime, written in Rust, is built-with industry-leading technologies such as Apache DataFusion, Apache Arrow, Apache Arrow Flight, SQLite, and DuckDB. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    Torch-TensorRT

    Torch-TensorRT

    PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

    Torch-TensorRT is a compiler for PyTorch/TorchScript, targeting NVIDIA GPUs via NVIDIA’s TensorRT Deep Learning Optimizer and Runtime. Unlike PyTorch’s Just-In-Time (JIT) compiler, Torch-TensorRT is an Ahead-of-Time (AOT) compiler, meaning that before you deploy your TorchScript code, you go through an explicit compile step to convert a standard TorchScript program into a module targeting a TensorRT engine. Torch-TensorRT operates as a PyTorch extension and compiles modules that integrate into the JIT runtime seamlessly. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Elkeid

    Elkeid

    Open source solution that can meet the requirements of workloads

    ...It was born out of ByteDance’s internal security best practices, offering for community users a subset of its enterprise-grade capabilities. Elkeid combines kernel-level data collection, user-space agents, and runtime instrumentation (RASP) to detect malicious behavior, file anomalies, runtime exploits, and suspicious container activity. For container or cloud-native workloads, it also supports gathering audit logs from Kubernetes and correlating events across processes, network, and file activity to detect security threats. The platform packages data collection, event-streaming, and a rule/event engine (called “HUB”) — letting users define detection rules, alerts, baseline checks, and policy enforcement.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    mlx

    mlx

    MLX: An array framework for Apple silicon

    MlX offers a local web interface to browse, download, and run ML models via Hugging Face or local sources. It supports searching by tags or tasks, visualization of model metadata, quick inference demos, automatic setup of runtime environments, and works with PyTorch, TensorFlow, and ONNX. Ideal for researchers exploring and testing models via browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 93 This Week
    Last Update:
    See Project
  • 15
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    ...Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. It supports pre-trained models from the Open Model Zoo, along with 100+ open source and public models in popular formats such as TensorFlow, ONNX, PaddlePaddle, MXNet, Caffe, Kaldi.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 16
    Agentex

    Agentex

    Open source codebase for Scale Agentex

    AgentEX is an open framework from Scale for building, running, and evaluating agentic workflows, with an emphasis on reproducibility and measurable outcomes rather than ad-hoc demos. It treats an “agent” as a composition of a policy (the LLM), tools, memory, and an execution runtime so you can test the whole loop, not just prompting. The repo focuses on structured experiments: standardized tasks, canonical tool interfaces, and logs that make it possible to compare models, prompts, and tool sets fairly. It also includes evaluation harnesses that capture success criteria and partial credit, plus traces you can inspect to understand where reasoning or tool use failed. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    SGLang

    SGLang

    SGLang is a fast serving framework for large language models

    SGLang is a fast serving framework for large language models and vision language models. It makes your interaction with models faster and more controllable by co-designing the backend runtime and frontend language.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 18
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    Matrix is a distributed, large-scale engine for multi-agent synthetic data generation and experiments: it provides the infrastructure to run thousands of “agentic” workflows concurrently (e.g. multiple LLMs interacting, reasoning, generating content, data-processing pipelines) by leveraging distributed computing (like Ray + cluster management). The idea is to treat data generation as a “data-to-data” transformation: each input item defines a task, and the runtime orchestrates asynchronous, peer-to-peer agent workflows, avoiding global synchronization bottlenecks. That design makes Matrix particularly well-suited for large-batch inference, model benchmarking, data curation, augmentation, or generation — whether for language, code, dialogue, or multimodal tasks. It supports both open-source LLMs and proprietary models (via integration with model backends), and works with containerized or sandboxed environments for safe tool execution or external code runs.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    MobileCLIP

    MobileCLIP

    Implementation of "MobileCLIP" CVPR 2024

    ...It includes an iOS demo app and Core ML artifacts to showcase practical, offline photo search and classification on iPhone-class hardware. Project notes highlight latency/accuracy trade-offs, with MobileCLIP2 variants matching or surpassing larger baselines at notably lower parameter counts and runtime on mobile devices. A companion “mobileclip-dr” repository details large-scale, distributed data-generation pipelines used to reinforce datasets across billions of samples on thousands of GPUs. Overall, MobileCLIP emphasizes end-to-end practicality: scalable training, deployable models, and consumer-grade demos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    DeepGEMM

    DeepGEMM

    Clean and efficient FP8 GEMM kernels with fine-grained scaling

    ...It supports both standard and “grouped” GEMMs, which is useful for architectures like Mixture of Experts (MoE) that require segmented matrix multiplications. One distinguishing aspect is that DeepGEMM compiles its kernels at runtime (via a lightweight Just-In-Time (JIT) module), so users don’t need to precompile CUDA kernels before installation. Despite its lean design, it includes scaling strategies (fine-grained scaling) and optimizations inspired by cutting edge systems (drawing from ideas in CUTLASS, CuTe) but in a more streamlined form.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    shadowhook

    shadowhook

    Android inline hook library which supports thumb, arm32 and arm64

    shadowhook is an open-source native code hooking library for Android — designed to let developers intercept and override native (C/C++) functions inside Android apps at runtime. It supports both ARM32 and ARM64 architectures (and the respective “thumb” instruction set) and works across a wide range of Android OS versions. The library allows you to specify hook targets either by function address or by library name + function name, and it automatically handles newly loaded shared libraries (ELFs), ensuring hooks remain effective even when code is dynamically loaded at runtime. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    NVIDIA FLARE

    NVIDIA FLARE

    NVIDIA Federated Learning Application Runtime Environment

    NVIDIA Federated Learning Application Runtime Environment NVIDIA FLARE is a domain-agnostic, open-source, extensible SDK that allows researchers and data scientists to adapt existing ML/DL workflows(PyTorch, TensorFlow, Scikit-learn, XGBoost etc.) to a federated paradigm. It enables platform developers to build a secure, privacy-preserving offering for a distributed multi-party collaboration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MochiDiffusion

    MochiDiffusion

    Run Stable Diffusion on Mac natively

    MochiDiffusion is a native macOS application that allows users to run Stable Diffusion models locally, leveraging Apple Silicon GPU acceleration via Core ML. It offers users GUI controls for prompts and model configuration without needing Python or Docker, enabling offline image generation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    5ire

    5ire

    5ire is a cross-platform desktop AI assistant, MCP client

    5ire is a sleek, cross‑platform desktop AI assistant and MCP client that connects to major service providers, supports a local knowledge base and tool integration via MCP servers, enabling robust RAG and assistant features. These components are required as they constitute the runtime environment for the MCP Server. If you don't anticipate using the tools feature immediately, you may choose to skip this installation step and complete it later when the need arises. MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 25
    Agent Stack

    Agent Stack

    Deploy and share agents with open infrastructure

    Agent Stack is an open infrastructure platform designed to take AI agents from prototype to production, no matter how they were built. It includes a runtime environment, multi-tenant web UI, catalog of agents, and deployment flow that seeks to remove vendor lock-in and provide greater autonomy. Under the hood it’s built on the “Agent2Agent” (A2A) protocol, enabling interoperability between different agent ecosystems, runtime services, and frameworks. The platform supports agents built in frameworks like LangChain, CrewAI, etc., enabling them to be hosted, managed and shared through a unified interface. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next