Showing 36 open source projects for "inference"

View related business solutions
  • AI Agents That Actually Do the Work Icon
    AI Agents That Actually Do the Work

    Assign real work to AI teammates that know your projects, priorities, and deadlines.

    ClickUp's Super Agents run 24/7 inside your workspace: triaging bugs, drafting content, updating statuses, and routing tasks without being told twice. Connect them to 500+ tools and let them execute, not just suggest. Build custom agents in minutes that understand your workflows and act on them autonomously.
    Try ClickUp Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Text Embeddings Inference

    Text Embeddings Inference

    High-performance inference server for text embeddings models API layer

    Text Embeddings Inference is a high-performance server designed to serve text embedding models efficiently in production environments. It focuses on delivering fast and scalable embedding generation by leveraging optimized inference techniques and modern hardware acceleration. It is built to support transformer-based embedding models, making it suitable for tasks such as semantic search, clustering, and retrieval-augmented systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Open WebUI

    Open WebUI

    User-friendly AI Interface

    Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with a built-in inference engine for Retrieval Augmented Generation (RAG), making it a powerful AI deployment solution. Key features include effortless setup via Docker or Kubernetes, seamless integration with OpenAI-compatible APIs, granular permissions and user groups for enhanced security, responsive design across devices, and full Markdown and LaTeX support for enriched interactions. ...
    Downloads: 130 This Week
    Last Update:
    See Project
  • 3
    DeepCamera

    DeepCamera

    Open-Source AI Camera. Empower any camera/CCTV

    DeepCamera empowers your traditional surveillance cameras and CCTV/NVR with machine learning technologies. It provides open-source facial recognition-based intrusion detection, fall detection, and parking lot monitoring with the inference engine on your local device. SharpAI-hub is the cloud hosting for AI applications that helps you deploy AI applications with your CCTV camera on your edge device in minutes. SharpAI yolov7_reid is an open-source Python application that leverages AI technologies to detect intruders with traditional surveillance cameras. The source code is here It leverages Yolov7 as a person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identify unseen persons, Labelstudio to host images locally and for further usage such as label data and train your own classifier. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    Kokoro

    Kokoro

    An inference library for Kokoro-82M

    Kokoro is an open-weight text-to-speech model and inference library built around the lightweight Kokoro-82M model. It is designed to generate high-quality speech from text while staying fast, compact, and cost-efficient compared with larger TTS systems. The project is useful for developers who want deployable speech synthesis without depending on a closed platform. It can be installed as a Python package and used in applications, scripts, experiments, or production workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    NemoClaw

    NemoClaw

    NVIDIA plugin for secure installation of OpenClaw

    ...It installs and configures the NVIDIA OpenShell runtime, which provides a secure environment for running autonomous AI agents. NemoClaw enables users to launch sandboxed agent environments that control network access, file permissions, and inference requests through policy-based security. The platform integrates with AI models such as NVIDIA Nemotron and supports multiple inference backends including cloud APIs, local NIM deployments, and vLLM. Through its command-line interface, developers can deploy, monitor, and manage AI assistants running inside isolated sandboxes. By combining sandbox orchestration, agent management, and AI model integration, NemoClaw provides a secure foundation for building and operating autonomous AI assistants.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Harbor LLM

    Harbor LLM

    Run a full local LLM stack with one command using Docker

    ...With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    LLM Course

    LLM Course

    Course to get into Large Language Models (LLMs)

    ...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Model Explorer

    Model Explorer

    A modern model graph visualizer and debugger

    Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Operit AI

    Operit AI

    Powerful Android AI agent with tools, automation, and Linux shell

    Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...
    Downloads: 13 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 10
    Groq Desktop

    Groq Desktop

    Local Groq Desktop chat app with MCP support

    ...Developers can also use groq-desktop-beta as a lightweight interface to test prompts, media inputs, or function-calling capabilities before embedding them into larger projects. The project offers installable builds (including via Homebrew on macOS) and supports easy setup, giving quick access to Groq’s inference services without needing to spin up a full backend.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 11
    GitHub Actions for DigitalOcean

    GitHub Actions for DigitalOcean

    GitHub Actions for DigitalOcean - doctl

    ...Powerful and production-ready, our cloud platform has the solutions that devs like you need to succeed, whether you're building world-changing AI apps, running a side project, or building a business. GPU solutions for everyone—novice to expert. Run training and inference, process large data sets and complex neural networks, and deploy high-performance computing clusters.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ARC-AGI

    ARC-AGI

    The Abstraction and Reasoning Corpus

    ARC-AGI is a benchmark dataset and experimental framework designed to evaluate and advance artificial general intelligence by testing systems on abstract reasoning tasks that require human-like problem-solving abilities. It consists of a curated set of tasks where models must infer patterns from input-output examples and apply those rules to new unseen cases, without relying on memorization or prior training data. The dataset is structured as grid-based puzzles, where each task requires...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    Cognita

    Cognita

    Open source RAG framework for building scalable modular AI apps

    Cognita is an open source framework designed to help developers build, organize, and deploy Retrieval-Augmented Generation (RAG) applications in a structured and production-ready way. It addresses the gap between quick experimentation in notebooks and the complexity of deploying scalable AI systems by introducing a modular and API-driven architecture. Cognita provides reusable components such as parsers, data loaders, embedders, retrievers, and query controllers, allowing teams to customize...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    JSON to Go

    JSON to Go

    Translates JSON into a Go type in your browser instantly (original)

    JSON to Go is a browser-based developer tool that converts JSON samples into Go struct definitions. It is designed to save Go developers time when working with APIs, configuration files, or external JSON payloads. Users paste JSON into the tool, and it generates a matching Go type that can be copied into a project. The tool makes reasonable assumptions about field names, types, nested objects, arrays, and struct tags, but it still expects users to review the output before using it in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Supertonic

    Supertonic

    Lightning-fast, on-device TTS, running natively via ONNX

    Supertonic is a lightning-fast, on-device text-to-speech system built around ONNX Runtime for maximum speed and portability. It focuses on running entirely locally, eliminating the need for cloud APIs and providing low latency and strong privacy guarantees, even on constrained devices like Raspberry Pi boards and e-readers. The core model is highly compact at around 66 million parameters, yet benchmarks show it can generate speech up to 167× faster than real time on modern consumer hardware...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Jaaz

    Jaaz

    Open source multimodal creative AI assistant with infinite canvas tool

    ...It combines AI agents with visual editing tools, allowing users to generate media through prompts, sketches, or simple instructions. Jaaz supports multiple AI models and can integrate both local and cloud-based inference systems, enabling flexible creative workflows. Jaaz emphasizes privacy and local-first operation, allowing creators to run AI models locally so that their data does not leave their device. It also includes collaborative planning tools such as visual layouts and storyboard organization to support complex creative projects. By combining generative AI with a canvas-based interface, the project aims to provide a creative platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Text-to-image Playground

    Text-to-image Playground

    A playground to generate images from any text prompt using SD

    ...The platform demonstrates how large generative models can be integrated into user-friendly tools for creative exploration and rapid prototyping. It also serves as a reference architecture for building full-stack generative AI applications that connect model inference pipelines with web interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ChatAnyLLM

    ChatAnyLLM

    Unified interface for local model, cloud provider, and custom agent.

    ChatAnyLLM is a desktop GUI application for local inference engines (Ollama, LM Studio, OpenClaw) and cloud providers like OpenRouter. Users may manually configure any OpenAI-compatible API endpoint, connecting third-party providers such as Groq or Cerebras. The application stores conversation history locally and saves api keys with system-level encryption. It supports reasoning models, multimodal inputs, and formatting for LaTeX and code.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    pipeless

    pipeless

    A computer vision framework to create and deploy apps in minutes

    ...You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    gpu_poor

    gpu_poor

    Calculate token/s & GPU memory requirement for any LLM

    gpu_poor is an open-source tool designed to help developers determine whether their hardware is capable of running a specific large language model and to estimate the performance they can expect from it. The project focuses on calculating GPU memory requirements and predicted inference speed for different models, hardware configurations, and quantization strategies. By analyzing factors such as model size, context length, batch size, and GPU specifications, the system estimates how much VRAM will be required and how fast tokens can be generated during inference. The tool also provides a detailed breakdown of where GPU memory is allocated, including model weights, KV cache, activations, and other runtime overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    enhancr

    enhancr

    Video Frame Interpolation & Super Resolution using NVIDIA's TensorRT

    ...The GUI was designed to provide a stunning experience powered by state-of-the-art technologies without feeling clunky and outdated like other alternatives. It features blazing-fast TensorRT inference by NVIDIA, which can speed up AI processes significantly. Pre-packaged, without the need to install Docker or WSL (Windows Subsystem for Linux) - and NCNN inference by Tencent which is lightweight and runs on NVIDIA, AMD and even Apple Silicon - in contrast to the mammoth of an inference PyTorch is, which only runs on NVIDIA GPUs.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    Style2Paints

    Style2Paints

    Sketch + style = paints

    style2paints is an AI-assisted colorization system aimed primarily at line art and manga, turning monochrome drawings into colored illustrations with minimal manual effort. It combines automatic color inference with user guidance, letting artists nudge the model using sparse color hints, masks, or style references. The pipeline focuses on preserving line quality while spreading coherent colors and shading across regions that are often ambiguous to purely automatic methods. Iterative refinement is a core workflow: you can add or adjust hints, rerun inference, and progressively converge on a desired palette and lighting. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 23
    Caramel

    Caramel

    Functional language for building type-safe applications

    ...Caramel leverages the OCaml compiler, to provide you with a pragmatic type system and industrial-strength type safety, and the Erlang VM, known for running low-latency, distributed, and fault-tolerant systems used in a wide range of industries. Excellent type inference, so you never need to annotate your code. Supports sources in OCaml (and soon Reason syntax too). Caramel aims to make building type-safe concurrent programs a productive and fun experience. Caramel should let anyone with existing OCaml or Reason experience be up and running without having to relearn the entire language. Caramel strives to integrate with the larger ecosystem of BEAM languages, like Erlang, Elixir, Gleam, Purerl, LFE, and Hamler.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    commit-autosuggestions

    commit-autosuggestions

    A tool that AI automatically recommends commit messages

    ...However, most code changes are not made only by add of the code, and some parts of the code are deleted. We plan to slowly conquer languages that are not currently supported. To run this project, you need a flask-based inference server (GPU) and a client (commit module). If you don't have a GPU, don't worry, you can use it through Google Colab.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    SQLFlow

    SQLFlow

    SQL compiler bridging databases and machine learning workflows

    SQLFlow is an open source project designed to bridge the gap between traditional SQL-based data processing and modern machine learning workflows by extending SQL syntax with AI capabilities. It acts as a compiler that translates SQL programs into executable workflows, enabling users to train, evaluate, and deploy machine learning models directly from SQL statements. It integrates with multiple database engines such as MySQL, Hive, and MaxCompute, while also supporting machine learning...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo