inference free download

Showing 36 open source projects for "inference"

View related business solutions

JavaScript Clear Filters & Widen Search

AI Agents That Actually Do the Work
Assign real work to AI teammates that know your projects, priorities, and deadlines.

ClickUp's Super Agents run 24/7 inside your workspace: triaging bugs, drafting content, updating statuses, and routing tasks without being told twice. Connect them to 500+ tools and let them execute, not just suggest. Build custom agents in minutes that understand your workflows and act on them autonomously.

Try ClickUp Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
1

Text Embeddings Inference

High-performance inference server for text embeddings models API layer

Text Embeddings Inference is a high-performance server designed to serve text embedding models efficiently in production environments. It focuses on delivering fast and scalable embedding generation by leveraging optimized inference techniques and modern hardware acceleration. It is built to support transformer-based embedding models, making it suitable for tasks such as semantic search, clustering, and retrieval-augmented systems.

Downloads: 0 This Week

Last Update: 2026-03-23
See Project
2

Open WebUI

User-friendly AI Interface

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with a built-in inference engine for Retrieval Augmented Generation (RAG), making it a powerful AI deployment solution. Key features include effortless setup via Docker or Kubernetes, seamless integration with OpenAI-compatible APIs, granular permissions and user groups for enhanced security, responsive design across devices, and full Markdown and LaTeX support for enriched interactions. ...

Downloads: 130 This Week

Last Update: 2026-06-02
See Project
3

DeepCamera

Open-Source AI Camera. Empower any camera/CCTV

DeepCamera empowers your traditional surveillance cameras and CCTV/NVR with machine learning technologies. It provides open-source facial recognition-based intrusion detection, fall detection, and parking lot monitoring with the inference engine on your local device. SharpAI-hub is the cloud hosting for AI applications that helps you deploy AI applications with your CCTV camera on your edge device in minutes. SharpAI yolov7_reid is an open-source Python application that leverages AI technologies to detect intruders with traditional surveillance cameras. The source code is here It leverages Yolov7 as a person detector, FastReID for person feature extraction, Milvus the local vector database for self-supervised learning to identify unseen persons, Labelstudio to host images locally and for further usage such as label data and train your own classifier. ...

Downloads: 12 This Week

Last Update: 2026-03-20
See Project
4

Kokoro

An inference library for Kokoro-82M

Kokoro is an open-weight text-to-speech model and inference library built around the lightweight Kokoro-82M model. It is designed to generate high-quality speech from text while staying fast, compact, and cost-efficient compared with larger TTS systems. The project is useful for developers who want deployable speech synthesis without depending on a closed platform. It can be installed as a Python package and used in applications, scripts, experiments, or production workflows.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
5

NemoClaw

NVIDIA plugin for secure installation of OpenClaw

...It installs and configures the NVIDIA OpenShell runtime, which provides a secure environment for running autonomous AI agents. NemoClaw enables users to launch sandboxed agent environments that control network access, file permissions, and inference requests through policy-based security. The platform integrates with AI models such as NVIDIA Nemotron and supports multiple inference backends including cloud APIs, local NIM deployments, and vLLM. Through its command-line interface, developers can deploy, monitor, and manage AI assistants running inside isolated sandboxes. By combining sandbox orchestration, agent management, and AI model integration, NemoClaw provides a secure foundation for building and operating autonomous AI assistants.

Downloads: 2 This Week

Last Update: 15 hours ago
See Project
6

Harbor LLM

Run a full local LLM stack with one command using Docker

...With a single command, users can start preconfigured tools like Ollama and Open WebUI, enabling chat, workflows, and integrations immediately. Harbor supports multiple inference engines, including llama.cpp and vLLM, and connects them seamlessly to user interfaces. It also includes tools for web retrieval, image generation, voice interaction, and workflow automation. Built on Docker, Harbor allows services to run in isolated containers while communicating over a local network. It is intended for local development and experimentation rather than production deployment, giving developers a flexible way to explore AI systems, test configurations, and manage complex LLM stacks without manual wiring or setup overhead.

Downloads: 0 This Week

Last Update: 2026-06-15
See Project
7

LLM Course

Course to get into Large Language Models (LLMs)

...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
8

Model Explorer

A modern model graph visualizer and debugger

Model Explorer is a visual tool for exploring, debugging, and optimizing ML models deployed on edge devices. Developed by Google AI Edge, it offers a browser-based interface to inspect layer-wise performance, memory usage, and inference timing of TensorFlow Lite and other supported models. It’s a powerful utility for developers optimizing models for constrained environments.

Downloads: 1 This Week

Last Update: 2026-02-09
See Project
9

Operit AI

Powerful Android AI agent with tools, automation, and Linux shell

Operit is a full-featured AI assistant and agent platform designed specifically for Android devices, aiming to go far beyond traditional chat-based interfaces. It integrates deep system-level capabilities with a wide range of tools, allowing the AI to perform real tasks such as file management, automation, and system control directly on the device. A standout aspect of the project is its built-in Ubuntu 24 environment, which enables users to run Linux commands, scripts, and development tools...

Downloads: 13 This Week

Last Update: 2026-05-16
See Project
Ship Agents Faster
Transform your applications and workflows into powerful agentic systems at global scale.

Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.

Get Started Free
10

Groq Desktop

Local Groq Desktop chat app with MCP support

...Developers can also use groq-desktop-beta as a lightweight interface to test prompts, media inputs, or function-calling capabilities before embedding them into larger projects. The project offers installable builds (including via Homebrew on macOS) and supports easy setup, giving quick access to Groq’s inference services without needing to spin up a full backend.

Downloads: 13 This Week

Last Update: 2025-12-12
See Project
11

GitHub Actions for DigitalOcean

GitHub Actions for DigitalOcean - doctl

...Powerful and production-ready, our cloud platform has the solutions that devs like you need to succeed, whether you're building world-changing AI apps, running a side project, or building a business. GPU solutions for everyone—novice to expert. Run training and inference, process large data sets and complex neural networks, and deploy high-performance computing clusters.

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
12

ARC-AGI

The Abstraction and Reasoning Corpus

ARC-AGI is a benchmark dataset and experimental framework designed to evaluate and advance artificial general intelligence by testing systems on abstract reasoning tasks that require human-like problem-solving abilities. It consists of a curated set of tasks where models must infer patterns from input-output examples and apply those rules to new unseen cases, without relying on memorization or prior training data. The dataset is structured as grid-based puzzles, where each task requires...

Downloads: 3 This Week

Last Update: 2026-04-03
See Project
13

Cognita

Open source RAG framework for building scalable modular AI apps

Cognita is an open source framework designed to help developers build, organize, and deploy Retrieval-Augmented Generation (RAG) applications in a structured and production-ready way. It addresses the gap between quick experimentation in notebooks and the complexity of deploying scalable AI systems by introducing a modular and API-driven architecture. Cognita provides reusable components such as parsers, data loaders, embedders, retrievers, and query controllers, allowing teams to customize...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
14

JSON to Go

Translates JSON into a Go type in your browser instantly (original)

JSON to Go is a browser-based developer tool that converts JSON samples into Go struct definitions. It is designed to save Go developers time when working with APIs, configuration files, or external JSON payloads. Users paste JSON into the tool, and it generates a matching Go type that can be copied into a project. The tool makes reasonable assumptions about field names, types, nested objects, arrays, and struct tags, but it still expects users to review the output before using it in...

Downloads: 0 This Week

Last Update: 2026-05-13
See Project
15

Supertonic

Lightning-fast, on-device TTS, running natively via ONNX

Supertonic is a lightning-fast, on-device text-to-speech system built around ONNX Runtime for maximum speed and portability. It focuses on running entirely locally, eliminating the need for cloud APIs and providing low latency and strong privacy guarantees, even on constrained devices like Raspberry Pi boards and e-readers. The core model is highly compact at around 66 million parameters, yet benchmarks show it can generate speech up to 167× faster than real time on modern consumer hardware...

Downloads: 1 This Week

Last Update: 2026-01-06
See Project
16

Jaaz

Open source multimodal creative AI assistant with infinite canvas tool

...It combines AI agents with visual editing tools, allowing users to generate media through prompts, sketches, or simple instructions. Jaaz supports multiple AI models and can integrate both local and cloud-based inference systems, enabling flexible creative workflows. Jaaz emphasizes privacy and local-first operation, allowing creators to run AI models locally so that their data does not leave their device. It also includes collaborative planning tools such as visual layouts and storyboard organization to support complex creative projects. By combining generative AI with a canvas-based interface, the project aims to provide a creative platform.

Downloads: 0 This Week

Last Update: 2026-03-17
See Project
17

Text-to-image Playground

A playground to generate images from any text prompt using SD

...The platform demonstrates how large generative models can be integrated into user-friendly tools for creative exploration and rapid prototyping. It also serves as a reference architecture for building full-stack generative AI applications that connect model inference pipelines with web interfaces.

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
18

ChatAnyLLM

Unified interface for local model, cloud provider, and custom agent.

ChatAnyLLM is a desktop GUI application for local inference engines (Ollama, LM Studio, OpenClaw) and cloud providers like OpenRouter. Users may manually configure any OpenAI-compatible API endpoint, connecting third-party providers such as Groq or Cerebras. The application stores conversation history locally and saves api keys with system-level encryption. It supports reasoning models, multimodal inputs, and formatting for LaTeX and code.

1 Review

Downloads: 8 This Week

Last Update: 6 days ago
See Project
19

pipeless

A computer vision framework to create and deploy apps in minutes

...You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.

Downloads: 0 This Week

Last Update: 2024-02-23
See Project
20

gpu_poor

Calculate token/s & GPU memory requirement for any LLM

gpu_poor is an open-source tool designed to help developers determine whether their hardware is capable of running a specific large language model and to estimate the performance they can expect from it. The project focuses on calculating GPU memory requirements and predicted inference speed for different models, hardware configurations, and quantization strategies. By analyzing factors such as model size, context length, batch size, and GPU specifications, the system estimates how much VRAM will be required and how fast tokens can be generated during inference. The tool also provides a detailed breakdown of where GPU memory is allocated, including model weights, KV cache, activations, and other runtime overhead. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
21

enhancr

Video Frame Interpolation & Super Resolution using NVIDIA's TensorRT

...The GUI was designed to provide a stunning experience powered by state-of-the-art technologies without feeling clunky and outdated like other alternatives. It features blazing-fast TensorRT inference by NVIDIA, which can speed up AI processes significantly. Pre-packaged, without the need to install Docker or WSL (Windows Subsystem for Linux) - and NCNN inference by Tencent which is lightweight and runs on NVIDIA, AMD and even Apple Silicon - in contrast to the mammoth of an inference PyTorch is, which only runs on NVIDIA GPUs.

1 Review

Downloads: 8 This Week

Last Update: 2023-06-07
See Project
22

Style2Paints

Sketch + style = paints

style2paints is an AI-assisted colorization system aimed primarily at line art and manga, turning monochrome drawings into colored illustrations with minimal manual effort. It combines automatic color inference with user guidance, letting artists nudge the model using sparse color hints, masks, or style references. The pipeline focuses on preserving line quality while spreading coherent colors and shading across regions that are often ambiguous to purely automatic methods. Iterative refinement is a core workflow: you can add or adjust hints, rerun inference, and progressively converge on a desired palette and lighting. ...

Downloads: 11 This Week

Last Update: 2025-10-21
See Project
23

Caramel

Functional language for building type-safe applications

...Caramel leverages the OCaml compiler, to provide you with a pragmatic type system and industrial-strength type safety, and the Erlang VM, known for running low-latency, distributed, and fault-tolerant systems used in a wide range of industries. Excellent type inference, so you never need to annotate your code. Supports sources in OCaml (and soon Reason syntax too). Caramel aims to make building type-safe concurrent programs a productive and fun experience. Caramel should let anyone with existing OCaml or Reason experience be up and running without having to relearn the entire language. Caramel strives to integrate with the larger ecosystem of BEAM languages, like Erlang, Elixir, Gleam, Purerl, LFE, and Hamler.

Downloads: 0 This Week

Last Update: 2022-10-11
See Project
24

commit-autosuggestions

A tool that AI automatically recommends commit messages

...However, most code changes are not made only by add of the code, and some parts of the code are deleted. We plan to slowly conquer languages that are not currently supported. To run this project, you need a flask-based inference server (GPU) and a client (commit module). If you don't have a GPU, don't worry, you can use it through Google Colab.

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
25

SQLFlow

SQL compiler bridging databases and machine learning workflows

SQLFlow is an open source project designed to bridge the gap between traditional SQL-based data processing and modern machine learning workflows by extending SQL syntax with AI capabilities. It acts as a compiler that translates SQL programs into executable workflows, enabling users to train, evaluate, and deploy machine learning models directly from SQL statements. It integrates with multiple database engines such as MySQL, Hive, and MaxCompute, while also supporting machine learning...

Downloads: 1 This Week

Last Update: 1 day ago
See Project