inference free download

Showing 34 open source projects for "inference"

View related business solutions

Frameworks Clear Filters & Widen Search

$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
1

MegEngine

Easy-to-use deep learning framework with 3 key features

...On Windows 10 you can either install the Linux distribution through Windows Subsystem for Linux (WSL) or install the Windows distribution directly. Many other platforms are supported for inference.

Downloads: 3 This Week

Last Update: 2024-04-30
See Project
2

bitnet.cpp

Official inference framework for 1-bit LLMs

bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous compute infrastructure. ...

Downloads: 6 This Week

Last Update: 2026-03-10
See Project
3

BentoML

Unified Model Serving Framework

...Standard .bento format for packaging code, models and dependencies for easy versioning and deployment. Integrate with any training pipeline or ML experimentation platform. Parallelize compute-intense model inference workloads to scale separately from the serving logic. Adaptive batching dynamically groups inference requests for optimal performance. Orchestrate distributed inference graph with multiple models via Yatai on Kubernetes. Easily configure CUDA dependencies for running inference with GPU. Automatically generate docker images for production deployment.

Downloads: 0 This Week

Last Update: 2026-04-02
See Project
4

NNCF

Neural Network Compression Framework for enhanced OpenVINO

NNCF (Neural Network Compression Framework) is an optimization toolkit for deep learning models, designed to apply quantization, pruning, and other techniques to improve inference efficiency.

Downloads: 2 This Week

Last Update: 2026-04-08
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

MNN

MNN is a blazing fast, lightweight deep learning framework

MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models, and has industry leading performance for inference and training on-device. At present, MNN has been integrated in more than 20 apps of Alibaba Inc, such as Taobao, Tmall, Youku, Dingtalk, Xianyu and etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. ...

Downloads: 14 This Week

Last Update: 2026-04-07
See Project
6

Seldon Core

An MLOps framework to package, deploy, monitor and manage models

The de facto standard open-source platform for rapidly deploying machine learning models on Kubernetes. Seldon Core, our open-source framework, makes it easier and faster to deploy your machine learning models and experiments at scale on Kubernetes. Seldon Core serves models built in any open-source or commercial model building framework. You can make use of powerful Kubernetes features like custom resource definitions to manage model graphs. And then connect your continuous integration and...

Downloads: 1 This Week

Last Update: 2026-01-23
See Project
7

Superduper

Superduper: Integrate AI models and machine learning workflows

Superduper is a Python-based framework for building end-2-end AI-data workflows and applications on your own data, integrating with major databases. It supports the latest technologies and techniques, including LLMs, vector-search, RAG, and multimodality as well as classical AI and ML paradigms. Developers may leverage Superduper by building compositional and declarative objects that out-source the details of deployment, orchestration versioning, and more to the Superduper engine. This...

Downloads: 3 This Week

Last Update: 2025-08-26
See Project
8

StabilityMatrix

Multi-Platform Package Manager for Stable Diffusion

StabilityMatrix is a project that helps organize, evaluate, and compare generative AI models and their behavior across prompts, datasets, or configuration settings. It provides a framework to run experiments systematically—capturing inputs, model configurations, outputs, and metrics—so researchers and practitioners can reason about differences in quality, robustness, and failure modes. The repository often bundles tooling for automated prompt sweeping, scoring heuristics (such as diversity,...

Downloads: 228 This Week

Last Update: 6 days ago
See Project
9

tinygrad

Deep learning framework

This may not be the best deep learning framework, but it is a deep learning framework. Due to its extreme simplicity, it aims to be the easiest framework to add new accelerators to, with support for both inference and training. If XLA is CISC, tinygrad is RISC.

Downloads: 3 This Week

Last Update: 2026-01-12
See Project
Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

Ray

A unified framework for scalable computing

Modern workloads like deep learning and hyperparameter tuning are compute-intensive and require distributed or parallel execution. Ray makes it effortless to parallelize single machine code — go from a single CPU to multi-core, multi-GPU or multi-node with minimal code changes. Accelerate your PyTorch and Tensorflow workload with a more resource-efficient and flexible distributed execution framework powered by Ray. Accelerate your hyperparameter search workloads with Ray Tune. Find the best...

Downloads: 4 This Week

Last Update: 7 days ago
See Project
11

Meridian

Meridian is an MMM framework

Meridian is a comprehensive, open source marketing mix modeling (MMM) framework developed by Google to help advertisers analyze and optimize the impact of their marketing investments. Built on Bayesian causal inference principles, Meridian enables organizations to evaluate how different marketing channels influence key performance indicators (KPIs) such as revenue or conversions while accounting for external factors like seasonality or economic trends. The framework provides a robust foundation for constructing in-house MMM pipelines capable of handling both national and geo-level data, with built-in support for calibration using experimental data or prior knowledge. ...

Downloads: 0 This Week

Last Update: 4 hours ago
See Project
12

waifu2x ncnn Vulkan

waifu2x converter ncnn version, run fast GPU with vulkan

ncnn implementation of waifu2x converter. Runs fast on Intel/AMD/Nvidia/Apple-Silicon with Vulkan API. waifu2x-ncnn-vulkan uses ncnn project as the universal neural network inference framework.

Downloads: 2 This Week

Last Update: 2025-09-15
See Project
13

easystats

The R easystats-project

easystats is a meta‑package that installs and unifies a suite of R packages for post‑processing statistical models. It delivers a consistent API to assess model performance, effect sizes, parameters, and to generate reports and visualizations, all with minimal dependencies and maximum clarity.

Downloads: 0 This Week

Last Update: 2025-07-30
See Project
14

pep484 stubs for Django

PEP-484 stubs for Django

This package contains type stubs and a custom mypy plugin to provide more precise static types and type inference for Django framework. Django uses some Python "magic" that makes having precise types for some code patterns problematic. This is why we need this project. The final goal is to be able to get precise types for the most common patterns. We are independent from Django at the moment. There's a proposal to merge our project into the Django itself.

Downloads: 5 This Week

Last Update: 2026-04-01
See Project
15

Nimble

A Matcher Framework for Swift and Objective-C

Use Nimble to express the expected outcomes of Swift or Objective-C expressions. Inspired by Cedar. Apple's Xcode includes the XCTest framework, which provides assertion macros to test whether code behaves properly. XCTest assertions have a couple of drawbacks. Not enough macros. There's no easy way to assert that a string contains a particular substring, or that a number is less than or equal to another. It's hard to write asynchronous tests. XCTest forces you to write a lot of boilerplate...

Downloads: 6 This Week

Last Update: 2025-11-28
See Project
16

TorchQuantum

A PyTorch-based framework for Quantum Classical Simulation

A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers. Researchers on quantum algorithm design, parameterized quantum circuit training, quantum optimal control, quantum machine learning, and quantum neural networks. Dynamic computation graph, automatic gradient computation, fast GPU support, batch model terrorized processing.

Downloads: 1 This Week

Last Update: 2024-09-30
See Project
17

MMDeploy

OpenMMLab Model Deployment Framework

...Models can be exported and run in several backends, and more will be compatible. All kinds of modules in the SDK can be extended, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on. Install and build your target backend. ONNX Runtime is a cross-platform inference and training accelerator compatible with many popular ML/DNN frameworks. Please read getting_started for the basic usage of MMDeploy.

Downloads: 0 This Week

Last Update: 2023-12-25
See Project
18

pipeless

A computer vision framework to create and deploy apps in minutes

...You provide some functions that are executed for new video frames and Pipeless takes care of everything else. You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.

Downloads: 16 This Week

Last Update: 2024-02-23
See Project
19

towhee

Framework that is dedicated to making neural data processing

...From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. Towhee includes a pythonic method-chaining API for describing custom data processing pipelines. We also support schemas, making processing unstructured data as easy as handling tabular data.

Downloads: 1 This Week

Last Update: 2023-12-05
See Project
20

KotlinDL

High-level Deep Learning Framework written in Kotlin

...Under the hood, it uses TensorFlow Java API and ONNX Runtime API for Java. KotlinDL offers simple APIs for training deep learning models from scratch, importing existing Keras and ONNX models for inference, and leveraging transfer learning for tailoring existing pre-trained models to your tasks. This project aims to make Deep Learning easier for JVM and Android developers and simplify deploying deep learning models in production environments.

Downloads: 3 This Week

Last Update: 2024-01-29
See Project
21

Darknet

Convolutional Neural Networks

Darknet is an open source neural network framework written in C and CUDA, developed by Joseph Redmon. It is best known as the original implementation of the YOLO (You Only Look Once) real-time object detection system. Darknet is lightweight, fast, and easy to compile, making it suitable for research and production use. The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration,...

Downloads: 21 This Week

Last Update: 7 days ago
See Project
22

typera

Type-safe routes for Express and Koa

Typera helps you build backends in a type-safe manner by leveraging io-ts and some TypeScript type inference magic. It works with both Express and Koa.

Downloads: 2 This Week

Last Update: 2024-01-11
See Project
23

MACE

Deep learning inference framework optimized for mobile platforms

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices. Runtime is optimized with NEON, OpenCL and Hexagon, and Winograd algorithm is introduced to speed up convolution operations. The initialization is also optimized to be faster. Chip-dependent power options like big.LITTLE scheduling, Adreno GPU hints are included as advanced APIs.

Downloads: 0 This Week

Last Update: 2022-01-13
See Project
24

TNN

Uniform deep learning inference framework for mobile

...As a basic acceleration framework for Tencent Cloud AI, TNN has provided acceleration support for the implementation of many businesses. Everyone is welcome to participate in the collaborative construction to promote the further improvement of the TNN inference framework.

Downloads: 0 This Week

Last Update: 2022-08-03
See Project
25

ThinkTs

Based on koa and typeorm,asynchronous non blocking reactive coding

Based on koa and Typeform, asynchronous nonblocking reactive coding, and a real MVC web framework, inspired by [ThinkPHP + Nestjs + FastAPI], it is also the fastest development speed and fastest performance.

Downloads: 1 This Week

Last Update: 2024-01-19
See Project