Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "inference" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Linux 491
Windows 473
Mac 471
More...
BSD 250
ChromeOS 250
Mobile Operating Systems 10

Category

Artificial Intelligence 452
Software Development 59
Scientific/Engineering 20
Multimedia 17
Business 12
Education 7
Formats and Protocols 5
System 5
Database 3
Internet 2
Text Editors 2
Communications 1
Games 1
Productivity 1

License

OSI-Approved Open Source 462
Creative Commons Attribution License 9
Other License 4
GNU Free Documentation License 1

Translations

English 12
Bengali 1
Italian 1

Programming Language

Python 520
Unix Shell 23
C++ 14
JavaScript 11
TypeScript 7
More...
Java 6
Rust 6
C 4
Go 4
PHP 2
Kotlin 1
Lua 1
Perl 1
Ruby 1
Yacc 1

Status

Beta 6
Alpha 5
Production/Stable 5
Pre-Alpha 4
More...
Planning 1
Inactive 1

Showing 520 open source projects for "inference"

View related business solutions

Python Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

Orpheus TTS

Towards Human-Sounding Speech

...The project ships both pretrained and finetuned English models, as well as a family of multilingual models released as a research preview, and includes data-processing scripts so users can train or finetune their own variants. Inference is provided through a Python package that uses vLLM under the hood for high-throughput, low-latency generation, including streaming examples that show how to generate audio chunks in real time. The maintainers provide Colab notebooks, a standardized prompting format, and one-click deployment via Baseten for production-grade, FP8/FP16 optimized inference with ~200 ms streaming latency.

Downloads: 1 This Week

Last Update: 2025-12-05
See Project
2

ZML

Any model. Any hardware. Zero compromise

...One of its key strengths is cross-compilation, enabling developers to build once and deploy across various platforms without rewriting code. zml provides example implementations of models and workflows, demonstrating how to run inference tasks such as image classification or large language models. It is designed to handle complex distributed setups, including scenarios where model components are split across devices connected via networks.

Downloads: 1 This Week

Last Update: 5 days ago
See Project
3

EconML

Python Package for ML-Based Heterogeneous Treatment Effects Estimation

...This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. One of the biggest promises of machine learning is to automate decision-making in a multitude of domains. At the core of many data-driven personalized decision scenarios is the estimation of heterogeneous treatment effects: what is the causal effect of an intervention on an outcome of interest for a sample with a particular set of features? ...

Downloads: 7 This Week

Last Update: 2025-07-10
See Project
4

Mosec

A high-performance ML model serving framework, offers dynamic batching

Mosec is a high-performance and flexible model-serving framework for building ML model-enabled backend and microservices. It bridges the gap between any machine learning models you just trained and the efficient online service API.

Downloads: 7 This Week

Last Update: 2025-11-25
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference

LightLLM is a high-performance inference and serving framework designed specifically for large language models, focusing on lightweight architecture, scalability, and efficient deployment. The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source implementations, including FasterTransformer, vLLM, and FlashAttention, to accelerate token generation and reduce latency. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
6

TorchRec

Pytorch domain library for recommendation systems

...Pipelined training overlaps dataloading device transfer (copy to GPU), inter-device communications (input_dist), and computation (forward, backward) for increased performance. Optimized kernels for RecSys powered by FBGEMM. Quantization support for reduced precision training and inference. Common modules for RecSys.

Downloads: 3 This Week

Last Update: 2026-03-15
See Project
7

OpenJarvis

Personal AI, On Personal Devices

...The framework provides shared primitives for building local-first agents, along with evaluation tools that measure performance using metrics such as energy consumption, latency, cost, and accuracy. OpenJarvis integrates with local inference engines like Ollama, vLLM, SGLang, and llama.cpp to run language models directly on personal hardware. It also includes a learning loop that allows models to improve over time using locally generated interaction traces. By prioritizing local execution and efficiency, OpenJarvis aims to provide a foundation for privacy-preserving personal AI assistants.

Downloads: 132 This Week

Last Update: 2026-03-16
See Project
8

PEFT

State-of-the-art Parameter-Efficient Fine-Tuning

Parameter-Efficient Fine-Tuning (PEFT) methods enable efficient adaptation of pre-trained language models (PLMs) to various downstream applications without fine-tuning all the model's parameters. Fine-tuning large-scale PLMs is often prohibitively costly. In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, thereby greatly decreasing the computational and storage costs. Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full...

Downloads: 4 This Week

Last Update: 2026-01-09
See Project
9

Llama Recipes

Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method

The 'llama-recipes' repository is a companion to the Meta Llama models. We support the latest version, Llama 3.1, in this repository. The goal is to provide a scalable library for fine-tuning Meta Llama models, along with some example scripts and notebooks to quickly get started with using the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here showcase how to run...

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime
General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.

Try Free
10

Bespoke Curator

Synthetic data curation for post-training and data extraction

...Curator includes tools for monitoring data generation processes and managing dataset quality while large batches of examples are being created. The framework also integrates with multiple inference systems and APIs, allowing users to generate data using different model providers or open-source inference engines.

Downloads: 6 This Week

Last Update: 2026-03-14
See Project
11

WiFi DensePose

Turn WiFi signals into real-time human pose estimation and detection

...It is designed to showcase the emerging field of RF-based sensing, where machine learning models interpret wireless channel data to reconstruct human movement and posture. The repository includes components for data processing, model inference, and real-time visualization, making it suitable for research and experimental deployments. Its architecture emphasizes performance and reproducibility, allowing developers to explore non-visual motion capture systems using accessible hardware. Overall, WiFi DensePose functions as an advanced research-grade toolkit for WiFi-based human sensing and pose estimation.

Downloads: 53 This Week

Last Update: 2026-04-06
See Project
12

SparseML

Libraries for applying sparsification recipes to neural networks

SparseML is an optimization toolkit for training and deploying deep learning models using sparsification techniques like pruning and quantization to improve efficiency.

Downloads: 0 This Week

Last Update: 2025-06-02
See Project
13

DocTR

Library for OCR-related tasks powered by Deep Learning

DocTR provides an easy and powerful way to extract valuable information from your documents. Seemlessly process documents for Natural Language Understanding tasks: we provide OCR predictors to parse textual information (localize and identify each word) from your documents. Robust 2-stage (detection + recognition) OCR predictors with pretrained parameters. User-friendly, 3 lines of code to load a document and extract text with a predictor. State-of-the-art performances on public document...

Downloads: 9 This Week

Last Update: 2026-02-04
See Project
14

FlashAttention

Fast and memory-efficient exact attention

...The project provides implementations of FlashAttention, FlashAttention-2, and newer iterations optimized for modern GPU architectures such as NVIDIA Hopper and AMD accelerators. By improving both forward and backward pass efficiency, it enables training and inference of large language models with longer sequence lengths and higher throughput. The library integrates with PyTorch and supports various attention configurations, including causal masking, multi-query attention, and rotary embeddings.

Downloads: 49 This Week

Last Update: 2026-03-18
See Project
15

Genv

GPU environment management and cluster orchestration

Genv is an open-source environment and cluster management system for GPUs. Genv lets you easily control, configure, monitor and enforce the GPU resources that you are using in a GPU machine or cluster. It is intended to ease up the process of GPU allocation for data scientists without code changes.

Downloads: 19 This Week

Last Update: 2024-05-16
See Project
16

LLM Action

Technical principles related to large models

LLM-Action is a knowledge/tutorial/repository that shares principles, techniques, and real-world experience related to large language models (LLMs), focusing on LLM engineering, deployment, optimization, inference, compression, and tooling. It organizes content in domains like training, inference, compression, alignment, evaluation, pipelines, and applications. Sections covering infrastructure, engineering, and deployment. Repository templates, sample code, and resource links. Articles/code on LLM compression (quantization, pruning).

Downloads: 1 This Week

Last Update: 2026-03-12
See Project
17

OpenVINO Training Extensions

Trainable models and NN optimization tools

OpenVINO™ Training Extensions provide a convenient environment to train Deep Learning models and convert them using the OpenVINO™ toolkit for optimized inference. When ote_cli is installed in the virtual environment, you can use the ote command line interface to perform various actions for templates related to the chosen task type, such as running, training, evaluating, exporting, etc. ote train trains a model (a particular model template) on a dataset and saves results in two files. ote optimize optimizes a pre-trained model using NNCF or POT depending on the model format. ...

Downloads: 8 This Week

Last Update: 2025-10-13
See Project
18

tiny-llm

A course of learning LLM inference serving on Apple Silicon

tiny-llm is an educational open-source project designed to teach system engineers how large language model inference and serving systems work by building them from scratch. The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. ...

Downloads: 0 This Week

Last Update: 2026-03-26
See Project
19

SAHI

A lightweight vision library for performing large object detection

...In this work, an open-source framework called Slicing Aided Hyper Inference (SAHI) is proposed that provides a generic slicing aided inference and fine-tuning pipeline for small object detection.

Downloads: 0 This Week

Last Update: 2025-09-28
See Project
20

KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

...It also supports advanced inference configurations such as Flash Attention v2 and multi-GPU inference setups for very large models.

Downloads: 1 This Week

Last Update: 2026-03-09
See Project
21

API-for-Open-LLM

Openai style api for open large language models

API-for-Open-LLM is a lightweight API server designed for deploying and serving open large language models (LLMs), offering a simple way to integrate LLMs into applications.

Downloads: 0 This Week

Last Update: 2025-01-22
See Project
22

DeepSeek Coder

DeepSeek Coder: Let the Code Write Itself

...This dataset covers project-level code structure (not just line-by-line snippets), using a large context window (e.g. 16K) and a secondary fill-in-the-blank objective to encourage better contextual completions and infilling. Multiple sizes of the model are offered (e.g. 1B, 5.7B, 6.7B, 33B) so users can trade off inference cost vs capability. The repo provides model weights, documentation on training setup, evaluation results on common benchmarks (HumanEval, MultiPL-E, APPS, etc.), and inference tools.

Downloads: 10 This Week

Last Update: 2025-11-11
See Project
23

Cybergod

A program that can do anything to earn money without human operators

AGI Computer Control is an experimental autonomous software system designed to operate independently and generate income without human intervention. It aims to simulate artificial general intelligence (AGI) by leveraging evolutionary algorithms, deep active inference, and other advanced AI techniques. The project explores the boundaries of machine autonomy and self-directed behavior in computational environments.

Downloads: 6 This Week

Last Update: 2025-05-21
See Project
24

LLaMA Models

Utilities intended for use with Llama models

...It complements separate repos that carry code and demos (for example inference kernels or cookbook content) by keeping authoritative metadata and specs here. Model lineages and size variants are documented externally (e.g., Llama 3.x and beyond), with this repo providing the “single source of truth” links and utilities. In practice, teams use llama-models as a reference when selecting variants, aligning licenses, and wiring in helper scripts for deployment.

Downloads: 6 This Week

Last Update: 2025-10-08
See Project
25

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides inference scripts, checkpoints, and simple Python APIs so you can generate clips from prompts or incorporate the models into applications. ...

Downloads: 9 This Week

Last Update: 2025-10-13
See Project

Previous
1
2
3
You're on page 4
5
6
7
8
Next

Related Searches

wifi-densepose

wifi

wifi densepose

craxsrat-7.5-license

ocr

llm

sahi

deepseek

ai

agi

Related Categories

Artificial Intelligence

Software Development

Scientific/Engineering

Multimedia

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise