Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "i86bi-linux-l3-adventerprisek..."

x

Sort By:

Relevance

Clear All Filters

OS

Linux 152
Windows 143
Mac 138
More...
BSD 16
ChromeOS 15
Mobile Operating Systems 7

Category

Artificial Intelligence 152
Software Development 30
System 4
Business 3
Mobile 2
Multimedia 2
Communications 1
Database 1
Education 1
Formats and Protocols 1

License

OSI-Approved Open Source 147
Creative Commons Attribution License 1
GNU Free Documentation License 1

Translations

English 4

Programming Language

Python 90
C++ 34
Go 8
TypeScript 4
More...
C 3
JavaScript 3
Julia 2
Rust 2
Swift 2
C# 1
Java 1
Kotlin 1

Status

Production/Stable 4
Planning 1

Showing 152 open source projects for "i86bi-linux-l3-adventerprisek..."

View related business solutions

LLM Inference Linux Clear Filters & Widen Search

Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
1

llama.cpp

Port of Facebook's LLaMA model in C/C++

The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.

1 Review

Downloads: 380 This Week

Last Update: 8 hours ago
See Project
2

whisper.cpp

Port of OpenAI's Whisper model in C/C++

whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples....

Downloads: 461 This Week

Last Update: 2026-06-01
See Project
3

Open WebUI

User-friendly AI Interface

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with a built-in inference engine for Retrieval Augmented Generation (RAG), making it a powerful AI deployment solution. Key features include effortless setup via Docker or Kubernetes, seamless integration with OpenAI-compatible APIs, granular permissions and user groups for enhanced security,...

Downloads: 178 This Week

Last Update: 2026-06-02
See Project
4

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...

1 Review

Downloads: 121 This Week

Last Update: 2025-03-17
See Project
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
5

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 90 This Week

Last Update: 2026-05-27
See Project
6

ONNX Runtime

ONNX Runtime: cross-platform, high performance ML inferencing

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators...

Downloads: 48 This Week

Last Update: 2026-05-22
See Project
7

FlashInfer

FlashInfer: Kernel Library for LLM Serving

FlashInfer is a kernel library designed to enhance the serving of Large Language Models (LLMs) by optimizing inference performance. It provides a high-performance framework that integrates seamlessly with existing systems, aiming to reduce latency and improve efficiency in LLM deployments. FlashInfer supports various hardware architectures and is built to scale with the demands of production environments.

Downloads: 36 This Week

Last Update: 3 days ago
See Project
8

vLLM

A high-throughput and memory-efficient inference and serving engine

vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.

Downloads: 21 This Week

Last Update: 2026-06-05
See Project
9

MNN

MNN is a blazing fast, lightweight deep learning framework

MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models, and has industry leading performance for inference and training on-device. At present, MNN has been integrated in more than 20 apps of Alibaba Inc, such as Taobao, Tmall, Youku, Dingtalk, Xianyu and etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity...

Downloads: 30 This Week

Last Update: 2026-04-07
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 27 This Week

Last Update: 2026-06-02
See Project
11

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...

Downloads: 25 This Week

Last Update: 2026-05-25
See Project
12

Gitleaks

Protect and discover secrets using Gitleaks

Gitleaks is a fast, lightweight, portable, and open-source secret scanner for git repositories, files, and directories. With over 6.8 million docker downloads, 11.2k GitHub stars, 1.7 million GitHub Downloads, thousands of weekly clones, and over 400k homebrew installs, gitleaks is the most trusted secret scanner among security professionals, enterprises, and developers. Gitleaks-Action is our official GitHub Action. You can use it to automatically run a gitleaks scan on all your team's pull...

Downloads: 20 This Week

Last Update: 2026-03-21
See Project
13

Oumi

Everything you need to build state-of-the-art foundation models

Oumi is an open-source framework that provides everything needed to build state-of-the-art foundation models, end-to-end. It aims to simplify the development of large-scale machine-learning models.

Downloads: 13 This Week

Last Update: 2026-05-06
See Project
14

LocalAI

The free, Open Source alternative to OpenAI, Claude and others

LocalAI is an open-source platform that allows users to run large language models and other AI systems locally on their own hardware. It acts as a drop-in replacement for APIs such as OpenAI, enabling developers to build AI-powered applications without relying on external cloud services. The platform supports a wide range of model types, including text generation, image creation, speech processing, and embeddings. LocalAI can run on consumer-grade hardware and does not necessarily require a...

Downloads: 17 This Week

Last Update: 24 hours ago
See Project
15

OpenVINO Model Server

A scalable inference server for models optimized with OpenVINO

OpenVINO™ Model Server is a high-performance inference serving system designed to host and serve machine learning models that have been optimized with the OpenVINO toolkit. It’s implemented in C++ for scalability and efficiency, making it suitable for both edge and cloud deployments where inference workloads must be reliable and high throughput. The server exposes model inference via standard network protocols like REST and gRPC, allowing any client that speaks those protocols to request...

Downloads: 16 This Week

Last Update: 2026-05-28
See Project
16

Transformer Engine

A library for accelerating Transformer models on NVIDIA GPUs

Transformer Engine (TE) is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference. TE provides a collection of highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that can be used seamlessly with your framework-specific code. TE also includes a framework-agnostic C++...

Downloads: 16 This Week

Last Update: 4 days ago
See Project
17

ONNX

Open standard for machine learning interoperability

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models, both...

Downloads: 16 This Week

Last Update: 2026-03-27
See Project
18

RamaLama

Simplifies the local serving of AI models from any source

RamaLama is an open-source developer tool that simplifies working with and serving AI models locally or in production by leveraging container technologies like Docker, Podman, and OCI registries, allowing AI inference workflows to be treated like standard container deployments. It abstracts away much of the complexity of configuring AI runtimes, dependencies, and hardware optimizations by detecting available GPUs (or falling back to CPU) and automatically pulling a container image...

Downloads: 15 This Week

Last Update: 2026-06-05
See Project
19

Lean Copilot

LLMs as Copilots for Theorem Proving in Lean

LeanCopilot integrates large language models (LLMs) as copilots for theorem proving in the Lean proof assistant. It assists users by suggesting tactics, premises, and searching for proofs, thereby enhancing the efficiency of formal verification processes. LeanCopilot supports both built-in models from LeanDojo and custom models, offering flexibility for various use cases.

Downloads: 11 This Week

Last Update: 2026-06-02
See Project
20

AIMET

AIMET is a library that provides advanced quantization and compression

Qualcomm Innovation Center (QuIC) is at the forefront of enabling low-power inference at the edge through its pioneering model-efficiency research. QuIC has a mission to help migrate the ecosystem toward fixed-point inference. With this goal, QuIC presents the AI Model Efficiency Toolkit (AIMET) - a library that provides advanced quantization and compression techniques for trained neural network models. AIMET enables neural networks to run more efficiently on fixed-point AI hardware...

Downloads: 16 This Week

Last Update: 2026-06-04
See Project
21

LLamaSharp

C#/.NET binding of llama.cpp, including LLaMa/GPT model inference

The C#/.NET binding of llama.cpp. It provides APIs to infer the LLaMa Models and deploy it on the local environment. It works on both Windows, Linux and MAC without the requirement for compiling llama.cpp yourself. Its performance is close to llama.cpp. Furthermore, it provides integrations with other projects such as BotSharp to provide higher-level applications and UI.

Downloads: 9 This Week

Last Update: 2026-04-26
See Project
22

optillm

Optimizing inference proxy for LLMs

OptiLLM is an optimizing inference proxy for Large Language Models (LLMs) that implements state-of-the-art techniques to enhance performance and efficiency. It serves as an OpenAI API-compatible proxy, allowing for seamless integration into existing workflows while optimizing inference processes. OptiLLM aims to reduce latency and resource consumption during LLM inference.

Downloads: 10 This Week

Last Update: 2026-05-07
See Project
23

RWKV Runner

A RWKV management and startup tool, full automation, only 8MB

RWKV (pronounced as RwaKuv) is an RNN with GPT-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free. Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility...

Downloads: 11 This Week

Last Update: 2026-05-08
See Project
24

Turing.jl

Bayesian inference with probabilistic programming

Bayesian inference with probabilistic programming.

Downloads: 8 This Week

Last Update: 2026-05-02
See Project
25

Norfair

Lightweight Python library for adding real-time multi-object tracking

Norfair is a customizable lightweight Python library for real-time multi-object tracking. Using Norfair, you can add tracking capabilities to any detector with just a few lines of code. Any detector expressing its detections as a series of (x, y) coordinates can be used with Norfair. This includes detectors performing tasks such as object or keypoint detection. It can easily be inserted into complex video processing pipelines to add tracking to existing projects. At the same time, it is...

Downloads: 12 This Week

Last Update: 2025-04-30
See Project

Previous
You're on page 1
2
3
4
5
Next

Related Searches

whisper-windows-x64.exe

offline artificial intelligence\

whisper-bin-x64.zip

whisper.cpp

gpt4all

openvino

llama

c++

llama.cpp

whisper-cli.exe

Related Categories

Artificial Intelligence

Software Development

System

Business

Mobile

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise