Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "c++ programming language"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 21
Windows 21
Mac 19
More...
BSD 3
ChromeOS 3

Category

Artificial Intelligence 21
Software Development 3
Multimedia 1

License

OSI-Approved Open Source 21

Translations

English 2

Programming Language

C++ 15
C 2
Go 2
Python 2
More...
C# 1
Julia 1

Status

Production/Stable 1

Showing 21 open source projects for "c++ programming language"

View related business solutions

LLM Inference Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Sales CRM and Pipeline Management Software | Pipedrive
The easy and effective CRM for closing deals

Pipedrive’s simple interface empowers salespeople to streamline workflows and unite sales tasks in one workspace. Unlock instant sales insights with Pipedrive’s visual sales pipeline and fine-tune your strategy with robust reporting features and a personalized AI Sales Assistant.

Try it for free
1

llama.cpp

Port of Facebook's LLaMA model in C/C++

The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.

1 Review

Downloads: 72 This Week

Last Update: 6 hours ago
See Project
2

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately...

1 Review

Downloads: 73 This Week

Last Update: 2025-03-17
See Project
3

LocalAI

Self-hosted, community-driven, local OpenAI compatible API

Self-hosted, community-driven, local OpenAI compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU is required. Runs ggml, GPTQ, onnx, TF compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many others. LocalAI is a drop-in replacement REST API that’s compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not...

Downloads: 28 This Week

Last Update: 3 days ago
See Project
4

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training...

Downloads: 20 This Week

Last Update: 2025-06-06
See Project
Crowdtesting That Delivers | Testeum
Unfixed bugs delaying your launch? Test with real users globally – check it out for free, results in days.

Testeum connects your software, app, or website to a worldwide network of testers, delivering detailed feedback in under 48 hours. Ensure functionality and refine UX on real devices, all at a fraction of traditional costs. Trusted by startups and enterprises alike, our platform streamlines quality assurance with actionable insights.

Click to perfect your product now.
5

TensorRT

C++ library for high performance inference on NVIDIA GPUs

..., embedded, or automotive product platforms. TensorRT is built on CUDA®, NVIDIA’s parallel programming model, and enables you to optimize inference leveraging libraries, development tools, and technologies in CUDA-X™ for artificial intelligence, autonomous machines, high-performance computing, and graphics. With new NVIDIA Ampere Architecture GPUs, TensorRT also leverages sparse tensor cores providing an additional performance boost.

Downloads: 12 This Week

Last Update: 7 days ago
See Project
6

ChatGLM.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.

Downloads: 7 This Week

Last Update: 2025-01-21
See Project
7

Gen.jl

A general-purpose probabilistic programming system

An open-source stack for generative modeling and probabilistic inference. Gen’s inference library gives users building blocks for writing efficient probabilistic inference algorithms that are tailored to their models, while automating the tricky math and the low-level implementation details. Gen helps users write hybrid algorithms that combine neural networks, variational inference, sequential Monte Carlo samplers, and Markov chain Monte Carlo. Gen features an easy-to-use modeling language...

Downloads: 2 This Week

Last Update: 2025-07-11
See Project
8

Lean Copilot

LLMs as Copilots for Theorem Proving in Lean

LeanCopilot integrates large language models (LLMs) as copilots for theorem proving in the Lean proof assistant. It assists users by suggesting tactics, premises, and searching for proofs, thereby enhancing the efficiency of formal verification processes. LeanCopilot supports both built-in models from LeanDojo and custom models, offering flexibility for various use cases.

Downloads: 2 This Week

Last Update: 2025-07-01
See Project
9

Distributed Llama

Connect home devices into a powerful cluster to accelerate LLM

Distributed Llama is an open-source project that enables users to connect multiple home devices into a powerful cluster to accelerate Large Language Model (LLM) inference. By leveraging tensor parallelism and high-speed synchronization over Ethernet, it allows for faster performance as more devices are added to the cluster. The system supports various operating systems, including Linux, macOS, and Windows, and is optimized for both ARM and x86_64 AVX2 CPUs.

Downloads: 2 This Week

Last Update: 2025-07-07
See Project
Gen AI apps are built with MongoDB Atlas
Build gen AI apps with an all-in-one modern database: MongoDB Atlas

MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.

Start Free
10

ChatLLM.cpp

Pure C++ implementation of several models for real-time chatting

chatllm.cpp is a pure C++ implementation designed for real-time chatting with Large Language Models (LLMs) on personal computers, supporting both CPU and GPU executions. It enables users to run various LLMs ranging from less than 1 billion to over 300 billion parameters, facilitating responsive and efficient conversational AI experiences without relying on external servers.

Downloads: 1 This Week

Last Update: 2025-06-12
See Project
11

rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.

Downloads: 0 This Week

Last Update: 2025-03-23
See Project
12

EvaDB

Database system for building simpler and faster AI-powered application

Over the last decade, AI models have radically changed the world of natural language processing and computer vision. They are accurate on various tasks ranging from question answering to object tracking in videos. To use an AI model, the user needs to program against multiple low-level libraries, like PyTorch, Hugging Face, Open AI, etc. This tedious process often leads to a complex AI app that glues together these libraries to accomplish the given task. This programming complexity prevents...

Downloads: 0 This Week

Last Update: 2023-11-19
See Project
13

LLamaSharp

C#/.NET binding of llama.cpp, including LLaMa/GPT model inference

The C#/.NET binding of llama.cpp. It provides APIs to infer the LLaMa Models and deploy it on the local environment. It works on both Windows, Linux and MAC without the requirement for compiling llama.cpp yourself. Its performance is close to llama.cpp. Furthermore, it provides integrations with other projects such as BotSharp to provide higher-level applications and UI.

Downloads: 0 This Week

Last Update: 2025-05-14
See Project
14

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models

Gemma.cpp is a C++ implementation for running inference with Gemma models efficiently on CPUs and GPUs. Developed by Google, it allows running large language models (LLMs) like Gemma with minimal hardware, focusing on optimized performance and low latency. Gemma.cpp is intended for developers seeking to deploy LLMs in production environments without needing massive computational resources.

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
15

Bolt NLP

Bolt is a deep learning library with high performance

Bolt is a high-performance deep learning inference framework developed by Huawei Noah's Ark Lab. It is designed to optimize and accelerate the deployment of deep learning models across various hardware platforms. Bolt is a light-weight library for deep learning. Bolt, as a universal deployment tool for all kinds of neural networks, aims to automate the deployment pipeline and achieve extreme acceleration. Bolt has been widely deployed and used in many departments of HUAWEI company, such as...

Downloads: 0 This Week

Last Update: 2025-01-30
See Project
16

Neural Speed

An innovative library for efficient LLM inference

neural-speed is an innovative library developed by Intel to enhance the efficiency of Large Language Model (LLM) inference through low-bit quantization techniques. By reducing the precision of model weights and activations, neural-speed aims to accelerate inference while maintaining model accuracy, making it suitable for deployment in resource-constrained environments.

Downloads: 0 This Week

Last Update: 2025-03-19
See Project
17

TensorFlow Serving

Serving system for machine learning models

.... The easiest and most straight-forward way of using TensorFlow Serving is with Docker images. We highly recommend this route unless you have specific needs that are not addressed by running in a container. In order to serve a Tensorflow model, simply export a SavedModel from your Tensorflow program. SavedModel is a language-neutral, recoverable, hermetic serialization format that enables higher-level systems and tools to produce, consume, and transform TensorFlow models.

Downloads: 0 This Week

Last Update: 2025-05-05
See Project
18

LLaMA.go

llama.go is like llama.cpp in pure Golang

llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.

Downloads: 0 This Week

Last Update: 2023-08-25
See Project
19

Temporal Inference Engine

A real time inference engine for temporal logical specifications

.... The accepted language provides timed logic and mathematical operators, conditional operators, interval operators, bounded quantifiers and parametrization of signals.

1 Review

Downloads: 0 This Week

Last Update: 2025-06-20
See Project
20

Coqui STT

The deep learning toolkit for speech-to-text

... in post. With Coqui, dubbing is a delight. Effortlessly clone the voice of your talent into another language and let the clone do the dub. With text-to-speech, experience the immediacy of script-to-performance. Cast from a wide selection of high-quality, directable, emotive voices or clone a voice to suit your needs. With Coqui text-to-speech, production times go from months to minutes.

Downloads: 5 This Week

Last Update: 2022-09-03
See Project
21

TurboTransformers

Fast and user-friendly runtime for transformer inference

TurboTransformers is a high-performance inference framework optimized for running Transformer models efficiently on CPUs and GPUs. It improves latency and throughput for NLP applications.

Downloads: 0 This Week

Last Update: 2025-01-24
See Project

Previous
You're on page 1
Next

Related Searches

llama.cpp

offline artificial intelligence\

gpt4all

llama-b5828-bin-win-cpu-x64.zip

python

offline artificial intelligence assistant

llama

delphi speech recognition

ai

gen

Related Categories

Artificial Intelligence

Software Development

Multimedia

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2025 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Want the latest updates on software, tech news, and AI?

Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: