Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
LLM Inference Tools
Search Results

Search Results for "tiny-core-plus"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 41
Windows 39
Mac 33
More...
Mobile Operating Systems 5
BSD 4
ChromeOS 2

Category

Artificial Intelligence 41
Software Development 13
Mobile 2
System 2
Education 1
Multimedia 1

License

OSI-Approved Open Source 40
GNU Free Documentation License 1

Translations

English 1

Programming Language

C++ 34
Python 6
C 2
C# 1
More...
Go 1

Status

Planning 1

Showing 41 open source projects for "tiny-core-plus"

View related business solutions

LLM Inference Linux Clear Filters & Widen Search

Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
Save Up to 91% on Cloud Compute With Spot VMs
Automatic sustained-use discounts. One free VM per month. No negotiation needed.

Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.

Try Free
1

Seldon Core

An MLOps framework to package, deploy, monitor and manage models

The de facto standard open-source platform for rapidly deploying machine learning models on Kubernetes. Seldon Core, our open-source framework, makes it easier and faster to deploy your machine learning models and experiments at scale on Kubernetes. Seldon Core serves models built in any open-source or commercial model building framework. You can make use of powerful Kubernetes features like custom resource definitions to manage model graphs. And then connect your continuous integration and deployment (CI/CD) tools to scale and update your deployment. ...

Downloads: 0 This Week

Last Update: 2026-01-23
See Project
2

DeepDetect

Deep Learning API and Server in C++14 support for Caffe, PyTorch

The core idea is to remove the error sources and difficulties of Deep Learning applications by providing a safe haven of commoditized practices, all available as a single core. While the Open Source Deep Learning Server is the core element, with REST API, and multi-platform support that allows training & inference everywhere, the Deep Learning Platform allows higher level management for training neural network models and using them as if they were simple code snippets. ...

Downloads: 2 This Week

Last Update: 7 days ago
See Project
3

MNN

MNN is a blazing fast, lightweight deep learning framework

...MNN Workbench could be downloaded from MNN's homepage, which provides pretrained models, visualized training tools, and one-click deployment of models to devices. Android platform, core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB. Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

Downloads: 10 This Week

Last Update: 2026-06-16
See Project
4

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 62 This Week

Last Update: 2026-05-27
See Project
$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
5

whisper.cpp

Port of OpenAI's Whisper model in C/C++

whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples....

Downloads: 525 This Week

Last Update: 2026-06-19
See Project
6

PaddlePaddle

PArallel Distributed Deep LEarning: Machine Learning Framework

...It is the only independent R&D deep learning platform in China, and has been widely adopted in various sectors including manufacturing, agriculture and enterprise service. PaddlePaddle covers core deep learning frameworks, basic model libraries, end-to-end development kits and more, with support for both dynamic and static graphs.

Downloads: 0 This Week

Last Update: 2026-01-31
See Project
7

llama.cpp

Port of Facebook's LLaMA model in C/C++

The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.

1 Review

Downloads: 177 This Week

Last Update: 16 hours ago
See Project
8

GPT4All

Run Local LLMs on Any Device. Open-source

GPT4All is an open-source project that allows users to run large language models (LLMs) locally on their desktops or laptops, eliminating the need for API calls or GPUs. The software provides a simple, user-friendly application that can be downloaded and run on various platforms, including Windows, macOS, and Ubuntu, without requiring specialized hardware. It integrates with the llama.cpp implementation and supports multiple LLMs, allowing users to interact with AI models privately. This...

1 Review

Downloads: 133 This Week

Last Update: 2025-03-17
See Project
9

Ray

A unified framework for scalable computing

...Scale reinforcement learning (RL) with RLlib, a framework-agnostic RL library that ships with 30+ cutting-edge RL algorithms including A3C, DQN, and PPO. Easily build out scalable, distributed systems in Python with simple and composable primitives in Ray Core.

Downloads: 1 This Week

Last Update: 2026-06-24
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
10

ONNX Runtime

ONNX Runtime: cross-platform, high performance ML inferencing

ONNX Runtime is a cross-platform inference and training machine-learning accelerator. ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators...

Downloads: 39 This Week

Last Update: 2026-06-22
See Project
11

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 23 This Week

Last Update: 2026-06-24
See Project
12

SageMaker Python SDK

Training and deploying machine learning models on Amazon SageMaker

...With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow. You can also train and deploy models with Amazon algorithms, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have your own algorithms built into SageMaker-compatible Docker containers, you can train and host models using these as well.

Downloads: 1 This Week

Last Update: 5 days ago
See Project
13

ONNX

Open standard for machine learning interoperability

ONNX is an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers. Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models, both...

Downloads: 10 This Week

Last Update: 2026-06-15
See Project
14

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...

Downloads: 17 This Week

Last Update: 2026-06-09
See Project
15

LLamaSharp

C#/.NET binding of llama.cpp, including LLaMa/GPT model inference

The C#/.NET binding of llama.cpp. It provides APIs to infer the LLaMa Models and deploy it on the local environment. It works on both Windows, Linux and MAC without the requirement for compiling llama.cpp yourself. Its performance is close to llama.cpp. Furthermore, it provides integrations with other projects such as BotSharp to provide higher-level applications and UI.

Downloads: 0 This Week

Last Update: 2026-04-26
See Project
16

Lean Copilot

LLMs as Copilots for Theorem Proving in Lean

LeanCopilot integrates large language models (LLMs) as copilots for theorem proving in the Lean proof assistant. It assists users by suggesting tactics, premises, and searching for proofs, thereby enhancing the efficiency of formal verification processes. LeanCopilot supports both built-in models from LeanDojo and custom models, offering flexibility for various use cases.

Downloads: 1 This Week

Last Update: 2026-06-20
See Project
17

EconML

Python Package for ML-Based Heterogeneous Treatment Effects Estimation

...This package was designed and built as part of the ALICE project at Microsoft Research with the goal of combining state-of-the-art machine learning techniques with econometrics to bring automation to complex causal inference problems. One of the biggest promises of machine learning is to automate decision-making in a multitude of domains. At the core of many data-driven personalized decision scenarios is the estimation of heterogeneous treatment effects: what is the causal effect of an intervention on an outcome of interest for a sample with a particular set of features? In a nutshell, this toolkit is designed to measure the causal effect of some treatment variable(s) T on an outcome variable Y, controlling for a set of features X, W and how does that effect vary as a function of X.

Downloads: 0 This Week

Last Update: 2025-07-10
See Project
18

ChatLLM.cpp

Pure C++ implementation of several models for real-time chatting

chatllm.cpp is a pure C++ implementation designed for real-time chatting with Large Language Models (LLMs) on personal computers, supporting both CPU and GPU executions. It enables users to run various LLMs ranging from less than 1 billion to over 300 billion parameters, facilitating responsive and efficient conversational AI experiences without relying on external servers.

Downloads: 1 This Week

Last Update: 2026-05-27
See Project
19

Distributed Llama

Connect home devices into a powerful cluster to accelerate LLM

Distributed Llama is an open-source project that enables users to connect multiple home devices into a powerful cluster to accelerate Large Language Model (LLM) inference. By leveraging tensor parallelism and high-speed synchronization over Ethernet, it allows for faster performance as more devices are added to the cluster. The system supports various operating systems, including Linux, macOS, and Windows, and is optimized for both ARM and x86_64 AVX2 CPUs.

Downloads: 1 This Week

Last Update: 2026-02-02
See Project
20

ExecuTorch

On-device AI across mobile, embedded and edge for PyTorch

ExecuTorch is an end-to-end solution for enabling on-device inference capabilities across mobile and edge devices including wearables, embedded devices and microcontrollers. It is part of the PyTorch Edge ecosystem and enables efficient deployment of PyTorch models to edge devices.

Downloads: 1 This Week

Last Update: 2026-05-28
See Project
21

OpenVINO Model Server

A scalable inference server for models optimized with OpenVINO

OpenVINO™ Model Server is a high-performance inference serving system designed to host and serve machine learning models that have been optimized with the OpenVINO toolkit. It’s implemented in C++ for scalability and efficiency, making it suitable for both edge and cloud deployments where inference workloads must be reliable and high throughput. The server exposes model inference via standard network protocols like REST and gRPC, allowing any client that speaks those protocols to request...

Downloads: 2 This Week

Last Update: 2026-06-19
See Project
22

DALI

A GPU-accelerated library containing highly optimized building blocks

The NVIDIA Data Loading Library (DALI) is a library for data loading and pre-processing to accelerate deep learning applications. It provides a collection of highly optimized building blocks for loading and processing image, video and audio data. It can be used as a portable drop-in replacement for built-in data loaders and data iterators in popular deep learning frameworks. Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding,...

Downloads: 2 This Week

Last Update: 2026-06-11
See Project
23

gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models

Gemma.cpp is a C++ implementation for running inference with Gemma models efficiently on CPUs and GPUs. Developed by Google, it allows running large language models (LLMs) like Gemma with minimal hardware, focusing on optimized performance and low latency. Gemma.cpp is intended for developers seeking to deploy LLMs in production environments without needing massive computational resources.

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
24

ChatGLM.cpp

C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)

ChatGLM.cpp is a C++ implementation of the ChatGLM-6B model, enabling efficient local inference without requiring a Python environment. It is optimized for running on consumer hardware.

Downloads: 0 This Week

Last Update: 2025-01-21
See Project
25

CTranslate2

Fast inference engine for Transformer models

CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Transformer models on CPU and GPU. The execution is significantly faster and requires less resources than general-purpose deep learning frameworks on supported models and tasks thanks to many...

Downloads: 1 This Week

Last Update: 2026-06-06
See Project

Previous
You're on page 1
2
Next

Related Searches

whisper-windows-x64.exe

offline artificial intelligence\

whisper-bin-x64.zip

whisper.cpp

gpt4all

avx2

openvino

llama

c++

ocr c++

Related Categories

Artificial Intelligence

Software Development

Mobile

System

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise