Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "gpu hardware" - Page 4

x

Sort By:

Relevance

Clear All Filters

OS

Mac 160
Linux 153
Windows 153
More...
ChromeOS 63
BSD 59
Mobile Operating Systems 11
Desktop Operating Systems 1
Embedded Operating Systems 1

Category

Artificial Intelligence 79
Software Development 28
System 20
Multimedia 18
Business 6
Scientific/Engineering 6
Games 5
Blockchain 3
Education 2
Security 2
Terminals 2
Communications 1
Desktop Environment 1

License

OSI-Approved Open Source 119
Other License 2

Translations

English 3
Polish 2
Chinese (Simplified) 1
French 1
More...
German 1
Italian 1
Russian 1

Programming Language

Python 61
C++ 33
C 13
JavaScript 10
More...
Rust 9
ActionScript 4
Go 4
Julia 4
Unix Shell 4
TypeScript 3
C# 2
Objective C 2
Swift 2
haXe 1
Java 1
Objective-C 2.0 1
PHP 1

Status

Production/Stable 6
Beta 4
Planning 3
Alpha 3
More...
Pre-Alpha 2

Showing 160 open source projects for "gpu hardware"

View related business solutions

Mac Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Go from Code to Production URL in Seconds
Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.

Try it free
1

Lemonade

Lemonade helps users run local LLMs with the highest performance

Lemonade is a local LLM runtime that aims to deliver the highest possible performance on your own hardware by auto-configuring state-of-the-art inference engines for both NPUs and GPUs. The project positions itself as a “local LLM server” you can run on laptops and workstations, abstracting away backend differences while giving you a single place to serve and manage models. Its README emphasizes real-world adoption across startups, research groups, and large companies, signaling a focus on...

Downloads: 3 This Week

Last Update: 2026-04-08
See Project
2

mosaicml composer

Supercharge Your Model Training

composer is a deep learning training framework built on PyTorch and designed to make large-scale model training more efficient, scalable, and customizable. At the center of the project is a highly optimized Trainer abstraction that simplifies the management of training loops, parallelization, metrics, logging, and data loading. The framework is intended for modern workloads that may span anything from a single GPU to very large distributed training environments, which makes it suitable for...

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
3

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...

Downloads: 22 This Week

Last Update: 2026-03-25
See Project
4

gpt-oss

gpt-oss-120b and gpt-oss-20b are two open-weight language models

gpt-oss is OpenAI’s open-weight family of large language models designed for powerful reasoning, agentic workflows, and versatile developer use cases. The series includes two main models: gpt-oss-120b, a 117-billion parameter model optimized for general-purpose, high-reasoning tasks that can run on a single H100 GPU, and gpt-oss-20b, a lighter 21-billion parameter model ideal for low-latency or specialized applications on smaller hardware. Both models use a native MXFP4 quantization for efficient memory use and support OpenAI’s Harmony response format, enabling transparent full chain-of-thought reasoning and advanced tool integrations such as function calling, browsing, and Python code execution. ...

1 Review

Downloads: 10 This Week

Last Update: 2026-01-13
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

powerMAX

powerMAX is a CPU and GPU burn-in test

powerMAX is a CPU and GPU burn-in tool designed to push your hardware to its absolute thermal and power limits. It helps users uncover stability issues, cooling weaknesses, and power delivery problems by applying maximum, sustained stress to both the processor and graphics card. The utility supports dedicated CPU tests—SSE or AVX—and a demanding GPU 3D rendering test, with the option to run both simultaneously for full-system power load evaluation.

1 Review

Downloads: 25 This Week

Last Update: 2025-11-22
See Project
6

Superposition Benchmark (Unigine)

GPU benchmark testing graphics performance with realistic 3D scenes.

...Widely used by gamers and hardware reviewers, it is proprietary but offers a free edition.

Downloads: 97 This Week

Last Update: 2025-10-07
See Project
7

Unsloth-MLX

Bringing the Unsloth experience to Mac users via Apple's MLX framework

...This project removes traditional barriers that prevent Mac users from prototyping and experimenting with LLM training locally by allowing the same code used in cloud GPU environments to run on M-series hardware, improving workflow continuity and reducing iteration costs. It supports loading and training Hugging Face models with fine-tuning strategies like SFT, DPO, ORPO, and GRPO and even handles exporting models to formats like GGUF for downstream use, although some limitations apply with quantized models. ...

Downloads: 1 This Week

Last Update: 15 hours ago
See Project
8

CUDA-QX

Accelerated libraries for quantum-classical computing built on CUDA-Q

CUDA-QX is a collection of accelerated libraries built on top of the CUDA-Q platform, designed to enable rapid development of hybrid quantum-classical applications. It extends the CUDA-Q programming model by providing optimized implementations of domain-specific quantum computing primitives and workflows. The libraries are intended to help researchers and developers leverage GPUs, CPUs, and quantum processing units together in a unified computational model. CUDA-QX focuses on key areas such...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
9

JAX Toolbox

Public CI, Docker images for popular JAX libraries

JAX Toolbox is a development toolkit designed to streamline and optimize the use of JAX for machine learning and high-performance computing on NVIDIA GPUs. It provides prebuilt Docker images, continuous integration pipelines, and optimized example implementations that help developers quickly set up and run JAX workloads without complex configuration. The project supports popular JAX-based frameworks and models, including architectures used for large-scale pretraining such as GPT and LLaMA...

Downloads: 0 This Week

Last Update: 2026-04-14
See Project
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
10

TensorRT LLM

TensorRT LLM provides users with an easy-to-use Python API

TensorRT-LLM is an open-source high-performance inference library specifically designed to optimize and accelerate large language model deployment on NVIDIA GPUs. It provides a Python-based API built on top of PyTorch that allows developers to define, customize, and deploy LLMs efficiently across a variety of hardware configurations, from single GPUs to large multi-node clusters. The library focuses on maximizing throughput and minimizing latency through advanced techniques such as...

Downloads: 0 This Week

Last Update: 6 days ago
See Project
11

Diffrax

Numerical differential equation solvers in JAX

Diffrax is a numerical differential equation solving library built for the JAX ecosystem, with a strong focus on composability, differentiability, and high-performance scientific computing. The project provides tools for solving ordinary differential equations, stochastic differential equations, controlled differential equations, and related systems in a way that fits naturally into modern machine learning and differentiable programming workflows. Because it is written to work closely with...

Downloads: 0 This Week

Last Update: 2026-03-12
See Project
12

CUDA Containers for Edge AI & Robotics

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

CUDA Containers for Edge AI & Robotics is an open-source project that provides a modular container build system designed for running machine learning and AI workloads on NVIDIA Jetson devices. The repository contains container configurations that package the latest AI frameworks and dependencies optimized for Jetson hardware. These containers simplify the deployment of complex machine learning environments by bundling libraries such as CUDA, TensorRT, and deep learning frameworks into reproducible container images. The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. ...

Downloads: 0 This Week

Last Update: 2026-04-15
See Project
13

wllama

WebAssembly binding for llama.cpp - Enabling on-browser LLM inference

wllama is a WebAssembly-based library that enables large language model inference directly inside a web browser. Built as a binding for the llama.cpp inference engine, the project allows developers to run LLM models locally without requiring a server backend or dedicated GPU hardware. The library leverages WebAssembly SIMD capabilities to achieve efficient execution within modern browsers while maintaining compatibility across platforms. By running models locally on the user’s device, wllama enables privacy-preserving AI applications that do not require sending data to remote servers. The framework provides both high-level APIs for common tasks such as text generation and embeddings, as well as low-level APIs that expose tokenization, sampling controls, and model state management.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
14

CUDA Agent

Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback. Its architecture combines large-scale data synthesis, a skill-augmented CUDA development environment, and long-horizon reinforcement learning to build intrinsic optimization capability rather than relying on simple post-hoc tuning. ...

Downloads: 0 This Week

Last Update: 2026-03-03
See Project
15

EPLB

Expert Parallelism Load Balancer

EPLB is DeepSeek’s open implementation of a load balancing algorithm designed for expert parallelism (EP) settings in MoE architectures. In EP, different “experts” are mapped to different GPUs or nodes, so load imbalance becomes a performance bottleneck if certain experts are invoked much more often. EPLB solves this by duplicating heavily used experts (redundancy) and then placing those duplicates across GPUs to even out computational load. It uses policies like hierarchical load balancing...

Downloads: 0 This Week

Last Update: 2025-10-03
See Project
16

WebLLM

Bringing large-language models and chat to web browsers

WebLLM is a modular, customizable javascript package that directly brings language model chats directly onto web browsers with hardware acceleration. Everything runs inside the browser with no server support and is accelerated with WebGPU. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration. WebLLM offers a minimalist and modular interface to access the chatbot in the browser. The WebLLM package itself does not come with UI, and is designed in a modular way to hook to any of the UI components. ...

Downloads: 0 This Week

Last Update: 2026-03-13
See Project
17

CosyVoice

Multi-lingual large voice generation model, providing inference

CosyVoice is a multilingual large voice generation model that offers a full-stack solution for training, inference, and deployment of high-quality TTS systems. The model supports multiple languages, including Chinese, English, Japanese, Korean, and a range of Chinese dialects such as Cantonese, Sichuanese, Shanghainese, Tianjinese, and Wuhanese. It is designed for zero-shot voice cloning and cross-lingual or mix-lingual scenarios, so a single reference voice can be used to synthesize speech...

Downloads: 2 This Week

Last Update: 2025-11-30
See Project
18

MaxText

A simple, performant and scalable Jax LLM

MaxText is a high-performance, highly scalable open-source framework designed to train and fine-tune large language models using the JAX ecosystem. The project acts as both a reference implementation and a practical training library that demonstrates best practices for building and scaling transformer-based language models on modern accelerator hardware. It is optimized to run efficiently on Google Cloud TPUs and GPUs, enabling researchers and engineers to train models ranging from small...

Downloads: 0 This Week

Last Update: 2026-03-23
See Project
19

TensorFlow Probability

Probabilistic reasoning and statistical analysis in TensorFlow

TensorFlow Probability is a library for probabilistic reasoning and statistical analysis. TensorFlow Probability (TFP) is a Python library built on TensorFlow that makes it easy to combine probabilistic models and deep learning on modern hardware (TPU, GPU). It's for data scientists, statisticians, ML researchers, and practitioners who want to encode domain knowledge to understand data and make predictions. Since TFP inherits the benefits of TensorFlow, you can build, fit, and deploy a model using a single language throughout the lifecycle of model exploration and production. ...

Downloads: 0 This Week

Last Update: 2024-11-08
See Project
20

local-llm

Run LLMs locally on Cloud Workstations

local-llm is a development framework that enables developers to run large language models locally within Google Cloud Workstations or standard environments without requiring GPU hardware. It focuses on making generative AI development more accessible by leveraging quantized models and CPU-based execution, eliminating the dependency on expensive GPU infrastructure. The repository includes tools, Docker configurations, and command-line utilities that simplify the process of downloading, running, and interacting with language models directly on local or cloud-based workstations. ...

Downloads: 1 This Week

Last Update: 2026-03-17
See Project
21

MegEngine

Easy-to-use deep learning framework with 3 key features

MegEngine is a fast, scalable and easy-to-use deep learning framework with 3 key features. You can represent quantization/dynamic shape/image pre-processing and even derivation in one model. After training, just put everything into your model and inference it on any platform at ease. Speed and precision problems won't bother you anymore due to the same core inside. In training, GPU memory usage could go down to one-third at the cost of only one additional line, which enables the DTR...

Downloads: 0 This Week

Last Update: 2024-04-30
See Project
22

VCClient

Software that uses AI to perform real-time voice conversion

...It provides both a graphical user interface and API access, making it suitable for casual users as well as developers who want to integrate voice transformation into their own applications. The project also supports GPU acceleration, enabling faster inference and smoother real-time performance on compatible hardware. Additionally, it includes tools for training and managing voice models, giving users the ability to create personalized voice profiles.

Downloads: 12 This Week

Last Update: 2026-03-23
See Project
23

MSI Kombustor

Advanced OpenGL and Vulkan graphics card stress testing utility

...The tool provides MSI users with an exclusive, streamlined interface for testing their hardware safely and effectively. By driving high temperatures and peak loads, it reveals whether a graphics card can sustain extended heavy usage. Kombustor is ideal for anyone looking to test, validate, or tune their GPU setup.

1 Review

Downloads: 79 This Week

Last Update: 2025-11-22
See Project
24

Bottleneck Calculator

Check CPU and GPU balance with real time bottleneck analysis

PC Bottleneck Calculator is a performance analysis tool that helps PC gamers and builders identify CPU or GPU bottlenecks in their systems. It provides accurate compatibility insights by comparing hardware data and real world benchmarks to estimate system balance. Users can instantly see how well their CPU and GPU pair together, test different configurations, and understand which component limits their gaming performance. www.pcbottleneckcalculator.io Built with a clean, responsive interface, the tool offers quick, data-driven results without requiring downloads or complex setup.

Downloads: 0 This Week

Last Update: 2025-10-19
See Project
25

Knema - Frame Continuity Engine

Knema is a lightweight real-time performance & frame continuity engine

... 🔹 Adaptive Frametime Control Continuously analyzes frametime distribution (mean, p95, jitter) Prioritizes stable frame pacing over artificial FPS boosting Reduces micro-stutter and sudden frame spikes 🔹 GPU-Aware Decision Engine Accurately detects GPU-bound, CPU-bound, and engine-wait scenarios Differentiates real GPU bottlenecks from telemetry glitches Prevents false performance corrections 🔹 Intelligent FPS & Power Management Dynamically adjusts FPS caps based on real hardware limits Reduces unnecessary GPU power consumption in stable scenes Avoids aggressive throttling that causes oscillation or jitter 🔹 Real-Time Probing System Actively tests GPU headroom instead of relying on assumptions Safely probes performance limits without destabilizing gameplay Automatically backs off when physical limits

Downloads: 0 This Week

Last Update: 2026-02-10
See Project

Previous
1
2
3
You're on page 4
5
6
7
Next

Related Searches

chat gpt

openvino

gpt-oss

ai

gpt

chatbot code

tensorflow

cobol for 64bit windows

npu

optimizer

Related Categories

Artificial Intelligence

Software Development

System

Multimedia

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise