Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "gpu max performance" - Page 3

x

Sort By:

Relevance

Clear All Filters

OS

Linux 388
Windows 346
Mac 336
More...
BSD 122
ChromeOS 121
Mobile Operating Systems 23
Desktop Operating Systems 6
Embedded Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 153
Software Development 106
Multimedia 47
System 47
Business 24
Scientific/Engineering 17
Games 13
Blockchain 6
Database 4
Mobile 3
Security 3
Education 2
Terminals 2
Internet 1
Text Editors 1

License

OSI-Approved Open Source 315
Creative Commons Attribution License 2
Other License 2
Public Domain 1

Translations

English 13
Bengali 1
Chinese (Simplified) 1
Korean 1
More...
Spanish 1

Programming Language

Python 131
C++ 88
C 35
Rust 20
More...
Java 14
JavaScript 13
TypeScript 13
Unix Shell 13
Go 10
Julia 10
ActionScript 8
C# 7
Objective C 4
Assembly 2
CoffeeScript 2
Haskell 2
MATLAB 2
AspectJ 1
Fortran 1
haXe 1
Kotlin 1
Lua 1
PHP 1
Swift 1
Tcl 1

Status

Production/Stable 24
Beta 15
Alpha 7
Mature 3

Showing 388 open source projects for "gpu max performance"

View related business solutions

Linux Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

LMCache

Supercharge Your LLM with the Fastest KV Cache Layer

...These capabilities aim to lower latency, cut GPU cycles, and stabilize performance for production workloads with overlapping prompts or retrieval-augmented contexts. The end result is a cache fabric for LLMs that complements engines rather than replacing them.

Downloads: 0 This Week

Last Update: 2026-04-23
See Project
2

Zed

High-performance, multiplayer code editor from the creators of Atom

Zed is a next-generation code editor designed for high-performance collaboration with humans and AI. Written from scratch in Rust to efficiently leverage multiple CPU cores and your GPU. Integrate upcoming LLMs into your workflow to generate, transform, and analyze code. Chat with teammates, write notes together, and share your screen and project. Multibuffers compose excerpts from across the codebase in one editable surface.

Downloads: 27 This Week

Last Update: 10 hours ago
See Project
3

Pruna AI

Pruna is a model optimization framework built for developers

Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while maintaining speed, cost-efficiency, and full control over their data and AI stack. ...

Downloads: 1 This Week

Last Update: 2026-04-22
See Project
4

libplacebo

Official mirror of libplacebo

libplacebo is a flexible, high-performance graphics library built on top of Vulkan, designed to provide reusable GPU-accelerated components for media applications. It originated as a core part of the rendering pipeline for the mpv media player and has since grown into a standalone library used for tone mapping, dithering, color space conversion, and more. libplacebo is ideal for developers looking to integrate sophisticated video rendering and post-processing into their own applications with full control over shaders and rendering stages.

Downloads: 1 This Week

Last Update: 2026-03-13
See Project
Application Monitoring That Won't Slow Your App Down
AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.

Start Free
5

TensorRT Node for ComfyUI

Enables the best performance on NVIDIA RTX Graphics Cards

...The repo typically includes instructions for converting models to TensorRT engines and for wiring those engines into ComfyUI nodes. This is particularly attractive for power users who run many generations or who host ComfyUI on dedicated hardware and want to squeeze out every bit of GPU performance. In short, it’s about taking ComfyUI from “it runs” to “it runs fast” on NVIDIA GPUs.

Downloads: 3 This Week

Last Update: 2025-10-30
See Project
6

UCCL

UCCL is an efficient communication library for GPUs

UCCL is a high-performance GPU communication library designed to support distributed machine learning workloads and large-scale AI systems. The library focuses on enabling efficient data transfer and collective communication between GPUs during training and inference processes. It supports a variety of communication patterns including collective operations such as all-reduce as well as peer-to-peer transfers that are commonly used in modern machine learning architectures. ...

Downloads: 0 This Week

Last Update: 2026-03-14
See Project
7

KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

...In large language models, the key-value cache stores intermediate attention states that enable efficient token generation during inference, but these caches can consume large amounts of GPU memory when handling long contexts. KVCache-Factory provides a platform for implementing and evaluating multiple compression strategies that reduce memory usage while preserving model performance. The framework integrates several state-of-the-art methods such as PyramidKV, SnapKV, H2O, and StreamingLLM, allowing researchers to compare and experiment with different approaches within the same environment. ...

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
8

FLUX.2-klein-4B

Flux 2 image generation model pure C inference

...Because the implementation is in plain C and focuses on data locality and vectorized operations, flux2.c can be integrated into performance-critical code paths where control over memory layout and execution behavior matters, such as GPU kernels, embedded systems, or custom ML runtime engines.

Downloads: 10 This Week

Last Update: 2026-02-13
See Project
9

Scalene

High-performance CPU, GPU, and memory profiler for Python

Scalene is a high-performance CPU, GPU and memory profiler for Python that does a number of things that other Python profilers do not and cannot do. It runs orders of magnitude faster than other profilers while delivering far more detailed information. Once Scalene has profiled your program, it will launch a web browser with an interactive user interface (all processing is done locally).

Downloads: 0 This Week

Last Update: 2026-03-22
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
10

Text Generation Inference

Large Language Model Text Generation Inference

Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.

Downloads: 1 This Week

Last Update: 2025-12-18
See Project
11

Anime4KCPP

A high performance anime upscaler

Anime4KCPP provides an optimized bloc97's Anime4K algorithm version 0.9, and it also provides its own CNN algorithm ACNet, it provides a variety of way to use, including preprocessing and real-time playback, it aims to be a high-performance tool to process both image and video. This project is for learning and the exploration task of the algorithm course in SWJTU. Anime4K is a simple high-quality anime upscale algorithm. Version 0.9 does not use any machine learning approaches and can be...

Downloads: 18 This Week

Last Update: 2025-08-01
See Project
12

EvoTrees.jl

Boosted trees in Julia

A Julia implementation of boosted trees with CPU and GPU support. Efficient histogram-based algorithms with support for multiple loss functions, including various regressions, multi-classification and Gaussian max likelihood.

Downloads: 0 This Week

Last Update: 2026-02-24
See Project
13

DXVK

Vulkan-based implementation of D3D9, D3D10 and D3D11 for Linux / Wine

...Direct3D is a graphics application programming interface built for Windows and is used for rendering three-dimensional graphics in applications. It is typically useful in applications where performance is vital, such as in three-dimensional games. This project aims to provide support for Direct3D11, feature level 11_1, and Direct3D10, feature level 10_1. Currently however, there are still a few unsupported features, such as shared resources, predication, class linkage and target-independent rasterization. To get the best results out of this project, it is recommended that you use an esync-enabled Wine build to reduce CPU overhead in some games, and to disable desktop effects on your compositor, as this can cause stuttering issues when games are GPU-bound.

Downloads: 399 This Week

Last Update: 2025-10-11
See Project
14

LibreHardwareMonitor

Monitor temperature sensors, fan speed, voltage, load & clock speeds

Libre Hardware Monitor is a free, open-source system monitoring tool that provides detailed insights into your computer’s hardware health and performance. It tracks real-time metrics such as temperatures, fan speeds, voltages, clock speeds, and load across a wide range of components. The project includes both a Windows Forms application for visual monitoring and a reusable library for developers who want to integrate hardware monitoring into their own software. LibreHardwareMonitor supports modern Intel and AMD CPUs, major GPU vendors, storage devices, and network adapters. ...

Downloads: 255 This Week

Last Update: 2026-02-14
See Project
15

Ultralight

Lightweight, high-performance HTML renderer for game developers

...Available for desktop apps, game consoles, TVs, embedded device displays, servers, and more. Official API for C and C++, with bindings for more. Render web-content on the GPU via Direct3D, Metal, OpenGL, or your own engine for unmatched visual performance. Render web-content on the CPU via SIMD/parallel for incredibly easy integration with any environment (including server-side!). Ultralight is engineered for peak performance, ensuring minimal CPU and memory usage. Customize low-level platform functionality, integrate JavaScript directly with native code, dive deep into performance tuning, and more. ...

Downloads: 3 This Week

Last Update: 2024-06-12
See Project
16

OpenVINO AI Plugins for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.

Downloads: 114 This Week

Last Update: 2024-12-20
See Project
17

ffmpeg-over-ip

Connect to remote ffmpeg servers

ffmpeg-over-ip is a client-server system that enables remote execution of FFmpeg commands on a machine with GPU access while controlling it from another environment such as a container or virtual machine. It allows applications without direct GPU access to offload video transcoding tasks to a remote server, improving performance without requiring complex passthrough setups. The system works by coordinating commands through a lightweight protocol while using a shared filesystem to exchange media data. ...

Downloads: 2 This Week

Last Update: 3 days ago
See Project
18

Faster Whisper

Faster Whisper transcription with CTranslate2

Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based...

Downloads: 33 This Week

Last Update: 2026-04-06
See Project
19

Alpamayo 1

Bridging Reasoning and Action Prediction

...It incorporates vision-language-action modeling, enabling it to process sensor data and contextual information simultaneously. Alpamayo supports tasks such as trajectory prediction, auto-labeling, and reasoning-based decision making. The system is optimized for high-performance GPU environments and is intended primarily for experimentation and benchmarking. Overall, it represents an advanced step toward integrating reasoning into autonomous driving pipelines.

Downloads: 0 This Week

Last Update: 1 day ago
See Project
20

Diligent Core

A modern cross-platform low-level graphics API

DiligentCore is a low-level, cross-platform rendering library designed to provide a modern graphics abstraction layer over Direct3D11, Direct3D12, OpenGL, Vulkan, and Metal. It’s aimed at developers building high-performance rendering engines and scientific visualization tools. DiligentCore gives precise control over GPU resources and rendering pipelines, while also abstracting away platform-specific boilerplate. The library is modular, extensible, and well-suited for projects that require direct access to modern graphics APIs while maintaining portability and scalability.

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
21

RTP-LLM

Alibaba's high-performance LLM inference engine for diverse apps

RTP-LLM is an open-source large language model inference acceleration engine developed by Alibaba to provide high-performance serving infrastructure for modern LLM deployments. The system focuses on improving throughput, latency, and resource utilization when running large models in production environments. It achieves this by implementing optimized GPU kernels, batching strategies, and memory management techniques tailored for transformer inference workloads.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
22

clip-retrieval

Easily compute clip embeddings and build a clip retrieval system

...It allows developers to compute embeddings for both images and text efficiently and then index them for fast similarity search across massive datasets. The system is optimized for performance and scalability, capable of processing tens or even hundreds of millions of embeddings using GPU acceleration. It includes components for inference, indexing, filtering, and serving results through APIs, making it a complete pipeline for building production-ready retrieval systems. The framework also supports querying by image, text, or embedding, enabling flexible use cases such as reverse image search or multimodal content discovery. ...

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
23

XFrames

GPU-accelerated GUI development for Node.js and the browser

xframes is a high-performance library that empowers developers to build native desktop applications using familiar web technologies, specifically Node.js and React, without the overhead of the DOM. xframes serves as a streamlined alternative to Electron, designed for developers looking to maximize performance and efficiency.

Downloads: 0 This Week

Last Update: 2024-12-07
See Project
24

Shumai

Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

Shumai is an experimental differentiable tensor library for TypeScript and JavaScript, developed by Facebook Research. It provides a high-performance framework for numerical computing and machine learning within modern JavaScript runtimes. Built on Bun and Flashlight, with ArrayFire as its numerical backend, Shumai brings GPU-accelerated tensor operations, automatic differentiation, and scientific computing tools directly to JavaScript developers. It allows seamless integration of machine learning, deep learning, and custom differentiable programs into web-based or server-side environments without relying on Python frameworks. ...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
25

G-Helper

Lightweight Armoury Crate alternative for Asus laptops and ROG Ally

Small and lightweight Armoury Crate alternative for Asus laptops offering almost same functionality without extra load and unnecessary services. Works with all popular models, such as ROG Zephyrus G14, G15, G16, M16, Flow X13, Flow X16, Flow Z13, DUO, TUF Series, Strix or Scar Series, ProArt, Vivobook, Zenbook, ROG Ally or Ally X and many more.

Downloads: 150 This Week

Last Update: 2026-04-22
See Project

Previous
1
2
You're on page 3
4
5
6
7
Next

Related Searches

dxvk-1.5.5

dxvk-1.9.3

dxvk

dxvk-2.7.1.tar.gz

zed

ai

python

anime4kcpp

wine

open hardware monitor

Related Categories

Artificial Intelligence

Software Development

Multimedia

System

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise