Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "gpu max performance"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 88
Windows 88
Mac 72
More...
BSD 23
ChromeOS 22
Mobile Operating Systems 11
Desktop Operating Systems 5
Embedded Operating Systems 1
Game Consoles 1
Server Operating Systems 1

Category

Artificial Intelligence 33
Software Development 30
Multimedia 28
System 15
Scientific/Engineering 9
Games 6
Mobile 3
Blockchain 2
Business 2
Education 2
Security 2
Communications 1
Database 1
Internet 1

License

OSI-Approved Open Source 93
Other License 4

Translations

English 8
Chinese (Simplified) 1

Programming Language

C++ 102
C 10
Python 8
Unix Shell 5
C# 3
More...
Go 3
Assembly 1
BASIC 1
Fortran 1
GLSL (OpenGL Shading Language) 1
Java 1
JavaScript 1
Julia 1
Lua 1
MATLAB 1
Rust 1
TypeScript 1
VHDL/Verilog 1

Status

Beta 12
Production/Stable 11
Alpha 3
Planning 2
More...
Mature 2
Inactive 1

Showing 102 open source projects for "gpu max performance"

View related business solutions

C++ Clear Filters & Widen Search

MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

AGI (Android GPU Inspector)

Android GPU Inspector

Android GPU Inspector (AGI) is a desktop tool for profiling, tracing, and debugging graphics workloads running on Android devices. It helps developers analyze Vulkan and OpenGL ES applications at the system, frame, and draw-call levels to uncover GPU and CPU bottlenecks. AGI captures detailed performance counters, timelines, and pipeline state to reveal stalls, overdraw, shader hotspots, and inefficient resource usage.

Downloads: 0 This Week

Last Update: 2025-10-11
See Project
2

CatBoost

High-performance library for gradient boosting on decision trees

CatBoost is a fast, high-performance open source library for gradient boosting on decision trees. It is a machine learning method with plenty of applications, including ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. CatBoost offers superior performance over other GBDT libraries on many datasets, and has several superb features.

Downloads: 8 This Week

Last Update: 2026-02-21
See Project
3

SwiftShader

SwiftShader is a high-performance CPU-based implementation

SwiftShader is Google’s high-performance CPU-based implementation of the Vulkan 1.3 graphics API, designed to provide a hardware-independent rendering solution for 3D graphics. Unlike traditional GPU drivers, SwiftShader executes graphics commands entirely on the CPU, making it ideal for environments where dedicated graphics hardware is unavailable or unsuitable. It acts as a drop-in replacement for Vulkan drivers, allowing existing applications to run seamlessly by redirecting API calls through its software-based rendering engine. ...

Downloads: 134 This Week

Last Update: 4 days ago
See Project
4

Citron Neo

Research software designed to orchestrate virtual environments

Citron Neo is an advanced emulator project focused on replicating complex system environments with high performance and flexibility. It is designed to emulate modern console behavior while integrating improvements in CPU emulation, GPU rendering, and memory management. The project incorporates optimizations such as dynamic recompilation and Vulkan-based rendering to enhance performance across supported platforms. It also includes continuous updates that improve compatibility with games and system firmware, reflecting an active development cycle. ...

Downloads: 205 This Week

Last Update: 2026-04-27
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

7-max

7-max increases the performance of some applications up to 10-20%. Windows uses small (4 KB) RAM pages by default. 7-max allows to use large (2 MB) RAM pages.

2 Reviews

Downloads: 50 This Week

Last Update: 2024-08-25
See Project
6

CUDA Core Compute Libraries (CCCL)

CUDA Core Compute Libraries

...By unifying these components, CCCL reduces duplication and improves developer productivity while maintaining performance across different GPU architectures.

Downloads: 1 This Week

Last Update: 3 days ago
See Project
7

XenosRecomp

A tool for converting Xbox 360 shaders to HLSL

...The project addresses one of the most complex aspects of console reverse engineering, which is accurately reproducing proprietary GPU behavior in a portable and efficient way. By reconstructing the graphics pipeline, XenosRecomp enables developers to render scenes correctly without relying on emulation layers that can introduce performance overhead or inaccuracies.

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
8

Xenia Canary

Xbox 360 Emulator Research Project

Xenia Canary is an experimental fork of the Xenia Xbox 360 emulator that moves faster than the mainline project to trial bleeding-edge improvements. It focuses on game compatibility and performance by iterating quickly on GPU and CPU emulation paths, shader translation, and timing correctness. Canary builds are where risky optimizations, new backends, and rewrites land first so they can be tested by a wider community before stabilizing. The project emphasizes pragmatism: make more titles boot and run with fewer glitches, even if it means carrying experiments that later get refined or rolled back. ...

Downloads: 104 This Week

Last Update: 2 days ago
See Project
9

PowerInfer

High-speed Large Language Model Serving for Local Deployment

PowerInfer is a high-performance inference engine designed to run large language models efficiently on personal computers equipped with consumer-grade GPUs. The project focuses on improving the performance of local AI inference by optimizing how neural network computations are distributed between CPU and GPU resources. Its architecture exploits the observation that only a subset of neurons in large models are frequently activated, allowing the system to preload frequently used neurons into GPU memory while processing less common activations on the CPU. ...

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

OptiScaler

OptiScaler bridges upscaling/frame gen across GPUs

...The tool effectively acts as a compatibility layer between the game engine and multiple upscaling frameworks, enabling cross-GPU access to features that might otherwise be restricted to specific hardware ecosystems. In addition to replacing upscalers, OptiScaler can enable frame generation features in titles that do not officially support them, improving frame rates and perceived smoothness during gameplay.

Downloads: 219 This Week

Last Update: 2026-04-27
See Project
11

NVIDIA cuOpt

GPU accelerated decision optimization

...The platform provides multiple interfaces, including C, Python, and server APIs, allowing developers to integrate optimization capabilities into applications and services. cuOpt is designed for high-performance environments and can be deployed across cloud, hybrid, or on-premise infrastructures. By combining GPU acceleration with scalable APIs, cuOpt enables organizations to solve large optimization challenges in logistics, operations research, and decision-making systems.

Downloads: 0 This Week

Last Update: 2026-04-09
See Project
12

Codon

A high-performance, zero-overhead, extensible Python compiler

Codon is a high-performance Python compiler that compiles Python code to native machine code without any runtime overhead. Typical speedups over Python are on the order of 100x or more, on a single thread. Codon supports native multithreading which can lead to speedups many times higher still. The Codon framework is fully modular and extensible, allowing for the seamless integration of new modules, compiler optimizations, domain-specific languages and so on. We actively develop Codon...

Downloads: 10 This Week

Last Update: 2026-03-04
See Project
13

Anime4KCPP

A high performance anime upscaler

Anime4KCPP provides an optimized bloc97's Anime4K algorithm version 0.9, and it also provides its own CNN algorithm ACNet, it provides a variety of way to use, including preprocessing and real-time playback, it aims to be a high-performance tool to process both image and video. This project is for learning and the exploration task of the algorithm course in SWJTU. Anime4K is a simple high-quality anime upscale algorithm. Version 0.9 does not use any machine learning approaches and can be...

Downloads: 25 This Week

Last Update: 2025-08-01
See Project
14

HeavyDB

HeavyDB (formerly MapD/OmniSciDB)

HeavyDB is an open-source GPU-accelerated analytical database designed to perform extremely fast queries on large datasets. The system is built as a SQL-based relational columnar database engine that leverages modern hardware parallelism, including GPUs and multicore CPUs. Its architecture allows users to query datasets containing billions of rows in milliseconds without requiring traditional indexing, pre-aggregation, or sampling techniques. HeavyDB was originally developed as part of the...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
15

UCCL

UCCL is an efficient communication library for GPUs

UCCL is a high-performance GPU communication library designed to support distributed machine learning workloads and large-scale AI systems. The library focuses on enabling efficient data transfer and collective communication between GPUs during training and inference processes. It supports a variety of communication patterns including collective operations such as all-reduce as well as peer-to-peer transfers that are commonly used in modern machine learning architectures. ...

Downloads: 0 This Week

Last Update: 2026-03-14
See Project
16

DXVK

Vulkan-based implementation of D3D9, D3D10 and D3D11 for Linux / Wine

...Direct3D is a graphics application programming interface built for Windows and is used for rendering three-dimensional graphics in applications. It is typically useful in applications where performance is vital, such as in three-dimensional games. This project aims to provide support for Direct3D11, feature level 11_1, and Direct3D10, feature level 10_1. Currently however, there are still a few unsupported features, such as shared resources, predication, class linkage and target-independent rasterization. To get the best results out of this project, it is recommended that you use an esync-enabled Wine build to reduce CPU overhead in some games, and to disable desktop effects on your compositor, as this can cause stuttering issues when games are GPU-bound.

Downloads: 396 This Week

Last Update: 2025-10-11
See Project
17

OpenVINO AI Plugins for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity

A set of AI-enabled effects, generators, and analyzers for Audacity. These AI features run 100% locally on your PC, no internet connection is necessary. OpenVINO™ is used to run AI models on supported accelerators found on the user's system such as CPU, GPU, and NPU.

Downloads: 125 This Week

Last Update: 2024-12-20
See Project
18

Diligent Core

A modern cross-platform low-level graphics API

DiligentCore is a low-level, cross-platform rendering library designed to provide a modern graphics abstraction layer over Direct3D11, Direct3D12, OpenGL, Vulkan, and Metal. It’s aimed at developers building high-performance rendering engines and scientific visualization tools. DiligentCore gives precise control over GPU resources and rendering pipelines, while also abstracting away platform-specific boilerplate. The library is modular, extensible, and well-suited for projects that require direct access to modern graphics APIs while maintaining portability and scalability.

Downloads: 0 This Week

Last Update: 2025-03-25
See Project
19

RTP-LLM

Alibaba's high-performance LLM inference engine for diverse apps

RTP-LLM is an open-source large language model inference acceleration engine developed by Alibaba to provide high-performance serving infrastructure for modern LLM deployments. The system focuses on improving throughput, latency, and resource utilization when running large models in production environments. It achieves this by implementing optimized GPU kernels, batching strategies, and memory management techniques tailored for transformer inference workloads.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
20

Shumai

Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

Shumai is an experimental differentiable tensor library for TypeScript and JavaScript, developed by Facebook Research. It provides a high-performance framework for numerical computing and machine learning within modern JavaScript runtimes. Built on Bun and Flashlight, with ArrayFire as its numerical backend, Shumai brings GPU-accelerated tensor operations, automatic differentiation, and scientific computing tools directly to JavaScript developers. It allows seamless integration of machine learning, deep learning, and custom differentiable programs into web-based or server-side environments without relying on Python frameworks. ...

Downloads: 0 This Week

Last Update: 5 days ago
See Project
21

TensorRT

C++ library for high performance inference on NVIDIA GPUs

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers,...

Downloads: 17 This Week

Last Update: 2026-03-25
See Project
22

Skiko

Kotlin Multiplatform bindings to Skia

...By leveraging Skia’s proven performance and cross-platform consistency, Skiko helps developers write a single graphics pipeline that behaves predictably across environments, simplifying maintenance and reducing platform fragmentation.

Downloads: 24 This Week

Last Update: 3 days ago
See Project
23

UIforETW

User interface for recording and managing ETW traces

UIforETW is a Windows performance tracing companion that wraps the Event Tracing for Windows (ETW) toolchain in an approachable GUI. It standardizes trace collection profiles, launches WPR/xperf with the right providers, and organizes the resulting .etl files for repeatable investigations. The tool streamlines the entire loop—record, annotate, open in WPA/XperfView—so engineers can focus on finding scheduling stalls, I/O bottlenecks, GC pauses, or GPU hitches instead of memorizing command-line incantations. ...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
24

FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

FlashMLA is a high-performance decoding kernel library designed especially for Multi-Head Latent Attention (MLA) workloads, targeting NVIDIA Hopper GPU architectures. It provides optimized kernels for MLA decoding, including support for variable-length sequences, helping reduce latency and increase throughput in model inference systems using that attention style.

Downloads: 0 This Week

Last Update: 2026-04-29
See Project
25

cuML

RAPIDS Machine Learning Library

cuML is a suite of libraries that implement machine learning algorithms and mathematical primitives functions that share compatible APIs with other RAPIDS projects. cuML enables data scientists, researchers, and software engineers to run traditional tabular ML tasks on GPUs without going into the details of CUDA programming. In most cases, cuML's Python API matches the API from scikit-learn. For large datasets, these GPU-based implementations can complete 10-50x faster than their CPU equivalents. For details on performance, see the cuML Benchmarks Notebook.

Downloads: 0 This Week

Last Update: 2026-04-09
See Project

Previous
You're on page 1
2
3
4
5
Next

Related Searches

xbox 360 bios

swift shader 3.0

chromebook game emulator

xbox 360 emulator

dxvk-1.5.5

3ds max 7

python compiler

ps4 emulator for pc

dxvk-1.9.3

xbox emulator for pc

Related Categories

Artificial Intelligence

Software Development

Multimedia

System

Scientific/Engineering

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise