Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Software Development Software
Search Results

Search Results for "gpu max performance" - Page 2

x

Sort By:

Relevance

Clear All Filters

OS

Linux 106
Windows 95
Mac 94
More...
ChromeOS 42
BSD 40
Mobile Operating Systems 11
Desktop Operating Systems 1
Embedded Operating Systems 1

Category

Software Development 106
Artificial Intelligence 20
Multimedia 10
Scientific/Engineering 4
System 3
Business 1
Database 1
Internet 1
Mobile 1
Text Editors 1

License

OSI-Approved Open Source 92
Creative Commons Attribution License 1
Other License 1

Translations

English 4
Chinese (Simplified) 1

Programming Language

Python 28
C++ 27
C 10
TypeScript 9
More...
JavaScript 7
Rust 7
ActionScript 6
Java 6
Unix Shell 4
Go 3
Haskell 2
Julia 2
AspectJ 1
Assembly 1
C# 1
CoffeeScript 1
haXe 1
Objective C 1
PHP 1
Swift 1

Status

Production/Stable 7
Alpha 5
Beta 3
Mature 1

Showing 106 open source projects for "gpu max performance"

View related business solutions

Software Development Linux Clear Filters & Widen Search

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

DALI

A GPU-accelerated library containing highly optimized building blocks

...Deep learning applications require complex, multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentations. These data processing pipelines, which are currently executed on the CPU, have become a bottleneck, limiting the performance and scalability of training and inference. DALI addresses the problem of the CPU bottleneck by offloading data preprocessing to the GPU. Additionally, DALI relies on its own execution engine, built to maximize the throughput of the input pipeline.

Downloads: 1 This Week

Last Update: 2026-04-16
See Project
2

HLSL++

Math library using HLSL syntax with multiplatform SIMD support

HLSL++ is a header-only C++ math library designed to replicate the syntax and functionality of the HLSL shading language, making it easier for developers to write CPU-side code that mirrors GPU shader logic. It provides vector, matrix, and math operations with a syntax identical or very similar to HLSL, allowing seamless transition between shader code and application code. The library is optimized for performance and supports SIMD instructions across multiple architectures, including SSE, AVX, AVX2, AVX512, and ARM NEON, ensuring high efficiency on modern hardware. ...

Downloads: 2 This Week

Last Update: 2026-04-08
See Project
3

Floem

A native Rust UI library with fine-grained reactivity

Floem is a cross-platform GUI framework for Rust. It aims to be extremely performant while providing world-class developer ergonomics. Supporting both GPU and CPU rendering, Floem gives you performance that's closest to bare metal. Also primitives are provided to help the developer to write performant UI code without too much effect.

Downloads: 0 This Week

Last Update: 2024-11-15
See Project
4

Arcan

Powerful development framework for creating virtually anything

...At its heart lies a robust and portable multimedia engine, with a well-tested and well-documented Lua scripting interface. The development emphasizes security, debuggability and performance, guided by a principle of least surprise in terms of API design. For the main engine there has been quite some refactoring to reduce input latency; better accommodate variable-refresh rate display; prepare for asymmetric uncooperative multi-GPU and GPU handover; explicit synchronization and runtime transitions back and forth between low (16-bit) to standard (32-bit) to high-definition rendering (10-bit + fp16/fp32).

Downloads: 0 This Week

Last Update: 2025-01-29
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

ChartGPU

Beautiful, open source, WebGPU-based charting library

The ChartGPU repository is an open-source, WebGPU-based charting library written in TypeScript that enables developers to visualize large datasets with high performance and smooth interactivity even when handling millions of data points. By leveraging WebGPU — the next-generation graphics API for the web — ChartGPU offloads rendering work to the GPU, allowing for fast panning, zooming, and real-time updates with minimal latency. This makes the library particularly valuable for data-intensive dashboards, scientific visualizations, and financial charting where performance bottlenecks of traditional canvas or SVG approaches become apparent. ...

Downloads: 0 This Week

Last Update: 2026-02-26
See Project
6

The Futhark Programming Language

A data-parallel functional programming language

Futhark is a small programming language designed to be compiled into efficient parallel code. It is a statically typed, data-parallel, and purely functional array language in the ML family, and comes with a heavily optimizing ahead-of-time compiler that presently generates either GPU code via CUDA and OpenCL, or multi-threaded CPU code. Futhark is not designed for graphics programming, but can instead use the compute power of the GPU to accelerate data-parallel array computations. The...

Downloads: 3 This Week

Last Update: 7 days ago
See Project
7

FurMark

GPU stress test OpenGL and Vulkan graphics benchmark Windows/Linux

FurMark is an intensive benchmarking tool designed to evaluate the performance of graphics cards using fur rendering algorithms. This tool is particularly effective in generating high workloads that can significantly increase the temperature of the GPU, making it a useful utility for testing the stability and stress tolerance of graphics cards. By simulating demanding rendering tasks, FurMark serves as a comprehensive test for assessing the robustness and thermal performance of GPUs under extreme conditions. ...

Downloads: 311 This Week

Last Update: 2024-10-28
See Project
8

Face Alignment

2D and 3D Face alignment library build using pytorch

...By default, the package will use the SFD face detector. However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. While not required, for optimal performance(especially for the detector) it is highly recommended to run the code using a CUDA-enabled GPU. While here the work is presented as a black box, if you want to know more about the intrisecs of the method please check the original paper either on arxiv or my webpage.

Downloads: 2 This Week

Last Update: 2026-04-06
See Project
9

JAX Toolbox

Public CI, Docker images for popular JAX libraries

JAX Toolbox is a development toolkit designed to streamline and optimize the use of JAX for machine learning and high-performance computing on NVIDIA GPUs. It provides prebuilt Docker images, continuous integration pipelines, and optimized example implementations that help developers quickly set up and run JAX workloads without complex configuration. The project supports popular JAX-based frameworks and models, including architectures used for large-scale pretraining such as GPT and LLaMA...

Downloads: 2 This Week

Last Update: 7 days ago
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 18 This Week

Last Update: 2026-01-13
See Project
11

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...

Downloads: 1 This Week

Last Update: 2025-09-05
See Project
12

MNN

MNN is a blazing fast, lightweight deep learning framework

...Android platform, core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB. Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

Downloads: 12 This Week

Last Update: 2026-04-07
See Project
13

RLax

Library of JAX-based building blocks for reinforcement learning agents

...It supports both on-policy and off-policy learning, as well as value-based, policy-based, and model-based approaches. RLax is fully JIT-compilable with JAX, enabling high-performance execution across CPU, GPU, and TPU backends. The library implements tools for Bellman equations, return distributions, general value functions, and policy optimization in both continuous and discrete action spaces. It integrates seamlessly with DeepMind’s Haiku (for neural network definition) and Optax (for optimization), making it a key component in modular RL pipelines.

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
14

Halide

A language for fast, portable data-parallel computation

Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. ...

Downloads: 0 This Week

Last Update: 2025-09-17
See Project
15

GPUPixel

Real-time image and video processing library similar to GPUImage

GPUPixel is a real-time image and video processing library written in C++11, based on OpenGL/ES. It offers functionalities similar to GPUImage, including built-in beauty filters, enabling efficient processing and rendering of visual effects on images and videos.

Downloads: 1 This Week

Last Update: 2026-01-23
See Project
16

ngx-toastr

Angular Toastr

Toast Component Injection without being passed ViewContainerRef. No use of ngFor. Fewer dirty checks and higher performance. AoT compilation and lazy loading compatible. Component inheritance for custom toasts. SystemJS/UMD rollup bundle. Animations using Angular's Web Animations API. Output toasts to an optional target directive. Put toasts in a specific div inside your application. This should probably be somewhere that doesn't get deleted. Add ToastContainerModule to the ngModule where...

Downloads: 0 This Week

Last Update: 2026-02-06
See Project
17

GitHub Actions for DigitalOcean

GitHub Actions for DigitalOcean - doctl

...Powerful and production-ready, our cloud platform has the solutions that devs like you need to succeed, whether you're building world-changing AI apps, running a side project, or building a business. GPU solutions for everyone—novice to expert. Run training and inference, process large data sets and complex neural networks, and deploy high-performance computing clusters.

Downloads: 0 This Week

Last Update: 2026-04-22
See Project
18

PyOpenCL

OpenCL integration for Python, plus shiny features

PyOpenCL is a Python wrapper for the OpenCL framework, providing seamless access to parallel computing on CPUs, GPUs, and other accelerators. It enables developers to harness the full power of heterogeneous computing directly from Python, combining Python’s ease of use with the performance benefits of OpenCL. PyOpenCL also includes convenient features for managing memory, compiling kernels, and interfacing with NumPy, making it a preferred choice in scientific computing, data analysis, and...

Downloads: 1 This Week

Last Update: 2026-01-09
See Project
19

Bend

A massively parallel, high-level programming language

Bend is an interactive programming environment (REPL) built on top of the Kotlin language, designed to allow users to explore, experiment, and learn Kotlin in a live, feedback-driven manner. The tool lets you define variables, functions, or values at the prompt and iteratively refine them—immediately seeing output and types—while preserving state across commands. It emphasizes discoverability and experimentation: users can inspect functions, call them on sample inputs, and evolve logic...

Downloads: 0 This Week

Last Update: 2025-09-21
See Project
20

oneDNN

oneAPI Deep Neural Network Library (oneDNN)

This software was previously known as Intel(R) Math Kernel Library for Deep Neural Networks (Intel(R) MKL-DNN) and Deep Neural Network Library (DNNL). oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform performance library of basic building blocks for deep learning applications. oneDNN is part of oneAPI. The library is optimized for Intel(R) Architecture Processors, Intel Processor Graphics and Xe Architecture graphics. oneDNN has experimental support for the following architectures: Arm* 64-bit Architecture (AArch64), NVIDIA* GPU, OpenPOWER* Power ISA (PPC64), IBMz* (s390x), and RISC-V. oneDNN is intended for deep learning applications and framework developers interested in improving application performance on Intel CPUs and GPUs. ...

Downloads: 0 This Week

Last Update: 2026-04-17
See Project
21

VK-GL-CTS

Khronos Vulkan, OpenGL, and OpenGL ES Conformance Tests

...These tests are essential for vendors seeking certification, as they rigorously check the correctness and completeness of driver implementations against standardized behavior. The suite contains thousands of automated tests that assess rendering accuracy, API behavior, memory usage, and performance consistency. It is widely used by GPU vendors and developers to ensure compatibility, stability, and reliability across platforms and hardware.

Downloads: 1 This Week

Last Update: 2026-03-27
See Project
22

bitnet.cpp

Official inference framework for 1-bit LLMs

bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
23

Isaac ROS Visual SLAM

Visual SLAM/odometry package based on NVIDIA-accelerated cuVSLAM

Discover a faster, easier way to build advanced AI robotics applications with the NVIDIA Isaac™ ROS collection of accelerated computing packages and AI models, bringing NVIDIA acceleration to ROS developers everywhere. Isaac ROS Visual SLAM provides a high-performance, best-in-class ROS 2 package for VSLAM (visual simultaneous localization and mapping). This package uses one or more stereo cameras and optionally an IMU to estimate odometry as an input to navigation. It is GPU-accelerated to provide real-time, low-latency results in a robotics application. VSLAM provides an additional odometry source for mobile robots (ground-based) and can be the primary odometry source for drones. ...

Downloads: 2 This Week

Last Update: 6 days ago
See Project
24

Recursive Language Models

General plug-and-play inference library for Recursive Language Models

RLM (short for Reinforcement Learning Models) is a modular framework that makes it easier to build, train, evaluate, and deploy reinforcement learning (RL) agents across a wide range of environments and tasks. It provides a consistent API that abstracts away many of the repetitive engineering patterns in RL research and application work, letting developers focus on modeling, experimentation, and fine-tuning rather than infrastructure plumbing. Within the framework, you can define custom...

Downloads: 0 This Week

Last Update: 2026-02-18
See Project
25

BentoML

Unified Model Serving Framework

...Parallelize compute-intense model inference workloads to scale separately from the serving logic. Adaptive batching dynamically groups inference requests for optimal performance. Orchestrate distributed inference graph with multiple models via Yatai on Kubernetes. Easily configure CUDA dependencies for running inference with GPU. Automatically generate docker images for production deployment.

Downloads: 0 This Week

Last Update: 2026-04-02
See Project

Previous
1
You're on page 2
3
4
5
Next

Related Searches

furmark

3d face reconstruction

ncnn

android tools

cuda

opengl es 2.0

virtual gpu

intel c++

opengl 2.0

windows 7 32 bit software

Related Categories

Software Development

Artificial Intelligence

Multimedia

Scientific/Engineering

System

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise