Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Search Results

Search Results for "gpu max performance" - Page 6

x

Sort By:

Relevance

Clear All Filters

OS

Linux 388
Windows 346
Mac 336
More...
BSD 122
ChromeOS 121
Mobile Operating Systems 23
Desktop Operating Systems 6
Embedded Operating Systems 1
Server Operating Systems 1

Category

Artificial Intelligence 153
Software Development 106
Multimedia 47
System 47
Business 24
Scientific/Engineering 17
Games 13
Blockchain 6
Database 4
Mobile 3
Security 3
Education 2
Terminals 2
Internet 1
Text Editors 1

License

OSI-Approved Open Source 315
Creative Commons Attribution License 2
Other License 2
Public Domain 1

Translations

English 13
Bengali 1
Chinese (Simplified) 1
Korean 1
More...
Spanish 1

Programming Language

Python 131
C++ 88
C 35
Rust 20
More...
Java 14
JavaScript 13
TypeScript 13
Unix Shell 13
Go 10
Julia 10
ActionScript 8
C# 7
Objective C 4
Assembly 2
CoffeeScript 2
Haskell 2
MATLAB 2
AspectJ 1
Fortran 1
haXe 1
Kotlin 1
Lua 1
PHP 1
Swift 1
Tcl 1

Status

Production/Stable 24
Beta 15
Alpha 7
Mature 3

Showing 388 open source projects for "gpu max performance"

View related business solutions

Linux Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
Earn up to 16% annual interest with Nexo.
More flexibility. More control.

Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.

Get started with Nexo.
1

ANE Training

Training neural networks on Apple Neural Engine via APIs

...The repository implements a from-scratch transformer training pipeline capable of running both forward and backward passes on ANE hardware without relying on CoreML, Metal, or GPU acceleration. It explores the internal software stack of the Apple Neural Engine by interfacing with private classes such as _ANEClient and compiling custom compute graphs in the MIL format. The project includes performance benchmarks and kernel breakdowns that show how different components of the training loop are distributed between the ANE and CPU. ...

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
2

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 5 This Week

Last Update: 2025-11-30
See Project
3

Lux.jl

Elegant and Performant Deep Learning

Lux.jl is a lightweight and extensible deep learning framework in Julia designed for speed, composability, and clarity. Unlike traditional machine learning libraries that bundle training logic and models, Lux separates model definitions from training routines, encouraging modularity and ease of experimentation. It integrates seamlessly with SciML and other Julia packages, supporting neural differential equations and scientific machine learning workflows.

Downloads: 2 This Week

Last Update: 5 days ago
See Project
4

Mach Engine

Zig game engine & graphics toolkit

Mach is a game engine and graphics toolkit written in Zig, built with the goal of enabling high-performance, truly cross-platform 2D, 3D, GUI, and visualization applications. The project aims to deliver a modular, robust foundation where graphics, input, windowing, and rendering are unified under a modern, low-level but ergonomic API. Because Mach is written in Zig (with some shader code / WGSL), it leverages Zig’s performance and modern systems-level features while offering safe-ish...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
5

hashcat

World's fastest and most advanced password recovery utility

hashcat is the world's fastest and most advanced password recovery utility, supporting five unique modes of attack for over 300 highly-optimized hashing algorithms. hashcat currently supports CPUs, GPUs, and other hardware accelerators on Linux, Windows, and macOS, and has facilities to help enable distributed password cracking. Download the latest release and unpack it in the desired location. Please remember to use 7z x when unpacking the archive from the command line to ensure full file...

Downloads: 81 This Week

Last Update: 2025-08-23
See Project
6

face.evoLVe

High-Performance Face Recognition Library on PaddlePaddle & PyTorch

face.evoLVe is a high-performance face recognition library designed for research and real-world applications in computer vision. The project provides a comprehensive framework for building and training modern face recognition models using deep learning architectures. It includes components for face alignment, landmark localization, data preprocessing, and model training pipelines that allow developers to construct end-to-end facial recognition systems. The repository supports multiple neural...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
7

Xenia

Xbox 360 Emulator Research Project

Xenia is an open-source experimental emulator for the Xbox 360 that aims to let users run Xbox 360 games on Windows and other platforms by reverse-engineering the console’s hardware and firmware behavior in software. It implements the 360’s CPU (Xenon), GPU (including Direct3D shader logic), and system libraries to translate Xbox instructions into equivalent host machine operations, enabling many titles to launch and in some cases play at improved frame rates compared with the original hardware. Because Xbox 360 games use custom hardware features and proprietary APIs, Xenia developers have progressively mapped and translated these into PC-friendly code while balancing performance and accuracy, and the project includes compatibility tracking so users can see what games work and how well. ...

Downloads: 41 This Week

Last Update: 2026-02-18
See Project
8

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. ...

Downloads: 30 This Week

Last Update: 2026-03-25
See Project
9

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...

Downloads: 1 This Week

Last Update: 2025-09-05
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
10

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 18 This Week

Last Update: 2026-01-13
See Project
11

LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference

LightLLM is a high-performance inference and serving framework designed specifically for large language models, focusing on lightweight architecture, scalability, and efficient deployment. The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems.

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
12

RLax

Library of JAX-based building blocks for reinforcement learning agents

...It supports both on-policy and off-policy learning, as well as value-based, policy-based, and model-based approaches. RLax is fully JIT-compilable with JAX, enabling high-performance execution across CPU, GPU, and TPU backends. The library implements tools for Bellman equations, return distributions, general value functions, and policy optimization in both continuous and discrete action spaces. It integrates seamlessly with DeepMind’s Haiku (for neural network definition) and Optax (for optimization), making it a key component in modular RL pipelines.

Downloads: 0 This Week

Last Update: 2025-10-09
See Project
13

DLRM

An implementation of a deep learning recommendation model (DLRM)

...The architecture combines dense (MLP) and sparse (embedding) branches, then interacts features via dot product or feature interactions before passing through further dense layers to predict click-through, ranking scores, or conversion probabilities. The implementation is optimized for performance at scale, supporting multi-GPU and multi-node execution, quantization, embedding partitioning, and pipelined I/O to feed huge embeddings efficiently. It includes data loaders for standard benchmarks (like Criteo), training scripts, evaluation tools, and capabilities like mixed precision, gradient compression, and memory fusion to maximize throughput.

Downloads: 0 This Week

Last Update: 2026-01-12
See Project
14

CTranslate2

Fast inference engine for Transformer models

CTranslate2 is a C++ and Python library for efficient inference with Transformer models. The project implements a custom runtime that applies many performance optimization techniques such as weights quantization, layers fusion, batch reordering, etc., to accelerate and reduce the memory usage of Transformer models on CPU and GPU. The execution is significantly faster and requires less resources than general-purpose deep learning frameworks on supported models and tasks thanks to many advanced optimizations: layer fusion, padding removal, batch reordering, in-place operations, caching mechanism, etc. ...

Downloads: 0 This Week

Last Update: 2026-02-04
See Project
15

OpenFold

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction

OpenFold carefully reproduces (almost) all of the features of the original open source inference code (v2.0.1). The sole exception is model ensembling, which fared poorly in DeepMind's own ablation testing and is being phased out in future DeepMind experiments. It is omitted here for the sake of reducing clutter. In cases where the Nature paper differs from the source, we always defer to the latter. OpenFold is trainable in full precision, half precision, or bfloat16 with or without...

Downloads: 0 This Week

Last Update: 2025-04-26
See Project
16

Halide

A language for fast, portable data-parallel computation

Halide is a programming language for fast, portable data-parallel computation. It was designed to make writing high-performance image and array processing code much easier on modern machines. It works on all major operating systems and with several CPU architectures (X86, ARM, MIPS, Hexagon, PowerPC) and GPU Compute APIs (CUDA, OpenCL, OpenGL, among others). It isn't a standalone programming language however; rather it is embedded in C++ which means that you write C++ code, building an in-memory representation of a Halide pipeline using Halide's C++ API. ...

Downloads: 0 This Week

Last Update: 2025-09-17
See Project
17

LightGBM

Gradient boosting framework based on decision tree algorithms

...LightGBM supports parallel and GPU learning, and can handle large-scale data. It’s become widely-used for ranking, classification and many other machine learning tasks.

Downloads: 0 This Week

Last Update: 2025-02-15
See Project
18

MNN

MNN is a blazing fast, lightweight deep learning framework

...Android platform, core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB. Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

Downloads: 12 This Week

Last Update: 2026-04-07
See Project
19

DINOv3

Reference PyTorch implementation and models for DINOv3

DINOv3 is the third-generation iteration of Meta’s self-supervised visual representation learning framework, building upon the ideas from DINO and DINOv2. It continues the paradigm of learning strong image representations without labels using teacher–student distillation, but introduces a simplified and more scalable training recipe that performs well across datasets and architectures. DINOv3 removes the need for complex augmentations or momentum encoders, streamlining the pipeline while...

Downloads: 14 This Week

Last Update: 2026-03-30
See Project
20

GPUPixel

Real-time image and video processing library similar to GPUImage

GPUPixel is a real-time image and video processing library written in C++11, based on OpenGL/ES. It offers functionalities similar to GPUImage, including built-in beauty filters, enabling efficient processing and rendering of visual effects on images and videos.

Downloads: 1 This Week

Last Update: 2026-01-23
See Project
21

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

RLHF-Reward-Modeling is an open-source research framework focused on training reward models used in reinforcement learning from human feedback for large language models. In RLHF pipelines, reward models are responsible for evaluating generated responses and assigning scores that guide the model toward outputs that better match human preferences. The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
22

xFormers

Hackable and optimized Transformers building blocks

xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and...

Downloads: 1 This Week

Last Update: 2026-02-20
See Project
23

Novabench

Benchmark CPU, GPU, memory, and storage

Novabench is a computer benchmarking software that helps users evaluate and compare the performance of their system’s CPU, GPU, memory, and storage. It offers rapid testing, enabling comparisons across millions of devices and providing insights for troubleshooting, upgrades, and performance optimization.

Downloads: 11 This Week

Last Update: 2024-10-29
See Project
24

LitServe

Minimal Python framework for scalable AI inference servers fast

LitServe is a minimal Python framework designed for building custom AI inference servers with full control over how models are executed and served. It allows developers to define their own inference logic, making it suitable for complex systems such as multi-model pipelines, agents, and retrieval-augmented generation workflows. Unlike traditional serving tools that enforce rigid abstractions, LitServe focuses on flexibility by letting users control request handling, batching strategies, and...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
25

dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...

Downloads: 0 This Week

Last Update: 2026-03-24
See Project

Previous
2
3
4
5
You're on page 6
7
8
9
10
Next

Related Searches

xbox 360 bios

hashcat

xbox 360 emulator.apk

xbox 360 emulator

password

xbox emulator for pc

xbox 360 jailbreak

openvino

jarvis voice hindi

wifi penetration testing

Related Categories

Artificial Intelligence

Software Development

Multimedia

System

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise