Search Results for "gpu max performance" - Page 6

Sort By:

Showing 457 open source projects for "gpu max performance"

View related business solutions

Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
1

CapFrameX

Frametime capture and analysis tool

...Importantly, the tool also integrates with sensor inputs (CPU, GPU, VRAM, temps, etc.) and overlays statistics in-game via Rivatuner Statistics Server, so you get in-situ feedback while you run. For benchmarking, it supports aggregation, filtering, outlier detection, and export of records to CSV/Excel for further analysis or reporting. The project is suited for reviewers, hardware testers, and power users who want to dig deeper than simple FPS numbers and want to diagnose performance issues.

Downloads: 11 This Week

Last Update: 3 days ago
See Project
2

AIDA64 Extreme

AIDA64 Extreme: Ultimate PC diagnostics & system info tool

AIDA64 Extreme - The Ultimate System Diagnostics & Benchmarking Tool Unlock the full potential of your PC with AIDA64, the industry-leading system information, diagnostics, and benchmarking software. Trusted by PC enthusiasts, IT professionals, and overclockers, AIDA64 provides detailed insights into your hardware, software, and system performance. Optimize your device, troubleshoot issues, and push performance to the max. Why Choose AIDA64? - Comprehensive System Info: Get in-depth details on CPU, GPU, RAM, motherboard, and more. - Advanced Diagnostics: Identify hardware issues and monitor system health in real-time. - Benchmarking Power: Test CPU, GPU, and memory performance with accurate metrics...

1 Review

Downloads: 96 This Week

Last Update: 2025-06-03
See Project
3

FurMark

GPU stress test OpenGL and Vulkan graphics benchmark Windows/Linux

FurMark is an intensive benchmarking tool designed to evaluate the performance of graphics cards using fur rendering algorithms. This tool is particularly effective in generating high workloads that can significantly increase the temperature of the GPU, making it a useful utility for testing the stability and stress tolerance of graphics cards. By simulating demanding rendering tasks, FurMark serves as a comprehensive test for assessing the robustness and thermal performance of GPUs under extreme conditions. ...

Downloads: 302 This Week

Last Update: 2024-10-28
See Project
4

WhisperLive

A nearly-live implementation of OpenAI's Whisper

WhisperLive is a “nearly live” implementation of OpenAI’s Whisper model focused on real-time transcription. It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and...

Downloads: 7 This Week

Last Update: 2026-03-17
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
5

Qwen

The official repo of Qwen chat & pretrained large language model

Qwen is a series of large language models developed by Alibaba Cloud, consisting of various pretrained versions like Qwen-1.8B, Qwen-7B, Qwen-14B, and Qwen-72B. These models, which range from smaller to larger configurations, are designed for a wide range of natural language processing tasks. They are openly available for research and commercial use, with Qwen's code and model weights shared on GitHub. Qwen's capabilities include text generation, comprehension, and conversation, making it a...

1 Review

Downloads: 16 This Week

Last Update: 2026-03-05
See Project
6

node-llama-cpp

Run AI models locally on your machine with node.js bindings for llama

...The system automatically detects the available hardware on a machine and selects the most appropriate compute backend, including CPU or GPU acceleration. Developers can use the library to perform tasks such as text generation, conversational chat, embedding generation, and structured output generation. Because it runs models locally, the platform is particularly useful for privacy-sensitive environments or offline AI deployments.

Downloads: 2 This Week

Last Update: 2026-03-17
See Project
7

Lux.jl

Elegant and Performant Deep Learning

Lux.jl is a lightweight and extensible deep learning framework in Julia designed for speed, composability, and clarity. Unlike traditional machine learning libraries that bundle training logic and models, Lux separates model definitions from training routines, encouraging modularity and ease of experimentation. It integrates seamlessly with SciML and other Julia packages, supporting neural differential equations and scientific machine learning workflows.

Downloads: 4 This Week

Last Update: 6 days ago
See Project
8

CUDA Agent

Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

CUDA Agent is a research-driven agentic reinforcement learning system designed to automatically generate and optimize high-performance CUDA kernels for GPU workloads. The project addresses the long-standing challenge that efficient CUDA programming typically requires deep hardware expertise by training an autonomous coding agent capable of iterative improvement through execution feedback. Its architecture combines large-scale data synthesis, a skill-augmented CUDA development environment, and long-horizon reinforcement learning to build intrinsic optimization capability rather than relying on simple post-hoc tuning. ...

Downloads: 0 This Week

Last Update: 2026-03-03
See Project
9

hashcat

World's fastest and most advanced password recovery utility

hashcat is the world's fastest and most advanced password recovery utility, supporting five unique modes of attack for over 300 highly-optimized hashing algorithms. hashcat currently supports CPUs, GPUs, and other hardware accelerators on Linux, Windows, and macOS, and has facilities to help enable distributed password cracking. Download the latest release and unpack it in the desired location. Please remember to use 7z x when unpacking the archive from the command line to ensure full file...

Downloads: 94 This Week

Last Update: 2025-08-23
See Project
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
10

Proxyman

Web Debugging Proxy for macOS, iOS, and Android

Don't let cumbersome web debugging tools hold you back. With Proxyman's native macOS app, you can capture, inspect, and manipulate HTTP(s) traffic with ease. Intuitive, thoughtful, built with meticulous attention to detail. Comprehensive Guideline to set up with iOS simulator and iOS and Android devices. Proxyman acts as a man-in-the-middle server that capture the traffic between your applications and SSL Web Server. With built-in macOS setup, so you can inspect your HTTP/HTTPS Request and...

Downloads: 32 This Week

Last Update: 2 days ago
See Project
11

Handy STT

A free, open source, and extensible speech-to-text application

Handy is a free, open-source, offline speech-to-text application built for privacy, accessibility, and extensibility. Developed using Tauri (Rust + React/TypeScript), it runs natively across Windows, macOS, and Linux while performing local speech recognition without sending any audio to cloud servers. Handy allows users to start transcription instantly using a configurable keyboard shortcut—press to record, release to transcribe—and automatically pastes the resulting text into any active...

Downloads: 51 This Week

Last Update: 2026-04-27
See Project
12

LightGBM

Gradient boosting framework based on decision tree algorithms

...LightGBM supports parallel and GPU learning, and can handle large-scale data. It’s become widely-used for ranking, classification and many other machine learning tasks.

Downloads: 1 This Week

Last Update: 2025-02-15
See Project
13

Xenia

Xbox 360 Emulator Research Project

Xenia is an open-source experimental emulator for the Xbox 360 that aims to let users run Xbox 360 games on Windows and other platforms by reverse-engineering the console’s hardware and firmware behavior in software. It implements the 360’s CPU (Xenon), GPU (including Direct3D shader logic), and system libraries to translate Xbox instructions into equivalent host machine operations, enabling many titles to launch and in some cases play at improved frame rates compared with the original hardware. Because Xbox 360 games use custom hardware features and proprietary APIs, Xenia developers have progressively mapped and translated these into PC-friendly code while balancing performance and accuracy, and the project includes compatibility tracking so users can see what games work and how well. ...

Downloads: 48 This Week

Last Update: 2026-02-18
See Project
14

JAX Toolbox

Public CI, Docker images for popular JAX libraries

JAX Toolbox is a development toolkit designed to streamline and optimize the use of JAX for machine learning and high-performance computing on NVIDIA GPUs. It provides prebuilt Docker images, continuous integration pipelines, and optimized example implementations that help developers quickly set up and run JAX workloads without complex configuration. The project supports popular JAX-based frameworks and models, including architectures used for large-scale pretraining such as GPT and LLaMA...

Downloads: 2 This Week

Last Update: 3 days ago
See Project
15

ANE Training

Training neural networks on Apple Neural Engine via APIs

...The repository implements a from-scratch transformer training pipeline capable of running both forward and backward passes on ANE hardware without relying on CoreML, Metal, or GPU acceleration. It explores the internal software stack of the Apple Neural Engine by interfacing with private classes such as _ANEClient and compiling custom compute graphs in the MIL format. The project includes performance benchmarks and kernel breakdowns that show how different components of the training loop are distributed between the ANE and CPU. ...

Downloads: 2 This Week

Last Update: 2026-03-10
See Project
16

Humanoid-Gym

Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real

Humanoid-Gym is a reinforcement learning framework designed to train locomotion and control policies for humanoid robots using high-performance simulation environments. The system is built on top of NVIDIA Isaac Gym, which allows large-scale parallel simulation of robotic environments directly on GPU hardware. Its primary goal is to enable efficient training of humanoid robots in simulation while enabling policies to transfer effectively to real-world hardware without additional training. ...

Downloads: 0 This Week

Last Update: 2026-03-15
See Project
17

ncnn

High-performance neural network inference framework for mobile

ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...

Downloads: 22 This Week

Last Update: 2026-01-13
See Project
18

AceBase realtime database

A fast, low memory, transactional, index, query enabled NoSQL database

A fast, low memory, transactional, index & query enabled NoSQL database engine and server for node.js and browser with real-time data change notifications. Supports storing of JSON objects, arrays, numbers, strings, booleans, dates, begins, and binary (ArrayBuffer) data. Inspired by (and largely compatible with) the Firebase real-time database, with additional functionality and less data sharding/duplication. Capable of storing up to 2^48 (281 trillion) object nodes in a binary database file...

Downloads: 3 This Week

Last Update: 2026-04-16
See Project
19

Audiblez

Generate audiobooks from e-books

Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...

Downloads: 4 This Week

Last Update: 2025-11-30
See Project
20

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. ...

Downloads: 32 This Week

Last Update: 2026-03-25
See Project
21

Mach Engine

Zig game engine & graphics toolkit

Mach is a game engine and graphics toolkit written in Zig, built with the goal of enabling high-performance, truly cross-platform 2D, 3D, GUI, and visualization applications. The project aims to deliver a modular, robust foundation where graphics, input, windowing, and rendering are unified under a modern, low-level but ergonomic API. Because Mach is written in Zig (with some shader code / WGSL), it leverages Zig’s performance and modern systems-level features while offering safe-ish...

Downloads: 0 This Week

Last Update: 2026-04-10
See Project
22

face.evoLVe

High-Performance Face Recognition Library on PaddlePaddle & PyTorch

face.evoLVe is a high-performance face recognition library designed for research and real-world applications in computer vision. The project provides a comprehensive framework for building and training modern face recognition models using deep learning architectures. It includes components for face alignment, landmark localization, data preprocessing, and model training pipelines that allow developers to construct end-to-end facial recognition systems. The repository supports multiple neural...

Downloads: 0 This Week

Last Update: 2026-03-11
See Project
23

LitServe

Minimal Python framework for scalable AI inference servers fast

LitServe is a minimal Python framework designed for building custom AI inference servers with full control over how models are executed and served. It allows developers to define their own inference logic, making it suitable for complex systems such as multi-model pipelines, agents, and retrieval-augmented generation workflows. Unlike traditional serving tools that enforce rigid abstractions, LitServe focuses on flexibility by letting users control request handling, batching strategies, and...

Downloads: 1 This Week

Last Update: 2026-03-18
See Project
24

ArrayFire

ArrayFire, a general purpose GPU library

ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...

Downloads: 1 This Week

Last Update: 2025-09-05
See Project
25

Superposition Benchmark (Unigine)

GPU benchmark testing graphics performance with realistic 3D scenes.

Superposition Benchmark by Unigine is a powerful GPU stress-testing and benchmarking tool designed to evaluate graphics performance using the Unigine 2 Engine. It features advanced visuals, real-time lighting, and physics simulations to test DirectX and OpenGL performance. Superposition provides detailed results, including frame rates, GPU temperatures, and stability data. It supports VR mode and 4K to 8K resolutions.

Downloads: 128 This Week

Last Update: 2025-10-07
See Project