Search Results for "gpu max performance" - Page 13

Sort By:

Showing 458 open source projects for "gpu max performance"

View related business solutions

Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
Stop Storing Third-Party Tokens in Your Database
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.

Try Auth0 for Free
1

Veldrid

A low-level, portable graphics library for .NET

Veldrid is a low-level, portable graphics library for .NET, providing a unified API over multiple graphics backends such as Direct3D, Vulkan, OpenGL, and Metal. It enables developers to write high-performance, cross-platform graphics applications without being tied to a specific graphics API. Veldrid is suitable for game development, simulations, and other applications requiring advanced graphics capabilities.

Downloads: 5 This Week

Last Update: 2025-03-19
See Project
2

Forma

An efficient vector-graphics renderer

Forma is an experimental vector graphics renderer written in Rust, developed by Google to explore high-performance, parallelized rendering techniques across multiple platforms. The project aims to achieve portability, performance, simplicity, and small footprint through a streamlined four-stage rendering pipeline. Forma provides both CPU (software) and GPU (hardware) backends, relying on Rust’s SIMD auto-vectorization, Rayon for multithreading, and WebGPU (wgpu) for hardware acceleration. ...

Downloads: 3 This Week

Last Update: 6 days ago
See Project
3

FasterTransformer

Transformer related optimization, including BERT, GPT

FasterTransformer is a high-performance inference library designed to accelerate transformer-based models such as BERT, GPT, and T5 on NVIDIA GPUs. It provides optimized implementations of transformer encoder and decoder layers using CUDA, cuBLAS, and custom kernels to maximize throughput and minimize latency. The library supports multiple deep learning frameworks, including TensorFlow, PyTorch, and Triton, allowing developers to integrate it into existing pipelines without major changes. It...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project
4

FairScale

PyTorch extensions for high performance and large scale training

...FairScale puts emphasis on correctness and debuggability, offering hook points, logging, and reference examples for common trainer patterns. Although many ideas have since landed in core PyTorch, FairScale remains a valuable reference and a practical toolbox for squeezing more performance out of multi-GPU and multi-node jobs.

Downloads: 4 This Week

Last Update: 2025-10-07
See Project
Secure File Transfer for Windows with Cerberus by Redwood
Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.

Try for Free
5

Real-ESRGAN Video Enhance

Real-ESRGAN video upscaler with resumability

...It utilizes Real-ESRGAN-can-Vulkan, FFmpeg and MediaInfo under the hood. REVE employs a segment-based approach to video upscaling, allowing it to simultaneously upscale and encode videos. This results in a notable enhancement in performance and enables the feature of reusability. You can download Windows executable file for Intel/AMD/Nvidia GPU. This executable file is portable and includes all the binaries and models required. No CUDA or PyTorch environment is needed.

Downloads: 16 This Week

Last Update: 2023-03-30
See Project
6

slide-element

Promise-based library for animating elements with dynamic heights

...The animations themselves are powered by the same mechanics used within CSS transitions, making it one of the best ways to pull it off in terms of performance.

Downloads: 0 This Week

Last Update: 2023-10-03
See Project
7

NBMiner

GPU Miner for ETH, RVN, BEAM, CFX, ZIL, AE, ERGO

nbminer.com & NBMiner_github are the only 2 officially maintained site for publishing information and new releases of NBMiner. Be aware when you download NBMiner binaries from other sources. For GPUs with Hynix GDDR6 memory, LHR mode is not recommended for the poor performance. Improve performance on Grin29 & AE. Add support for mining SERO, algo progpow_sero. Change devfee in percentage, [0-5]. Set to ‘0’ to turn off devfee with lower hash rate. Otherwise, devfee = max(set_value, default_value). Memory timings optimize for Nvidia GDDR5 & GDDR5X gpus. range [1-6]. Higher value equals higher hashrate. ...

Downloads: 6 This Week

Last Update: 2022-10-25
See Project
8

FLoops.jl

Fast sequential, threaded, and distributed for-loops for Julia

Fast sequential, threaded, and distributed for-loops for Julia, fold for humans.FLoops.jl provides a macro @floop. It can be used to generate a fast generic sequential and parallel iteration over complex collections. Furthermore, the loop written in @floop can be executed with any compatible executors. See FoldsThreads.jl for various thread-based executors that are optimized for different kinds of loops. FoldsCUDA.jl provides an executor for GPU. FLoops.jl also provides a simple distributed...

Downloads: 2 This Week

Last Update: 2023-11-13
See Project
9

Remotery

Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

Remotery is a real-time CPU/GPU profiler implemented as a single C file, providing developers with immediate insights into the performance of their applications. It features a remote web-based viewer that runs in browsers like Chrome, Firefox, and Safari, allowing for cross-platform performance analysis. Remotery supports profiling multiple threads and GPU contexts, offering a comprehensive view of an application's performance characteristics.

Downloads: 3 This Week

Last Update: 2025-03-19
See Project
Build Securely on AWS with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

Coqui STT

The deep learning toolkit for speech-to-text

Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...

Downloads: 3 This Week

Last Update: 2022-09-03
See Project
11

Darknet

Convolutional Neural Networks

...Darknet is lightweight, fast, and easy to compile, making it suitable for research and production use. The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.

Downloads: 19 This Week

Last Update: 2026-05-02
See Project
12

TOTimer

Time-Out-Timer for single threaded environement

Serve timeouts by calling a service-function periodically and an independent time-source (toticker). If you need a timer and like to try this one, please write your experiences to me in brief! And may be suggestions, of course. 10ms tick time should be possible. May be less. DO NOT USE Version less 0.1.1 - it is definatly buggy! Easy to control timers via Handle, eg auto create&start a timer via totimer_setTimeout function. Optionally use Callback-Functions if timeout...

Downloads: 0 This Week

Last Update: 2022-11-22
See Project
13

CUDA Pathtracer

GPU Raytracer from scratch in C++/CUDA

GPU-Raytracer is a high-performance, real-time ray tracing engine implemented using OpenGL compute shaders. It demonstrates the power of modern GPU architectures to handle complex lighting calculations, reflections, shadows, and global illumination in real-time. This project is educational and experimental, providing insight into GPU parallelism and real-time rendering techniques.

Downloads: 3 This Week

Last Update: 2025-03-25
See Project
14

ASRT Speech Recognition

A Deep-Learning-Based Chinese Speech Recognition System

ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.

Downloads: 0 This Week

Last Update: 2025-07-03
See Project
15

Darktile

Darktile is a GPU rendered terminal emulator

Darktile is a GPU-rendered terminal emulator specifically designed for tiling window managers. It utilizes GPU acceleration to provide smooth and efficient rendering, supporting Unicode and a variety of themes. Darktile includes features like font ligatures, context-aware overlays (hints), and customizable cursors, enhancing the terminal experience for users who prefer tiling window environments.

Downloads: 6 This Week

Last Update: 2025-03-19
See Project
16

Ion

Portable suite of libraries and tools for building client applications

Ion is a modular C++ toolkit for building high-performance 2D/3D graphics applications with a strong emphasis on portability, correctness, and developer ergonomics. Rather than a monolithic engine, it offers focused libraries—math, image, GPU resource management, shader utilities, remote inspection, and platform abstractions—that you can adopt à la carte. The rendering layer wraps modern OpenGL/OpenGL ES concepts with a carefully layered API that tracks object lifetimes, deduplicates resources, and enables safe multithreaded recording of draw calls. ...

Downloads: 0 This Week

Last Update: 2025-10-10
See Project
17

TerraForge3D

Cross Platform Professional Procedural Terrain Generation & Texturing

TerraForge3D is an advanced procedural terrain generation tool that allows users to create stunning, customizable landscapes using an intuitive node-based interface. Built in C++ with Vulkan, ImGui, and ImGuiNodeEditor, TerraForge3D supports real-time editing and visualization of terrain, water, and environmental effects. It’s ideal for game developers, VFX artists, and simulation creators who want full control over terrain features without relying on pre-built assets. The software also...

Downloads: 2 This Week

Last Update: 2025-03-25
See Project
18

JAMon API

Monitor Java applications - SQL, HTTP, Methods, Exceptions and more.

JAMon API is a free, simple, high performance, thread safe, Java API that allows developers to easily monitor the performance and scalability of production applications. JAMon tracks hits, execution times (total, avg, min, max, std dev), and more. * JAMon Users Manual: For more on the JAMon, including installing, configuring, and using it, see http://jamonapi.sourceforge.net/

4 Reviews

Downloads: 33 This Week

Last Update: 2024-05-09
See Project
19

Big Sleep

A simple command line tool for text to image generation

...You can set the number of classes that you wish to restrict Big Sleep to use for the Big GAN with the --max-classes flag as follows (ex. 15 classes). This may lead to extra stability during training, at the cost of lost expressivity.

Downloads: 0 This Week

Last Update: 2022-08-09
See Project
20

feathersui-starling

User interface components for Starling Framework, ActionScript 3

Feathers UI (Starling edition) is a lightweight, open-source library of user interface components designed specifically for use with the Starling Framework. It allows ActionScript developers to build GPU-accelerated interfaces for games and applications that run on desktop and mobile platforms. With a focus on performance and flexibility, Feathers UI includes buttons, sliders, lists, navigators, and layout containers optimized for Starling's rendering pipeline.

Downloads: 0 This Week

Last Update: 2025-07-04
See Project
21

MoCo v3

PyTorch implementation of MoCo v3

MoCo v3 is a PyTorch reimplementation of Momentum Contrast v3 (MoCo v3), Facebook Research’s state-of-the-art self-supervised learning framework for visual representation learning using ResNet and Vision Transformer (ViT) backbones. Originally developed in TensorFlow for TPUs, this version faithfully reproduces the paper’s results on GPUs while offering an accessible and scalable PyTorch interface. MoCo v3 introduces improvements for training self-supervised ViTs by combining contrastive...

Downloads: 1 This Week

Last Update: 2026-05-02
See Project
22

Robust Video Matting (RVM)

Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX

We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance. Our method is much lighter than previous approaches and can process 4K at 76 FPS and HD at 104 FPS on an Nvidia GTX 1080Ti GPU. Unlike most existing methods that perform video matting frame-by-frame as independent images, our method uses a recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and matting quality. ...

Downloads: 6 This Week

Last Update: 2023-03-25
See Project
23

Skija

Java bindings for Skia

Skija is a high-performance, Java bindings library for the Skia graphics engine, allowing JVM and Kotlin applications to access the full capabilities of Skia’s 2D GPU-accelerated graphics without writing native code. Skia is the same graphics engine used in Chrome, Android, Flutter, and other platforms, and Skija leverages this robust foundation to provide fast rendering of paths, text, images, transformations, filters, and animations within desktop and embedded Java environments. ...

Downloads: 0 This Week

Last Update: 2026-01-17
See Project
24

robot-monitor-graphics

Simple and quick 2D/3D graphics engine for simulation.

...Loads 2D/3D model files and texture files and easily control the pose and appearance of those 2D/3D objects. Lighting and shadow mapping are done in back-end processes. Performance is smooth since rendering engine uses shader programs and GPU power. The project uses OpenGL API and other external open source packages like GLFW and wxWidgets and is made a cross-platform API.

Downloads: 0 This Week

Last Update: 2021-11-09
See Project
25

Cuda Simulated Annealing GPU Route Plan

An Optimized GPU-Accelerated Route Planning of Multi-UAV Systems Using

An Optimized GPU-Accelerated Route Planning of Multi-UAV Systems Using Simulated Annealing Article CUDA CODE Usage of multiple unmanned aerial vehicles (UAV) in a certain mission makes flight route planning more complicated and slower. In order to obtain better performance, in the literature, most of the researchers propose using evolutionary algorithms and artificial intelligence approaches based on heuristics as optimization techniques.

Downloads: 0 This Week

Last Update: 2021-09-04
See Project