Search Results for "gpu max performance" - Page 13

Showing 458 open source projects for "gpu max performance"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    Veldrid

    Veldrid

    A low-level, portable graphics library for .NET

    Veldrid is a low-level, portable graphics library for .NET, providing a unified API over multiple graphics backends such as Direct3D, Vulkan, OpenGL, and Metal. It enables developers to write high-performance, cross-platform graphics applications without being tied to a specific graphics API. Veldrid is suitable for game development, simulations, and other applications requiring advanced graphics capabilities.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Forma

    Forma

    An efficient vector-graphics renderer

    Forma is an experimental vector graphics renderer written in Rust, developed by Google to explore high-performance, parallelized rendering techniques across multiple platforms. The project aims to achieve portability, performance, simplicity, and small footprint through a streamlined four-stage rendering pipeline. Forma provides both CPU (software) and GPU (hardware) backends, relying on Rust’s SIMD auto-vectorization, Rayon for multithreading, and WebGPU (wgpu) for hardware acceleration. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    FasterTransformer

    FasterTransformer

    Transformer related optimization, including BERT, GPT

    FasterTransformer is a high-performance inference library designed to accelerate transformer-based models such as BERT, GPT, and T5 on NVIDIA GPUs. It provides optimized implementations of transformer encoder and decoder layers using CUDA, cuBLAS, and custom kernels to maximize throughput and minimize latency. The library supports multiple deep learning frameworks, including TensorFlow, PyTorch, and Triton, allowing developers to integrate it into existing pipelines without major changes. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    ...FairScale puts emphasis on correctness and debuggability, offering hook points, logging, and reference examples for common trainer patterns. Although many ideas have since landed in core PyTorch, FairScale remains a valuable reference and a practical toolbox for squeezing more performance out of multi-GPU and multi-node jobs.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Real-ESRGAN Video Enhance

    Real-ESRGAN Video Enhance

    Real-ESRGAN video upscaler with resumability

    ...It utilizes Real-ESRGAN-can-Vulkan, FFmpeg and MediaInfo under the hood. REVE employs a segment-based approach to video upscaling, allowing it to simultaneously upscale and encode videos. This results in a notable enhancement in performance and enables the feature of reusability. You can download Windows executable file for Intel/AMD/Nvidia GPU. This executable file is portable and includes all the binaries and models required. No CUDA or PyTorch environment is needed.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    slide-element

    slide-element

    Promise-based library for animating elements with dynamic heights

    ...The animations themselves are powered by the same mechanics used within CSS transitions, making it one of the best ways to pull it off in terms of performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    NBMiner

    NBMiner

    GPU Miner for ETH, RVN, BEAM, CFX, ZIL, AE, ERGO

    nbminer.com & NBMiner_github are the only 2 officially maintained site for publishing information and new releases of NBMiner. Be aware when you download NBMiner binaries from other sources. For GPUs with Hynix GDDR6 memory, LHR mode is not recommended for the poor performance. Improve performance on Grin29 & AE. Add support for mining SERO, algo progpow_sero. Change devfee in percentage, [0-5]. Set to ‘0’ to turn off devfee with lower hash rate. Otherwise, devfee = max(set_value, default_value). Memory timings optimize for Nvidia GDDR5 & GDDR5X gpus. range [1-6]. Higher value equals higher hashrate. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8
    FLoops.jl

    FLoops.jl

    Fast sequential, threaded, and distributed for-loops for Julia

    Fast sequential, threaded, and distributed for-loops for Julia, fold for humans.FLoops.jl provides a macro @floop. It can be used to generate a fast generic sequential and parallel iteration over complex collections. Furthermore, the loop written in @floop can be executed with any compatible executors. See FoldsThreads.jl for various thread-based executors that are optimized for different kinds of loops. FoldsCUDA.jl provides an executor for GPU. FLoops.jl also provides a simple distributed...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Remotery

    Remotery

    Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

    Remotery is a real-time CPU/GPU profiler implemented as a single C file, providing developers with immediate insights into the performance of their applications. It features a remote web-based viewer that runs in browsers like Chrome, Firefox, and Safari, allowing for cross-platform performance analysis. Remotery supports profiling multiple threads and GPU contexts, offering a comprehensive view of an application's performance characteristics.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Darknet

    Darknet

    Convolutional Neural Networks

    ...Darknet is lightweight, fast, and easy to compile, making it suitable for research and production use. The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 12

    TOTimer

    Time-Out-Timer for single threaded environement

    Serve timeouts by calling a service-function periodically and an independent time-source (toticker). If you need a timer and like to try this one, please write your experiences to me in brief! And may be suggestions, of course. 10ms tick time should be possible. May be less. DO NOT USE Version less 0.1.1 - it is definatly buggy! Easy to control timers via Handle, eg auto create&start a timer via totimer_setTimeout function. Optionally use Callback-Functions if timeout...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    CUDA Pathtracer

    CUDA Pathtracer

    GPU Raytracer from scratch in C++/CUDA

    GPU-Raytracer is a high-performance, real-time ray tracing engine implemented using OpenGL compute shaders. It demonstrates the power of modern GPU architectures to handle complex lighting calculations, reflections, shadows, and global illumination in real-time. This project is educational and experimental, providing insight into GPU parallelism and real-time rendering techniques.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    ASRT Speech Recognition

    ASRT Speech Recognition

    A Deep-Learning-Based Chinese Speech Recognition System

    ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Darktile

    Darktile

    Darktile is a GPU rendered terminal emulator

    Darktile is a GPU-rendered terminal emulator specifically designed for tiling window managers. It utilizes GPU acceleration to provide smooth and efficient rendering, supporting Unicode and a variety of themes. Darktile includes features like font ligatures, context-aware overlays (hints), and customizable cursors, enhancing the terminal experience for users who prefer tiling window environments.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Ion

    Ion

    Portable suite of libraries and tools for building client applications

    Ion is a modular C++ toolkit for building high-performance 2D/3D graphics applications with a strong emphasis on portability, correctness, and developer ergonomics. Rather than a monolithic engine, it offers focused libraries—math, image, GPU resource management, shader utilities, remote inspection, and platform abstractions—that you can adopt à la carte. The rendering layer wraps modern OpenGL/OpenGL ES concepts with a carefully layered API that tracks object lifetimes, deduplicates resources, and enables safe multithreaded recording of draw calls. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    TerraForge3D

    TerraForge3D

    Cross Platform Professional Procedural Terrain Generation & Texturing

    TerraForge3D is an advanced procedural terrain generation tool that allows users to create stunning, customizable landscapes using an intuitive node-based interface. Built in C++ with Vulkan, ImGui, and ImGuiNodeEditor, TerraForge3D supports real-time editing and visualization of terrain, water, and environmental effects. It’s ideal for game developers, VFX artists, and simulation creators who want full control over terrain features without relying on pre-built assets. The software also...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    JAMon API

    JAMon API

    Monitor Java applications - SQL, HTTP, Methods, Exceptions and more.

    JAMon API is a free, simple, high performance, thread safe, Java API that allows developers to easily monitor the performance and scalability of production applications. JAMon tracks hits, execution times (total, avg, min, max, std dev), and more. * JAMon Users Manual: For more on the JAMon, including installing, configuring, and using it, see http://jamonapi.sourceforge.net/
    Downloads: 33 This Week
    Last Update:
    See Project
  • 19
    Big Sleep

    Big Sleep

    A simple command line tool for text to image generation

    ...You can set the number of classes that you wish to restrict Big Sleep to use for the Big GAN with the --max-classes flag as follows (ex. 15 classes). This may lead to extra stability during training, at the cost of lost expressivity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    feathersui-starling

    feathersui-starling

    User interface components for Starling Framework, ActionScript 3

    Feathers UI (Starling edition) is a lightweight, open-source library of user interface components designed specifically for use with the Starling Framework. It allows ActionScript developers to build GPU-accelerated interfaces for games and applications that run on desktop and mobile platforms. With a focus on performance and flexibility, Feathers UI includes buttons, sliders, lists, navigators, and layout containers optimized for Starling's rendering pipeline.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MoCo v3

    MoCo v3

    PyTorch implementation of MoCo v3

    MoCo v3 is a PyTorch reimplementation of Momentum Contrast v3 (MoCo v3), Facebook Research’s state-of-the-art self-supervised learning framework for visual representation learning using ResNet and Vision Transformer (ViT) backbones. Originally developed in TensorFlow for TPUs, this version faithfully reproduces the paper’s results on GPUs while offering an accessible and scalable PyTorch interface. MoCo v3 introduces improvements for training self-supervised ViTs by combining contrastive...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Robust Video Matting (RVM)

    Robust Video Matting (RVM)

    Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX

    We introduce a robust, real-time, high-resolution human video matting method that achieves new state-of-the-art performance. Our method is much lighter than previous approaches and can process 4K at 76 FPS and HD at 104 FPS on an Nvidia GTX 1080Ti GPU. Unlike most existing methods that perform video matting frame-by-frame as independent images, our method uses a recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and matting quality. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 23
    Skija

    Skija

    Java bindings for Skia

    Skija is a high-performance, Java bindings library for the Skia graphics engine, allowing JVM and Kotlin applications to access the full capabilities of Skia’s 2D GPU-accelerated graphics without writing native code. Skia is the same graphics engine used in Chrome, Android, Flutter, and other platforms, and Skija leverages this robust foundation to provide fast rendering of paths, text, images, transformations, filters, and animations within desktop and embedded Java environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    robot-monitor-graphics

    robot-monitor-graphics

    Simple and quick 2D/3D graphics engine for simulation.

    ...Loads 2D/3D model files and texture files and easily control the pose and appearance of those 2D/3D objects. Lighting and shadow mapping are done in back-end processes. Performance is smooth since rendering engine uses shader programs and GPU power. The project uses OpenGL API and other external open source packages like GLFW and wxWidgets and is made a cross-platform API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Cuda Simulated Annealing GPU Route Plan

    Cuda Simulated Annealing GPU Route Plan

    An Optimized GPU-Accelerated Route Planning of Multi-UAV Systems Using

    An Optimized GPU-Accelerated Route Planning of Multi-UAV Systems Using Simulated Annealing Article CUDA CODE Usage of multiple unmanned aerial vehicles (UAV) in a certain mission makes flight route planning more complicated and slower. In order to obtain better performance, in the literature, most of the researchers propose using evolutionary algorithms and artificial intelligence approaches based on heuristics as optimization techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB