Search Results for "gpu max performance" - Page 11

Showing 388 open source projects for "gpu max performance"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Access competitive interest rates on your digital assets.

    Generate interest, borrow against your crypto, and trade a range of cryptocurrencies — all in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    SIG Rust

    SIG Rust

    Rust language bindings for TensorFlow

    SIG Rust provides idiomatic Rust bindings for TensorFlow, making it possible for developers to work with TensorFlow functionality from within the Rust programming language. Rather than replacing TensorFlow itself, it acts as an integration layer that connects Rust applications to the TensorFlow C API. The repository is designed for developers who want Rust’s performance, safety, and systems programming strengths while still accessing TensorFlow’s machine learning capabilities. It includes...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese-LLaMA-Alpaca-2 v2.0

    Chinese LLaMA & Alpaca large language model + local CPU/GPU training

    This project has open-sourced the Chinese LLaMA model and the Alpaca large model with instruction fine-tuning to further promote the open research of large models in the Chinese NLP community. Based on the original LLaMA , these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    SRM

    SRM

    C library for the development of Linux OpenGL DRM/KMS apps

    SRM is a C library that simplifies the development of Linux DRM/KMS API applications. With SRM, you can focus on the OpenGL ES 2.0 logic of your application. For each available display, you can start a rendering thread that triggers common events like initializeGL(), paintGL(), resizeGL(), pageFlipped() and uninitializeGL(). SRM allows you to use multiple GPUs simultaneously and automatically finds the most efficient configuration. It also offers functions for creating OpenGL textures,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    OptiMate

    OptiMate

    Libraries for optimizing AI models, inference speed, and GPU usage

    ...One of the core components, Speedster, focuses on accelerating model inference by applying state of the art optimization techniques to increase performance while lowering operational costs. Another component, Nos, targets infrastructure optimization by improving GPU utilization in Kubernetes clusters through dynamic partitioning and elastic resource quotas.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 5
    LLaMA.go

    LLaMA.go

    llama.go is like llama.cpp in pure Golang

    llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Ambient

    Ambient

    The multiplayer game engine

    Ambient is an open-source, cross-platform runtime and engine for building and deploying high-performance multiplayer games and 3D applications, using a modern stack built on Rust, WebAssembly (WASM), and WebGPU. It aims to make multiplayer game development accessible and flexible, providing an entity-component-system (ECS) at its core that doubles as a real-time in-game database; everything in the game — from world objects to runtime data — is represented as entities + components, which can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Alphafold

    Alphafold

    Open source code for AlphaFold

    This package provides an implementation of the inference pipeline of AlphaFold v2.0. This is a completely new model that was entered in CASP14 and published in Nature. For simplicity, we refer to this model as AlphaFold throughout the rest of this document. Any publication that discloses findings arising from using this source code or the model parameters should cite the AlphaFold paper. Please also refer to the Supplementary Information for a detailed description of the method. You can use...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    MLPACK is a C++ machine learning library with emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and flexibility for expert users. * More info + downloads: https://mlpack.org * Git repo: https://github.com/mlpack/mlpack
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MetalPetal

    MetalPetal

    A GPU accelerated image and video processing framework built on Metal

    MetalPetal is an image processing framework based on Metal designed to provide real-time processing for still images and video with easy-to-use programming interfaces. This chapter covers the key concepts of MetalPetal, and will help you to get a better understanding of its design, implementation, performance implications, and best practices. A MTIImage object is a representation of an image to be processed or produced. It does directly represent image bitmap data instead it has all the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 10
    Veldrid

    Veldrid

    A low-level, portable graphics library for .NET

    Veldrid is a low-level, portable graphics library for .NET, providing a unified API over multiple graphics backends such as Direct3D, Vulkan, OpenGL, and Metal. It enables developers to write high-performance, cross-platform graphics applications without being tied to a specific graphics API. Veldrid is suitable for game development, simulations, and other applications requiring advanced graphics capabilities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Forma

    Forma

    An efficient vector-graphics renderer

    Forma is an experimental vector graphics renderer written in Rust, developed by Google to explore high-performance, parallelized rendering techniques across multiple platforms. The project aims to achieve portability, performance, simplicity, and small footprint through a streamlined four-stage rendering pipeline. Forma provides both CPU (software) and GPU (hardware) backends, relying on Rust’s SIMD auto-vectorization, Rayon for multithreading, and WebGPU (wgpu) for hardware acceleration. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    FasterTransformer

    FasterTransformer

    Transformer related optimization, including BERT, GPT

    FasterTransformer is a high-performance inference library designed to accelerate transformer-based models such as BERT, GPT, and T5 on NVIDIA GPUs. It provides optimized implementations of transformer encoder and decoder layers using CUDA, cuBLAS, and custom kernels to maximize throughput and minimize latency. The library supports multiple deep learning frameworks, including TensorFlow, PyTorch, and Triton, allowing developers to integrate it into existing pipelines without major changes. It...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    FairScale

    FairScale

    PyTorch extensions for high performance and large scale training

    ...FairScale puts emphasis on correctness and debuggability, offering hook points, logging, and reference examples for common trainer patterns. Although many ideas have since landed in core PyTorch, FairScale remains a valuable reference and a practical toolbox for squeezing more performance out of multi-GPU and multi-node jobs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    slide-element

    slide-element

    Promise-based library for animating elements with dynamic heights

    ...The animations themselves are powered by the same mechanics used within CSS transitions, making it one of the best ways to pull it off in terms of performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    NBMiner

    NBMiner

    GPU Miner for ETH, RVN, BEAM, CFX, ZIL, AE, ERGO

    nbminer.com & NBMiner_github are the only 2 officially maintained site for publishing information and new releases of NBMiner. Be aware when you download NBMiner binaries from other sources. For GPUs with Hynix GDDR6 memory, LHR mode is not recommended for the poor performance. Improve performance on Grin29 & AE. Add support for mining SERO, algo progpow_sero. Change devfee in percentage, [0-5]. Set to ‘0’ to turn off devfee with lower hash rate. Otherwise, devfee = max(set_value, default_value). Memory timings optimize for Nvidia GDDR5 & GDDR5X gpus. range [1-6]. Higher value equals higher hashrate. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    FLoops.jl

    FLoops.jl

    Fast sequential, threaded, and distributed for-loops for Julia

    Fast sequential, threaded, and distributed for-loops for Julia, fold for humans.FLoops.jl provides a macro @floop. It can be used to generate a fast generic sequential and parallel iteration over complex collections. Furthermore, the loop written in @floop can be executed with any compatible executors. See FoldsThreads.jl for various thread-based executors that are optimized for different kinds of loops. FoldsCUDA.jl provides an executor for GPU. FLoops.jl also provides a simple distributed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    Remotery

    Remotery

    Single C file, Realtime CPU/GPU Profiler with Remote Web Viewer

    Remotery is a real-time CPU/GPU profiler implemented as a single C file, providing developers with immediate insights into the performance of their applications. It features a remote web-based viewer that runs in browsers like Chrome, Firefox, and Safari, allowing for cross-platform performance analysis. Remotery supports profiling multiple threads and GPU contexts, offering a comprehensive view of an application's performance characteristics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    TOTimer

    Time-Out-Timer for single threaded environement

    Serve timeouts by calling a service-function periodically and an independent time-source (toticker). If you need a timer and like to try this one, please write your experiences to me in brief! And may be suggestions, of course. 10ms tick time should be possible. May be less. DO NOT USE Version less 0.1.1 - it is definatly buggy! Easy to control timers via Handle, eg auto create&start a timer via totimer_setTimeout function. Optionally use Callback-Functions if timeout...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Darknet

    Darknet

    Convolutional Neural Networks

    ...Darknet is lightweight, fast, and easy to compile, making it suitable for research and production use. The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 21
    CUDA Pathtracer

    CUDA Pathtracer

    GPU Raytracer from scratch in C++/CUDA

    GPU-Raytracer is a high-performance, real-time ray tracing engine implemented using OpenGL compute shaders. It demonstrates the power of modern GPU architectures to handle complex lighting calculations, reflections, shadows, and global illumination in real-time. This project is educational and experimental, providing insight into GPU parallelism and real-time rendering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ASRT Speech Recognition

    ASRT Speech Recognition

    A Deep-Learning-Based Chinese Speech Recognition System

    ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Darktile

    Darktile

    Darktile is a GPU rendered terminal emulator

    Darktile is a GPU-rendered terminal emulator specifically designed for tiling window managers. It utilizes GPU acceleration to provide smooth and efficient rendering, supporting Unicode and a variety of themes. Darktile includes features like font ligatures, context-aware overlays (hints), and customizable cursors, enhancing the terminal experience for users who prefer tiling window environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Ion

    Ion

    Portable suite of libraries and tools for building client applications

    Ion is a modular C++ toolkit for building high-performance 2D/3D graphics applications with a strong emphasis on portability, correctness, and developer ergonomics. Rather than a monolithic engine, it offers focused libraries—math, image, GPU resource management, shader utilities, remote inspection, and platform abstractions—that you can adopt à la carte. The rendering layer wraps modern OpenGL/OpenGL ES concepts with a carefully layered API that tracks object lifetimes, deduplicates resources, and enables safe multithreaded recording of draw calls. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    TerraForge3D

    TerraForge3D

    Cross Platform Professional Procedural Terrain Generation & Texturing

    TerraForge3D is an advanced procedural terrain generation tool that allows users to create stunning, customizable landscapes using an intuitive node-based interface. Built in C++ with Vulkan, ImGui, and ImGuiNodeEditor, TerraForge3D supports real-time editing and visualization of terrain, water, and environmental effects. It’s ideal for game developers, VFX artists, and simulation creators who want full control over terrain features without relying on pre-built assets. The software also...
    Downloads: 9 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB