Showing 24 open source projects for "gpu max performance"

View related business solutions
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    NVIDIA Warp

    NVIDIA Warp

    A Python framework for accelerated simulation, data generation

    NVIDIA Warp is a high-performance Python framework developed by NVIDIA for building and accelerating simulation, graphics, and physics-based workloads using GPU computing. It enables developers to write kernel-level code in Python that is automatically compiled into efficient CUDA kernels, combining ease of use with near-native performance. The framework is designed for applications such as robotics, reinforcement learning, physical simulation, and differentiable computing, where performance and flexibility are critical. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Meridian

    Meridian

    Meridian is an MMM framework

    ...Meridian uses the No-U-Turn Sampler (NUTS) for Markov Chain Monte Carlo (MCMC) sampling to produce statistically rigorous results, and it includes GPU acceleration to significantly reduce computation time.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    Nuclio

    Nuclio

    High-Performance Serverless event and data processing platform

    Nuclio is an open source and managed serverless platform used to minimize development and maintenance overhead and automate the deployment of data-science-based applications. Real-time performance running up to 400,000 function invocations per second. Portable across low laptops, edge, on-prem and multi-cloud deployments. The first serverless platform supporting GPUs for optimized utilization and sharing. Automated deployment to production in a few clicks from Jupyter notebook. Deploy one of...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    MNN

    MNN

    MNN is a blazing fast, lightweight deep learning framework

    ...Android platform, core so size is about 400KB, OpenCL so is about 400KB, Vulkan so is about 400KB. Supports hybrid computing on multiple devices. Currently supports CPU and GPU.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    bitnet.cpp

    bitnet.cpp

    Official inference framework for 1-bit LLMs

    bitnet.cpp is the official open-source inference framework and ecosystem designed to enable ultra-efficient execution of 1-bit large language models (LLMs), which quantize most model parameters to ternary values (-1, 0, +1) while maintaining competitive performance with full-precision counterparts. At its core is bitnet.cpp, a highly optimized C++ backend that supports fast, low-memory inference on both CPUs and GPUs, enabling models such as BitNet b1.58 to run without requiring enormous...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    BentoML

    BentoML

    Unified Model Serving Framework

    ...Parallelize compute-intense model inference workloads to scale separately from the serving logic. Adaptive batching dynamically groups inference requests for optimal performance. Orchestrate distributed inference graph with multiple models via Yatai on Kubernetes. Easily configure CUDA dependencies for running inference with GPU. Automatically generate docker images for production deployment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TensorFlow Model Garden

    TensorFlow Model Garden

    Models and examples built with TensorFlow

    The TensorFlow Model Garden is a repository with a number of different implementations of state-of-the-art (SOTA) models and modeling solutions for TensorFlow users. We aim to demonstrate the best practices for modeling so that TensorFlow users can take full advantage of TensorFlow for their research and product development. To improve the transparency and reproducibility of our models, training logs on TensorBoard.dev are also provided for models to the extent possible though not all models...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    tvm

    tvm

    Open deep learning compiler stack for cpu, gpu, etc.

    Apache TVM is an open source machine learning compiler framework for CPUs, GPUs, and machine learning accelerators. It aims to enable machine learning engineers to optimize and run computations efficiently on any hardware backend. The vision of the Apache TVM Project is to host a diverse community of experts and practitioners in machine learning, compilers, and systems architecture to build an accessible, extensible, and automated open-source framework that optimizes current and emerging...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Deep Java Library (DJL)

    Deep Java Library (DJL)

    An engine-agnostic deep learning framework in Java

    ...Because DJL is deep learning engine agnostic, you don't have to make a choice between engines when creating your projects. You can switch engines at any point. To ensure the best performance, DJL also provides automatic CPU/GPU choice based on hardware configuration.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    QtAV

    QtAV

    A multimedia framework based on Qt and FFmpeg

    QtAV is a cross-platform and high performance multimedia playback framework based on Qt and FFmpeg. Features: timeline preview, gpu decoding etc
    Downloads: 26 This Week
    Last Update:
    See Project
  • 11
    pipeless

    pipeless

    A computer vision framework to create and deploy apps in minutes

    ...You can easily use industry-standard models, such as YOLO, or load your custom model in one of the supported inference runtimes. Pipeless ships some of the most popular inference runtimes, such as the ONNX Runtime, allowing you to run inference with high performance on CPU or GPU out-of-the-box. You can deploy your Pipeless application with a single command to edge and IoT devices or the cloud.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Louvre

    Louvre

    High performance C++ library for building Wayland compositors

    Louvre is a high-performance C++ library designed for building Wayland compositors with a strong emphasis on ease of development. It provides a default way for managing protocols, enabling you to have a basic but functional compositor from day one and progressively explore and customize its functionality to precisely match your requirements. Within Louvre, you have the flexibility to either employ your own OpenGL ES 2.0 shaders/programs, use the LPainter class for fundamental 2D...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    MetalPetal

    MetalPetal

    A GPU accelerated image and video processing framework built on Metal

    MetalPetal is an image processing framework based on Metal designed to provide real-time processing for still images and video with easy-to-use programming interfaces. This chapter covers the key concepts of MetalPetal, and will help you to get a better understanding of its design, implementation, performance implications, and best practices. A MTIImage object is a representation of an image to be processed or produced. It does directly represent image bitmap data instead it has all the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Darknet

    Darknet

    Convolutional Neural Networks

    ...Darknet is lightweight, fast, and easy to compile, making it suitable for research and production use. The repository provides pre-trained models, configuration files, and tools for training custom object detection models. With GPU acceleration via CUDA and OpenCV integration, it achieves high performance in image recognition tasks. Its simplicity, combined with powerful capabilities, has made Darknet one of the most influential projects in the computer vision community.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 15
    Tone.js

    Tone.js

    A Web Audio framework for making interactive music in the browser

    ...It has common DAW (digital audio workstation) features for those looking to schedule events and tinker with pre-built synths and effects. There’s also a great selection of high-performance building blocks for signal-processing programmers familiar with languages like Max/MSP. With Tone.js they can create their own synthesizers, effects, and complex control signals.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    GPUImage 2

    GPUImage 2

    Framework for GPU-accelerated video and image processing

    ...By relying on the GPU to run these operations, performance improvements of 100X or more over CPU-bound code can be realized. This is particularly noticeable in mobile or embedded devices. On an iPhone 4S, this framework can easily process 1080p video at over 60 FPS. On a Raspberry Pi 3, it can perform Sobel edge detection on live 720p video at over 20 FPS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Mocha.jl

    Mocha.jl

    Deep Learning framework for Julia

    ...It offers efficient implementations of gradient descent solvers and common neural network layers, supports optional unsupervised pre-training, and allows switching to a GPU backend for accelerated performance. The development of Mocha.jl happens in relative early days of Julia. Now that both Julia and the ecosystem has evolved significantly, and with some exciting new tech such as writing GPU kernels directly in Julia and general auto-differentiation supports, the Mocha codebase becomes excessively old and primitive. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Intel neon

    Intel neon

    Intel® Nervana™ reference deep learning framework

    neon is Intel's reference deep learning framework committed to best performance on all hardware. Designed for ease of use and extensibility. See the new features in our latest release. We want to highlight that neon v2.0.0+ has been optimized for much better performance on CPUs by enabling Intel Math Kernel Library (MKL). The DNN (Deep Neural Networks) component of MKL that is used by neon is provided free of charge and downloaded automatically as part of the neon installation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Caffe2

    Caffe2

    Caffe2 is a lightweight, modular, and scalable deep learning framework

    Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind. Caffe2 is a deep learning framework that provides an easy and straightforward way for you to experiment with deep learning and leverage community contributions of new models and algorithms. You can bring your creations to scale using the power of GPUs in the cloud or to the masses on mobile with Caffe2’s cross-platform...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Starling Extension Graphics

    Starling Extension Graphics

    flash.display.Graphics style extension for the Starling Flash GPU

    Starling-Extension-Graphics is an extension for the Starling framework (which itself is a GPU-accelerated 2D framework for Flash/AIR via Stage3D). This extension adds graphics primitives (fills, strokes, planes etc.) that mimic flash.display.Graphics-style drawing but implemented in a GPU-friendly manner. It automatically triangulates vector shapes, letting developers use familiar drawing APIs but get performance benefits of GPU rendering via Starling. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Starling Extension Particle System

    Starling Extension Particle System

    A particle system for the Starling framework

    The Starling Extension Particle System is an ActionScript extension for the Starling framework that enables developers to integrate particle effects created with the "Particle Designer" tool by 71squared into Starling-based applications. The demo-directory contains a sample project. To compile it, add a reference to the Starling library and add the source directory that contains the particle system classes. The project contains 4 sample configurations. Switch between configurations in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ND2D

    ND2D

    A Flash Molehill (Stage3D) GPU accelerated 2D game engine

    ND2D is a 2D game framework for Flash that uses Stage3D / Molehill (i.e. the GPU acceleration in newer Flash Player versions). It allows game developers to build 2D games with lots of sprites, leveraging GPU for better performance. It includes display tree constructs, sprite sheets, particle systems, cameras, post-processing etc., made to simplify building high-performance 2D content in Flash. ND2D was built to make an ease use of hardware accelerated 2D content in the Flashplayer. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    HIPAcc

    HIPAcc

    Heterogeneous Image Processing Acceleration (HIPACC) Framework

    HIPAcc development has moved to github: https://github.com/hipacc HIPAcc allows to design image processing kernels and algorithms in a domain-specific language (DSL). From this high-level description, low-level target code for GPU accelerators is generated using source-to-source translation. As back ends, the framework supports CUDA, OpenCL, and Renderscript. HIPAcc allows programmers to develop imaging applications while providing high productivity, flexibility and portability as well as competitive performance: the same algorithm description serves as basis for targeting different GPU accelerators and low-level languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    StarlingPunk

    StarlingPunk

    StarlingPunk is a framework built on top the Starling library

    StarlingPunk is a game framework built on top of the Starling GPU-accelerated 2D library (AS3 / Flash / AIR). It is inspired by FlashPunk: it gives structure (entities, worlds), collision detection systems, tile maps, etc., and is intended to help developers organize 2D game code more cleanly while benefiting from Starling’s performance. It has features for quick prototyping and reusing code between projects.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB