Showing 117 open source projects for "gpu speed"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 1
    GPU Hot

    GPU Hot

    Real-time NVIDIA GPU dashboard

    ...The dashboard collects and displays a wide range of performance metrics including temperature, memory usage, power consumption, clock speeds, fan speed, and active processes. It can scale from monitoring a single GPU workstation to large distributed environments with dozens or even hundreds of GPUs by running lightweight containers on each node and aggregating the data centrally.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2
    TrafficMonitor

    TrafficMonitor

    Floating window used to display current network speed, CPU & memory

    TrafficMonitor is a network monitoring software with floating window feature for Windows. It displays the current internet speed and CPU and RAM usage. There are also other capabilities like an embedded display in the taskbar, changeable display skins, and historical traffic statistics. There are two versions of TrafficMonitor, the standard version and the Lite version. The standard version includes all the functions, while the Lite version does not include hardware monitoring functions such as temperature monitoring, GPU usage, and hard disk usage. ...
    Downloads: 116 This Week
    Last Update:
    See Project
  • 3
    GPUArrays

    GPUArrays

    Reusable array functionality for Julia's various GPU backends

    Reusable GPU array functionality for Julia's various GPU backends. This package is the counterpart of Julia's AbstractArray interface, but for GPU array types: It provides functionality and tooling to speed-up development of new GPU array types. This package is not intended for end users! Instead, you should use one of the packages that builds on GPUArrays.jl, such as CUDA.jl, oneAPI.jl, AMDGPU.jl, or Metal.jl.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    GPU-Z

    GPU-Z

    Lightweight GPU information and diagnostics tool.

    ...It accurately reports clock speeds, including default, overclocked, 3D, and boost clocks. Furthermore, it provides a detailed analysis of the memory subsystem, including size, type, speed, and bus width. Unique features include a GPU load test to verify PCI-Express configuration, results validation, and the ability to back up your graphics card BIOS. It is portable (requires no installation) and fully supports all modern Windows versions, including Windows 11. (GPU-Z, graphics card info, GPU specs, video card diagnostics, NVIDIA, AMD, Intel, BIOS backup, overclocking, sensor monitoring, free download, portable, TechPowerUp.)
    Downloads: 154 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Fan Control

    Fan Control

    Highly customizable fan controlling software for Windows

    Fan Control is a Windows utility designed to give users fine-grained, customizable control over system fans (CPU, GPU, case, etc.) based on temperature and sensor inputs. Rather than relying solely on BIOS fan curves, it allows dynamic adjustment of fan behaviour at the operating-system level — letting you react to real-time load, mix multiple sensors (CPU, GPU, motherboard, drives, etc.), and define custom fan-speed curves for different situations.
    Downloads: 305 This Week
    Last Update:
    See Project
  • 6
    CatBoost

    CatBoost

    High-performance library for gradient boosting on decision trees

    ...It is a machine learning method with plenty of applications, including ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. CatBoost offers superior performance over other GBDT libraries on many datasets, and has several superb features. It has best in class prediction speed, supports both numerical and categorical features, has a fast and scalable GPU version, and readily comes with visualization tools. CatBoost was developed by Yandex and is used in various areas including search, self-driving cars, personal assistance, weather prediction and more.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    llmfit

    llmfit

    157 models, 30 providers, one command to find what runs on hardware

    llmfit is a terminal-based utility that helps developers determine which large language models can realistically run on their local hardware by analyzing system resources and model requirements. The tool automatically detects CPU, RAM, GPU, and VRAM specifications, then ranks available models based on performance factors such as speed, quality, and memory fit. It provides both an interactive terminal user interface and a traditional CLI mode, enabling flexible workflows for different user preferences. llmfit also supports advanced configurations including multi-GPU setups, mixture-of-experts architectures, and dynamic quantization recommendations. ...
    Downloads: 31 This Week
    Last Update:
    See Project
  • 8
    PowerInfer

    PowerInfer

    High-speed Large Language Model Serving for Local Deployment

    PowerInfer is a high-performance inference engine designed to run large language models efficiently on personal computers equipped with consumer-grade GPUs. The project focuses on improving the performance of local AI inference by optimizing how neural network computations are distributed between CPU and GPU resources. Its architecture exploits the observation that only a subset of neurons in large models are frequently activated, allowing the system to preload frequently used neurons into GPU memory while processing less common activations on the CPU. This hybrid execution strategy significantly reduces memory bottlenecks and improves overall inference speed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    CuPy

    CuPy

    A NumPy-compatible array library accelerated by CUDA

    CuPy is an open source implementation of NumPy-compatible multi-dimensional array accelerated with NVIDIA CUDA. It consists of cupy.ndarray, a core multi-dimensional array class and many functions on it. CuPy offers GPU accelerated computing with Python, using CUDA-related libraries to fully utilize the GPU architecture. According to benchmarks, it can even speed up some operations by more than 100X. CuPy is highly compatible with NumPy, serving as a drop-in replacement in most cases. CuPy is very easy to install through pip or through precompiled binary packages called wheels for recommended environments. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 10
    HunyuanVideo

    HunyuanVideo

    HunyuanVideo: A Systematic Framework For Large Video Generation Model

    ...The framework aims to push the boundaries of video generation quality, incorporating multiple innovative approaches to improve the realism and coherence of the generated content. Release of FP8 model weights to reduce GPU memory usage / improve efficiency. Parallel inference code to speed up sampling, utilities and tests included.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    how-to-optim-algorithm-in-cuda

    how-to-optim-algorithm-in-cuda

    How to optimize some algorithm in cuda

    ...The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    Rio Terminal

    Rio Terminal

    A hardware-accelerated GPU terminal emulator

    ...It is cross-platform, with support for Windows, macOS, Linux, and FreeBSD. Its main value is offering a modern, GPU-powered terminal that combines speed, visual polish, and broad platform reach.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 13

    LightGBM

    Gradient boosting framework based on decision tree algorithms

    LightGBM or Light Gradient Boosting Machine is a high-performance, open source gradient boosting framework based on decision tree algorithms. Compared to other boosting frameworks, LightGBM offers several advantages in terms of speed, efficiency and accuracy. Parallel experiments have shown that LightGBM can attain linear speed-up through multiple machines for training in specific settings, all while consuming less memory. LightGBM supports parallel and GPU learning, and can handle large-scale data. It’s become widely-used for ranking, classification and many other machine learning tasks.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    Shumai

    Shumai

    Fast Differentiable Tensor Library in JavaScript & TypeScript with Bun

    ...It can automatically leverage GPU acceleration on Linux (via CUDA) and CPU computation on macOS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Nvitop

    Nvitop

    An interactive NVIDIA-GPU process viewer and beyond

    nvitop is an interactive NVIDIA device and process monitoring tool. It has a colorful and informative interface that continuously updates the status of the devices and processes. As a resource monitor, it includes many features and options, such as tree-view, environment variable viewing, process filtering, process metrics monitoring, etc. Beyond that, the package also ships a CUDA device selection tool nvisel for deep learning researchers. It also provides handy APIs that allow developers...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Stats

    Stats

    macOS system monitor in your menu bar

    Stats currently supported on macOS 10.13 (High Sierra) and higher. Stats is an application that allows you to monitor your macOS system. CPU utilization, GPU utilization, memory usage, disk utilization, sensors information (Temperature/Voltage/Power), battery level, network usage, fans speed, fan control, and Bluetooth devices. Supports many languages, such as English, Polski, Українська, Русский, and many more. You can help by adding a new language or improve existing translation.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 17
    Habitat-Sim

    Habitat-Sim

    A flexible, high-performance 3D simulator for Embodied AI research

    ...It ships with connectors to popular datasets and scene formats, plus tools for dataset generation and scene replay. Determinism and reproducibility are first-class goals, which is critical for benchmarking agents and comparing algorithms. Thanks to its speed and modular design, Habitat-Sim is widely used to prototype embodied agents, train at scale, and evaluate in standardized environments with consistent metrics.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 18
    Faster Whisper

    Faster Whisper

    Faster Whisper transcription with CTranslate2

    Faster Whisper is an optimized implementation of the Whisper speech recognition model designed to deliver significantly faster inference while maintaining comparable accuracy. It leverages efficient inference engines and optimized computation strategies to reduce latency and resource consumption. The system is particularly useful for real-time or large-scale transcription tasks where performance is critical. It supports multiple model sizes, allowing users to balance speed and accuracy based...
    Downloads: 50 This Week
    Last Update:
    See Project
  • 19
    Citron Neo

    Citron Neo

    Research software designed to orchestrate virtual environments

    ...It supports multiple operating systems, including desktop and mobile environments, making it accessible across different devices. Overall, it represents a modern approach to emulation with a focus on speed, compatibility, and extensibility.
    Downloads: 119 This Week
    Last Update:
    See Project
  • 20
    LuxTTS

    LuxTTS

    A high-quality rapid TTS voice cloning model

    LuxTTS is an open-source text-to-speech (TTS) system focused on delivering high-quality, rapid voice synthesis and voice cloning that runs extremely fast and efficiently on consumer hardware. It implements a lightweight architecture based on ZipVoice and optimized sampling techniques so that it can generate speech at speeds up to roughly 150 times real-time on a single GPU and faster than real-time on CPU, all while producing audio at high fidelity with 48 kHz quality. The project supports...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 21
    cuDF

    cuDF

    GPU DataFrame Library

    ...The RAPIDS suite of open-source software libraries aims to enable the execution of end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization but exposing that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    NumPy

    NumPy

    The fundamental package for scientific computing with Python

    ...NumPy offers comprehensive mathematical functions, random number generators, linear algebra routines, Fourier transforms, and more. NumPy supports a wide range of hardware and computing platforms, and plays well with distributed, GPU, and sparse array libraries. The core of NumPy is well-optimized C code. Enjoy the flexibility of Python with the speed of compiled code. NumPy’s high level syntax makes it accessible and productive for programmers from any background or experience level. Distributed under a liberal BSD license, NumPy is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community. ...
    Downloads: 65 This Week
    Last Update:
    See Project
  • 23
    LightLLM

    LightLLM

    LightLLM is a Python-based LLM (Large Language Model) inference

    LightLLM is a high-performance inference and serving framework designed specifically for large language models, focusing on lightweight architecture, scalability, and efficient deployment. The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source implementations, including FasterTransformer, vLLM, and FlashAttention, to accelerate token generation and reduce latency. LightLLM is designed to handle large-scale model workloads in production environments, supporting efficient batching and GPU utilization for fast inference across multiple requests. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    PyTorch

    PyTorch

    Open source machine learning framework

    ...PyTorch can be used as a replacement for Numpy, or as a deep learning research platform that provides optimum flexibility and speed.
    Downloads: 93 This Week
    Last Update:
    See Project
  • 25
    Pruna AI

    Pruna AI

    Pruna is a model optimization framework built for developers

    Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while maintaining speed, cost-efficiency, and full control over their data and AI stack. With a focus on extensibility and observability, Pruna empowers engineers to scale LLM applications from prototype to production securely and reliably.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo