9 projects for "tensorrt" with 1 filter applied:

  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    TokenSpeed

    TokenSpeed

    TokenSpeed is a speed-of-light LLM inference engine

    TokenSpeed is an LLM inference engine designed for high-performance production agent workloads. It aims to combine TensorRT-LLM-level speed with vLLM-level usability, making it relevant for teams that need fast generation without sacrificing developer ergonomics. The project is focused on the specific needs of agentic systems, where latency, throughput, and efficient scheduling matter across many short or tool-heavy requests. It builds on ideas and components from the broader open-source inference ecosystem while presenting its own execution stack. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    WhisperLive

    WhisperLive

    A nearly-live implementation of OpenAI's Whisper

    ...It runs as a server–client system in which the server hosts a Whisper backend and clients stream audio to be transcribed with very low delay. The project supports multiple inference backends, including Faster-Whisper, NVIDIA TensorRT, and OpenVINO, allowing you to target GPUs and different CPU architectures efficiently. It can handle microphone input, pre-recorded audio files, and network streams such as RTSP and HLS, making it flexible for live events, monitoring, or accessibility workflows. Configuration options let you control the number of clients, maximum connection time, and threading behavior so the server can be tuned for different deployment environments. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    CUDA Containers for Edge AI & Robotics

    CUDA Containers for Edge AI & Robotics

    Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

    ...The repository contains container configurations that package the latest AI frameworks and dependencies optimized for Jetson hardware. These containers simplify the deployment of complex machine learning environments by bundling libraries such as CUDA, TensorRT, and deep learning frameworks into reproducible container images. The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. By using containerized environments, developers can ensure that their applications run consistently across different Jetson platforms and JetPack versions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Ultralytics

    Ultralytics

    Ultralytics YOLO

    Ultralytics is a comprehensive computer vision framework that provides state-of-the-art implementations of the YOLO (You Only Look Once) family of models, enabling developers to perform tasks such as object detection, segmentation, classification, tracking, and pose estimation within a unified system. It is designed to be fast, accurate, and easy to use, offering both command-line and Python-based interfaces for training, validation, and deployment of machine learning models. The framework...
    Downloads: 4 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5
    Mooncake

    Mooncake

    Mooncake is the serving platform for Kimi

    Mooncake is an open-source infrastructure platform designed to optimize large language model serving by focusing on efficient management and transfer of model data and KV cache. The platform was originally developed as part of the serving infrastructure for the Kimi large language model system. Its architecture centers on a high-performance transfer engine that provides unified data transfer across different storage and networking technologies. This engine enables efficient movement of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    AI-Aimbot

    AI-Aimbot

    CS2, Valorant, Fortnite, APEX, every game

    AI-Aimbot is a computer vision project that demonstrates how artificial intelligence can be used to automatically identify and target opponents in video games. The system uses an object detection model based on the YOLOv5 architecture to detect human-shaped characters in gameplay screenshots or video frames. Once a target is identified, the program automatically adjusts the player’s aim toward the detected target, effectively automating the aiming process in first-person shooter games. The...
    Downloads: 2,549 This Week
    Last Update:
    See Project
  • 7
    FasterTransformer

    FasterTransformer

    Transformer related optimization, including BERT, GPT

    ...FasterTransformer is particularly focused on inference workloads, where it significantly improves performance compared to standard framework implementations. Although development has transitioned toward TensorRT-LLM, the project remains an important reference for understanding optimized transformer execution.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Hunyuan-A13B-Instruct

    Hunyuan-A13B-Instruct

    Efficient 13B MoE language model with long context and reasoning modes

    ...It excels in mathematics, science, coding, and multi-turn conversation tasks, rivaling or outperforming larger models in several areas. Deployment is supported via TensorRT-LLM, vLLM, and SGLang, with Docker images and integration guides provided. Open-source under a custom license, it's ideal for researchers and developers seeking scalable, high-context AI capabilities with optimized inference.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    GigaChat 3 Ultra

    GigaChat 3 Ultra

    High-performance MoE model with MLA, MTP, and multilingual reasoning

    GigaChat 3 Ultra is a flagship instruct-model built on a custom Mixture-of-Experts architecture with 702B total and 36B active parameters. It leverages Multi-head Latent Attention to compress the KV cache into latent vectors, dramatically reducing memory demand and improving inference speed at scale. The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • Previous
  • You're on page 1
  • Next