Showing 32 open source projects for "tensorrt"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 1
    YOLOX

    YOLOX

    YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5

    YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. Prepare your own dataset with images and labels first. For labeling images, you can use tools like Labelme or CVAT. One more thing worth noting is that you should also implement pull_item and load_anno method for the Mosiac and MixUp augmentations. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 2
    TensorRT Pro

    TensorRT Pro

    C++ library based on tensorrt integration

    High-level interface for C++/Python. Simplify the implementation of the custom plugin. And serialization and deserialization have been encapsulated for easier usage. Simplify the compilation of fp32, fp16 and int8 for facilitating the deployment with C++/Python in server or embedded device. Models ready for use also with examples are RetinaFace, Scrfd, YoloV5, YoloX, Arcface, AlphaPose, CenterNet and DeepSORT(C++).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Hugging Face Transformer

    Hugging Face Transformer

    CPU/GPU inference server for Hugging Face transformer models

    ...You will usually get from 2X to 4X faster inference compared to vanilla Pytorch. It's cool! However, if you want the best in class performances on GPU, there is only a single possible combination: Nvidia TensorRT and Triton. You will usually get 5X faster inference compared to vanilla Pytorch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    MMdnn

    MMdnn

    Tools to help users inter-operate among deep learning frameworks

    MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML. MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model management, and "dnn" is the acronym of deep neural network. We implement a universal converter to convert DL models between frameworks,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Hunyuan-A13B-Instruct

    Hunyuan-A13B-Instruct

    Efficient 13B MoE language model with long context and reasoning modes

    ...It excels in mathematics, science, coding, and multi-turn conversation tasks, rivaling or outperforming larger models in several areas. Deployment is supported via TensorRT-LLM, vLLM, and SGLang, with Docker images and integration guides provided. Open-source under a custom license, it's ideal for researchers and developers seeking scalable, high-context AI capabilities with optimized inference.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    GigaChat 3 Ultra

    GigaChat 3 Ultra

    High-performance MoE model with MLA, MTP, and multilingual reasoning

    GigaChat 3 Ultra is a flagship instruct-model built on a custom Mixture-of-Experts architecture with 702B total and 36B active parameters. It leverages Multi-head Latent Attention to compress the KV cache into latent vectors, dramatically reducing memory demand and improving inference speed at scale. The model also employs Multi-Token Prediction, enabling multi-step token generation in a single pass for up to 40% faster output through speculative and parallel decoding techniques. Its...
    Downloads: 0 This Week
    Last Update:
    See Project