Showing 6 open source projects for "latency"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    DeepSpeed MII

    DeepSpeed MII

    MII makes low-latency and high-throughput inference possible

    ...While open-sourcing has democratized access to AI capabilities, their application is still restricted by two critical factors: inference latency and cost. DeepSpeed-MII is a new open-source python library from DeepSpeed, aimed towards making low-latency, low-cost inference of powerful models not only feasible but also easily accessible. MII offers access to the highly optimized implementation of thousands of widely used DL models. MII-supported models achieve significantly lower latency and cost compared to their original implementation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    DeepSpeed

    DeepSpeed

    Deep learning optimization library: makes distributed training easy

    ...Achieve excellent system throughput and efficiently scale to thousands of GPUs 3. Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme compression for an unparalleled inference latency and model size reduction with low costs DeepSpeed offers a confluence of system innovations, that has made large scale DL training effective, and efficient, greatly improved ease of use, and redefined the DL training landscape in terms of scale that is possible. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    TensorRT

    TensorRT

    C++ library for high performance inference on NVIDIA GPUs

    NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. TensorRT-based applications perform up to 40X faster than CPU-only platforms during inference. With TensorRT, you can optimize neural network models trained in all major frameworks, calibrate for lower precision with high accuracy, and deploy to hyperscale data centers, embedded, or automotive product platforms. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    Synapse Machine Learning

    Synapse Machine Learning

    Simple and distributed Machine Learning

    ...With the HTTP on Spark project, users can embed any web service into their SparkML models. For production-grade deployment, the Spark Serving project enables high throughput, sub-millisecond latency web services, backed by your Spark cluster.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    AWS Neuron

    AWS Neuron

    Powering Amazon custom machine learning chips

    AWS Neuron is a software development kit (SDK) for running machine learning inference using AWS Inferentia chips. It consists of a compiler, run-time, and profiling tools that enable developers to run high-performance and low latency inference using AWS Inferentia-based Amazon EC2 Inf1 instances. Using Neuron developers can easily train their machine learning models on any popular framework such as TensorFlow, PyTorch, and MXNet, and run it optimally on Amazon EC2 Inf1 instances. You can continue to use the same ML frameworks you use today and migrate your software onto Inf1 instances with minimal code changes and without tie-in to vendor-specific solutions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    The Google Cloud Developer's Cheat Sheet

    The Google Cloud Developer's Cheat Sheet

    Cheat sheet for Google Cloud developers

    Every product in the Google Cloud family described in <=4 words (with liberal use of hyphens and slashes) by the Google Developer Relations Team. This list only includes products that are publicly available. There are several products in pre-release/private-alpha that will not be included until they go public beta or GA. Many of these products have a free tier. There is also a free trial that will enable you try almost everything. API platforms and ecosystems, developer and management tools,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB