Framework that is dedicated to making neural data processing
Multilingual Automatic Speech Recognition with word-level timestamps
Tensor search for humans
Sparsity-aware deep learning inference runtime for CPUs
PyTorch extensions for fast R&D prototyping and Kaggle farming
The Triton Inference Server provides an optimized cloud
Standardized Serverless ML Inference Platform on Kubernetes
Powering Amazon custom machine learning chips
Optimizing inference proxy for LLMs
Replace OpenAI GPT with another LLM in your app
Bring the notion of Model-as-a-Service to life
Library for serving Transformers models on Amazon SageMaker
A set of Docker images for training and serving models in TensorFlow
LLMFlows - Simple, Explicit and Transparent LLM Apps
Data manipulation and transformation for audio signal processing
State-of-the-art Parameter-Efficient Fine-Tuning
Probabilistic reasoning and statistical analysis in TensorFlow
Database system for building simpler and faster AI-powered application
Lightweight Python library for adding real-time multi-object tracking
Gaussian processes in TensorFlow
Low-latency REST API for serving text-embeddings
LLM Chatbot Assistant for Openfire server
Libraries for applying sparsification recipes to neural networks
Open platform for training, serving, and evaluating language models
A library to communicate with ChatGPT, Claude, Copilot, Gemini