High-performance inference server for text embeddings models API layer
Hypernetworks that adapt LLMs for specific benchmark tasks
Document (PDF, Word, PPTX ...) extraction and parse API
A playground to generate images from any text prompt using SD
TTS with kokoro and onnx runtime
Use Microsoft Edge's online text-to-speech service from Python
A robust, efficient, low-latency speech-to-text library
Official inference repo for FLUX.1 models
Contexts Optical Compression
Code for running inference and finetuning with SAM 3 model
A lightweight text-to-speech model with zero-shot voice cloning
High-Quality Voice Cloning TTS for 600+ Languages
Robust Speech Recognition via Large-Scale Weak Supervision
Python library and CLI tool to interface with Google Translate
A high-quality rapid TTS voice cloning model
Offline inference engine for art, real-time voice conversations
Tokenizer-Free TTS for Multilingual Speech Generation
Official inference repo for FLUX.2 models
Qwen3-TTS is an open-source series of TTS models
TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
A Unified Framework for Text-to-3D and Image-to-3D Generation
Handwritten Text Recognition (HTR) system implemented with TensorFlow
Lightning-fast, on-device TTS, running natively via ONNX
A TTS that fits in your CPU (and pocket)