High-performance inference server for text embeddings models API layer
A playground to generate images from any text prompt using SD
Hypernetworks that adapt LLMs for specific benchmark tasks
Code for openai.fm, a demo for the OpenAI Speech API
TTS with kokoro and onnx runtime
Offline inference engine for art, real-time voice conversations
Speech-AI-Forge is a project developed around TTS generation model
High-Quality Voice Cloning TTS for 600+ Languages
Official inference repo for FLUX.1 models
Canvas-based WYSIWYG rich text editor with advanced layout tools
Multimodal-Driven Architecture for Customized Video Generation
Tokenizer-Free TTS for Multilingual Speech Generation
A robust, efficient, low-latency speech-to-text library
A Family of Open Sourced Music Foundation Models
Lightning-fast, on-device TTS, running natively via ONNX
Handwritten Text Recognition (HTR) system implemented with TensorFlow
JavaScript OCR and text extraction for images and PDFs
super expressive prompting model based on ltx2.3
TextWorld is a sandbox learning environment for the training
Industrial-level controllable zero-shot text-to-speech system
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
A fast TTS architecture with conditional flow matching
OCR model for complex documents with layout-aware structured outputs
Framework for building realtime multimodal voice AI agents apps
A high-quality rapid TTS voice cloning model