High-performance inference server for text embeddings models API layer
A playground to generate images from any text prompt using SD
Hypernetworks that adapt LLMs for specific benchmark tasks
Code for openai.fm, a demo for the OpenAI Speech API
TTS with kokoro and onnx runtime
Offline inference engine for art, real-time voice conversations
A robust, efficient, low-latency speech-to-text library
Canvas-based WYSIWYG rich text editor with advanced layout tools
Speech-AI-Forge is a project developed around TTS generation model
High-Quality Voice Cloning TTS for 600+ Languages
Tokenizer-Free TTS for Multilingual Speech Generation
Official inference repo for FLUX.1 models
Multimodal-Driven Architecture for Customized Video Generation
Handwritten Text Recognition (HTR) system implemented with TensorFlow
A Family of Open Sourced Music Foundation Models
Lightning-fast, on-device TTS, running natively via ONNX
super expressive prompting model based on ltx2.3
A fast TTS architecture with conditional flow matching
JavaScript OCR and text extraction for images and PDFs
A high-quality rapid TTS voice cloning model
Framework for building realtime multimodal voice AI agents apps
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Industrial-level controllable zero-shot text-to-speech system
OCR model for complex documents with layout-aware structured outputs
Unifying 3D Mesh Generation with Language Models