Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
StreamSpeech is a seamless model for offline speech recognition
State-of-the-art TTS model under 25MB
Convert various image, audio and video formats from your context menu.
An Open Source text-to-speech system built by inverting Whisper
Generate audiobooks from e-books
A Telegram RSS bot that cares about your reading experience
Official PyTorch Implementation
Private AI platform for agents, enterprise search and RAG pipelines
Get your documents ready for gen AI
VMZ: Model Zoo for Video Modeling
High-resolution models for human tasks
Official repository for LTX-Video
Document Image Parsing via Heterogeneous Anchor Prompting”
Large Multimodal Models for Video Understanding and Editing
Multi-lingual large voice generation model, providing inference
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
The data structure for multimodal data
Instill Core is a full-stack AI infrastructure tool for data
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A Telegram bot that integrates with OpenAI's official ChatGPT APIs
LLM Large Model of Selling Anchor
Minimal scripts to run the emulator in a container for various systems
A simple native web interface that uses ChatTTS to synthesize text
Open-Source Low-Latency Accelerated Linux WebRTC HTML5 Remote Desktop