Streamline your ML workflow
Automatically translates the text of a video based on a subtitle file
Supercharge Your LLM with the Fastest KV Cache Layer
Multi-lingual large voice generation model, providing inference
A fast TTS architecture with conditional flow matching
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
A Gym environment for web task automation
Taming Stable Diffusion for Lip Sync
Machine Learning Pipelines for Kubeflow
Python observability platform for tracing apps, metrics, and logs
A comprehensive quantitative trading system with AI-powered analysis
AI-driven multi-agent research assistant automating hypothesis
A simple, secure MCP-to-OpenAPI proxy server
Superduper: Integrate AI models and machine learning workflows
Real-time voice interactive digital human
On-device Speech-to-Intent engine powered by deep learning
Qwen3-omni is a natively end-to-end, omni-modal LLM
Autonomous LLM agent for end-to-end data science workflows
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Multilingual Automatic Speech Recognition with word-level timestamps
The official repository for ERNIE 4.5 and ERNIEKit
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Conversational voice AI agents
Bailing is a voice dialogue robot similar to GPT-4o
Converts text to speech in realtime