State-of-the-art (SoTA) text-to-video pre-trained model
Streamlink is a CLI utility which pipes video streams
Capable of understanding text, audio, vision, video
Your Ultimate IPTV & Stream Management Companion
Cross-platform GUI for image upscaler Real-ESRGAN
Lightweight Python tool for downloading videos from many platforms
Qwen3-omni is a natively end-to-end, omni-modal LLM
RGBD video generation model conditioned on camera input
Generate short videos with one click using AI LLM
Streaming Real-time Audio-Driven Avatar Generation
Open-Source Low-Latency Accelerated Linux WebRTC HTML5 Remote Desktop
Document Image Parsing via Heterogeneous Anchor Prompting”
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Python inference and LoRA trainer package for the LTX-2 audio–video
Yet bunkrr album downloader
A youtube-dl fork with additional features and fixes
Expressive Portrait Image Animation for Live Streaming
Large Audio Language Model built for natural interactions
A python tool that uses GPT-4, FFmpeg, and OpenCV
Taming Stable Diffusion for Lip Sync
Multimodal-Driven Architecture for Customized Video Generation
Topic Modelling for Humans
GPT4V-level open-source multi-modal model based on Llama3-8B
Build Vision Agents quickly with any model or video provider
Douyin TikTok Download API