Contexts Optical Compression
Qwen3-omni is a natively end-to-end, omni-modal LLM
All-in-one AI productivity platform with agents, workflows, and IM
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
ComfyUI nodes for LivePortrait
Expressive Portrait Image Animation for Live Streaming
PyTorch3D is FAIR's library of reusable components for deep learning
Taming Stable Diffusion for Lip Sync
Chinese and English multimodal conversational language model
The library to build & auto-optimize LLM applications
ASCII art library for Python
InvokeAI is a leading creative engine for Stable Diffusion models
Foundation model for image generation
Transform your favorite cities into beautiful, minimalist designs
Benchmarking Multimodal Agents for Open-Ended Tasks
Entity Relation Diagrams generation tool
GPT Image 2 prompt gallery, image prompt library, agentic skill
Programs to process GoPro MP4 & Generic GPX/FIT files
Videomass is a free, open source and cross-platform GUI for FFmpeg
Open-source and free to self-host
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Phi-3.5 for Mac: Locally-run Vision and Language Models
Browse the web, directly from Cursor etc.
PDF to Markdown with vision models
GitLab automatic code review tool based on large models