Python inference and LoRA trainer package for the LTX-2 audio–video
Qwen2.5-VL is the multimodal large language model series
Audio foundation model excelling in audio understanding
My personal Claude Code configuration
Qwen2.5-VL-3B-Instruct: Multimodal model for chat, vision & video