A Family of Open Sourced Music Foundation Models
The most powerful local music generation model
Official Python inference and LoRA trainer package
Multimodal Diffusion with Representation Alignment
Audio foundation model excelling in audio understanding
Qwen2.5-VL is the multimodal large language model series
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Chat & pretrained large audio language model proposed by Alibaba Cloud
Code for the paper Hybrid Spectrogram and Waveform Source Separation