HeartMuLa

HeartMuLa is the open-source library and reference implementation for the HeartMuLa family of music foundation models, designed to support both music generation and music-related understanding tasks in a cohesive stack. At the center is HeartMuLa, a music language model that generates music conditioned on inputs like lyrics and tags, with multilingual support that broadens the range of lyric-driven use cases. The project also includes HeartCodec, a music codec optimized for high reconstruction fidelity, enabling efficient tokenization and reconstruction workflows that are critical for training and generation pipelines. For text extraction from audio, it provides HeartTranscriptor, a Whisper-based model tuned specifically for lyrics transcription, which helps bridge generated or recorded audio back into structured text. It also introduces HeartCLAP, which aligns audio and text into a shared embedding space.

Features

Music generation model conditioned on lyrics and descriptive tags
Multilingual lyric support for broader creative workflows
High-fidelity music codec for audio tokenization and reconstruction
Lyrics transcription model tuned from a Whisper baseline
Audio–text alignment embeddings for cross-modal retrieval
Reference library with example workflows for inference and evaluation

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow HeartMuLa

HeartMuLa Web Site

Other Useful Business Software

Atera all-in-one platform IT management software with AI agents

Ideal for internal IT departments or managed service providers (MSPs)

Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.

Learn More

Rate This Project