model-builder free download

ACE-Step 1.5

The most powerful local music generation model

ACE-Step 1.5 is an advanced open-source foundation model for AI-driven music generation that pushes beyond traditional limitations in speed, musical coherence, and controllability by innovating in architecture and training design. It integrates cutting-edge generative techniques—such as diffusion-based synthesis combined with compressed autoencoders and lightweight transformer elements—to produce high-quality full-length music tracks with rapid inference times, capable of generating a complete song in seconds on modern GPUs while remaining efficient enough to run on consumer-grade hardware with minimal memory requirements. ...

Downloads: 53 This Week

Last Update: 2026-05-18

See Project

HunyuanVideo-Foley

Multimodal Diffusion with Representation Alignment

HunyuanVideo-Foley is a multimodal diffusion model from Tencent Hunyuan for high-fidelity Foley (sound effects) audio generation synchronized to video scenes. It is designed to generate audio that matches both visual content and textual semantic cues, for use in video production, film, advertising, games, etc. The model architecture aligns audio, video, and text representations to produce realistic synchronized soundtracks.

Downloads: 0 This Week

Last Update: 2025-09-28

See Project

AudioLM - Pytorch

Implementation of AudioLM audio generation model in Pytorch

Implementation of AudioLM, a Language Modeling Approach to Audio Generation out of Google Research, in Pytorch It also extends the work for conditioning with classifier free guidance with T5. This allows for one to do text-to-audio or TTS, not offered in the paper. Yes, this means VALL-E can be trained from this repository. It is essentially the same. This repository now also contains a MIT licensed version of SoundStream. It is also compatible with EnCodec, however, be aware that it...

Downloads: 2 This Week

Last Update: 2025-01-12

See Project

DiffRhythm

Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation

DiffRhythm is an open-source, diffusion-based model designed to generate full-length songs. Focused on music creation, it combines advanced AI techniques to produce coherent and creative audio compositions. The model utilizes a latent diffusion architecture, making it capable of producing high-quality, long-form music. It can be accessed on Huggingface, where users can interact with a demo or download the model for further use.

1 Review

Downloads: 7 This Week

Last Update: 2025-03-06

See Project

MusicLM - Pytorch

Implementation of MusicLM music generation model in Pytorch

Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch. They are basically using text-conditioned AudioLM, but surprisingly with the embeddings from a text-audio contrastive learned model named MuLan. MuLan is what will be built out in this repository, with AudioLM modified from the other repository to support the music generation needs here.

Downloads: 0 This Week

Last Update: 2023-09-06

See Project

DeepMozart

Audio generation using diffusion models

Audio generation using diffusion models in PyTorch. The code is based on the audio-diffusion-pytorch repository.

Downloads: 0 This Week

Last Update: 2023-03-29

See Project

Search Results for "model-builder"

Showing 6 open source projects for "model-builder"

ACE-Step 1.5

HunyuanVideo-Foley

AudioLM - Pytorch

DiffRhythm

MusicLM - Pytorch

DeepMozart

Search Results for "model-builder"

Showing 6 open source projects for "model-builder"

ACE-Step 1.5

HunyuanVideo-Foley

AudioLM - Pytorch

DiffRhythm

MusicLM - Pytorch

DeepMozart

Related Searches

Related Categories