Search Results for "music-generation" - Page 4

Sort By:

Showing 1318 open source projects for "music-generation"

View related business solutions

Python Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
AI-generated apps that pass security review
Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.

Try Retool free
1

HunyuanVideo-I2V

A Customizable Image-to-Video Model based on HunyuanVideo

HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and support for parallel inference via xDiT. ...

Downloads: 2 This Week

Last Update: 2025-09-23
See Project
2

Bespoke Curator

Synthetic data curation for post-training and data extraction

Curator is an open-source Python library designed to build synthetic data pipelines for training and evaluating machine learning models, particularly large language models. The system helps developers generate, transform, and curate high-quality datasets by combining automated generation with structured validation and filtering. It supports workflows where models are used to produce synthetic examples that can later be refined into reliable training datasets for reasoning, question answering, or structured information extraction tasks. Curator includes tools for monitoring data generation processes and managing dataset quality while large batches of examples are being created. ...

Downloads: 0 This Week

Last Update: 2026-03-14
See Project
3

LongCat-Image

Foundation model for image generation

...The model excels at both text-to-image generation and instruction-guided image editing, offering users versatile capabilities for creative and practical tasks—whether generating art, mockups, or adjusting existing visuals with fine control.

Downloads: 1 This Week

Last Update: 2026-03-24
See Project
4

CodeGeeX4

CodeGeeX4-ALL-9B, a versatile model for all AI software development

CodeGeeX4 is the fourth-generation open source multilingual code large language model (LLM) developed by ZhipuAI. Designed as a powerful AI coding assistant, it supports over 100 programming languages and has been trained on a massive code and natural language corpus. Compared to its predecessors, CodeGeeX4 introduces improved reasoning, stronger alignment with developer needs, and better performance on real-world programming benchmarks.

Downloads: 1 This Week

Last Update: 4 days ago
See Project
Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build generative AI apps with Vertex AI. Switch between models without switching platforms.

Start Free
5

UNO

A Universal Customization Method for Single and Multi Conditioning

...It suggests a framework for image (or more general generative) modeling where the model can be conditioned either on a single subject or multiple subjects — which may correspond to generating or customizing images featuring specific people, styles, or objects, possibly with fine-grained control over subject identity or composition. Because the project is new (see activity logs for 2025), it seems to aim at bridging between single-subject customization and multi-subject generation in generative modeling — potentially useful for personalized content creation, flexible composition, or controlled generation tasks. UNO likely offers tools to fine-tune or condition generation models so that they can incorporate novel subjects, enabling users to produce custom outputs beyond standard training distribution.

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
6

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models

DataDreamer is a tool designed to assist in the generation and manipulation of synthetic data for various applications, including testing and machine learning.

Downloads: 0 This Week

Last Update: 2025-02-02
See Project
7

BIP Utility Library

Generation of mnemonics, seeds, private/public keys and addresses

Generation of mnemonics, seeds, private/public keys, and addresses for different types of cryptocurrencies. A Python library for handling cryptocurrency wallet standards like BIP32, BIP39, and BIP44.

Downloads: 0 This Week

Last Update: 2026-03-02
See Project
8

pyVideoTrans

Translate the video from one language to another and embed dubbing

pyVideoTrans is an ambitious open-source multimedia processing project that assembles speech recognition, subtitle generation, AI translation, voice synthesis, and video assembly into a unified pipeline for converting videos from one language to another with embedded dubbing and captions. At its core it runs speech-to-text models to transcribe audio tracks, translates the resulting text into a target language using local or cloud-based translation engines, synthesizes new speech to match the translated subtitles, and then merges that speech back into the video, creating a fully localized media file. ...

Downloads: 16 This Week

Last Update: 5 days ago
See Project
9

Synthetic Data Vault (SDV)

Synthetic Data Generation for tabular, relational and time series data

The Synthetic Data Vault (SDV) is a Synthetic Data Generation ecosystem of libraries that allows users to easily learn single-table, multi-table and timeseries datasets to later on generate new Synthetic Data that has the same format and statistical properties as the original dataset. Synthetic data can then be used to supplement, augment and in some cases replace real data when training Machine Learning models.

Downloads: 2 This Week

Last Update: 5 days ago
See Project
$300 in Free Credit Towards Top Cloud Services
Build VMs, containers, AI, databases, storage—all in one place.

Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.

Get Started
10

BioNeMo

BioNeMo Framework: For building and adapting AI models

BioNeMo is an AI-powered framework developed by NVIDIA for protein and molecular generation using deep learning models. It provides researchers and developers with tools to design, analyze, and optimize biological molecules, aiding in drug discovery and synthetic biology applications.

Downloads: 4 This Week

Last Update: 2025-10-01
See Project
11

WavTokenizer

SOTA discrete acoustic codec models with 40/75 tokens per second

...Extensive experiments show that WavTokenizer matches or surpasses previous neural codecs across speech, music, and general audio on both objective metrics and subjective listening tests.

Downloads: 1 This Week

Last Update: 2025-11-28
See Project
12

Step-Video-T2V

State-of-the-art (SoTA) text-to-video pre-trained model

...Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible motion and visuals. The model handles bilingual input (e.g. English and Chinese) thanks to dual encoders, and supports end-to-end text-to-video generation without requiring external assets. Its training and generation pipeline includes techniques like flow-matching, full 3D attention for temporal consistency, and fine-tuning approaches (e.g. video-based DPO) to improve fidelity and reduce artifacts. As a result, Step-Video-T2V aims to push the frontier of open-source video generation.

Downloads: 1 This Week

Last Update: 2025-12-02
See Project
13

SDGym

Benchmarking synthetic data generation methods

The Synthetic Data Gym (SDGym) is a benchmarking framework for modeling and generating synthetic data. Measure performance and memory usage across different synthetic data modeling techniques – classical statistics, deep learning and more! The SDGym library integrates with the Synthetic Data Vault ecosystem. You can use any of its synthesizers, datasets or metrics for benchmarking. You also customize the process to include your own work. Select any of the publicly available datasets from the...

Downloads: 1 This Week

Last Update: 5 hours ago
See Project
14

EasyOCR

Ready-to-use OCR with 80+ supported languages

...EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first generation models. EasyOCR will choose the latest model by default but you can also specify which model to use. Model weights for the chosen language will be automatically downloaded or you can download them manually from the model hub. ...

Downloads: 29 This Week

Last Update: 2024-09-24
See Project
15

Pocket TTS

A TTS that fits in your CPU (and pocket)

...It also emphasizes developer ergonomics, providing a straightforward API surface that can be integrated into pipelines, assistants, accessibility tools, or batch generation scripts.

Downloads: 5 This Week

Last Update: 2026-02-16
See Project
16

ComfyUI-3D-Pack

An extensive node suite that enables ComfyUI to process 3D inputs

...The package allows the platform to process inputs such as meshes and UV textures and integrate them into generative workflows similar to those used for image and video generation. It incorporates modern 3D generation technologies including neural radiance fields, Gaussian splatting, and other AI-driven reconstruction techniques. Through these nodes, users can convert images into 3D models, manipulate geometry, and experiment with generative 3D workflows inside the visual pipeline editor.

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
17

LongWriter

Unleashing 10,000+ Word Generation from Long Context LLMs

LongWriter is an open-source framework and set of large language models designed to enable ultra-long text generation that can exceed 10,000 words while maintaining coherence and structure. Traditional large language models can process large inputs but often struggle to generate long outputs due to limitations in training data and alignment strategies. LongWriter addresses this challenge by introducing a specialized dataset and training approach that encourages models to produce longer responses. ...

Downloads: 1 This Week

Last Update: 2026-03-06
See Project
18

Sygil WebUI

Stable Diffusion web UI

Sygil WebUI is a browser-based interface for running Stable Diffusion image generation locally or on a server, wrapping common text-to-image and image-to-image workflows into a practical UI. It provides multiple UI modes (including a legacy Gradio interface) and focuses on making iterative prompting, parameter tuning, and post-processing accessible without writing code. The UI exposes core generation controls like resolution, CFG guidance, sampling steps, samplers, seeds, and batch generation so users can reproduce results and refine outputs systematically. ...

Downloads: 1 This Week

Last Update: 2026-02-03
See Project
19

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. ...

1 Review

Downloads: 19 This Week

Last Update: 2026-03-22
See Project
20

PDFSticher

Code repository for PDFStitcher, a utility to stitch together PDFs

The open source PDF stitching software for sewists, by sewists. PDFSticher is a utility for stitching together many PDF pages from one document into a single page. This is also called "N-Up" or page imposition. This program was created in order to convert sewing patterns into a convenient format for projecting, though it could be used to stitch together any PDF. Since version 0.4, it is also possible to select layers for inclusion/exclusion in the final output. Additionally, line properties...

Downloads: 12 This Week

Last Update: 2025-06-26
See Project
21

Janus

Unified Multimodal Understanding and Generation Models

Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations. The design tackles long-standing conflicts in multimodal models: namely that the visual encoder has to serve both analysis (understanding) and synthesis (generation) roles. ...

Downloads: 0 This Week

Last Update: 2025-10-20
See Project
22

MoneyPrinter V2

Automate the process of making money online

MoneyPrinter V2 is an open-source automation platform designed to streamline and scale online income generation workflows by combining content creation, social media automation, and marketing strategies into a single system. It is a complete rewrite of the original MoneyPrinter project, focusing on modularity, extensibility, and broader functionality across multiple monetization channels. The platform operates primarily through Python-based scripts that automate tasks such as generating and publishing YouTube Shorts, posting on social media platforms like Twitter, and executing affiliate marketing campaigns. ...

Downloads: 16 This Week

Last Update: 6 days ago
See Project
23

LangChain

⚡ Building applications with LLMs through composability ⚡

Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.

1 Review

Downloads: 11 This Week

Last Update: 1 hour ago
See Project
24

InfiniteYou

Flexible Photo Recrafting While Preserving Your Identity

InfiniteYou is an open-source image-generation and “identity-preserving image editing / generation” framework from ByteDance, designed to generate high-fidelity images that preserve a subject’s identity while allowing flexible editing or re-creation according to textual prompts. Using an architecture built around diffusion transformers (DiTs), InfiniteYou introduces a component called InfuseNet that injects identity features derived from reference images into the generation process — via residual connections — so that the output matches the person’s identity closely, without sacrificing visual quality or text-image alignment. ...

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
25

ComfyUI-Copilot

AI assistant for ComfyUI workflow generation, debugging, and tuning

...It functions as a custom node integrated directly into the ComfyUI environment, allowing users to interact with workflows through natural language and intelligent suggestions. ComfyUI-Copilot focuses on reducing the complexity of building node-based pipelines for generative AI tasks such as image generation, making it more accessible to both beginners and experienced users. It supports the entire workflow lifecycle, including generation, debugging, rewriting, and parameter optimization, helping users iterate more efficiently. ComfyUI-Copilot leverages large language model capabilities to analyze user intent, recommend nodes, and suggest models that match specific requirements. ...

Downloads: 0 This Week

Last Update: 2026-03-18
See Project