DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models
Qwen-Image is a powerful image generation foundation model
Ready-to-use OCR with 80+ supported languages
State-of-the-art (SoTA) text-to-video pre-trained model
Project Lyra: Open Generative 3D World Models
A TTS that fits in your CPU (and pocket)
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
⚡ Building applications with LLMs through composability ⚡
Genome modeling and design across all domains of life
AI-powered video generation skill for OpenClaw
Stable Diffusion web UI
Official implementation of DreamCraft3D
Toolkit for audio, music, and speech generation
InvokeAI is a leading creative engine for Stable Diffusion models
Unofficial Python API and agentic skill for Google NotebookLM
Unified Multimodal Understanding and Generation Models
Python Audio Analysis Library: Feature Extraction, Classification
Retrieval and Retrieval-augmented LLMs
Generate high-definition story short videos with one click using AI
BioNeMo Framework: For building and adapting AI models
CodeGeeX4-ALL-9B, a versatile model for all AI software development
1 min voice data can also be used to train a good TTS model
Multi-lingual large voice generation model, providing inference
100–200× Acceleration for Video Diffusion Models
SOTA discrete acoustic codec models with 40/75 tokens per second