A simple, high-quality voice conversion tool focused on ease of use
Stable Diffusion web UI
TTS with kokoro and onnx runtime
High-Resolution Image Synthesis with Latent Diffusion Models
Visual Causal Flow
Public repository for Agent Skills
Research code artifacts for Code World Model (CWM)
gpt-4o for windows, macos and linux
The largest collection of PyTorch image encoders / backbones
Contexts Optical Compression
Open-source infrastructure for Computer-Use Agents. Sandboxes
Block Diffusion for Ultra-Fast Speculative Decoding
Generate short videos with one click using AI LLM
Audiocraft is a library for audio processing and generation
A reactive notebook for Python
Create videos with Stable Diffusion
The official gpt4free repository
The repository provides code for running inference with SAM 2
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT method
Clone a voice in 5 seconds to generate arbitrary speech in real-time
Definitions for AI/ML tasks like dataset creation
CLIP, Predict the most relevant text snippet given an image
Release for Improved Denoising Diffusion Probabilistic Models
Specification and documentation for Agent Skills
Advanced techniques for RAG systems