Stable Virtual Camera: Generative View Synthesis with Diffusion Models
A Systematic Framework for Interactive World Modeling
Virtual AI anchor that combines state-of-the-art technology
A Model Context Protocol (MCP) Gateway & Registry
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Best practices on recommendation systems
Python scraper based on AI
Comprehensive Gradio WebUI for audio processing
Python Client for Supabase. Query Postgres from Flask, Django
A Unified Framework for Image Customization
An LLM-powered knowledge curation system that researches topics
Foundational model for human-like, expressive TTS
GLM-4-Voice | End-to-End Chinese-English Conversational Model
Generate Any 3D Scene in Seconds
Agent S: an open agentic framework that uses computers like a human
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Open source machine learning framework to automate text conversations
A library for deep learning end-to-end dialog systems and chatbots
Large Audio Language Model built for natural interactions
Benchmarking synthetic data generation methods
Trainable models and NN optimization tools
State-of-the-art diffusion models for image and audio generation
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Multi-Voice and Prompt-Controlled TTS Engine
Simple and powerful voice changer for Linux, written with Python & GTK