Stable Virtual Camera: Generative View Synthesis with Diffusion Models
A Systematic Framework for Interactive World Modeling
Comprehensive Gradio WebUI for audio processing
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
A Model Context Protocol (MCP) Gateway & Registry
Virtual AI anchor that combines state-of-the-art technology
An LLM-powered knowledge curation system that researches topics
Python Client for Supabase. Query Postgres from Flask, Django
Python scraper based on AI
A Unified Framework for Image Customization
Agent S: an open agentic framework that uses computers like a human
Style-Bert-VITS2: Bert-VITS2 with more controllable voice styles
Open source machine learning framework to automate text conversations
GLM-4-Voice | End-to-End Chinese-English Conversational Model
State-of-the-art diffusion models for image and audio generation
Generate Any 3D Scene in Seconds
Photorealistic Synthetic Dataset for Holistic Indoor Scene
Foundational model for human-like, expressive TTS
Trainable models and NN optimization tools
Best practices on recommendation systems
A library for deep learning end-to-end dialog systems and chatbots
Benchmarking synthetic data generation methods
Multi-Voice and Prompt-Controlled TTS Engine
Simple and powerful voice changer for Linux, written with Python & GTK
Interpretability and explainability of data and machine learning model