Deep learning optimization library: makes distributed training easy
Generate high-definition story short videos with one click using AI
This repo contains the code for 1D tokenizer and generator
A Universal Customization Method for Single and Multi Conditioning
A Unified Framework for Image Customization
Flexible Photo Recrafting While Preserving Your Identity
A SOTA open-source image editing model
Multi-Agent daTa geneRation Infra and eXperimentation framework
Bailing is a voice dialogue robot similar to GPT-4o
Build Vision Agents quickly with any model or video provider
An Open Source text-to-speech system built by inverting Whisper
Reading book source
MARS5 speech model (TTS) from CAMB.AI
This repository provides an advanced RAG
An MCP server that autonomously evaluates web applications
A state-of-the-art open visual language model
Chinese and English multimodal conversational language model
Repo of Qwen2-Audio chat & pretrained large audio language model
Helping you get the most out of AWS, wherever you use MCP
Multi-Modal Neural Networks for Semantic Search, based on Mid-Fusion
Tensor search for humans
The data structure for multimodal data
Django friendly finite state machine support
Implementation of Imagen, Google's Text-to-Image Neural Network
Open Source Differentiable Computer Vision Library