A SOTA open-source image editing model
Chinese and English multimodal conversational language model
Implementation of Imagen, Google's Text-to-Image Neural Network
Fast image augmentation library and an easy-to-use wrapper
Autoregressive Model Beats Diffusion
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
An Open Source text-to-speech system built by inverting Whisper
Large-language-model & vision-language-model based on Linear Attention
Capable of understanding text, audio, vision, video
Private chat with local GPT with document, images, video, etc.
Easy Docker setup for Stable Diffusion with user-friendly UI
An AI for Music Generation
Implementation of Make-A-Video, new SOTA text to video generator
Implementation of Video Diffusion Models
AI-powered tool to quickly remove watermarks from images flawlessly
High-Resolution Image Synthesis with Latent Diffusion Models
A Pioneering Open-Source Alternative to GPT-4o
AI Suite for upscaling, interpolating & restoring images/videos
Scientific Visualisation Made Easy
Chat & pretrained large vision language model
Plug-n-play module turning text-to-image models into animation
Towards Real-World Vision-Language Understanding
Overcoming Data Limitations for High-Quality Video Diffusion Models
A Customizable Image-to-Video Model based on HunyuanVideo
Free AI-powered church presentation & live sermon display app