A playground to generate images from any text prompt using SD
Foundation model for image generation
GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Image generation model with single-stream diffusion transformer
General-purpose image editing model that delivers high-fidelity
Multimodal-Driven Architecture for Customized Video Generation
Official inference repo for FLUX.2 models
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
A Powerful Native Multimodal Model for Image Generation
Official inference repo for FLUX.1 models
CLIP, Predict the most relevant text snippet given an image
A 0.1B Omni model trained from scratch
Tokenizer-Free TTS for Multilingual Speech Generation
A Multi-Modal World Model for Reconstructing, Generating, Simulation
A Unified Framework for Text-to-3D and Image-to-3D Generation
Autoregressive Model Beats Diffusion
Collection of Gemma 3 variants that are trained for performance
Offline inference engine for art, real-time voice conversations
Code for openai.fm, a demo for the OpenAI Speech API
Contexts Optical Compression
JavaScript OCR and text extraction for images and PDFs
Framework for building neural networks
Flexible Photo Recrafting While Preserving Your Identity
AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim