ChatGPT interface with better UI
Controllable & emotion-expressive zero-shot TTS
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
DeepMind model for tracking arbitrary points across videos & robotics
Global weather forecasting model using graph neural networks and JAX
code for Mesh R-CNN, ICCV 2019
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
Large-language-model & vision-language-model based on Linear Attention
Capable of understanding text, audio, vision, video
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training
A Unified Framework for Text-to-3D and Image-to-3D Generation
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
OCR expert VLM powered by Hunyuan's native multimodal architecture
Audio foundation model excelling in audio understanding
Stable Diffusion with Core ML on Apple Silicon
Towards Real-World Vision-Language Understanding
The ChatGPT Retrieval Plugin lets you easily find personal documents
Pushing the Limits of Mathematical Reasoning in Open Language Models
Chat & pretrained large vision language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
High-Resolution Image Synthesis with Latent Diffusion Models
AI Suite for upscaling, interpolating & restoring images/videos
Open-source, high-performance Mixture-of-Experts large language model