code for Mesh R-CNN, ICCV 2019
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
State-of-the-art Image & Video CLIP, Multimodal Large Language Models
Language modeling in a sentence representation space
An AI-powered security review GitHub Action using Claude
Renderer for the harmony response format to be used with gpt-oss
Implementation of the Surya Foundation Model for Heliophysics
A SOTA open-source image editing model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Multi-modal large language model designed for audio understanding
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
ChatGPT interface with better UI
Stable Diffusion with Core ML on Apple Silicon
Towards Real-World Vision-Language Understanding
The ChatGPT Retrieval Plugin lets you easily find personal documents
High-Resolution Image Synthesis with Latent Diffusion Models
Pushing the Limits of Mathematical Reasoning in Open Language Models
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Chat & pretrained large vision language model
A Conversational Speech Generation Model
Open-source, high-performance Mixture-of-Experts large language model
AI-powered tool to quickly remove watermarks from images flawlessly
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
StudioOllamaUI is a local, portable interface for Ollama