An AI-powered security review GitHub Action using Claude
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Revolutionizing Database Interactions with Private LLM Technology
Tool for exploring and debugging transformer model behaviors
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Block Diffusion for Ultra-Fast Speculative Decoding
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
State-of-the-art (SoTA) text-to-video pre-trained model
One-click local MCP server installation in desktop apps
Open-source industrial-grade ASR models
Pushing the Limits of Mathematical Reasoning in Open Language Models
Foundation model for image generation
Recovering the Visual Space from Any Views
VMZ: Model Zoo for Video Modeling
Video understanding codebase from FAIR for reproducing video models
Towards Real-World Vision-Language Understanding
Renderer for the harmony response format to be used with gpt-oss
Easy Docker setup for Stable Diffusion with user-friendly UI
PyTorch code and models for the DINOv2 self-supervised learning
Stable Diffusion with Core ML on Apple Silicon
Long-form streaming TTS system for multi-speaker dialogue generation
Global weather forecasting model using graph neural networks and JAX
General-purpose image editing model that delivers high-fidelity
Generate Any 3D Scene in Seconds