Phi-3.5 for Mac: Locally-run Vision and Language Models
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Example Discord bot written in Python that uses the completions API
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Hackable and optimized Transformers building blocks
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
DeepSeek Coder: Let the Code Write Itself
CodeGeeX2: A More Powerful Multilingual Code Generation Model
LTX-Video Support for ComfyUI
Tool for exploring and debugging transformer model behaviors
A Unified Framework for Text-to-3D and Image-to-3D Generation
Qwen2.5-VL is the multimodal large language model series
Programmatic access to the AlphaGenome model
GLM-4 series: Open Multilingual Multimodal Chat LMs
GPT4V-level open-source multi-modal model based on Llama3-8B
A PyTorch library for implementing flow matching algorithms
Pushing the Limits of Mathematical Reasoning in Open Language Models
Global weather forecasting model using graph neural networks and JAX
An AI-powered security review GitHub Action using Claude
Renderer for the harmony response format to be used with gpt-oss
Inference script for Oasis 500M
FAIR Sequence Modeling Toolkit 2
Open-source large language model family from Tencent Hunyuan
Chat & pretrained large audio language model proposed by Alibaba Cloud