New set of lightweight state-of-the-art, open foundation models
Port of Facebook's LLaMA model in C/C++
Phi-3.5 for Mac: Locally-run Vision and Language Models
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Clean and efficient FP8 GEMM kernels with fine-grained scaling
New family of code large language models (LLMs)
Tiny vision language model
A Family of Open Foundation Models for Code Intelligence
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Block Diffusion for Ultra-Fast Speculative Decoding
ICLR2024 Spotlight: curation/training code, metadata, distribution
MiniMax-M2, a model built for Max coding & agentic workflows
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
React app for inspecting, building and debugging with the Realtime API
Lightweight multimodal translation model for 55 languages
Custom BLEURT model for evaluating text similarity using PyTorch
Lightweight 24B agentic coding model with vision and long context
Efficient MoE reasoning model for coding and math workloads
Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices
Compact 8B multimodal instruct model optimized for edge deployment
Small 3B-base multimodal model ideal for custom AI on edge hardware
Ultra-efficient 3B multimodal instruct model built for edge deployment
Compact 3B-param multimodal model for efficient on-device reasoning
OpenAI’s compact 20B open model for fast, agentic, and local use
Self-evolving AI model for agents, coding, and complex workflows