Python inference and LoRA trainer package for the LTX-2 audio–video
FlashMLA: Efficient Multi-head Latent Attention Kernels
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Reference PyTorch implementation and models for DINOv3
RGBD video generation model conditioned on camera input
MiniMax-M2, a model built for Max coding & agentic workflows
PyTorch implementation of JiT
Analyze computation-communication overlap in V3/R1
Open-source, high-performance Mixture-of-Experts large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Efficient MoE model for million-token reasoning and coding
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Lightweight multimodal translation model for 55 languages
Efficient 14B multimodal instruct model with edge deployment and FP8
Large-scale xAI model for local inference with SGLang, Grok-2.5
Tiny pre-trained IBM model for multivariate time series forecasting
Quantized 675B multimodal instruct model optimized for NVFP4
Ultra-efficient 3B multimodal instruct model built for edge deployment