Python inference and LoRA trainer package for the LTX-2 audio–video
FlashMLA: Efficient Multi-head Latent Attention Kernels
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Reference PyTorch implementation and models for DINOv3
RGBD video generation model conditioned on camera input
PyTorch implementation of JiT
MiniMax-M2, a model built for Max coding & agentic workflows
Analyze computation-communication overlap in V3/R1
Open-source, high-performance Mixture-of-Experts large language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Lightweight multimodal translation model for 55 languages
Efficient MoE model for million-token reasoning and coding
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Efficient 14B multimodal instruct model with edge deployment and FP8
Large-scale xAI model for local inference with SGLang, Grok-2.5
Tiny pre-trained IBM model for multivariate time series forecasting
Quantized 675B multimodal instruct model optimized for NVFP4
Ultra-efficient 3B multimodal instruct model built for edge deployment