MoBA: Mixture of Block Attention for Long-Context LLMs
Qwen3.5 is the large language model series developed by Qwen team
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Running a big model on a small laptop
Open-source, high-performance AI model with advanced reasoning
A Powerful Native Multimodal Model for Image Generation
A Next-Generation Training Engine Built for Ultra-Large MoE Models
Towards self-verifiable mathematical reasoning
From nobody to big model (LLM) hero
System Level Intelligent Router for Mixture-of-Models at Cloud
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Clean and efficient FP8 GEMM kernels with fine-grained scaling
157 models, 30 providers, one command to find what runs on hardware
Powerful AI language model (MoE) optimized for efficiency/performance
Kimi K2 is the large language model series developed by Moonshot AI
Open-weight, large-scale hybrid-attention reasoning model
Strong, Economical, and Efficient Mixture-of-Experts Language Model
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
The home of the ICU project source code
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Mainly record the knowledge and interview questions
Fully automatic censorship removal for language models
PRML algorithms implemented in Python
Python-free Rust inference server
Research papers and blogs to transition to AI Engineering