MoBA: Mixture of Block Attention for Long-Context LLMs
Qwen3.5 is the large language model series developed by Qwen team
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Open-source, high-performance AI model with advanced reasoning
A Powerful Native Multimodal Model for Image Generation
Running a big model on a small laptop
A Next-Generation Training Engine Built for Ultra-Large MoE Models
Towards self-verifiable mathematical reasoning
From nobody to big model (LLM) hero
System Level Intelligent Router for Mixture-of-Models at Cloud
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Clean and efficient FP8 GEMM kernels with fine-grained scaling
157 models, 30 providers, one command to find what runs on hardware
Kimi K2 is the large language model series developed by Moonshot AI
Powerful AI language model (MoE) optimized for efficiency/performance
Open-weight, large-scale hybrid-attention reasoning model
Strong, Economical, and Efficient Mixture-of-Experts Language Model
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
The home of the ICU project source code
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Mainly record the knowledge and interview questions
Fully automatic censorship removal for language models
PRML algorithms implemented in Python
Python-free Rust inference server
Research papers and blogs to transition to AI Engineering