Self-hosted outdoor activity tracker
Strong, Economical, and Efficient Mixture-of-Experts Language Model
Open-weight, large-scale hybrid-attention reasoning model
Implementation for MatMul-free LM
BitNet: Scaling 1-bit Transformers for Large Language Models
Drop-in replacement for standard residual connections in Transformers
On the Structural Pruning of Large Language Models
UCCL is an efficient communication library for GPUs
Collection of common code shared among different research projects
DE-based Weight Optimisation for Heterogeneous Ensemble
A fast implementation of LeCun's convolutional neural network
version 0.1
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens
Frontier-scale 675B multimodal base model for custom AI training