Strong, Economical, and Efficient Mixture-of-Experts Language Model
Open-weight, large-scale hybrid-attention reasoning model
Usable Implementation of "Bootstrap Your Own Latent" self-supervised
Implementation for MatMul-free LM
Text and image to video generation: CogVideoX and CogVideo
BitNet: Scaling 1-bit Transformers for Large Language Models
Drop-in replacement for standard residual connections in Transformers
On the Structural Pruning of Large Language Models
UCCL is an efficient communication library for GPUs
Open platform for training, serving, and evaluating language models
interactive tool designed to help users understand how neural network
VITS2 backbone with multilingual-bert
The TensorFlow Object Counting API is an open source framework
Deep Learning Chinese Word Segment
DE-based Weight Optimisation for Heterogeneous Ensemble
A fast implementation of LeCun's convolutional neural network
A neural network library for Java.
Qwen3-Next: 80B instruct LLM with ultra-long context up to 1M tokens
Frontier-scale 675B multimodal base model for custom AI training