Dataset of GPT-2 outputs for research in detection, biases, and more
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Diffusion Transformer with Fine-Grained Chinese Understanding
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Chinese and English multimodal conversational language model
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning
GLM-4 series: Open Multilingual Multimodal Chat LMs
Inference framework for 1-bit LLMs
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Chinese LLaMA & Alpaca large language model + local CPU/GPU training
Repo for external large-scale work
Official PyTorch Implementation of "Scalable Diffusion Models"
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)
LLaMA: Open and Efficient Foundation Language Models
Implementation of model parallel autoregressive transformers on GPUs
Open-Source Financial Large Language Models!
Open-source, high-performance Mixture-of-Experts large language model
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Janus-Series: Unified Multimodal Understanding and Generation Models
An Open Bilingual Chat LLM | Open Source Bilingual Conversation LLM
Open Multilingual Multimodal Chat LMs
Open-source pre-training implementation of Google's LaMDA in PyTorch