Implementation of model parallel autoregressive transformers on GPUs
Framework and no-code GUI for fine-tuning LLMs
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
State-of-the-art Parameter-Efficient Fine-Tuning
Ongoing research training transformer models at scale
BISHENG is an open LLM devops platform for next generation apps
LLM training code for MosaicML foundation models
Training and serving large-scale neural networks
Open source large language model by Alibaba
Open-source, high-performance AI model with advanced reasoning
Powerful AI language model (MoE) optimized for efficiency/performance
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open-source, high-performance Mixture-of-Experts large language model
An Open Bilingual Chat LLM | Open Source Bilingual Conversation LLM
An implementation of model parallel GPT-2 and GPT-3-style models