A high-performance ML model serving framework, offers dynamic batching
BertViz: Visualize Attention in NLP Models (BERT, GPT2, BART, etc.)
Large-language-model & vision-language-model based on Linear Attention
Open-source, high-performance Mixture-of-Experts large language model
Open-source pre-training implementation of Google's LaMDA in PyTorch