ChatGLM2-6B: An Open Bilingual Chat LLM
Implement CPU from scratch and play with large model deployments
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
A high-performance ML model serving framework, offers dynamic batching
Chinese Llama-3 LLMs) developed from Meta Llama 3
Gemma open-weight LLM library, from Google DeepMind
Low-latency REST API for serving text-embeddings
Tools for merging pretrained large language models
Tensor search for humans
Run LLMs locally on Cloud Workstations
Run Mixtral-8x7B models in Colab or consumer desktops
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere
Chinese LLaMA & Alpaca large language model + local CPU/GPU training