Find the local LLM that actually runs and performs best
ChatGLM2-6B: An Open Bilingual Chat LLM
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
Agentic, Reasoning, and Coding (ARC) foundation models
A.S.E (AICGSecEval) is a repository-level AI-generated code security
Code for the paper "Evaluating Large Language Models Trained on Code"
LongBench v2 and LongBench (ACL 25'&24')
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Leaderboard Comparing LLM Performance at Producing Hallucinations
Benchmark LLMs by fighting in Street Fighter 3
LLM inference in C/C++
A high-performance ML model serving framework, offers dynamic batching
High-speed Large Language Model Serving for Local Deployment
Implement CPU from scratch and play with large model deployments
Run AI models locally on your machine with node.js bindings for llama
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
A Gym environment for web task automation
157 models, 30 providers, one command to find what runs on hardware
Private Open AI on Kubernetes
Advanced language and coding AI model
Open-source model for program synthesis
Fast, flexible LLM inference
OpenCompass is an LLM evaluation platform