Port of Facebook's LLaMA model in C/C++
Get up and running with Llama 2 and other large language models
Inference Llama 2 in one file of pure C
Python bindings for the Transformer models implemented in C/C++
Locally run an Instruction-Tuned Chat-Style LLM