Alpaca.cpp
Locally run an Instruction-Tuned Chat-Style LLM
...Download the zip file corresponding to your operating system from the latest release. The weights are based on the published fine-tunes from alpaca-lora, converted back into a PyTorch checkpoint with a modified script and then quantized with llama.cpp the regular way.