...For davinci and other non-chat models, the output is prefixed to the prompt. Compose shell commands like you would in a script. Try with a custom model. By default gptee uses gpt-3.5-turbo.
...This combines the LLaMA foundation model with an open reproduction of Stanford Alpaca a fine-tuning of the base model to obey instructions (akin to the RLHF used to train ChatGPT) and a set of modifications to llama.cpp to add a chat interface. Download the zip file corresponding to your operating system from the latest release. The weights are based on the published fine-tunes from alpaca-lora, converted back into a PyTorch checkpoint with a modified script and then quantized with llama.cpp the regular way.