Add Support for llama.cpp Server Mode
AI desktop app with local RAG, privacy-first, multi-model support
Brought to you by:
hainguyen79
Originally created by: haiphucnguyen
Add a provider that communicates with the llama.cpp HTTP server (llama-server). This server exposes a simple API for running local LLM inference. Supporting it would allow Askimo to connect directly to standalone llama.cpp instances on macOS, Linux, Windows, and ARM devices.
Add a new provider: LlamaCppServerProvider.
Default base URL: http://localhost:8080.
Allow user to override the endpoint.