Secret Llama
Fully private LLM chatbot that runs entirely with a browser
...The interface mirrors the modern chat UX you’d expect—streaming responses, markdown, and a clean layout—so there’s no usability tradeoff to gain privacy. Under the hood it uses a web-native inference engine to accelerate model execution with GPU/WebGPU when available, keeping responses responsive even without a backend. It’s a great option for developers and teams who want to prototype assistants or handle sensitive text without sending prompts to external APIs.