llama2-webui

Running Llama 2 with gradio web UI on GPU or CPU from anywhere (Linux/Windows/Mac).

Features

Supporting all Llama 2 models (7B, 13B, 70B, GPTQ, GGML) with 8-bit, 4-bit mode
Use llama2-wrapper as your local llama2 backend for Generative Agents/Apps; colab example
Run OpenAI Compatible API on Llama2 models
Supporting models: Llama-2-7b/13b/70b, all Llama-2-GPTQ, all Llama-2-GGML
Supporting model backends: tranformers, bitsandbytes(8-bit inference), AutoGPTQ(4-bit inference), llama.cpp
Demos: Run Llama2 on MacBook Air; Run Llama2 on free Colab T4 GPU

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow llama2-webui

llama2-webui Web Site

Other Useful Business Software

Forever Free Full-Stack Observability | Grafana Cloud

Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account

Rate This Project

User Reviews

Be the first to post a review of llama2-webui!

Additional Project Details

Programming Language

Python

Related Categories

Python Large Language Models (LLM), Python LLM Inference Tool

Registered

2023-08-25

Similar Business Software

Gemini Enterprise Agent Platform

Gemini Enterprise Agent Platform is a comprehensive solution from Google Cloud designed to help organizations build, scale, govern, and optimize AI agents. It represents the evolution of Vertex AI, combining advanced model development with new capabilities for agent orchestration and...

See Software
LM-Kit.NET

LM-Kit.NET is a cutting-edge, high-level inference SDK designed specifically to bring the advanced capabilities of Large Language Models (LLM) into the C# ecosystem. Tailored for developers working within .NET, LM-Kit.NET provides a comprehensive suite of powerful Generative AI tools, making...

See Software
RunPod

RunPod offers a cloud-based platform designed for running AI workloads, focusing on providing scalable, on-demand GPU resources to accelerate machine learning (ML) model training and inference. With its diverse selection of powerful GPUs like the NVIDIA A100, RTX 3090, and H100, RunPod supports...

See Software
Google AI Studio

Google AI Studio is a unified development platform that helps teams explore, build, and deploy applications using Google’s most advanced AI models, including Gemini 3. It brings text, image, audio, and video models together in one interactive playground. With vibe coding, developers can use...

See Software
GMI Cloud

GMI Cloud provides a complete platform for building scalable AI solutions with enterprise-grade GPU access and rapid model deployment. Its Inference Engine offers ultra-low-latency performance optimized for real-time AI predictions across a wide range of applications. Developers can deploy...

See Software
Mistral 7B

Mistral 7B is a 7.3-billion-parameter language model that outperforms larger models like Llama 2 13B across various benchmarks. It employs Grouped-Query Attention (GQA) for faster inference and Sliding Window Attention (SWA) to efficiently handle longer sequences. Released under the Apache 2.0...

See Software

Report inappropriate content

llama2-webui

Run any Llama 2 locally with gradio UI on GPU or CPU from anywhere

Get an email when there's a new version of llama2-webui

Features

Project Samples

Project Activity

Categories

License

Follow llama2-webui

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered