LiteRT-LM is Google’s open-source inference framework for deploying large language models on edge devices. It is built for production-oriented local LLM execution across Android, iOS, desktop, web, embedded, and IoT environments. The framework focuses on performance, hardware acceleration, and efficient model serving close to the user instead of relying only on remote cloud inference. It supports CPU execution across major platforms and adds GPU or NPU acceleration where available. LiteRT-LM is especially relevant for developers building private, low-latency AI features on phones, laptops, Raspberry Pi-style devices, and other edge hardware. Its goal is to make modern language models usable in local applications with a consistent deployment stack.

Features

  • Edge LLM inference
  • Android, iOS, desktop, web, and IoT support
  • CPU, GPU, and NPU acceleration
  • Prebuilt binaries and mobile demos
  • Production-ready deployment focus
  • Local low-latency AI execution

Project Samples

Project Activity

See All Activity >

Categories

Machine Learning

License

Apache License V2.0

Follow LiteRT-LM

LiteRT-LM Web Site

Other Useful Business Software
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of LiteRT-LM!

Additional Project Details

Operating Systems

Android, Apple iPhone, Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Machine Learning Software

Registered

1 day ago