LiteRT-LM is Google’s open-source inference framework for deploying large language models on edge devices. It is built for production-oriented local LLM execution across Android, iOS, desktop, web, embedded, and IoT environments. The framework focuses on performance, hardware acceleration, and efficient model serving close to the user instead of relying only on remote cloud inference. It supports CPU execution across major platforms and adds GPU or NPU acceleration where available. LiteRT-LM is especially relevant for developers building private, low-latency AI features on phones, laptops, Raspberry Pi-style devices, and other edge hardware. Its goal is to make modern language models usable in local applications with a consistent deployment stack.

Features

  • Edge LLM inference
  • Android, iOS, desktop, web, and IoT support
  • CPU, GPU, and NPU acceleration
  • Prebuilt binaries and mobile demos
  • Production-ready deployment focus
  • Local low-latency AI execution

Project Samples

Project Activity

See All Activity >

Categories

Machine Learning

License

Apache License V2.0

Follow LiteRT-LM

LiteRT-LM Web Site

Other Useful Business Software
Streamline Azure Security with Palo Alto Networks VM-Series Icon
Streamline Azure Security with Palo Alto Networks VM-Series

Centrally manage physical and virtualized firewalls with Panorama

Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
Learn more
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of LiteRT-LM!

Additional Project Details

Operating Systems

Android, Apple iPhone, Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Machine Learning Software

Registered

5 hours ago