LiteRT-LM is Google’s open-source inference framework for deploying large language models on edge devices. It is built for production-oriented local LLM execution across Android, iOS, desktop, web, embedded, and IoT environments. The framework focuses on performance, hardware acceleration, and efficient model serving close to the user instead of relying only on remote cloud inference. It supports CPU execution across major platforms and adds GPU or NPU acceleration where available. LiteRT-LM is especially relevant for developers building private, low-latency AI features on phones, laptops, Raspberry Pi-style devices, and other edge hardware. Its goal is to make modern language models usable in local applications with a consistent deployment stack.

Features

  • Edge LLM inference
  • Android, iOS, desktop, web, and IoT support
  • CPU, GPU, and NPU acceleration
  • Prebuilt binaries and mobile demos
  • Production-ready deployment focus
  • Local low-latency AI execution

Project Samples

Project Activity

See All Activity >

Categories

Machine Learning

License

Apache License V2.0

Follow LiteRT-LM

LiteRT-LM Web Site

Other Useful Business Software
MongoDB Atlas runs apps anywhere Icon
MongoDB Atlas runs apps anywhere

Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
Start Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of LiteRT-LM!

Additional Project Details

Operating Systems

Android, Apple iPhone, Linux, Mac, Windows

Programming Language

C++

Related Categories

C++ Machine Learning Software

Registered

11 hours ago