| commit | b85cdce0f2592159a8e706cc86523ecb1b32d146 | [log] [tgz] |
|---|---|---|
| author | jl45G <[email protected]> | Thu Nov 13 07:03:14 2025 |
| committer | GitHub <[email protected]> | Thu Nov 13 07:03:14 2025 |
| tree | 1c21c4665effbfe8a3d8890753b334a0b4bc8770 | |
| parent | f73c7b76a0ce22d718e7d916158ac1b9ce76bf77 [diff] | |
| parent | df353a86585b087888b540d293e82729d97efade [diff] |
Merge pull request #4521 from dbcp1/2.0.3 Cherrypick changes for 2.0.3
Google's On-device framework for high-performance ML & GenAI deployment on edge platforms, via efficient conversion, runtime, and optimization
π Get Started | π€ Contributing | π License | π‘ Security Policy | π Documentation
LiteRT continues the legacy of TensorFlow Lite as the trusted, high-performance runtime for on-device AI.
LiteRT V2 (aka Next as announced at Google IO '25), introduced a new set of APIs, featuring advanced GPU/NPU acceleration, delivering superior performance, and making on-device ML inference easier than ever.
π New LiteRT v2 API: Streamline development with automated accelerator selection, true async execution, and efficient I/O buffer handling.
π€ Unified NPU Acceleration: Offer seamless access to NPUs from major chipset providers with a consistent developer experience. LiteRT NPU acceleration is available through an Early Access Program.
β‘ Best-in-class GPU Performance: Use state-of-the-art GPU acceleration for on-device ML. The new buffer interoperability enables zero-copy and minimizes latency across various GPU buffer types.
π§ Superior Generative AI inference: Enable the simplest integration with the best performance for GenAI models.
LiteRT is designed for cross-platform deployment on a wide range of hardware.
| Platform | CPU Support | GPU Support | NPU Support |
|---|---|---|---|
| π€ Android | β | β
OpenCL WebGPU* | Google Tensor* β Qualcomm β MediaTek S.SLI* |
| π iOS | β | Metal* | ANE* |
| π§ Linux | β | WebGPU* | N/A |
| π macOS | β | Metal* | ANE* |
| π» Windows | β | WebGPU* | Intel* |
| π Web | Coming soon | Coming soon | Coming soon |
| π§© Embedded | Broadcom* Raspberry Pi* |
*Coming soon
Coming soon...
For a comprehensive guide to setting up your application with LiteRT Next, see the Get Started guide.
You can build LiteRT from source:
build_with_docker.sh under docker_build/The script automatically creates a Linux Docker image, which allows you to build artifacts for Linux and Android (through cross compilation). See build instructions in BUILD_INSTRUCTIONS.md for more information on how to build runtime libraries with the docker container.
For more information about using docker interactive shell or building different targets, please refer to docker_build/README.md.
Every developer's path is different. Here are a few common journeys to help you get started based on your goals:
.tflite format, and use AI Edge Quantizer to optimize the model for optimal performance under resource constraints. From there, you can deploy it using the standard LiteRT runtime.Where Next:
Beta by Dec 2025:
General Availability by Google IO, May 2026
Our commitment is to make LiteRT the best runtime for any on-device ML deployment. The above roadmap is defined based on the following product strategy:
Going forward, LiteRT will establish a release cadence for minor release every 4-6 weeks.
This roadmap is subject to change. We encourage community feedbackβplease open an issue to discuss proposals or ideas!
We welcome contributions to LiteRT. Please see the CONTRIBUTING.md file for more information on how to contribute.
We encourage you to reach out if you need help.
LiteRT is part of a larger ecosystem of tools for on-device machine learning. Check out these other projects from Google:
This project is dedicated to fostering an open and welcoming environment. Please read our Code of Conduct to understand the standards of behavior we expect from all participants in our community.
LiteRT is licensed under the Apache-2.0 License.