ExecuTorch

12.2 ExecuTorch

Getting Started with ExecuTorch

Created Date: 2025-07-20

This tutorial is intended to describe the necessary steps to take PyTorch model and run it using ExecuTorch. To use the framework, you will typically need to take the following steps:

Install the ExecuTorch python package and runtime libraries.
Export the PyTorch model for the target hardware configuration.
Run the model using the ExecuTorch runtime APIs on your development platform.
Deploy the model to the target platform using the ExecuTorch runtime.

12.2.1 Installation

To use ExecuTorch, you will need to install both the Python package and the appropriate platform-specific runtime libraries. Pip is the recommended way to install the ExecuTorch python package.

This package includes the dependencies needed to export a PyTorch model, as well as Python runtime bindings for model testing and evaluation. Consider installing ExecuTorch within a virtual environment, such as one provided by conda or venv.

pip install executorch

12.2.2 Preparing the Model

Exporting is the process of taking a PyTorch model and converting it to the .pte file format used by the ExecuTorch runtime. This is done using Python APIs. PTE files for common models, such as Llama 3.2, can be found on HuggingFace under ExecuTorch Community. These models have been exported and lowered for ExecuTorch, and can be directly deployed without needing to go through the lowering process.

12.2.2.1 Exporting MobileNet V2

A complete example of exporting, lowering, and verifying MobileNet V2 is available as a Colab notebook.

ExecuTorch provides hardware acceleration for a wide variety of hardware. The most commonly used backends are XNNPACK, for Arm and x86 CPU, Core ML (for iOS), Vulkan (for Android GPUs), and Qualcomm (for Qualcomm-powered Android phones).

12.2.2.2 Testing the Model

After successfully generating a .pte file, it is common to use the Python runtime APIs to validate the model on the development platform. This can be used to evaluate model accuracy before running on-device.

For the MobileNet V2 model from torchvision used in this example, image inputs are expected as a normalized, float32 tensor with a dimensions of (batch, channels, height, width). The output See torchvision.models.mobilenet_v2 for more information on the input and output tensor format for this model.

12.2.3 Running on Device

ExecuTorch provides runtime APIs in Java, Objective-C, and C++.

12.2.3.1 Android

ExecuTorch provides Java bindings for Android usage, which can be consumed from both Java and Kotlin. To add the library to your app, add the following dependency to gradle build rule.

12.2.3.2 iOS

ExecuTorch supports both iOS and MacOS via C++, as well as hardware backends for CoreML, MPS, and CPU. The iOS runtime library is provided as a collection of .xcframework targets and are made available as a Swift PM package.

12.2.3.3 C++

ExecuTorch provides C++ APIs, which can be used to target embedded or mobile devices. The C++ APIs provide a greater level of control compared to other language bindings, allowing for advanced memory management, data loading, and platform integration.