Tensorflow Savedmodel to tflite conversion which supports IMX8MP NPU

Hi,

I would like to convert the Tensorflow Savedmodel to tflite model which supports imx8mp NPU.

I followed the below steps with no success

python models/research/object_detection/exporter_main_v2.py \
–input_type image_tensor \
–pipeline_config_path training_dir/pipeline.config \
–trained_checkpoint_dir training_dir/checkpoint \
–output_directory exported-model

and I make sure its fixed shape model {
ssd {
image_resizer {
fixed_shape_resizer {
height: 320
width: 320
}
}
}
}

and also Ensure TFLite-Compatible Ops

ssd {
feature_extractor {
type: “ssd_mobilenet_v2_fpn_keras”
use_depthwise: true
}
box_predictor {
convolutional_box_predictor {
use_depthwise: true
}
}
}

tflite conversion script

import tensorflow as tf
import pathlib

saved_model_dir = “exported-model/saved_model”

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Provide representative dataset for INT8 calibration
def representative_data_gen():
data_dir = pathlib.Path(“dataset/val”)
for img_path in data_dir.glob(“*.jpg”):
img = tf.keras.preprocessing.image.load_img(img_path, target_size=(320, 320))
img = tf.keras.preprocessing.image.img_to_array(img)
img = img[tf.newaxis, …] / 255.0
yield [img.astype(“float32”)]

converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8

tflite_model = converter.convert()

with open(“model_int8.tflite”, “wb”) as f:
f.write(tflite_model)

command to run the inference with model_int8.tflite

$ USE_GPU_INFERENCE=0 \
python3 label_image.py -m model_int8.tflite \
-e /usr/lib/liblitert_vx_delegate.so

please help me if these steps correct.

I also tried converting model using eIQ tool but not working.

Related Post : Re: Tensorflow Savedmodel to tflite conversion which supports IMX8MP NPU - NXP Community

I’m facing this problem almost a month. No solution. I above won’t workout. Is it possible to develop tf savedmodel with tflite supported ops from scratch?

current verdin BSP version :

TDX Wayland with XWayland 6.4.0-devel-20250926072415+build.0 (kirkstone) \n \l
Verdin-iMX8MP_Reference-Multimedia-Image

Hi, I’m the Toradex AI Assistant.

I might make mistakes or occasionally get you wrong. Our support team will step in if needed.


TensorFlow Lite Model Conversion for iMX8MP NPU

Your approach is on the right track, but there are several key considerations for NPU compatibility on Toradex modules with iMX8MP.

Model Architecture Recommendations

Based on the Toradex documentation, the SSD MobileNet V2 architecture is specifically recommended for NPU inference on iMX8MP modules(1). The documentation states that this approach “results in a model with high performance when needed for high FPS inference on embedded devices”(1).

Required Environment Variables

To enable NPU acceleration, you need to set the correct environment variables(2):

USE_HW_ACCELERATED_INFERENCE=1
USE_GPU_INFERENCE=0
ACCEPT_FSL_EULA=1

(2)

Delegate Configuration

For NPU inference on iMX8MP, you should use the VX delegate. The correct delegate path is /usr/lib/libvx_delegate.so(2). Your command should be:

USE_HW_ACCELERATED_INFERENCE=1 USE_GPU_INFERENCE=0 \
python3 label_image.py -m model_int8.tflite \
-e /usr/lib/libvx_delegate.so

Docker Configuration

If you’re running in a container, ensure your Dockerfile includes the necessary packages(2):

RUN apt-get -y update && apt-get install -y \
    libovxlib \
    && apt-get clean && apt-get autoremove && rm -rf /var/lib/apt/lists/*

(2)

Model Quantization Requirements

Your quantization approach looks correct. The documentation confirms that “to infer this model on the NPU of modules that have it, it is necessary to quantify the model”(1).

Performance Expectations

According to Toradex testing, NPU acceleration provides approximately 12x FPS improvement compared to CPU inference(1). The documentation shows these performance metrics for the “ssd-mobilenet-v2-fpnlite-320” architecture:

  • NPU: 28.57 FPS
  • CPU: 2.34 FPS
  • GPU: 1.13 FPS(1)

Alternative Approach: Use Toradex Training Repository

Consider using the official Toradex training repository which is specifically designed for iMX8MP NPU compatibility(1). You can find it at the Training TFLite Toradex repository(1).

Troubleshooting Steps

  1. Verify NPU Support: Ensure your Verdin iMX8MP variant has NPU support by checking the feature table(1)
  2. Check Delegate Loading: If the delegate fails to load, verify that all required packages are installed(2)
  3. Model Compatibility: Ensure your model uses only NPU-supported operations

Testing Your Setup

You can test NPU functionality using the official TFLite RTSP repository, which provides scripts for both CPU and NPU inference(1).

The key difference from your current approach is ensuring the correct delegate path and environment variables are used for Toradex hardware.

Best regards, Toradex AI Assistant.


Did you like my answer? :+1: :-1:

If I answered your question, please tick the “Solution” checkbox.

For follow-up questions to me, please use @ToradexAI in your post.