NPU acceleration for tflite python application

Hi there,
I have succesfully gotten my python application to a working stage but I am struggling to enable the NPU delegation for my tflite-runtime. I have read over all the documents I know on torizon pages and torizon-samples/tflite-rtsp and have not come up with anything.
How do I modify my Dockerfile | torizonPackages.json | requirements.txt and docker-compose.yml files to ensure this can occur? The hardware I have does support CPU, GPU & NPU acceleration.

Kind regards,

Ben Johnson

Hi!

Were you able to successfully run the tflite-rtsp sample? The recommended development workflow is to fork that example and modify the Dockerfile/Application layer (https://github.com/toradex/torizon-samples/tree/bookworm/tflite/tflite-rtsp/demos/object-detection) as you see fit to run your particular model.

Cheers,

Hi Leon,

Yes I can run the tflite example, pushing the image to my device and then running it. For my application code I used a torizon ide python template, so how do I alter my code using either the Dockerfile, requirements.txt or docker-compose etc

Kind regards,

Ben

Hi Leon,
How do I run my default Docker & Docker.debug container to make the dependencies from the recipes folder install correctly. I have attached my two Docker files to this message.

Help on this issue would be much appreciated as I have been at this for a long time.

Dockerfile (1.8 KB)
Dockerfile.debug (2.8 KB)

Kind regards,

Ben

Hi @btj2,

Please correct me I get this differently, I understand you are asking how to build tflite-rtsp demo which is using npu, from torizon-sample using VS code extension V2.
I am not sure about exact changes as we also have not tested at our end but ideally if you copy the required folder to existing extension project structure and modify Dockerfile and Dockerfile.debug accordingly then it should build fine.

Note the Dockerfile for tflite-rtsp uses stage build, same we need to replicate.
Dockerfile.debug (5.9 KB)
You may required to make some more changes at the end to deploy right python3 program.I have used python3 console template to show changes required.

Project structure should look like this

Please test and let us know if you face any issue.
Best Regards
Ritesh Kumar

Hi Ritesh,
I have tried you recommendation and the application built but I am now getting this error:
“Exception has occurred: ImportError
libtensorflow-lite.so.2.9.1: cannot open shared object file: No such file or directory
File “/usr/local/lib/python3.11/dist-packages/tflite_runtime/interpreter.py”, line 33, in
from tflite_runtime import _pywrap_tensorflow_interpreter_wrapper as _interpreter_wrapper
File “/home/agriai/Torizon/mvagriai/src/services.py”, line 23, in
import tflite_runtime.interpreter as tflite
File “/home/agriai/Torizon/mvagriai/src/main.py”, line 26, in
import services
ImportError: libtensorflow-lite.so.2.9.1: cannot open shared object file: No such file or directory”

I have attached the python application folder below for your review.

image.png

requirements-debug.txt (8 Bytes)

Dockerfile.debug (5.88 KB)

torizonPackages.json (48 Bytes)

docker-compose.yml (330 Bytes)

Dockerfile (3.27 KB)

Hi @btj2,

Please download zip using below link
https://share.toradex.com/vygyia7mj8bd681

Please note you may required to adapt at various places accordingly to your use case. At the moment we only copied the required folder and build the image successfully.
When you deploy image you also need to pass right permission for accessing device, please check torizon-sample repo README.

For error you get can you share exact steps you have followed, additionally share main.py also.

Best Regards
Ritesh Kumar

Hi Ritesh,
Thank you for that! So can’t you deploy the tflitedemo application you got working via the IDE v2? i.e Run & Debug?

image.png

Hi Ritesh,
The steps I have done is:

  • Download & extract you folder.
  • Changed the permissions on the build packages;

RUN chmod +x nn-imx_1.3.0.sh
RUN ./nn-imx_1.3.0.sh
RUN chmod +x tim-vx.sh
RUN ./tim-vx.sh
RUN chmod +x tensorflow-lite_2.9.1.sh
RUN ./tensorflow-lite_2.9.1.sh
RUN chmod +x tflite-vx-delegate.sh
RUN ./tflite-vx-delegate.sh

  • I then changed the docker name to my current one (i.e agriaitflite) in docker-compose.yml:

version: “3.9”
services:
agriaitflite-debug:
build:
context: .
dockerfile: Dockerfile.debug
image: ${LOCAL_REGISTRY}:5002/agriaitflite-debug:${TAG}
ports:

  • 6502:6502
  • 6512:6512

agriaitflite:
build:
context: .
dockerfile: Dockerfile
image: ${DOCKER_LOGIN}/agriaitflite:${TAG}

  • It built when deploying using the “Run & Debug (Torizon ARMv8)”
  • I checked my internet, relfashed the board and checked all possible networking issues.
  • The error I am getting is in the error.txt file I uploaded to this email, which contains the entire cmd output.
  • I have also uploaded my whole directory for reference.

I hope you can help!

poc2.zip (2.82 MB)
error.txt.txt (93.0 KB)

Hi @btj2,

Thanks for sharing error and zip, allow me some time to check in detail and get back to you.

I will suggest you to use the one I shared where we modified the main.py with content of ibject-detection.py to verify in first place and then modify as per use case.

coming to error you shared, seems like it failed connecting to device.
Can you check on device if you see the deployed docker image

docker images

then test starting docker image using below command

docker run -it --rm -p 8554:8554   -v /dev:/dev   -v /tmp:/tmp   -v /run/udev/:/run/udev/   --device-cgroup-rule='c 4:* rmw'   --device-cgroup-rule='c 13:* rmw'   --device-cgroup-rule='c 199:* rmw'   --device-cgroup-rule='c 226:* rmw'   --device-cgroup-rule='c 81:* rmw'   -e ACCEPT_FSL_EULA=1   -e CAPTURE_DEVICE=/dev/video2   -e USE_HW_ACCELERATED_INFERENCE=1   -e USE_GPU_INFERENCE=0   --name tflite-rtsp-pre torizon-agriaitflite-debug-1:arm64

I was able to build but not able to test, as we talking I am arranging iMX8MP module for me. Give me some time to test and share my findings with you.

Best Regards
Ritesh Kumar

Hi Ritesh,

I just tried connecting over ssh and I recieved this error: The authenticity of host ‘192.168.1.26 (192.168.1.26)’ can’t be established.
ED25519 key fingerprint is SHA256:2WwyyPT70cBT/JdBb027FuE/+laQo+/wk2s8R2WxPCU.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? no
Host key verification failed.

Kind regards,

Ben

Hi Ritesh,

Have you had any success over the last six days? I can get the GPU delegate to work but not the NPU, or rtsp streaming.

Kind regards,

Ben

Hi @btj2,

yes, I can confirm similar behaviour. For me if I build outside VSCode I can see vx-delegate but not with VSCode. Yet to find reason for same. Not sure if we hit similar issue like this post Tflite RTSP demo for torizoncore not working - #6 by leon.tx

I will check internally with concerned person and revert back.

Best Regards
Ritesh Kumar

Hi Ritesh,
That’s great because I have been struggling for a long time and I have gotten the gpu delegation (“tflite.load_delegate(‘/usr/lib/libvx_delegate.so’)”)to work but not the NPU which is critical for my application and I am under time pressure. So anything would help ASAP.

Kind regards,

Ben

Hi @btj2,

Quick update, can you try copying mobilenet_v1_1.0_224_quant.tflite from usr/bin/tensorflow-lite-2.9.1/examples to same directory of main.py and update main.py as below to load new tflite model.

# Create the tensorflow-lite interpreter
        self.interpreter = tf.Interpreter(model_path="mobilenet_v1_1.0_224_quant.tflite",
                                          experimental_delegates=delegates)

Please test and let us know if this work. Also please share error logs you get to further check.

Best Regards
Ritesh Kumar

Hi @btj2 ,

Did you get time to check? Can you share if you are able to run with NPU.
Please note if you are using USB Camera for testing please update Torizon OS to 6.4.0. We tested with building by our own Docker image using same Dockerfile which we shared with you and NPU is working good.

docker run -it --entrypoint=bash --rm -p 8554:8554   -v /dev:/dev   -v /tmp:/tmp   -v /run/udev/:/run/udev/   --device-cgroup-rule='c 4:* rmw'   --device-cgroup-rule='c 13:* rmw'   --device-cgroup-rule='c 199:* rmw'   --device-cgroup-rule='c 226:* rmw'   --device-cgroup-rule='c 81:* rmw'   -e ACCEPT_FSL_EULA=1   -e CAPTURE_DEVICE=/dev/video2   -e USE_HW_ACCELERATED_INFERENCE=1   -e USE_GPU_INFERENCE=0   --name tflite-rtsp rkt0589/tflite-rtsp:tc6
root@ddcb0819ffac:/home/torizon# ls
src
root@ddcb0819ffac:/home/torizon# cd src/
root@ddcb0819ffac:/home/torizon/src# python3 main.py 
[ WARN:0@4.096] global ./modules/videoio/src/cap_gstreamer.cpp (1405) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
Vx delegate: allowed_cache_mode set to 0.
Vx delegate: allowed_builtin_code set to 0.
Vx delegate: error_during_init set to 0.
Vx delegate: error_during_prepare set to 0.
Vx delegate: error_during_invoke set to 0.
WARNING: Fallback unsupported op 32 to TfLite
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.
W [HandleLayoutInfer:272]Op 162: default layout inference pass.

Best Regards
Ritesh Kumar

Hi Ritesh,

Sorry for the late reply, I have been away on a work trip. Yes I finally got it working after I used a quantised tflite model (uint8), after making a tonne of changes to the Dockerfile scripts. Thank you!

I am able to ask another question in regards to getting the CSI Camera AR0521 working? I have followed the tutorial shown below in the toradex developer resources page: https://developer.toradex.com/torizon/application-development/use-cases/multimedia/first-steps-with-csi-camera-set-5mp-ar0521-color-torizon

I cannot get the torizon OS to build and I am getting multiple errors! is there any advice you can give me to get the camera working? Also I want to be able to eventually get the MPI CSI-2 Driver to work for the VC MIPI IMX327C sensor. Here is the repo for it but it appears to be for only the verdin i.MX8M Mini and Dahlia carrier board. I am using the Verdin I.MX8M Plus, and the Mallow Carrier board, I am planning on integrating the mismatch in wiring of the MIPI lanes in a daughter PCB board.

It would be great to get this working and I would be happy to use the Toradex contracting services to do it if at all possible?

Hi @btj2,

I cannot get the torizon OS to build and I am getting multiple errors! is there any advice you can give me to get the camera working? Also I want to be able to eventually get the MPI CSI-2 Driver to work for the VC MIPI IMX327C sensor. Here is the repo for it but it appears to be for only the verdin i.MX8M Mini and Dahlia carrier board. I am using the Verdin I.MX8M Plus, and the Mallow Carrier board, I am planning on integrating the mismatch in wiring of the MIPI lanes in a daughter PCB board.

Can you please write or create new post for this question and we will check accordingly. Also I don’t see any link so please do check to share link.
for now I am marking this post solved.

Best Regards
Ritesh Kumar