Difference between revisions of "ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2"

From DAVE Developer's Wiki
Jump to: navigation, search
(Introduction)
m
Line 23: Line 23:
  
 
=== Environment ===
 
=== Environment ===
TBD indicare versioni BSP, TF ...
+
The kernel and the root file system of the tested platform were built with the L4.14.98_2.0.0 release of the Yocto Board Support Package for i.MX 8 family of devices. They were built with support for [https://www.nxp.com/design/software/development-software/eiq-ml-development-environment:EIQ eIQ]: "a collection of software and development tools for NXP microprocessors and microcontrollers to do inference of neural network models on embedded systems".
  
 
==Model deployment==
 
==Model deployment==
TBD
+
To run the model on the target, a new C++ application was written. After debugging this application on a host PC, it was migrated to the edge device where it was natively built. The root file system for eIQ, in fact, provides the native C++ compiler as well.
  
==Bulding and running the application==
+
The application uses OpenCV 4.0.1 to pre-process the input image and TensorFlow Lite (TFL) 1.12 as inference engine. The model, originally created and trained with Keras of TensorFlow (TF) 1.15, was therefore converted into the TFL format.
In order to have reproducible and reliable results, some measures were taken:
+
 
 +
Then, the same model was recreated and retrained with Keras of TF 1.12. This allowed to convert it into TFL with post-training quantization of the weights without compatibility issues with the target inference engine.
 +
 
 +
After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.
 +
 
 +
So, at the end, three converted models were obtained: a regular 32 bit floating point one, an 8 bit half quantized (only the weights, not the activations) one, and a fully quantized one.
 +
 
 +
The following images show the graphs of the models before and after conversion.
 +
 
 +
TBD{
 +
 
 +
[ML - Keras1.15_fruitsmodel.png] [ML - TFL_float_fruitsmodel.png]
 +
 
 +
[ML - Keras1.12_fruitsmodel.png] [ML - TFL_halfquant_fruitsmodel.png]
 +
 
 +
[ML - TF1.15QAT_fruitsmodel.png] [ML - TFL_QAT_fruitsmodel.png]}
 +
 
 +
==Running the application==
 +
{In order to have reproducible and reliable results, some measures were taken:
 
* The inference was repeated several times and the average execution time was computed
 
* The inference was repeated several times and the average execution time was computed
* All the files required to run the test—the executable, the image files, etc.—are stored on a tmpfs RAM disk in order to make file system/storage medium overhead neglectable.
+
* All the files required to run the test—the executable, the image files, etc.—are stored on a tmpfs RAM disk in order to make file system/storage medium overhead neglectable.}-------------------> ?
 +
The following blocks show the execution of the classifier on the embedded platform.
 +
 
 +
With the floating point model:
 +
 
 +
<pre class="board-terminal">
 +
root@imx8qmmek:~/devel/image_classifier_eIQ# ./image_classifier_cv 2 my_converted_model.tflite labels.txt testdata/red-apple1.jpg
 +
Number of threads: undefined
 +
Warmup time: 233.403 ms
 +
Original image size: 600x600x3
 +
Cropped image size: 600x600x3
 +
Resized image size: 224x224x3
 +
Input tensor index: 1
 +
Input tensor name: conv2d_8_input
 +
Selected order of channels: RGB
 +
Selected pixel values range: 0-1
 +
Filling time: 1.06354 ms
 +
Inference time 1: 219.723 ms
 +
Inference time 2: 220.512 ms
 +
Inference time 3: 221.897 ms
 +
Average inference time: 220.711 ms
 +
Total prediction time: 221.774 ms
 +
Output tensor index: 0
 +
Output tensor name: Identity
 +
Top results:
 +
1      Red Apple
 +
1.13485e-10    Orange
 +
5.58774e-18    Avocado
 +
7.49395e-20    Hand
 +
1.40372e-22    Banana
 +
</pre>
  
 
== Results ==
 
== Results ==

Revision as of 09:34, 1 October 2020

Info Box
NeuralNetwork.png Applies to Machine Learning
DMI-Mito-top.png Applies to MITO 8M
Work in progress


History[edit | edit source]

Version Date Notes
1.0.0 September 2020 First public release

Introduction[edit | edit source]

This Technical Note (TN for short) belongs to the series introduced here. Specifically, it illustrates the execution of an inference application (fruit classifier) that makes use of the model described in this section when executed on the Mito8M SoM, a system-on-module based on the NXP i.MX8M SoC.

Environment[edit | edit source]

The kernel and the root file system of the tested platform were built with the L4.14.98_2.0.0 release of the Yocto Board Support Package for i.MX 8 family of devices. They were built with support for eIQ: "a collection of software and development tools for NXP microprocessors and microcontrollers to do inference of neural network models on embedded systems".

Model deployment[edit | edit source]

To run the model on the target, a new C++ application was written. After debugging this application on a host PC, it was migrated to the edge device where it was natively built. The root file system for eIQ, in fact, provides the native C++ compiler as well.

The application uses OpenCV 4.0.1 to pre-process the input image and TensorFlow Lite (TFL) 1.12 as inference engine. The model, originally created and trained with Keras of TensorFlow (TF) 1.15, was therefore converted into the TFL format.

Then, the same model was recreated and retrained with Keras of TF 1.12. This allowed to convert it into TFL with post-training quantization of the weights without compatibility issues with the target inference engine.

After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.

So, at the end, three converted models were obtained: a regular 32 bit floating point one, an 8 bit half quantized (only the weights, not the activations) one, and a fully quantized one.

The following images show the graphs of the models before and after conversion.

TBD{

[ML - Keras1.15_fruitsmodel.png] [ML - TFL_float_fruitsmodel.png]

[ML - Keras1.12_fruitsmodel.png] [ML - TFL_halfquant_fruitsmodel.png]

[ML - TF1.15QAT_fruitsmodel.png] [ML - TFL_QAT_fruitsmodel.png]}

Running the application[edit | edit source]

{In order to have reproducible and reliable results, some measures were taken:

  • The inference was repeated several times and the average execution time was computed
  • All the files required to run the test—the executable, the image files, etc.—are stored on a tmpfs RAM disk in order to make file system/storage medium overhead neglectable.}-------------------> ?

The following blocks show the execution of the classifier on the embedded platform.

With the floating point model:

root@imx8qmmek:~/devel/image_classifier_eIQ# ./image_classifier_cv 2 my_converted_model.tflite labels.txt testdata/red-apple1.jpg 
Number of threads: undefined
Warmup time: 233.403 ms
Original image size: 600x600x3
Cropped image size: 600x600x3
Resized image size: 224x224x3
Input tensor index: 1
Input tensor name: conv2d_8_input
Selected order of channels: RGB
Selected pixel values range: 0-1
Filling time: 1.06354 ms
Inference time 1: 219.723 ms
Inference time 2: 220.512 ms
Inference time 3: 221.897 ms
Average inference time: 220.711 ms
Total prediction time: 221.774 ms
Output tensor index: 0
Output tensor name: Identity
Top results:
 1      Red Apple
 1.13485e-10    Orange
 5.58774e-18    Avocado
 7.49395e-20    Hand
 1.40372e-22    Banana

Results[edit | edit source]