Changes

Jump to: navigation, search
no edit summary
Specifically, it illustrates the execution of an inference application (fruit classifier) that makes use of the model described in [[ML-TN-001_-_AI_at_the_edge:_comparison_of_different_embedded_platforms_-_Part_1#Reference_application_.231:_fruit_classifier|this section]] when executed on the [[:Category:Mito8M|Mito8M SoM]], a system-on-module based on the NXP [https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-8-processors/i-mx-8m-family-armcortex-a53-cortex-m4-audio-voice-video:i.MX8M i.MX8M SoC].
=== Environment Test bed ===
The kernel and the root file system of the tested platform were built with the L4.14.98_2.0.0 release of the Yocto Board Support Package for i.MX 8 family of devices. They were built with support for [https://www.nxp.com/design/software/development-software/eiq-ml-development-environment:EIQ eIQ]: "a collection of software and development tools for NXP microprocessors and microcontrollers to do inference of neural network models on embedded systems".
 
{| class="wikitable"
|+
!
!
!
!
|-
|BSP release
|L4.14.98_2.0.0
|
|
|-
|Inference engine
|TensorFlow Lite 1.12
|
|
|-
|
|
|
|
|}
==Model deployment==
To run the model on the target, a new C++ application was written. After debugging this application on a host PC, it was migrated to the edge device where it was built natively built. The root file system for eIQ, in fact, provides the native C++ compiler as well.
The application uses OpenCV 4.0.1 to pre-process the input image and TensorFlow Lite (TFL) 1.12 as inference engine. The model, originally created and trained with Keras of TensorFlow (TF) 1.15, was therefore converted into the TFL format.
After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.
So, in the end, three converted models were obtained: a regular 32 -bit floating -point one, an 8 -bit half -quantized (only the weights, not the activations) one, and a fully -quantized one.
The following images show the graphs of the models before conversion (click to enlarge):
The following blocks show the execution of the classifier on the embedded platform.
With the floating -point model:
<pre class="board-terminal">
</pre>
With the half -quantized model:
<pre class="board-terminal">
</pre>
With the fully -quantized model:
<pre class="board-terminal">
== Results ==
As shown above, the total prediction times for a single image are:
* ~ 220 ms with the floating -point model;* ~ 330 ms with the half -quantized model;* ~ 200 ms with the fully -quantized model.
The total prediction time takes into account the time needed to fill the input tensor with the image and the average inference time over three predictions.
The same tests were repeated also without using a RAM disk and an ext4 file system stored on microSD card. No significant variations in the results are the sameprediction times were observed.
4,650
edits

Navigation menu