Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2

182 bytes added, 07:48, 5 October 2020

no edit summary

Specifically, it illustrates the execution of an inference application (fruit classifier) that makes use of the model described in [[ML-TN-001_-_AI_at_the_edge:_comparison_of_different_embedded_platforms_-_Part_1#Reference_application_.231:_fruit_classifier|this section]] when executed on the [[:Category:Mito8M|Mito8M SoM]], a system-on-module based on the NXP [https://www.nxp.com/products/processors-and-microcontrollers/arm-processors/i-mx-applications-processors/i-mx-8-processors/i-mx-8m-family-armcortex-a53-cortex-m4-audio-voice-video:i.MX8M i.MX8M SoC].

=== ~~Environment~~ Test bed ===

The kernel and the root file system of the tested platform were built with the L4.14.98_2.0.0 release of the Yocto Board Support Package for i.MX 8 family of devices. They were built with support for [https://www.nxp.com/design/software/development-software/eiq-ml-development-environment:EIQ eIQ]: "a collection of software and development tools for NXP microprocessors and microcontrollers to do inference of neural network models on embedded systems".

{| class="wikitable"

|+

!

|-

|BSP release

|L4.14.98_2.0.0

|

|-

|Inference engine

|TensorFlow Lite 1.12

|

|-

|

|}

==Model deployment==

To run the model on the target, a new C++ application was written. After debugging this application on a host PC, it was migrated to the edge device where it was built natively ~~built~~. The root file system for eIQ, in fact, provides the native C++ compiler as well.

The application uses OpenCV 4.0.1 to pre-process the input image and TensorFlow Lite (TFL) 1.12 as inference engine. The model, originally created and trained with Keras of TensorFlow (TF) 1.15, was therefore converted into the TFL format.

After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.

So, in the end, three converted models were obtained: a regular 32 -bit floating -point one, an 8 -bit half -quantized (only the weights, not the activations) one, and a fully -quantized one.

The following images show the graphs of the models before conversion (click to enlarge):

The following blocks show the execution of the classifier on the embedded platform.

With the floating -point model:

</pre>

With the half -quantized model:

</pre>

With the fully -quantized model:

== Results ==

As shown above, the total prediction times for a single image are:

* ~ 220 ms with the floating -point model;* ~ 330 ms with the half -quantized model;* ~ 200 ms with the fully -quantized model.

The total prediction time takes into account the time needed to fill the input tensor with the image and the average inference time over three predictions.

The same tests were repeated ~~also without~~ using ~~a RAM disk and~~ an ext4 file system stored on microSD card. No significant variations in the ~~results are the same~~prediction times were observed.

U0001

Bureaucrats, dave_user, Administrators

4,650

edits

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Quick Links

Contact us

How to use wiki

Advanced Search

Tools