Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 4

861 bytes added, 15:54, 9 October 2020

no edit summary

=== Version 2 ===

The version 1 application was modified to accelerate the inference using the ML module (NPU ~~integrated in~~ ) of the i.MX8M Plus Soc. Neither the floating-point nor the half-quantized models work in NPU. Moreover, "the GPU/ML module driver does not support per-channel quantization yet. Therefore post-training quantization of models with TensorFlow v2 cannot be used if the model is supposed to run on the GPU/ML module (inference on CPU does not have this limitation). TensorFlow v1 quantization-aware training and model conversion is recommended in this case". So, only the fully-quantized model was tested.

=== Version 3 ===

A new C++ application was written to apply the inference to the frames captured from an image sensor instead of images retrieved from files. Like version 2, inference run on NPU, so only the fully-quantized model was tested.

== Running the applications ==

=== Version 2 ===

"The first execution of model inference using the NN API always takes many times longer, because of model graph initialization needed by the GPU/ML module"

=== Version 3 ===

== Results ==

U0018

89

edits

DAVE Developer's Wiki β

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 4

DAVE Developer's Wiki ^β