Open main menu

DAVE Developer's Wiki β

Changes

no edit summary
* 32-bit floating-point model;
* half-quantized model (post-training 8-bit quantization of the weights only);
* fully-quantized model (TensorFlow v1 quantization-aware training and 8-bit quantization of the weights and activations).
=== Version 2 ===
The version 1 application was then modified to accelerate the inference using the ML module (NPU) of the i.MX8M Plus Soc.
Neither the floating-point nor the half-quantized models work in NPU(ML module). Moreover, "the GPU/ML module driver does not support per-channel quantization yet. Therefore post-training quantization of models with TensorFlow v2 cannot be used if the model is supposed to run on the GPU/ML module (inference on CPU does not have this limitation). TensorFlow v1 quantization-aware training and model conversion is recommended in this case".
So, only the fully-quantized model was testedwith the version 2 application.
=== Version 3 ===
A new C++ application was written to apply the inference to the frames captured from an image sensor instead of images retrieved from files. Like version 2, inference run on NPU, so only the fully-quantized model was testedwith the version 3 application.
== Running the applications ==
=== Version 1 ===
The following sections detail the execution of the first version of the classifier on the embedded platform. The number of threads was also tweaked in order to test different configurations. During the execution, the well-know <code>[https://en.wikipedia.org/wiki/Htop htop]</code> utility was used to monitor the system. This tool is very convenient to get some useful information such as cores allocation, processor load, and number of running threads.
 
====== Floating-point model ======
<pre class="board-terminal">
TBD
</pre>
 
=== Version 2 ===
89
edits