Changes

Jump to: navigation, search
Added missing parts
After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.
So, at it the end, three converted models were obtained: a regular 32 bit floating point one, an 8 bit half quantized (only the weights, not the activations) one, and a fully quantized one.
The following images show the graphs of the models before and after conversion.
==Running the application==
{In order to have reproducible and reliable results, some measures were taken:
* The inference was repeated several times and the average execution time was computed
* All the files required to run the test—the executable, the image files, etc.—are stored on a tmpfs RAM disk in order to make file system/storage medium overhead neglectable.}-------------------> ?The following blocks show the execution of the classifier on the embedded platform. With the floating point model:
<pre class="board-terminal">
== Results ==
As shown above, the total prediction times for a single image are:
* ~ 220 ms with the floating point model;
* ~ 330 ms with the half quantized model;
* ~ 200 ms with the fully quantized model.
The total prediction time takes into account the time needed to fill the input tensor with the image and the average inference time over three predictions.
 
The same tests were repeated also without using a RAM disk and the results are the same.
89
edits

Navigation menu