Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2

415 bytes added, 10:01, 1 October 2020

Added missing parts

After that, it was also recreated and retrained with quantization-aware training of TF 1.15. In this way, a fully quantized model was obtained after conversion.

So, at it the end, three converted models were obtained: a regular 32 bit floating point one, an 8 bit half quantized (only the weights, not the activations) one, and a fully quantized one.

The following images show the graphs of the models before and after conversion.

==Running the application==

{In order to have reproducible and reliable results, some measures were taken:

* The inference was repeated several times and the average execution time was computed

* All the files required to run the test—the executable, the image files, etc.—are stored on a tmpfs RAM disk in order to make file system/storage medium overhead neglectable.~~}-------------------> ?~~The following blocks show the execution of the classifier on the embedded platform. With the floating point model:

== Results ==

As shown above, the total prediction times for a single image are:

* ~ 220 ms with the floating point model;

* ~ 330 ms with the half quantized model;

* ~ 200 ms with the fully quantized model.

The total prediction time takes into account the time needed to fill the input tensor with the image and the average inference time over three predictions.

The same tests were repeated also without using a RAM disk and the results are the same.

U0018

89

edits

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 2

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Quick Links

Website

Contact us

How to use wiki

Advanced Search

Tools