Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

311 bytes added, 20:28, 8 October 2020

no edit summary

</pre>

After performing the optimization, the new description of the computational graph is ~~the following one; note that~~ provided:

<pre>

===Quantize the computational graph===

The process of inference is expensive in terms of computation and requires a high memory bandwidth to satisfy the low-latency and high-throughput requirement of edge applications. Generally, when training neural networks, 32-bit floating-point weights and activation values are used but, with the Vitis AI quantizer the complexity of the computation could be reduced without losing prediction accuracy, by converting the 32-bit floating-point values to 8-bit integer format. In this case, the fixed-point network model requires less memory bandwidth, providing faster speed and higher power efficiency than using the floating-point model.

In the quantize calibration process, only a small set of images are required to analyze the distribution of activations. Since we are not performing any backpropagation, there is no need to provide any labels either. Depending on the size of the neural network the running time of quantize calibration varies from a few seconds to several minutes.

After calibration, the quantized model is transformed into a DPU deployable model (named as deploy_model.pb for vai_q_tensorflow) which follows the data format of a DPU. This model can ~~then~~ be compiled by the Vitis AI compiler and deployed to the DPU. This quantized model cannot be used by the standard TensorFlow framework to evaluate the loss of accuracy; hence in order to do so, a second file is produced (named as quantize_eval_model.pb for vai_q_tensorflow). For the current application, 100 images are sampled from the train dataset and augmented, resulting in a total number of 1000 images used for calibration. Furthermore, the graph is calibrated providing a batch of 10 images for 100 iterations. Following the log of vai_q_tensorflow shows the result of the whole quantization process:

<pre>

U0019

dave_user

207

edits

DAVE Developer's Wiki β

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

DAVE Developer's Wiki ^β