Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

1,326 bytes added, 15:31, 7 October 2020

no edit summary

===Train the model===

===Prune the model===

Weight pruning means eliminating unnecessary values in the weight tensors, practically setting the neural network parameters’ values to zero to remove the unnecessary connections between the layers of a neural network. This is done during the training process to allow the neural network to adapt to the changes. An immediate benefit from this work is disk compression: sparse tensors are amenable to compression. Hence, by applying simple file compression to the pruned TensorFlow checkpoint, it is possible to reduce the size of the model for its storage and/or transmission.

The weights sparsity of the model, before applying pruning; It is notable how there is actually no sparsity in the weights of the model.

<pre>

predictions/bias:0 -- Param: 6 -- Zeros: 00.00%

</pre>

The dimension in bytes of the compressed model size before applying pruning:

<pre>

Size of gzipped loaded model: 17801431.00 bytes

</pre>

The accuracy of the non-pruned model over the test dataset:

<pre>

1/1 [==============================] - 0s 214ms/step - loss: 1.3166 - acc: 0.7083

</pre>

For this particular case, a good compromise between compression and accuracy drop, is to prune only the two dense layers of the model, which have a high number of parameters.

<pre>

predictions/bias:0 -- Param: 6 -- Zeros: 00.00%

</pre>

The dimension in bytes of the compressed model size after pruning; the difference between the two versions of the same compressed model (before and after pruning) in terms of disk occupation is remarkable, almost by a factor of 3.

<pre>

Size of gzipped loaded model: 5795289.00 bytes

</pre>

The accuracy of the pruned model over the test dataset:

<pre>

U0019

dave_user

207

edits

DAVE Developer's Wiki β

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

DAVE Developer's Wiki ^β