Difference between revisions of "ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3"

From DAVE Developer's Wiki
Jump to: navigation, search
Line 224: Line 224:
 
</pre>
 
</pre>
  
<>
+
<pre>
 
Graph accuracy with test dataset: 0.7083
 
Graph accuracy with test dataset: 0.7083
 
</pre>
 
</pre>

Revision as of 15:33, 5 October 2020

Info Box
NeuralNetwork.png Applies to Machine Learning
Work in progress


History[edit | edit source]

Version Date Notes
1.0.0 September 2020 First public release

Introduction[edit | edit source]

This Technical Note (TN for short) belongs to the series introduced here. Specifically, it illustrates the execution of an inference application (fruit classifier) that makes use of the model described in this section when executed on the Xilinx Zynq UltraScale+ MPSoC ZCU104 Evaluation Kit. The results achieved are also compared to the ones produced by the platforms that were considered in the previous articles of this series.

Building the application[edit | edit source]

Training the model[edit | edit source]

Pruning the model[edit | edit source]

conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 00.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.39%
predictions/kernel:0 -- Param:     1536 -- Zeros: 00.00%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%
Size of gzipped loaded model: 17801431.00 bytes
Test set
1/1 [==============================] - 0s 214ms/step - loss: 1.3166 - acc: 0.7083


conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 80.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.00%
predictions/kernel:0 -- Param:     1536 -- Zeros: 80.01%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%
Size of gzipped loaded model: 5795289.00 bytes
Test set
1/1 [==============================] - 0s 29ms/step - loss: 1.4578 - acc: 0.6667

Freezing the computational graph[edit | edit source]

Baseline model

INFO:tensorflow:Froze 12 variables.
I1002 09:08:49.716494 140705992206144 graph_util_impl.py:334] Froze 12 variables.
INFO:tensorflow:Converted 12 variables to const ops.
I1002 09:08:49.776397 140705992206144 graph_util_impl.py:394] Converted 12 variables to const ops.

Transform the computational graph[edit | edit source]

Applied transformations

transformations_list = ['remove_nodes(op=Identity, op=CheckNumerics)', 
                        'merge_duplicate_nodes',
                        'strip_unused_nodes',
                        'fold_constants(ignore_errors=true)',
                        'fold_batch_norms']

Baseline model

describe             : frozen_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/kernel', 'predictions/bias', 'predictions/MatMul/ReadVariableOp', 'predictions/MatMul', 'predictions/BiasAdd/ReadVariableOp', 'predictions/BiasAdd', 'predictions/Softmax']
quantization nodes   : []
constant count       : 16
variable count       : 0
identity count       : 13
total nodes          : 56
Op: Placeholder          -- Name: images_in                     
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_1/bias                 
Op: Identity             -- Name: conv2d_1/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: Identity             -- Name: conv2d_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: conv2d_2/kernel               
Op: Const                -- Name: conv2d_2/bias                 
Op: Identity             -- Name: conv2d_2/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: Identity             -- Name: conv2d_2/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Const                -- Name: conv2d_3/kernel               
Op: Const                -- Name: conv2d_3/bias                 
Op: Identity             -- Name: conv2d_3/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Identity             -- Name: conv2d_3/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/kernel               
Op: Const                -- Name: conv2d_4/bias                 
Op: Identity             -- Name: conv2d_4/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Identity             -- Name: conv2d_4/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Shape                -- Name: flatten/Shape                 
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: flatten/strided_slice/stack_2 
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Reshape              -- Name: flatten/Reshape               
Op: Const                -- Name: dense_1/kernel                
Op: Const                -- Name: dense_1/bias                  
Op: Identity             -- Name: dense_1/MatMul/ReadVariableOp 
Op: MatMul               -- Name: dense_1/MatMul                
Op: Identity             -- Name: dense_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: Relu                 -- Name: dense_1/Relu                  
Op: Identity             -- Name: dropout_1/Identity            
Op: Const                -- Name: predictions/kernel            
Op: Const                -- Name: predictions/bias              
Op: Identity             -- Name: predictions/MatMul/ReadVariableOp
Op: MatMul               -- Name: predictions/MatMul            
Op: Identity             -- Name: predictions/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: predictions/BiasAdd           
Op: Softmax              -- Name: predictions/Softmax
describe             : baseline_transf_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/MatMul', 'predictions/kernel', 'predictions/bias', 'predictions/Softmax', 'predictions/BiasAdd']
quantization nodes   : []
constant count       : 15
variable count       : 0
identity count       : 0
total nodes          : 42
Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Const                -- Name: conv2d_2/kernel               
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_3/kernel               
Op: Placeholder          -- Name: images_in                     
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Const                -- Name: conv2d_3/bias                 
Op: Const                -- Name: conv2d_4/kernel               
Op: Reshape              -- Name: flatten/Reshape               
Op: Shape                -- Name: flatten/Shape                 
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Const                -- Name: conv2d_2/bias                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MatMul               -- Name: predictions/MatMul            
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: dense_1/kernel                
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Const                -- Name: predictions/kernel            
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Const                -- Name: conv2d_1/bias                 
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: dense_1/bias                  
Op: Const                -- Name: predictions/bias              
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/bias                 
Op: Relu                 -- Name: dense_1/Relu                  
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: Softmax              -- Name: predictions/Softmax           
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: MatMul               -- Name: dense_1/MatMul                
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: BiasAdd              -- Name: predictions/BiasAdd
Graph accuracy with test dataset: 0.7083
Graph accuracy with test dataset: 0.6667

Quantize the computational graph[edit | edit source]

Baseline model

graph accuracy with test dataset: 0.7083

Pruned model

graph accuracy with test dataset: 0.7083

Compiling the model[edit | edit source]

Baseline model

Kernel topology "custom_cnn_kernel_graph.jpg" for network "custom_cnn"
kernel list info for network "custom_cnn"
                               Kernel ID : Name
                                       0 : custom_cnn_0
                                       1 : custom_cnn_1

                             Kernel Name : custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Pruned model

Kernel topology "pruned_custom_cnn_kernel_graph.jpg" for network "pruned_custom_cnn"
kernel list info for network "pruned_custom_cnn"
                               Kernel ID : Name
                                       0 : pruned_custom_cnn_0
                                       1 : pruned_custom_cnn_1

                             Kernel Name : pruned_custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : pruned_custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Testing and performances[edit | edit source]