ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

HOME	SOMs	SBCs	ToloMEO Embedded Assistant	GET A QUOTE	ONLINE HELPDESK
	Roadmap		IoT Services			ML/AI services	Embedded Design Services

Info Box

Applies to Machine Learning

Work in progress

HistoryEdit

Version	Date	Notes
1.0.0	September 2020	First public release

IntroductionEdit

This Technical Note (TN for short) belongs to the series introduced here. Specifically, it illustrates the execution of this inference application (fruit classifier) on the Xilinx Zynq UltraScale+ MPSoC ZCU104 Evaluation Kit. The results achieved are also compared to the ones produced by other platforms discussed in the articles of this series.

Building the applicationEdit

Training the modelEdit

Pruning the modelEdit

conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 00.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.39%
predictions/kernel:0 -- Param:     1536 -- Zeros: 00.00%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%

Size of gzipped loaded model: 17801431.00 bytes

Test set
1/1 [==============================] - 0s 214ms/step - loss: 1.3166 - acc: 0.7083

conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 80.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.00%
predictions/kernel:0 -- Param:     1536 -- Zeros: 80.01%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%

Size of gzipped loaded model: 5795289.00 bytes

Test set
1/1 [==============================] - 0s 29ms/step - loss: 1.4578 - acc: 0.6667

Freezing the computational graphEdit

Baseline model

INFO:tensorflow:Froze 12 variables.
I1002 09:08:49.716494 140705992206144 graph_util_impl.py:334] Froze 12 variables.
INFO:tensorflow:Converted 12 variables to const ops.
I1002 09:08:49.776397 140705992206144 graph_util_impl.py:394] Converted 12 variables to const ops.

Transform the computational graphEdit

Applied transformations

transformations_list = ['remove_nodes(op=Identity, op=CheckNumerics)', 
                        'merge_duplicate_nodes',
                        'strip_unused_nodes',
                        'fold_constants(ignore_errors=true)',
                        'fold_batch_norms']

Baseline model

describe             : frozen_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/kernel', 'predictions/bias', 'predictions/MatMul/ReadVariableOp', 'predictions/MatMul', 'predictions/BiasAdd/ReadVariableOp', 'predictions/BiasAdd', 'predictions/Softmax']
quantization nodes   : []
constant count       : 16
variable count       : 0
identity count       : 13
total nodes          : 56

Op: Placeholder          -- Name: images_in                     
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_1/bias                 
Op: Identity             -- Name: conv2d_1/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: Identity             -- Name: conv2d_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: conv2d_2/kernel               
Op: Const                -- Name: conv2d_2/bias                 
Op: Identity             -- Name: conv2d_2/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: Identity             -- Name: conv2d_2/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Const                -- Name: conv2d_3/kernel               
Op: Const                -- Name: conv2d_3/bias                 
Op: Identity             -- Name: conv2d_3/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Identity             -- Name: conv2d_3/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/kernel               
Op: Const                -- Name: conv2d_4/bias                 
Op: Identity             -- Name: conv2d_4/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Identity             -- Name: conv2d_4/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Shape                -- Name: flatten/Shape                 
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: flatten/strided_slice/stack_2 
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Reshape              -- Name: flatten/Reshape               
Op: Const                -- Name: dense_1/kernel                
Op: Const                -- Name: dense_1/bias                  
Op: Identity             -- Name: dense_1/MatMul/ReadVariableOp 
Op: MatMul               -- Name: dense_1/MatMul                
Op: Identity             -- Name: dense_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: Relu                 -- Name: dense_1/Relu                  
Op: Identity             -- Name: dropout_1/Identity            
Op: Const                -- Name: predictions/kernel            
Op: Const                -- Name: predictions/bias              
Op: Identity             -- Name: predictions/MatMul/ReadVariableOp
Op: MatMul               -- Name: predictions/MatMul            
Op: Identity             -- Name: predictions/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: predictions/BiasAdd           
Op: Softmax              -- Name: predictions/Softmax

describe             : baseline_transf_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/MatMul', 'predictions/kernel', 'predictions/bias', 'predictions/Softmax', 'predictions/BiasAdd']
quantization nodes   : []
constant count       : 15
variable count       : 0
identity count       : 0
total nodes          : 42

Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Const                -- Name: conv2d_2/kernel               
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_3/kernel               
Op: Placeholder          -- Name: images_in                     
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Const                -- Name: conv2d_3/bias                 
Op: Const                -- Name: conv2d_4/kernel               
Op: Reshape              -- Name: flatten/Reshape               
Op: Shape                -- Name: flatten/Shape                 
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Const                -- Name: conv2d_2/bias                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MatMul               -- Name: predictions/MatMul            
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: dense_1/kernel                
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Const                -- Name: predictions/kernel            
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Const                -- Name: conv2d_1/bias                 
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: dense_1/bias                  
Op: Const                -- Name: predictions/bias              
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/bias                 
Op: Relu                 -- Name: dense_1/Relu                  
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: Softmax              -- Name: predictions/Softmax           
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: MatMul               -- Name: dense_1/MatMul                
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: BiasAdd              -- Name: predictions/BiasAdd

Graph accuracy with test dataset: 0.7083

Graph accuracy with test dataset: 0.6667

Quantize the computational graphEdit

Baseline model

graph accuracy with test dataset: 0.7083

Pruned model

graph accuracy with test dataset: 0.7083

Compiling the modelEdit

Baseline model

Kernel topology "custom_cnn_kernel_graph.jpg" for network "custom_cnn"
kernel list info for network "custom_cnn"
                               Kernel ID : Name
                                       0 : custom_cnn_0
                                       1 : custom_cnn_1

                             Kernel Name : custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Pruned model

Kernel topology "pruned_custom_cnn_kernel_graph.jpg" for network "pruned_custom_cnn"
kernel list info for network "pruned_custom_cnn"
                               Kernel ID : Name
                                       0 : pruned_custom_cnn_0
                                       1 : pruned_custom_cnn_1

                             Kernel Name : pruned_custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : pruned_custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Testing and performancesEdit

HOME	SOMs	SBCs	ToloMEO Embedded Assistant	GET A QUOTE	ONLINE HELPDESK
	Roadmap		IoT Services			ML/AI services	Embedded Design Services

DAVE Developer's Wiki ^β

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

Contents

HistoryEdit

IntroductionEdit

Building the applicationEdit

Training the modelEdit

Pruning the modelEdit

Freezing the computational graphEdit

Transform the computational graphEdit

Quantize the computational graphEdit

Compiling the modelEdit

Testing and performancesEdit

DAVE Developer's Wiki β

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3

Contents

HistoryEdit

IntroductionEdit

Building the applicationEdit

Training the modelEdit

Pruning the modelEdit

Freezing the computational graphEdit

Transform the computational graphEdit

Quantize the computational graphEdit

Compiling the modelEdit

Testing and performancesEdit

DAVE Developer's Wiki ^β