Difference between revisions of "ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3"

Revision as of 15:33, 5 October 2020

HOME	SOMs	SBCs	ToloMEO Embedded Assistant	GET A QUOTE	ONLINE HELPDESK
	Roadmap		IoT Services			ML/AI services	Embedded Design Services

Info Box

Applies to Machine Learning

Work in progress

History[edit | edit source]

Version	Date	Notes
1.0.0	September 2020	First public release

Introduction[edit | edit source]

This Technical Note (TN for short) belongs to the series introduced here. Specifically, it illustrates the execution of an inference application (fruit classifier) that makes use of the model described in this section when executed on the Xilinx Zynq UltraScale+ MPSoC ZCU104 Evaluation Kit. The results achieved are also compared to the ones produced by the platforms that were considered in the previous articles of this series.

Building the application[edit | edit source]

Training the model[edit | edit source]

Pruning the model[edit | edit source]

conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 00.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.39%
predictions/kernel:0 -- Param:     1536 -- Zeros: 00.00%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%

Size of gzipped loaded model: 17801431.00 bytes

Test set
1/1 [==============================] - 0s 214ms/step - loss: 1.3166 - acc: 0.7083

conv2d_1/kernel:0    -- Param:      864 -- Zeros: 00.00%
conv2d_1/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_2/kernel:0    -- Param:     9216 -- Zeros: 00.00%
conv2d_2/bias:0      -- Param:       32 -- Zeros: 00.00%
conv2d_3/kernel:0    -- Param:    18432 -- Zeros: 00.00%
conv2d_3/bias:0      -- Param:       64 -- Zeros: 00.00%
conv2d_4/kernel:0    -- Param:    73728 -- Zeros: 00.00%
conv2d_4/bias:0      -- Param:      128 -- Zeros: 00.00%
dense_1/kernel:0     -- Param:  4718592 -- Zeros: 80.00%
dense_1/bias:0       -- Param:      256 -- Zeros: 00.00%
predictions/kernel:0 -- Param:     1536 -- Zeros: 80.01%
predictions/bias:0   -- Param:        6 -- Zeros: 00.00%

Size of gzipped loaded model: 5795289.00 bytes

Test set
1/1 [==============================] - 0s 29ms/step - loss: 1.4578 - acc: 0.6667

Freezing the computational graph[edit | edit source]

Baseline model

INFO:tensorflow:Froze 12 variables.
I1002 09:08:49.716494 140705992206144 graph_util_impl.py:334] Froze 12 variables.
INFO:tensorflow:Converted 12 variables to const ops.
I1002 09:08:49.776397 140705992206144 graph_util_impl.py:394] Converted 12 variables to const ops.

Transform the computational graph[edit | edit source]

Applied transformations

transformations_list = ['remove_nodes(op=Identity, op=CheckNumerics)', 
                        'merge_duplicate_nodes',
                        'strip_unused_nodes',
                        'fold_constants(ignore_errors=true)',
                        'fold_batch_norms']

Baseline model

describe             : frozen_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/kernel', 'predictions/bias', 'predictions/MatMul/ReadVariableOp', 'predictions/MatMul', 'predictions/BiasAdd/ReadVariableOp', 'predictions/BiasAdd', 'predictions/Softmax']
quantization nodes   : []
constant count       : 16
variable count       : 0
identity count       : 13
total nodes          : 56

Op: Placeholder          -- Name: images_in                     
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_1/bias                 
Op: Identity             -- Name: conv2d_1/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: Identity             -- Name: conv2d_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: conv2d_2/kernel               
Op: Const                -- Name: conv2d_2/bias                 
Op: Identity             -- Name: conv2d_2/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: Identity             -- Name: conv2d_2/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Const                -- Name: conv2d_3/kernel               
Op: Const                -- Name: conv2d_3/bias                 
Op: Identity             -- Name: conv2d_3/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Identity             -- Name: conv2d_3/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/kernel               
Op: Const                -- Name: conv2d_4/bias                 
Op: Identity             -- Name: conv2d_4/Conv2D/ReadVariableOp
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Identity             -- Name: conv2d_4/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Shape                -- Name: flatten/Shape                 
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: flatten/strided_slice/stack_2 
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Reshape              -- Name: flatten/Reshape               
Op: Const                -- Name: dense_1/kernel                
Op: Const                -- Name: dense_1/bias                  
Op: Identity             -- Name: dense_1/MatMul/ReadVariableOp 
Op: MatMul               -- Name: dense_1/MatMul                
Op: Identity             -- Name: dense_1/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: Relu                 -- Name: dense_1/Relu                  
Op: Identity             -- Name: dropout_1/Identity            
Op: Const                -- Name: predictions/kernel            
Op: Const                -- Name: predictions/bias              
Op: Identity             -- Name: predictions/MatMul/ReadVariableOp
Op: MatMul               -- Name: predictions/MatMul            
Op: Identity             -- Name: predictions/BiasAdd/ReadVariableOp
Op: BiasAdd              -- Name: predictions/BiasAdd           
Op: Softmax              -- Name: predictions/Softmax

describe             : baseline_transf_graph.pb
input feature nodes  : ['images_in']
unused nodes         : []
output nodes         : ['predictions/MatMul', 'predictions/kernel', 'predictions/bias', 'predictions/Softmax', 'predictions/BiasAdd']
quantization nodes   : []
constant count       : 15
variable count       : 0
identity count       : 0
total nodes          : 42

Op: Conv2D               -- Name: conv2d_1/Conv2D               
Op: BiasAdd              -- Name: conv2d_2/BiasAdd              
Op: Relu                 -- Name: conv2d_4/Relu                 
Op: Conv2D               -- Name: conv2d_3/Conv2D               
Op: Const                -- Name: conv2d_2/kernel               
Op: MaxPool              -- Name: maxpool_4/MaxPool             
Op: Const                -- Name: conv2d_1/kernel               
Op: Const                -- Name: conv2d_3/kernel               
Op: Placeholder          -- Name: images_in                     
Op: Pack                 -- Name: flatten/Reshape/shape         
Op: Const                -- Name: conv2d_3/bias                 
Op: Const                -- Name: conv2d_4/kernel               
Op: Reshape              -- Name: flatten/Reshape               
Op: Shape                -- Name: flatten/Shape                 
Op: Conv2D               -- Name: conv2d_4/Conv2D               
Op: Const                -- Name: conv2d_2/bias                 
Op: MaxPool              -- Name: maxpool_2/MaxPool             
Op: Relu                 -- Name: conv2d_1/Relu                 
Op: MatMul               -- Name: predictions/MatMul            
Op: BiasAdd              -- Name: dense_1/BiasAdd               
Op: MaxPool              -- Name: maxpool_1/MaxPool             
Op: Const                -- Name: flatten/strided_slice/stack   
Op: Const                -- Name: dense_1/kernel                
Op: BiasAdd              -- Name: conv2d_1/BiasAdd              
Op: Const                -- Name: flatten/Reshape/shape/1       
Op: Const                -- Name: predictions/kernel            
Op: BiasAdd              -- Name: conv2d_4/BiasAdd              
Op: Const                -- Name: conv2d_1/bias                 
Op: Relu                 -- Name: conv2d_2/Relu                 
Op: Const                -- Name: flatten/strided_slice/stack_1 
Op: Const                -- Name: dense_1/bias                  
Op: Const                -- Name: predictions/bias              
Op: Conv2D               -- Name: conv2d_2/Conv2D               
Op: MaxPool              -- Name: maxpool_3/MaxPool             
Op: Const                -- Name: conv2d_4/bias                 
Op: Relu                 -- Name: dense_1/Relu                  
Op: Relu                 -- Name: conv2d_3/Relu                 
Op: Softmax              -- Name: predictions/Softmax           
Op: BiasAdd              -- Name: conv2d_3/BiasAdd              
Op: MatMul               -- Name: dense_1/MatMul                
Op: StridedSlice         -- Name: flatten/strided_slice         
Op: BiasAdd              -- Name: predictions/BiasAdd

Graph accuracy with test dataset: 0.7083

Graph accuracy with test dataset: 0.6667

Quantize the computational graph[edit | edit source]

Baseline model

graph accuracy with test dataset: 0.7083

Pruned model

graph accuracy with test dataset: 0.7083

Compiling the model[edit | edit source]

Baseline model

Kernel topology "custom_cnn_kernel_graph.jpg" for network "custom_cnn"
kernel list info for network "custom_cnn"
                               Kernel ID : Name
                                       0 : custom_cnn_0
                                       1 : custom_cnn_1

                             Kernel Name : custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Pruned model

Kernel topology "pruned_custom_cnn_kernel_graph.jpg" for network "pruned_custom_cnn"
kernel list info for network "pruned_custom_cnn"
                               Kernel ID : Name
                                       0 : pruned_custom_cnn_0
                                       1 : pruned_custom_cnn_1

                             Kernel Name : pruned_custom_cnn_0
--------------------------------------------------------------------------------
                             Kernel Type : DPUKernel
                               Code Size : 0.02MB
                              Param Size : 4.60MB
                           Workload MACs : 498.21MOPS
                         IO Memory Space : 0.52MB
                              Mean Value : 0, 0, 0, 
                      Total Tensor Count : 7
                Boundary Input Tensor(s)   (H*W*C)
                          images_in:0(0) : 224*224*3

               Boundary Output Tensor(s)   (H*W*C)
                 predictions_MatMul:0(0) : 1*1*6

                        Total Node Count : 6
                           Input Node(s)   (H*W*C)
                      conv2d_1_Conv2D(0) : 224*224*3

                          Output Node(s)   (H*W*C)
                   predictions_MatMul(0) : 1*1*6




                             Kernel Name : pruned_custom_cnn_1
--------------------------------------------------------------------------------
                             Kernel Type : CPUKernel
                Boundary Input Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

               Boundary Output Tensor(s)   (H*W*C)
                predictions_Softmax:0(0) : 1*1*6

                           Input Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

                          Output Node(s)   (H*W*C)
                     predictions_Softmax : 1*1*6

Testing and performances[edit | edit source]

HOME	SOMs	SBCs	ToloMEO Embedded Assistant	GET A QUOTE	ONLINE HELPDESK
	Roadmap		IoT Services			ML/AI services	Embedded Design Services

Difference between revisions of "ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 3"

Revision as of 15:33, 5 October 2020

Contents

History[edit | edit source]

Introduction[edit | edit source]

Building the application[edit | edit source]

Training the model[edit | edit source]

Pruning the model[edit | edit source]

Freezing the computational graph[edit | edit source]

Transform the computational graph[edit | edit source]

Quantize the computational graph[edit | edit source]

Compiling the model[edit | edit source]

Testing and performances[edit | edit source]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Quick Links

Contact us

How to use wiki

Advanced Search

Tools