Difference between revisions of "ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 1"

From DAVE Developer's Wiki
Jump to: navigation, search
Line 38: Line 38:
 
==Dataset==
 
==Dataset==
  
[[File:FICS-PCB samples.png|border|500x500px|center|caption]]
+
[[File:FICS-PCB samples.png|center|thumb|500x500px|'''caption''']]
  
  
[[File:Samples per class in Microscope and DSLR subsets.png|border|500x500px|center|caption]]
+
[[File:Samples per class in Microscope and DSLR subsets.png|center|thumb|500x500px|caption]]
  
  
[[File:Dataset processing and augmentation.png|border|500x500px|center|caption]]
+
[[File:Dataset processing and augmentation.png|center|thumb|500x500px|caption]]
  
  
[[File:Image augmentation for training samples.png|border|500x500px|center|caption]]
+
[[File:Image augmentation for training samples.png|center|thumb|500x500px|caption]]
  
  
Line 58: Line 58:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet50 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:Resnet50 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for ResNet50 model]]
 
|
 
|
|[[File:Resnet50 train and validation loss.png|border|500x500px|none]]
+
|[[File:Resnet50 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for ResNet50 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for ResNet50 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for ResNet50 model
 
|width="25"|
 
 
|}
 
|}
  
Line 77: Line 71:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet50 host confusion matrix.png|center|border|Confusion matrix of ResNet50 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet50 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet50 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 140: Line 134:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet50 target confusion matrix.png|center|border|Confusion matrix of ResNet50 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet50 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet50 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |  
 
| width=200px style = " vertical-align: center; " |  
     {| class="wikitable" style="margin: auto; text-align: center;"  
+
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 206: Line 200:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet50 cores utilization.png|border|500x500px|none]]
+
|[[File:Resnet50 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of ResNet50 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Resnet50 DPU latency.png|border|500x500px|none]]
+
|[[File:Resnet50 DPU latency.png|thumb|500x500px|DPU latency of ResNet50 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of ResNet50 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of ResNet50 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 225: Line 213:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet101 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:Resnet101 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for ResNet101 model]]
 
|
 
|
|[[File:Resnet101 train and validation loss.png|border|500x500px|none]]
+
|[[File:Resnet101 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for ResNet101 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for ResNet101 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for ResNet101 model
 
|width="25"|
 
 
|}
 
|}
  
Line 244: Line 226:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet101 host confusion matrix.png|center|border|Confusion matrix of ResNet101 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet101 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet101 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 307: Line 289:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet101 target confusion matrix.png|center|border|Confusion matrix of ResNet101 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet101 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet101 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 368: Line 350:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet101 cores utilization.png|border|500x500px|none]]
+
|[[File:Resnet101 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of ResNet101 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Resnet101 DPU latency.png|border|500x500px|none]]
+
|[[File:Resnet101 DPU latency.png|thumb|500x500px|DPU latency of ResNet101 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of ResNet101 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of ResNet101 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 387: Line 363:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet152 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:Resnet152 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for ResNet152 model]]
 
|
 
|
|[[File:Resnet152 train and validation loss.png|border|500x500px|none]]
+
|[[File:Resnet152 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for ResNet152 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for ResNet152 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for ResNet152 model
 
|width="25"|
 
 
|}
 
|}
  
Line 406: Line 376:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet152 host confusion matrix.png|center|border|Confusion matrix of ResNet152 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet152 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet152 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 469: Line 439:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Resnet152 target confusion matrix.png|center|border|Confusion matrix of ResNet152 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Resnet152 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of ResNet152 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 530: Line 500:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Resnet152 cores utilization.png|border|500x500px|none]]
+
|[[File:Resnet152 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of ResNet152 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Resnet152 DPU latency.png|border|500x500px|none]]
+
|[[File:Resnet152 DPU latency.png|thumb|500x500px|DPU latency of ResNet152 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of ResNet152 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of ResNet152 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 549: Line 513:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:InceptionV4 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:InceptionV4 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for InceptionV4 model]]
 
|
 
|
|[[File:InceptionV4 train and validation loss.png|border|500x500px|none]]
+
|[[File:InceptionV4 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for InceptionV4 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for InceptionV4 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for InceptionV4 model
 
|width="25"|
 
 
|}
 
|}
  
Line 568: Line 526:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:InceptionV4 host confusion matrix.png|center|border|Confusion matrix of InceptionV4 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:InceptionV4 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of InceptionV4 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 631: Line 589:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:InceptionV4 target confusion matrix.png|center|border|Confusion matrix of InceptionV4 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:InceptionV4 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of InceptionV4 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 692: Line 650:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Inception v4 cores utilization.png|border|500x500px|none]]
+
|[[File:Inception v4 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of InceptionV4 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Inception v4 DPU latency.png|border|500x500px|none]]
+
|[[File:Inception v4 DPU latency.png|thumb|500x500px|DPU latency of InceptionV4 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of InceptionV4 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of InceptionV4 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 711: Line 663:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Inception ResNet V1 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:Inception ResNet V1 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for Inception ResNet V1 model]]
 
|
 
|
|[[File:Inception ResNet V1 train and validation loss.png|border|500x500px|none]]
+
|[[File:Inception ResNet V1 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for Inception ResNet V1 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for Inception ResNet V1 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for Inception ResNet V1 model
 
|width="25"|
 
 
|}
 
|}
  
Line 730: Line 676:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V1 host confusion matrix.png|center|border|Confusion matrix of Inception ResNet V1 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V1 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of Inception ResNet V1 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 793: Line 739:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V1 target confusion matrix.png|center|border|Confusion matrix of Inception ResNet V1 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V1 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of Inception ResNet V1 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 854: Line 800:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Inception resnet v1 cores utilization.png|border|500x500px|none]]
+
|[[File:Inception resnet v1 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of Inception ResNet V1 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Inception resnet v1 DPU latency.png|border|500x500px|none]]
+
|[[File:Inception resnet v1 DPU latency.png|thumb|500x500px|DPU latency of Inception ResNet V1 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of Inception ResNet V1 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of Inception ResNet V1 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 873: Line 813:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Inception ResNet V2 train and validation accuracy.png|border|500x500px|none]]
+
|[[File:Inception ResNet V2 train and validation accuracy.png|thumb|500x500px|Train and validation accuracy trend over 1000 training epochs for Inception ResNet V2 model]]
 
|
 
|
|[[File:Inception ResNet V2 train and validation loss.png|border|500x500px|none]]
+
|[[File:Inception ResNet V2 train and validation loss.png|thumb|500x500px|Train and validation loss trend over 1000 training epochs for Inception ResNet V2 model]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Train and validation accuracy trend over 1000 training epochs for Inception ResNet V2 model
 
|width="25"|
 
|width="100"|Train and validation loss trend over 1000 training epochs for Inception ResNet V2 model
 
|width="25"|
 
 
|}
 
|}
  
Line 892: Line 826:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Host machine, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V2 host confusion matrix.png|center|border|Confusion matrix of Inception ResNet V2 model on host machine before quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V2 host confusion matrix.png|center|thumb|500x500px|Confusion matrix of Inception ResNet V2 model on host machine before quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Host machine, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 955: Line 889:
  
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
 
{| align="center" style = "background: transparent; margin: auto; width: 60%;"
|+ style="padding: 10px" | '''Target device, confusion matrix & classification report'''
 
 
|-  
 
|-  
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V2 target confusion matrix.png|center|border|Confusion matrix of Inception ResNet V2 model on target device after quantization]]
+
| width=200px style = " vertical-align: center; " |[[File:Inception ResNet V2 target confusion matrix.png|center|thumb|500x500px|Confusion matrix of Inception ResNet V2 model on target device after quantization]]
 
| width=200px style = " vertical-align: center; " |
 
| width=200px style = " vertical-align: center; " |
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 
     {| class="wikitable" style="margin: auto; text-align: center;"
 +
    |+ Target device, classification report
 
     |- style="font-weight:bold;"
 
     |- style="font-weight:bold;"
 
     ! Class
 
     ! Class
Line 1,016: Line 950:
 
|-align="center"
 
|-align="center"
 
|
 
|
|[[File:Inception resnet v2 cores utilization.png|border|500x500px|none]]
+
|[[File:Inception resnet v2 cores utilization.png|thumb|500x500px|Utilization of CPU and DPU cores of Inception ResNet V2 model for 1, 2, and 4 threads]]
 
|
 
|
|[[File:Inception resnet v2 DPU latency.png|border|500x500px|none]]
+
|[[File:Inception resnet v2 DPU latency.png|thumb|500x500px|DPU latency of Inception ResNet V2 model for 1, 2, and 4 threads]]
 
|
 
|
|-align="center" valign="top"
 
|width="25"|
 
|width="100"|Utilization of CPU and DPU cores of Inception ResNet V2 model for 1, 2, and 4 threads
 
|width="25"|
 
|width="100"|DPU latency of Inception ResNet V2 model for 1, 2, and 4 threads
 
|width="25"|
 
 
|}
 
|}
  
Line 1,031: Line 959:
 
==Comparison==
 
==Comparison==
  
 
+
<!--Start of table definition-->
 +
{|style="background:transparent; color:black" border="0" height="550" align="center" valign="bottom" cellpadding=10px cellspacing=0px
 +
|-align="center"
 +
|
 +
|[[File:Inception resnet v2 cores utilization.png|thumb|250x250px|Utilization of CPU and DPU cores of Inception ResNet V2 model for 1, 2, and 4 threads]]
 +
|
 +
|[[File:Inception resnet v2 DPU latency.png|thumb|250x250px|DPU latency of Inception ResNet V2 model for 1, 2, and 4 threads]]
 +
|
 +
|}
  
 
==Useful links==
 
==Useful links==

Revision as of 09:06, 6 April 2021

Info Box
NeuralNetwork.png Applies to Machine Learning


History[edit | edit source]

Version Date Notes
1.0.0 March 2021 First public release

Introduction[edit | edit source]

In [ML-TN-001_-_AI_at_the_edge:_comparison_of_different_embedded_platforms_-_Part_1 this series] of articles, different embedded platforms suited for building "[Edge AI]" solutions are compared from the point of view of inferencing capabilities/features, development tools, etc. In principle, such platforms can drive a bunch of different applications in the industrial world and other fields as well.

This series of Technical Notes illustrates a feasibility study regarding a common problem in the manufacturing realm that supposedly Machine Learning (ML) algorithms can address effectively: defect detection by automatic visual inspection. More specifically, this study deals with the inspection of assembled Printed Circuit Boards (PCBs). The ultimate goal is to determine if it would be possible to design innovative machines exploiting ML algorithms and able to outperform traditional devices today employed for this task. This is a blatant example of AI at the edge as the application requirements include the following:

  • Data — images in this case — must be processed where they are originated
  • Processing latency has to be minimized in order to increase the manufacturing line efficiency.


Currently, the problem of latency is what is driving many companies to move from the cloud to the edge, along with the fact that it is not reasonably feasible to afford a GPU for all the use cases. This has led to the birth of a new computational paradigm called "Edge AI" that combines the efficiency, speed, scalability, and the reduced costs of edge computing with the powerful advantages offered by the use of Artificial Intelligence and Machine Learning models. "Edge AI", "Intelligence on the edge", or "Edge Machine Learning" means that data is processed locally — i.e. near the source of data — in algorithms stored on a hardware device instead of being processed in algorithms located in the cloud. This not only enables real-time operations, but it also helps to significantly reduce the power consumption and security vulnerability associated with processing data in the cloud.

While moving from the cloud to the edge is a vital step in solving resource constraint issues, many Machine Learning models are still using too much computing power and memory to be able to fit the small microprocessors available on the market. Many are approaching this challenge by creating more efficient software, algorithms, and hardware or by combining these components in a specialized way. To this end, a new generation of purpose-built accelerators is emerging as chip manufacturers work to speed up and optimize the workloads involved in AI and Machine Learning projects from training to performing inference. Faster, cheaper, more power-efficient and scalable, these accelerators promise to boost edge devices to a new level of performance. In this work, a modern system-on-chip (SoC) embedding a configurable hardware accelerator of this sort was analyzed in view of using it as a core building block of such devices. Also, it was studied its applicability in a real-world scenario characterized by issues that are common to a large class of problems in the industrial realm.

Articles in this series[edit | edit source]

  • Part 1 (this document)
  • Part 2 talks about the classification of surface-mounted components on printed circuit boards.
  • Part 3 deals with the issue of data scarcity.

Test Bed[edit | edit source]

Dataset[edit | edit source]

caption


caption


caption


caption


Models[edit | edit source]

ResNet50[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for ResNet50 model
Train and validation loss trend over 1000 training epochs for ResNet50 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet50 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.95740 0.89900 0.92728 1000
capacitor 0.97278 0.96500 0.96888 1000
diode 0.88558 0.95200 0.91759 1000
inductor 0.97006 0.97200 0.97103 1000
resistor 0.98882 0.97300 0.98085 1000
transistor 0.92262 0.93000 0.92629 1000
Weighted avg 0.94954 0.94850 0.94865 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet50 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.96384 0.85300 0.90504 1000
capacitor 0.99068 0.95700 0.97355 1000
diode 0.83779 0.94000 0.88596 1000
inductor 0.94839 0.97400 0.96103 1000
resistor 0.97211 0.97600 0.97405 1000
transistor 0.89960 0.89600 0.89780 1000
Weighted avg 0.93540 0.93267 0.93290 6000


lorem ipsum lorem ipsum lorem ipsum


Utilization of CPU and DPU cores of ResNet50 model for 1, 2, and 4 threads
DPU latency of ResNet50 model for 1, 2, and 4 threads


ResNet101[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for ResNet101 model
Train and validation loss trend over 1000 training epochs for ResNet101 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet101 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.96375 0.95700 0.96036 1000
capacitor 0.96373 0.98300 0.97327 1000
diode 0.96425 0.94400 0.95402 1000
inductor 0.98500 0.98500 0.98500 1000
resistor 0.98504 0.98800 0.98652 1000
transistor 0.96517 0.97000 0.96758 1000
Weighted avg 0.97116 0.97117 0.97112 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet101 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.96288 0.88200 0.92067 1000
capacitor 0.95898 0.98200 0.97036 1000
diode 0.93965 0.90300 0.92096 1000
inductor 0.93719 0.95500 0.94601 1000
resistor 0.90428 0.99200 0.94611 1000
transistor 0.93896 0.92300 0.93091 1000
Weighted avg 0.94033 0.93950 0.93917 6000


Utilization of CPU and DPU cores of ResNet101 model for 1, 2, and 4 threads
DPU latency of ResNet101 model for 1, 2, and 4 threads


ResNet152[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for ResNet152 model
Train and validation loss trend over 1000 training epochs for ResNet152 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet152 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.94553 0.97200 0.95858 1000
capacitor 0.95538 0.98500 0.96997 1000
diode 0.98298 0.92400 0.95258 1000
inductor 0.98584 0.97500 0.98039 1000
resistor 0.99390 0.97800 0.98589 1000
transistor 0.92899 0.95500 0.94181 1000
Weighted avg 0.96544 0.96483 0.96487 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of ResNet152 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.91182 0.91000 0.91091 1000
capacitor 0.94460 0.98900 0.96629 1000
diode 0.96464 0.87300 0.91654 1000
inductor 0.94124 0.94500 0.94311 1000
resistor 0.94038 0.97800 0.95882 1000
transistor 0.90358 0.90900 0.90628 1000
Weighted avg 0.93438 0.93400 0.93366 6000


Utilization of CPU and DPU cores of ResNet152 model for 1, 2, and 4 threads
DPU latency of ResNet152 model for 1, 2, and 4 threads


InceptionV4[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for InceptionV4 model
Train and validation loss trend over 1000 training epochs for InceptionV4 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of InceptionV4 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.94524 0.86300 0.90225 1000
capacitor 0.98051 0.95600 0.96810 1000
diode 0.88384 0.87500 0.87940 1000
inductor 0.95575 0.97200 0.96381 1000
resistor 0.96847 0.98300 0.97568 1000
transistor 0.83670 0.91200 0.87273 1000
Weighted avg 0.92842 0.92683 0.92699 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of InceptionV4 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.78158 0.89100 0.83271 1000
capacitor 0.99220 0.89000 0.93832 1000
diode 0.88553 0.82000 0.85151 1000
inductor 0.88973 0.94400 0.91606 1000
resistor 0.97319 0.98000 0.97658 1000
transistor 0.83282 0.80700 0.81971 1000
Weighted avg 0.89251 0.88867 0.88915 6000


Utilization of CPU and DPU cores of InceptionV4 model for 1, 2, and 4 threads
DPU latency of InceptionV4 model for 1, 2, and 4 threads


Inception ResNet V1[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for Inception ResNet V1 model
Train and validation loss trend over 1000 training epochs for Inception ResNet V1 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of Inception ResNet V1 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.98274 0.96800 0.97531 1000
capacitor 0.97571 0.96400 0.96982 1000
diode 0.94889 0.98400 0.96613 1000
inductor 0.98085 0.97300 0.97691 1000
resistor 0.98211 0.98800 0.98504 1000
transistor 0.97278 0.96500 0.96888 1000
Weighted avg 0.97385 0.97367 0.97368 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of Inception ResNet V1 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.84127 0.95400 0.89410 1000
capacitor 0.99787 0.93600 0.96594 1000
diode 0.94346 0.90100 0.92174 1000
inductor 0.95275 0.98800 0.97005 1000
resistor 0.94852 0.99500 0.97121 1000
transistor 0.93348 0.82800 0.87758 1000
Weighted avg 0.93622 0.93367 0.93344 6000


Utilization of CPU and DPU cores of Inception ResNet V1 model for 1, 2, and 4 threads
DPU latency of Inception ResNet V1 model for 1, 2, and 4 threads


Inception ResNet V2[edit | edit source]

Train and validation accuracy trend over 1000 training epochs for Inception ResNet V2 model
Train and validation loss trend over 1000 training epochs for Inception ResNet V2 model


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of Inception ResNet V2 model on host machine before quantization
Host machine, classification report
Class Precision Recall F1-score Support
IC 0.97872 0.96600 0.97232 1000
capacitor 0.99177 0.96400 0.97769 1000
diode 0.98963 0.95400 0.97149 1000
inductor 0.97931 0.99400 0.98660 1000
resistor 0.98213 0.98900 0.98555 1000
transistor 0.93365 0.98500 0.95864 1000
Weighted avg 0.97587 0.97533 0.97538 6000


lorem ipsum lorem ipsum lorem ipsum


Confusion matrix of Inception ResNet V2 model on target device after quantization
Target device, classification report
Class Precision Recall F1-score Support
IC 0.91735 0.89900 0.90808 1000
capacitor 0.99466 0.93200 0.96231 1000
diode 0.98793 0.90000 0.94192 1000
inductor 0.92066 0.99800 0.95777 1000
resistor 0.96970 0.99200 0.98072 1000
transistor 0.87887 0.93600 0.90654 1000
Weighted avg 0.94486 0.94283 0.94289 6000


Utilization of CPU and DPU cores of Inception ResNet V2 model for 1, 2, and 4 threads
DPU latency of Inception ResNet V2 model for 1, 2, and 4 threads


Comparison[edit | edit source]

Utilization of CPU and DPU cores of Inception ResNet V2 model for 1, 2, and 4 threads
DPU latency of Inception ResNet V2 model for 1, 2, and 4 threads

Useful links[edit | edit source]