Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

9,256 bytes added, 16:46, 12 April 2021

no edit summary

===ResNet50===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 993''''' with an '''''accuracy of 93.59%''''' and a '''''loss of 0.1912''''' on the validation data.

~~lorem ipsumlorem ipsumlorem ipsum~~The model, before performing the quantization with the ''vai_q_tensorflow'' tool, has an overall value of '''''accuracy of 94.85%''''' and an overall weighted average '''''F1-score of 94.86%''''' over the test subset of the dataset, showing a good generalization capability on unseen samples. The classes with the highest F1-score, above 96.00% are: ''resistor'' (98.08% F1-score), ''inductor'' (97.10% F1-score) and, ''capacitor'' (96.88% F1-score). On the contrary, the class on which the model performs poorly w.r.t the other classes, is the class diode (91.75% F1-score) due to having a low precision (88.55% precision).

~~lorem ipsumlorem ipsumlorem ipsum~~After performing the quantization with the ''vai_q_tensorflow'' tool and after the deployment on the target device, the model has an overall value of '''''accuracy of 93.27%''''' and an overall weighted average '''''F1-score of 93.29%''''' on the test subset of the dataset. The model is still performing well on ''resistor'' (98.08% F1-score), ''inductor'' (97.10% F1-score) and, ''capacitor'' (96.88% F1-score) classes. However, the model shows the worst results for ''transistor'' class (89.78% F1-score) due to having both precision and recall below 90.00% (89.96% precision and, 89.60% recall) and, ''diode'' class (88.59& F1-score) since the precision for this class is very low (83.77% precision).

~~lorem ipsumlorem ipsumlorem ipsum~~To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 55% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on DPU-00 core and close to 90% on DPU-01 core. Concerning the DPU latency, for 1 thread the average latency for one image is about 12ms (11526.41μs). By increasing the concurrency, the latency for both cores is higher, about 13ms (13318.01μs) for DPU-00 core, and 12ms (12019.21μs) for DPU-01 core when using 2 threads and about 14ms (14200.19μs) for DPU-00 core, and 13ms (12776.24μs) for DPU-01 core with 4 concurrent threads.

===ResNet101===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 944''''' with an '''''accuracy of 98.12%''''' and a '''''loss of 0.0781''''' on the validation data.

~~lorem ipsumlorem ipsumlorem ipsum~~The model, before performing the quantization with the ''vai_q_tensorflow'' tool, has an overall value of '''''accuracy of 97.10%''''' and an overall weighted average '''''F1-score of 97.11%''''' over the test subset of the dataset, showing a very high generalization capability on unseen samples. All the classes have a F1-score above 96.00%, very high for class ''resistor'' (98.65% F1-score) and, class ''inductor'' (98.50% F1-score) with only the exception of the diode class (95.40% F1-score) mainly because it has a low recall (94.40% recall).

~~lorem ipsumlorem ipsumlorem ipsum~~After performing the quantization with the ''vai_q_tensorflow'' tool and after the deployment on the target device, the model has an overall value of '''''accuracy of 93.95%'''''' and an overall weighted average '''''F1-score of 93.91%''''' on the test subset of the dataset. The model is still performing very well on ''capacitor'' class by keeping a F1-score above 96.00% (97.03% F1-score) but, on the other hand, for the remaining classes there is a substantial drop in the value of the metric. The classes that exhibits the worst results are ''diode'' (92.09% F1-score) and, ''IC'' (92.06% F1-score) due to having a low recall (88.20% recall). In general, the performance of the model is still good, similar to the one obtained with the ResNet50 model.

|}

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 70% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on DPU-00 core and close to 95% on DPU-01 core. Concerning the DPU latency, for 1 thread the average latency for one image is about 21ms (21339.73μs). By increasing the concurrency, the latency for both cores is higher, about 24ms (24313.61μs) for DPU-00 core, and 22ms (22231.22μs) for DPU-01 core when using 2 threads and about 25ms (25385.51μs) for DPU-00 core, and 23ms (23025.89μs) for DPU-01 core with 4 concurrent threads.

===ResNet152===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 969''''' with an '''''accuracy of 97.66%''''' and a '''''loss of 0.0721''''' on the validation data.

~~lorem ipsumlorem ipsumlorem ipsum~~The model, before performing the quantization with the ''vai_q_tensorflow'' tool, has an overall value of '''''accuracy of 96.46%''''' and an overall weighted average '''''F1-score of 96.48%''''' over the test subset of the dataset, showing a good generalization capability on unseen samples. The classes with the highest F1-score, above 96.00% are respectively ''resistor'' (98.58% F1-score), ''inductor'' (98.03% F1-score) and, ''capacitor'' (96.99% F1-score), a result quite similar to ResNet50 model. The worst performance is the one displayed by the class ''transistor'' by having "only" a F1-score around 94.00% (94.18% F1-score) mainly due to a low value of the precision metric (92.89% precision).

~~lorem ipsumlorem ipsumlorem ipsum~~After performing the quantization with the ''vai_q_tensorflow'' tool and after the deployment on the target device, the model has an overall value of '''''accuracy of 93.40%''''' and an overall weighted average '''''F1-score of 93.36%''''' on the test subset of the dataset.

|}

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 80% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on both DPU-00 and DPU-01 cores. Concerning the DPU latency, for 1 thread the average latency for one image is about 28ms (28867.86μs). By increasing the concurrency, the latency for both cores is higher, about 33ms 32702.59 for DPU-00 core, and 30ms 30046.64 for DPU-01 core when using 2 threads and about 34ms (33826.30μs) for DPU-00 core, and 30ms (30834.46μs) for DPU-01 core with 4 concurrent threads.

===InceptionV4===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 957''''' with an '''''accuracy of 95.00%''''' and a '''''loss of 0.1729''''' on the validation data.

===Inception ResNet V1===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 959''''' with an '''''accuracy of 97.97%''''' and a '''''loss of 0.0751''''' on the validation data.

===Inception ResNet V2===

The model, during the training phase, shows an increasing trend in accuracy over the train subset samples and over the validation subset samples. This is a sign that the model is learning in a correct way since it is not underfitting the train data. Furthermore, by looking at the trend of the loss during the 1000 training epochs, the model is clearly not overfitting the train data. By saving the status of the model with checkpoints each time there is an improvement in the validation loss, the best result is found at '''''epoch 974''''' with an '''''accuracy of 97.50%''''' and a '''''loss of 0.0724''''' on the validation data.

U0019

dave_user

207

edits

DAVE Developer's Wiki β

Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

DAVE Developer's Wiki ^β