Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

2,101 bytes added, 08:11, 13 April 2021

no edit summary

After performing the quantization with the ''vai_q_tensorflow'' tool and after the deployment on the target device, the model has an overall value of '''''accuracy of 93.95%'''''' and an overall weighted average '''''F1-score of 93.91%''''' on the test subset of the dataset. The model is still performing very well on the ''capacitor'' class by keeping a F1-score above 96.00% (97.03% F1-score) but, on the other hand, for the remaining classes , there is a substantial drop in the value of the metric. The classes that exhibits the worst results are ''diode'' (92.09% F1-score) and, ''IC'' (92.06% F1-score) due to having a low recall (88.20% recall). In general, the performance of the model is still good, similar to the one obtained with the ResNet50 model.

After performing the quantization with the ''vai_q_tensorflow'' tool and after the deployment on the target device, the model has an overall value of '''''accuracy of 93.40%''''' and an overall weighted average '''''F1-score of 93.36%''''' on the test subset of the dataset. The model is still performing very well on the ''capacitor'' class by keeping a F1-score above 96.00% (96.62% F1-score) but, on the other hand, for the remaining classes , there is a substantial drop in the value of the metric. The classes that exhibit the worst results are ''diode'' (91.65% F1-score) because the recall is very low (87.30% recall), ''IC'' (91.09% F1-score) having low precision and recall (91.18% precision, 91.00% recall) and, ''transistor'' (90.62% F1-score) having low precision and recall (90.35% precision, 90.62% recall). In general, the performance of the model is still good, similar to the one obtained with two previous models, especially similar to the ResNet101 model.

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a an 80% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on both DPU-00 and DPU-01 cores. Concerning the DPU latency, for 1 thread the average latency for one image is about 28ms (28867.86μs). By increasing the concurrency, the latency for both cores is higher, about 33ms (32702.59 59μs) for DPU-00 core, and 30ms (30046.64 64μs) for DPU-01 core when using 2 threads and about 34ms (33826.30μs) for DPU-00 core, and 30ms (30834.46μs) for DPU-01 core with 4 concurrent threads.

|}

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 70% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on both DPU-00 and DPU-01 cores. Concerning the DPU latency, for 1 thread the average latency for one image is about 30ms (30127.38μs). By increasing the concurrency, the latency for both cores is higher, about 34ms (34105.45μs) for DPU-00 core, and 31ms (30981.59μs) for DPU-01 core when using 2 threads and about 35ms (35273.61μs) for DPU-00 core, and 31ms (31761.21μs) for DPU-01 core with 4 concurrent threads.

|}

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 60% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on DPU-00 core and to 90% on DPU-01 core. Concerning the DPU latency, for 1 thread the average latency for one image is about 18ms (17651.31μs). By increasing the concurrency, the latency for both cores is higher, about 21ms (20511.79μs) for DPU-00 core, and 18ms (18466.97μs) for DPU-01 core when using 2 threads and about 22ms (21654.99μs) for DPU-00 core, and 20ms (19503.17μs) for DPU-01 core with 4 concurrent threads.

|}

To perform the inference over the images, only one DPU core is used for 1 thread, leading to almost a 65% utilization of the DPU-01 core. By increasing the number of threads i.e. with 4 threads, more cores are used and the percentage gets higher, very close to 100% on DPU-00 core and to 95% on DPU-01 core. Concerning the DPU latency, for 1 thread the average latency for one image is about 25ms (25185.03μs). By increasing the concurrency, the latency for both cores is higher, about 29ms (28858.88μs) for DPU-00 core, and 26ms (26336.11μs) for DPU-01 core when using 2 threads and about 30ms (30229.27μs) for DPU-00 core, and 27ms (27452.70μs) for DPU-01 core with 4 concurrent threads.

U0019

dave_user

207

edits

DAVE Developer's Wiki β

Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

DAVE Developer's Wiki ^β