Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

1,259 bytes added, 16:12, 6 April 2021

no edit summary

The dataset consists of 9,912 images of 31 PCB samples and contains a total amount of 77,347 labeled components distributed in six classes: ''IC'', ''capacitor'', ''diode'', ''inductor'', ''resistor'' and, ''transistor''. These components were collected using two image sensor types, a digital microscope and a Digital Single-Lens Reflex (DSLR) camera. To ensure that the dataset also includes samples that represent variations in illumination, the authors collected images using three different intensities from the built-in ring light of the microscope i.e. 20, 40, and 60, where 60 represents the brightest illumination. In addition, variations in scale were included using three difference magnifications i.e. 1×, 1.5×, and 2×.

[[File:FICS-PCB samples.png|center|thumb|500x500px|FICS-PCB dataset, examples of six types of components]]

It is straightforward just by looking at the figure below that this dataset is highly unbalanced, having a lot of samples only for two classes i.e. capacitor and resistor. This is indeed no surprise, simply because these two component types are more commonly mounted on a PCB with respect to the others. Unfortunately, in this situation, it is not a good idea to use this dataset as it is, simply because the models will be trained on image batches mainly composed by the most common components, hence learning only a restricted number of features. This has as a consequence that the models will probably be very good at classifying capacitor and resistor and pretty bad at classifying the remaining classes. Therefore, the missing data must be increased with image augmentation.

Before proceeding further, please note that the number of DSLR subset examples is by far lower than the number of the Microscope subset samples.

As the two subsets were acquired using two different kind of instruments, their characteristics — the resolution, for example — differ significantly. In order to have homogeneous images w.r.t. the characteristics, it is preferable to keep only one of them, specifically the most numerous.

RESNET + INCEPTION INFO

TRAINING SPECS(NO TENSORFLOW PRUNING)METRICS

===ResNet50===

By initially considering the accuracy of the models before the quantization, it is possible to see that the ones that have a higher capability of correctly classifying the test samples are, in descending order, the Inception ResNet V2, Inception ResNet V1, and the ResNet101. These three models show an accuracy above 97%. In contrast, the models that display two of the lowest accuracy values are the ResNet50 and the Inception V4. After doing the quantization, the situation changes radically, having at the top of the list the ResNet101, followed by the ResNet50 model, while the Inception ResNet V1 and inception ResNet V2 stand at the bottom, with an accuracy drop of 6.65% for the former and 5.55% for the latter. Moreover, the worst model among those analyzed is the Inception V4, with an accuracy below 90%.

[[File:Pre and post quantization accuracy.png|center|thumb|500x500px|Models pre and post quantization accuracy with vai_q_tensorflow tool]]

*'''Parameters size''': indicates in the unit of MB, kB, or bytes, the amount of memory occupied by the DPU Kernel, including weight and bias. It is straightforward to check that the greater the number of parameters for an implemented model on the host, the greater the amount of memory occupied on the target device.

*'''Total tensor count''': is the total number of DPU tensors for a DPU Kernel. This value depends on the number of stacked layers between input and output layers of the model and obviously the greater the number of stacked layers, the higher the number of tensors, leading to a more complex computation on the DPU. This is directly responsible for increasing the requested amount of time for a single inference on a single image.

*with the same ''total tensor count'', the latency increases along with the DPU Kernel parameters size.

*with the same DPU Kernel ''parameters size'', the latency decreases if the total tensor count lowers.

These considerations suggest that the best models among the implemented ones are ResNet50, ResNet101, and Inception ResNet V1.

Finally, it is possible to evaluate the DPU throughput in relation to the number of threads used by the benchmark application. In the figure below, it is really interesting to observe how all the models for 1 thread, have similar values of FPS but by increasing the level of concurrency the difference is more and more evident.

[[File:DPU throughput for 1-2-4 threads.png|center|thumb|500x500px|Deployed models DPU throughput for 1, 2, and 4 threads]]

U0019

dave_user

207

edits

DAVE Developer's Wiki β

Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 2

DAVE Developer's Wiki ^β