ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 3

From DAVE Developer's Wiki
Jump to: navigation, search
Info Box
NeuralNetwork.png Applies to Machine Learning


History[edit | edit source]

Version Date Notes
1.0.0 June 2021 First public release

Introduction[edit | edit source]

This Technical Note (TN for short) belongs to this series of articles dealing with the automatic visual inspection of Printed Circuit Boards (PCB-AVI). Specifically, this article illustrates the attempt of employing ML and image processing techniques for building a proprietary dataset of SMD-populated PCBs exhibiting mounting anomalies.

Detecting anomalies on an assembled PCB is a real-world use case affected by the typical problem ML engineers and data scientists need to face when it comes to spotting defects related to an industrial process: they are extremely rare! As a result, in general, no large datasets of samples are available to train the models and several tools have to be employed to augment available data. This is pretty challenging as described in the following chapters.

Building the dataset[edit | edit source]

Generating real defects[edit | edit source]

Of course, defect detection on assembled PCB's is a matter that is strictly related to the core business of DAVE Embedded Systems. As manufacturer of electronic boards, PCB assembling is an industrial process under the full control of the company itself. For the case under consideration, this was a great help because it allowed the creation of an initial dataset of defected boards in a controlled fashion. As detailed in the rest of the section, DAVE Embedded Systems' assembly line was configured to generate deliberately a relatively high rate of defects in a limited of boards devoted for this purpose.

Specifically, 5 panels were prepared, each one containing 4 PCBs of the same type and with same template project i.e. the same bill of materials (BOM) for the DIVA SoM. In this case, it was decided to mount components only on top side of the PCB. Furthermore, in order to reduce complexity and simplify the problem for this test, anomalies generation is restricted to 2 contacts SMD passive components i.e. resistors, capacitors, and inductors. The table below reports all plausible anomalies that can be generated by editing the standard template project containing all the positions of the components for the pick-and-place (P&P) machine, decreasing or increasing the quantity of solder deposited by the screen printer on the panel or that can be generated directly by the operator.

Anomalies, generation process and numerosity
Anomaly P&P Screen printer Manual Numerosity
Missing 21
Manhattan 4
Shift x-axis 51
Shift y-axis 58
Shift&Rotation
(shift x+z-axes)
57
Rotation
(z-axis)

55
Under soldering All
Over soldering All

Following it is reported a brief description for all the anomalies:

  • Missing: the component is not in place according to the PCB design and indeed is not mounted on the board.
  • Manhattan: the component is in place, but it is erected horizontally (the operation has to be performed manually by the operator).
  • Shift x-axis: the component is in place according to the PCB design, but it shows a small shift along its x-axis.
  • Shift y-axis: the component is in place according to the PCB design, but it shows a small shift along its y-axis.
  • Shift&Rotation (x+z-axes): the component is in place according to the PCB design, but is shifted along its x-axis and rotated along its z-axis.
  • Rotation (z-axis): the component is in place according to the PCB design, but it is rotated along its z-axis.
  • Under soldering: the amount of deposited solder is higher than normal.
  • Over soldering: the amount of deposited solder is less than normal.

The same template for the P&P machine is used for all the 5 panels, but some of them use a different configuration for the program of the screen printer:

  • 2 panels use the normal amount of solder paste;
  • 1 panel uses more than the normal amount of solder. In this case solder is re-applied several times;
  • 1 panel uses less than the normal amount of solder;
  • 1 panel uses the normal amount of solder, but shifts along x and y axes are applied too.

After completing the assembly of all 5 panels, a visual inspection was performed with a traditional AOI machine (*). A total amount of 832 anomalies were found and the corresponding unmarked images were saved for dataset building.

(*) This machine makes use of an RGB color lighting technique to emphasize anomalies of solder joints.

Class subdivision and labelling[edit | edit source]

In order to build a dataset for training a ML model for a classification application, a set of classes has to be defined. After considering the standard IPC-A-610E-2010 developed by IPC for the Acceptability of Electronic Assemblies and analyzing the features of collected images, the following classes were identified:

  • Acceptable: the AOI machine signals the component as a possible anomaly because the component image doesn’t respect the color constraints specified by the software of the machine, but in truth there is no defect. Generally, this occurs when the component is correctly soldered on both pads, but the amount of red and green color is too high with respect to the amount of blue. In this case, quality is not the target one but still is acceptable.
  • Missing: the component is not in place, hence only the pads with applied solder are visible.
  • Tombstoning: the component is lifted from a pad of the PCB; this class also comprehends all the cases for which the component is lifted and rotated by a certain amount.
  • Under soldering: the component is soldered on both pads, but the amount of solder is too low. By inspecting the picture of the reported component, this is clearly visible when on a pad or both there is a higher amount of red with respect to the blue one.

To simplify the problem, all the classes into which the images are divided are mutually exclusive. Manhattan and over soldering defect typologies are no longer included among the possible classes because their generation is too difficult hence the quantity of examples obtained is too low for training a model.

Examples of four types of defects

For labeling the images makesense−ai tool was used.

Total number of samples for each defect category
Acceptable Missing Tombstoning Under soldering
17 408 376 24

It is evident that this dataset is unbalanced because it has a relatively low number of acceptable and under soldering defect images. Nevertheless, taking into account the overall results achieved, this first attempt can be considered a successful one.

Soldering regions extraction[edit | edit source]

By analyzing the acquired images, it is interesting to note that most defect features are distributed in the solder region in the component. In particular, two defects belonging to any two different classes can be easily distinguished by looking at the two soldering regions only. This means that both regions can be potentially used for training a ML model for a classification problem, while all the other parts in the image can be discarded without losing information and accuracy.

To address this issue, a methodology was developed for extracting the soldering regions from all the collected images. The algorithm is designed to find the correct position of a rectangular window i.e. a region of interest (ROI) by employing an adaptive approach that uses OpenCV image processing functions. Note that this approach can be used for all the different classes of defects.

Original missing sample, full typology
upper side ROI solder region
lower side ROI solder region

Data augmentation with image synthesis[edit | edit source]

In the previous articles, when facing a similar problem of lack of data, data augmentation was employed with ease by simply creating input pipelines with image re-mapping operations, or by using OpenCV functions to increase the amount of data stored in memory, more or less in the same fashion as TensorFlow image APIs.

However, despite being a good and very common methodology to increase model robustness for the classification task, new images are generated directly from an already existent image. Thus, newly created images are always based on already existent features, such as shapes, edges, contours, etc. For this reason, it can be really a matter of interest to explore the possibility of generating new data based on features of the entire original subset. This can be achieved through ML techniques applied to artificial image synthesis, in particular with the help of Generative Adversarial Networks (GANs).

Generative Adversarial Networks[edit | edit source]

GANs are a particular type of generative model used for unsupervised learning, which attempt to synthesize new data that is indistinguishable from the training data i.e. with the same distribution function of original data.

It uses two NNs, specifically two CNNs in the most recent approaches, which are locked in a competition game:

  • A generator, which is fed a vector of random numbers i.e. the latent vector and outputs synthesized data i.e. the generated images.
  • A discriminator, which is fed a batch of data, in this case a batch of images, and outputs a prediction of it being from the training set or from the generated set, basically learning a binary classification problem. In other words, the generator creates fake data and the discriminator attempts to distinguish these fakes samples from the real ones.

It must be specified that GANs in practice are quite complex and training can be a very challenging task making the generation from scratch of high resolution quality images a non trivial problem. This indeed severely limits the usefulness and the applicability of classic GANs architectures for many kind of practical applications. Fortunately, this issue can be addressed by employing a particular typology of networks developed by NVIDIA and named as proGANs which are characterized by a progressive growing architecture.

Progressive GAN implementation[edit | edit source]

When using proGANs to synthesize images at high resolutions, instead of attempting to train all layers of the generator and discriminator at once to generate samples at the target resolution as it is usually done, the networks are initially created with a block containing only a bunch of layers. Then, they are progressively grown to output higher resolution versions of the images by adding more and more blocks, one at a time, after completing the training of the previous one, as illustrated in the figure below.

ProGAN: training progression

This approach leads to a series of advantages:

  • The incremental learning process greatly stabilizes training and reduces the chance of mode collapse, since the networks gradually learn a much simple piece of the overall problem.
  • The low-to-high resolution trend forces the progressively grown networks to focus on high-level structure first and fill in the details later, resulting in an improvement of the quality of the final images.
  • Increasing the network size gradually is more computationally efficient w.r.t. the classic approach of using all the layers from the start (fewer layers are faster to train because there are fewer parameters).

To implement a proGAN, one possibile solution is to pre-define all models prior to training and exploit the usage of the TensorFlow Keras functional API, ensuring that layers are shared across the models. This approach requires defining, for each resolution of the discriminator and generator, two models: one named straight-through and the other one named fade-in. The latter, as the name suggests, implements the fade-in mechanism and so is used to transition from a lower resolution straight-through model to a higher one. The straight-through version has a plain architecture and its purpose is to fine-tuning all the layers for a given resolution.

ProGAN: growing progression of the model during training

Training configuration and hyperparameters setup[edit | edit source]

A total of 6 proGAN models were built and trained. Each one was built with the required number of straight-through and fade-in model stages to generate several typologies of synthesized images at the desired target resolution. These images belong either to missing or tombstoning classes and are either full images (512 × 512 resolution) or upper / lower soldering region images (256 × 256 resolution). The training was executed mainly on cloud, initially with the free services provided by Google Colab and finally with AWS SageMaker.

Discriminators and generators were trained both with a learn rate of 5 × 10-4 for all fade-in stages and with a smaller learn rate of 1 × 10-4 for all straight-through stages in order to guarantee a smooth and slow enough fine-tuning for all layers. Each stage of each created model was trained for 100 epochs, with the sole exception of the last stage for the 512×512 full images, which needed 50 more epochs to generate satisfying results. The batch size is progressively reduced as the resolution increases, starting from a batch of 64 images for the first three resolutions (4 × 4, 8 × 8 and 16 × 16 resolutions), decreasing to 32 (32 × 32, and 64 × 64 resolutions) and 16 (128 × 128 and 256 × 256 resolutions). In the last stage for full typology (512 × 512), batch size is further reduced to 8 images for each train step.

ProGANs training time on Google Colab and AWS SageMaker
Class Synth image Resolution
(pixel)
Google Colab
(min)
AWS SageMaker
(min)

missing
full 512 × 512 ~480 ~410
upper/lower 256 × 256 ~435 ~310

tombstoning
full 512 × 512 ~460 ~390
upper/lower 256 × 256 ~420 ~300

Results validation[edit | edit source]

In the two groups of images below, some examples taken from the generated sets of missing and tombstoning classes are shown. For each group, from left to right, you can see:

  • Samples of 512×512 and 256×256 resolution full typology.
  • Further on the right, from top to bottom, 256×256 upper and lower soldering region typology.
Synthesized images for missing class
Synthesized images for tombstoning class

The synthesized images can be effectively used to train an SMC defects classification model for a future ML-based PCB-AVI application, only if they have a similar probability distribution function with respect to the original ones. In particular, it is interesting to verify if there is a clear separation in the data. To this end, it is possible to employ the t-distributed stochastic neighbor embedding (t-SNE), which is a ML algorithm for visualization based on Stochastic Neighbor Embedding (SNE) algorithm. Since the way this algorithm operates is computationally expensive, it is highly recommended to use another dimensionality reduction technique before applying t-SNE, in order to reduce the number of dimensions to a reasonable amount. For this purpose, it is possible to employ an autoencoder (AE).

NN architecture of the Autoencoder
NN architectre of the encoder part
NN architecture of the decoder part

The two soldering regions are the most important parts concerning the generated images and they can be easily extracted and evaluated as well with t-SNE by using the methodology proposed in the previous section. As the two soldering regions associated with the same image are also correlated, clearly even strongly correlated for tombstoning class samples, they must be concurrently compressed by the AE. Therefore, a dual-stream autoencoder, with two input/output layers, is required.

NN architecture of the dual-stream autoencoder
NN architecture of the dual-stream encoder part
NN architecture of the dual-stream decoder part

The values obtained by applying the t-SNE algorithm on the compressed data are displayed in several 3D plots reported below respectively for missing and tombstoning synthesized images. Interestingly, the 4 plots related to full typology do not show a separation in the data between fakes (red dots) and reals (blue dots), sign that the generated images do not differ too much from the original ones, which were used to train the proGANs.

The two figures to the utmost right report the plots showing the results of t-SNE on the compressed data, obtained from the dual stream encoder for missing and tombstoning synthesized images. In the first case, for missing class, there is actually no separation between fakes and real of upper and lower region sides, a good sign that the generated images are not very different from the original ones. This is quite understandable since the two regions look alike. The situation is visibly different in the plot for the tombstoning class, where two distinct clouds of dots can be seen, confirming a well defined separation between the data related to the upper and lower typologies. It is also possible to see that fakes and reals dots, belonging to the same soldering region, are very close to each other, while dots belonging to a different soldering region are distant in the plot, confirming once again that the image generation with the proGAN was successful.

T-SNE algorithm results for missing class synthesized images
512 × 512 resolution full images
256 × 256 resolution full images
256 × 256 resolution upper and lower region images
T-SNE algorithm results for tombstoning class synthesized images
512 × 512 resolution full images
256 × 256 resolution full images
256 × 256 resolution upper and lower region images

Useful links[edit | edit source]