Changes

← Older edit

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 3

580 bytes added, 08:08, 7 June 2021

m

no edit summary

|-

|1.0.0

|~~March~~ June 2021

|First public release

|}

==Building the dataset==

===~~Defects generation and acquisition~~Generating real defects===Of course, defect detection on assembled PCB's is a matter that is strictly related to the core business of DAVE Embedded Systems. ~~Also~~As manufacturer of electronic boards, PCB assembling is an industrial process under the full control of the company itself. ~~This~~ For the case under consideration, this was a great help ~~for creating a~~ because it allowed the creation of an initial dataset of defected boardsin a controlled fashion. As detailed in the rest of the section, ~~our~~ DAVE Embedded Systems' assembly line was configured ~~purposely~~ to generate deliberately a relatively high rate of defects in ~~a controlled fashion on~~ a limited ~~numbers~~ of boardsdevoted for this purpose.

~~For this first attempt~~Specifically, 5 panels were prepared, each one containing 4 PCBs of the same type and with same template project i.e. the same bill of materials (BOM) for the [https://www.dave.eu/en/solutions/system-on-modules/diva-som ''DIVA SoM'']. In this case, it was decided to mount components only on ''top'' side of the PCB. Furthermore, in order to reduce complexity and simplify the problem for this test, anomalies generation is restricted to 2 contacts SMD passive components i.e. resistors, capacitors, and inductors. The table below reports all plausible anomalies that can be generated by editing the standard template project containing all the positions of the components for the pick-and-place (P&P) machine, decreasing or increasing the quantity of solder deposited by the screen printer on the panel or that can be generated directly by the operator.

{| class="wikitable" style="text-align:center; margin: auto;"

In the previous articles, when facing a similar problem of lack of data, data augmentation was employed with ease by simply creating input pipelines with image re-mapping operations, or by using OpenCV functions to increase the amount of data stored in memory, more or less in the same fashion as TensorFlow image APIs.

However, despite being a good and very common methodology to increase model robustness for the classification task, new images are generated directly from an already existent image. Thus, newly created images are always based on already existent features, such as shapes, edges, contours, etc. For this reason, it can be really a matter of interest to explore the possibility of generating new data based on features of the entire original subset. This can be achieved through ML techniques applied to artificial image synthesis, in particular with the help of [https://en.wikipedia.org/wiki/Generative_adversarial_network Generative Adversarial Networks (GANs)].

===Generative Adversarial Networks===

GANs are a particular ~~typology~~ type of generative model used for unsupervised learning, which attempt to synthesize new data that is indistinguishable from the training data i.e. with the same distribution function of original data. It uses two NNs, specifically two CNNs in the most recent approaches, which are locked in a competition game: a generator, which is fed a vector of random numbers i.e. the latent vector and outputs synthesized data i.e. the generated images, and a discriminator, which is fed a batch of data, in this case a batch of images, and outputs a prediction of it being from the training set or from the generated set, basically learning a binary classification problem. In other words, the generator creates fake data and the discriminator attempts to distinguish these fakes samples from the real ones.

It uses two NNs, specifically two CNNs in the most recent approaches, which are locked in a competition game:

* A generator, which is fed a vector of random numbers i.e. the latent vector and outputs synthesized data i.e. the generated images.

* A discriminator, which is fed a batch of data, in this case a batch of images, and outputs a prediction of it being from the training set or from the generated set, basically learning a binary classification problem. In other words, the generator creates fake data and the discriminator attempts to distinguish these fakes samples from the real ones.

It must be specified that GANs in practice are quite complex and training can be a very challenging task making the generation from scratch of high resolution quality images a non trivial problem. This indeed severely limits the usefulness and the applicability of classic GANs architectures for many kind of practical applications. Fortunately, this issue can be addressed by employing a particular typology of networks developed by NVIDIA and named as proGANs which are characterized by a progressive growing architecture.

===Progressive GAN implementation===

When using [https://towardsdatascience.com/progan-how-nvidia-generated-images-of-unprecedented-quality-51c98ec2cbd2 proGANs ] to synthesize images at high resolutions, instead of attempting to train all layers of the generator and discriminator at once to generate samples at the target resolution as it is usually done, the networks are initially created with a block containing only a bunch of layers ~~and~~ . Then, they are progressively grown to output higher resolution versions of the images by adding more and more blocks, one at a time, after completing the training of the previous one, as illustrated in the figure below.

[[File:Training progression.png|center|thumb|500x500px|ProGAN: training progression]]

This approach leads to a series of advantages:

*~~the~~ The incremental learning process greatly stabilizes training and reduces the chance of mode collapse, since the networks gradually learn a much simple piece of the overall problem.*~~the~~ The low-to-high resolution trend forces the progressively grown networks to focus on high-level structure first and fill in the details later, resulting in an improvement of the quality of the final images.*~~increasing~~ Increasing the network size gradually is more computationally efficient w.r.t. the classic approach of using all the layers from the start (fewer layers are faster to train because there are fewer parameters).

To implement a proGAN, one possibile solution is to pre-define all models prior to training and exploit the usage of the TensorFlow Keras functional API, ensuring that layers are shared across the models. This approach requires defining, for each resolution of the discriminator and generator, two models: one named ''straight-through'' and the other one named ''fade-in''. The latter, as the name suggests, implements the fade-in mechanism and so is used to transition from a lower resolution ''straight-through'' model to a higher one. The ''straight-through'' version has a plain architecture and its purpose is to fine-tuning all the layers for a given resolution.

==Training configuration and hyperparameters setup==

A total of 6 proGAN models were built and trained~~, each~~ . Each one was built with the required number of ''straight-through'' and ''fade-in'' model stages, to generate several typologies of synthesized images at the desired target resolution~~, belonging~~ . These images belong either to ''missing'' ~~and~~ or ''tombstoning'' classes~~, more specifically~~ and are either ''full'' images (512 × 512 resolution), or ''upper'' ~~and~~ ''/ lower'' soldering region images (256 × 256 resolution). The training was executed mainly on cloud, initially with the free services provided by [https://colab.research.google.com/ Google Colab ] and finally on with [https://aws.amazon.com/it/sagemaker/ AWS SageMaker].

Discriminators and generators were trained both with a learn rate of 5 × 10<sup>-4</sup> for all fade-in stages and with a smaller learn rate of 1 × 10<sup>-4</sup> for all ''straight-through'' stages, in order to guarantee a smooth and slow enough fine-tuning for all layers. Each stage of each created model was trained for 100 epochs, with the sole exception of the last stage for the 512×512 full images, which needed 50 more epochs to generate satisfying results. The batch size is progressively reduced as the resolution increases, starting from a batch of 64 images for the first three resolutions (4 × 4, 8 × 8 and 16 × 16 resolutions), decreasing to 32 (32 × 32, and 64 × 64 resolutions) and 16 (128 × 128 and 256 × 256 resolutions). In the last stage for full typology (512 × 512), batch size is further reduced to 8 images for each train step.

{| class="wikitable" style="text-align:center; margin: auto;"

==Results validation==

In the two ~~figures~~ groups of images below ~~there are shown~~ , some examples taken from the generated sets of ''missing'' and ''tombstoning'' classesare shown. For each ~~image~~group, from left to right, ~~there are displayed samples~~ you can see:* Samples of 512×512 and 256×256 resolution ''full'' typology ~~and further~~ .* Further on the right, from top to bottom, 256×256 ''upper'' and ''lower'' soldering region typology.

{| style="background:transparent; color:black" border="0" align="center" cellpadding="10px" cellspacing="0px" height="550" valign="bottom"

|}

The synthesized images can be effectively used to train a an SMC defects classification model for a future ML-based PCB-AVI application, only if they have a similar probability distribution function with respect to the original ones. In particular, it is interesting to verify if there is a clear separation in the data. To this end, it is possible to employ the t-distributed stochastic neighbor embedding (t-SNE), which is a ML algorithm for visualization, based on Stochastic Neighbor Embedding (SNE) algorithm. Since the way this algorithm operates is computationally expensive, it is highly recommended to use another dimensionality reduction technique before applying t-SNE ~~algorithm~~, in order to reduce the number of dimensions to a reasonable amount. For this purpose, it is possible to employ an autoencoder (AE).

{| style="background:transparent; color:black" border="0" align="center" cellpadding="10px" cellspacing="0px" height="550" valign="bottom"

|}

The two soldering regions are the most important parts concerning the generated images and~~, by using the methodology proposed in the previous section,~~ they can be easily extracted and evaluated ~~too~~ as well with ~~the~~ t-SNE ~~algorithm~~by using the methodology proposed in the previous section. ~~Since~~ As the two ~~of them~~ soldering regions associated with the same image are also correlated ~~for the same image~~, clearly even strongly correlated for ''tombstoning'' class samples, they must be concurrently compressed by the AE. Therefore, a dual-stream autoencoder, with two input/output layers, is required.

{| style="background:transparent; color:black" border="0" align="center" cellpadding="10px" cellspacing="0px" height="550" valign="bottom"

|}

The values obtained by applying the t-SNE algorithm on the compressed data are displayed ~~into~~ in several 3D-plots reported below respectively for ''missing'' and ''tombstoning'' synthesized images. Interestingly, the 4 plots related to full typology do not show a separation in the data between ''fakes'' (red dots) and ''reals'' (blue dots), sign that the generated images do not differ too much from the original ones, which were used to train the proGANs.

The two figures to the utmost right report the plots showing the results of t-SNE on the compressed data, obtained from the dual stream encoder for ''missing'' and ''tombstoning'' synthesized images. In the first case, for ''missing'' class, there is actually no separation between fakes and real of ''upper'' and ''lower'' region sides, a good sign that the generated images are no not very different from the original ones. This is quite understandable since the two regions look alike. The situation is visibly different in the plot for the ''tombstoning class'', where two distinct clouds of dots can be seen, confirming a well defined separation between the data related to the ''upper'' and ''lower'' typologies. It is also possible to see that fakes and reals dots, belonging to the same soldering region, are very close to each other, while dots belonging to a different soldering region are distant in the plot, confirming once again that the image generation with the proGAN was successful.

{| style="background:transparent; color:black" border="0" align="center" cellpadding="10px" cellspacing="0px" height="550" valign="bottom"

* Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville, [https://arxiv.org/abs/1704.00028 ''Improved Training of Wasserstein GANs''], December 2017.

*[https://shop.ipc.org/IPC-A-610E-English-D ''IPC-A-610E: Acceptability of Electronic Assemblies''], a standard developed by IPC, April 2010.

*Alejandro Betancourt, [https://www.industryweek.com/technology-and-iiot/digital-tools/article/21122846/making-ai-work-with-small-data Making AI Work with Small Data], February 12, 2020

U0019

dave_user

207

edits

Changes

ML-TN-003 — AI at the edge: visual inspection of assembled PCBs for defect detection — Part 3

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Quick Links

Contact us

How to use wiki

Advanced Search

Tools