Changes

Jump to: navigation, search
no edit summary
==Data augmentation with image synthesis==
In the previous articles, when facing a similar problem of lack of data, data augmentation was employed with ease by simply creating input pipelines with image re-mapping operations, or by using OpenCV functions to increase the amount of data stored in memory, more or less in the same fashion as TensorFlow image APIs.
 
However, despite being a good and very common methodology to increase model robustness for the classification task, new images are generated directly from an already existent image hence are always based on already existent features, such as shapes, edges, contours etc. Therefore, it can be really a matter of interest to explore the possibility to generate new data based on features of the entire original subset. This can be achieved through ML techniques applied to artificial image synthesis, in particular with GANs.
 
===Generative adversarial networks===
GANs are a particular typology of generative model used for unsupervised learning, which attempt to synthesize new data that is indistinguishable from the training data i.e. with the same distribution function of original data.
 
It uses two NNs, specifically two CNNs in the most recent approaches, which are locked in a competition game: a generator, which is fed a vector of random numbers i.e. the latent vector and outputs synthesized data i.e. the generated images, and a discriminator, which is fed a batch of data, in this case a batch of images, and outputs a prediction of it being from the training set or from the generated set, basically learning a binary classification problem. In other words, the generator creates fake data and the discriminator attempts to distinguish these fakes samples from the real ones.
 
It must be specified that GANs in practice are quite complex and training can be a very challenging task making the generation from scratch of high resolution quality images a non trivial problem. This indeed severely limits the usefulness and the applicability of classic GANs architectures for many kind of practical applications. Fortunately, this issue can be addressed by employing a particular typology of networks developed by NVIDIA and named as proGANs which are characterized by a progressive growing architecture.
 
===Progressive GAN implementation===
When using proGANs to synthesize images at high resolutions, instead of attempting to train all layers of the generator and discriminator at once to generate samples at the target resolution as it is usually done, the networks are initially created with a block containing only a bunch of layers and are progressively grown to output higher resolution versions of the images by adding more and more blocks, one at a time, after completing the training of the previous one, as illustrated in the figure below.
dave_user
207
edits

Navigation menu