Open main menu

DAVE Developer's Wiki β

Changes

no edit summary
The model is trained for a total number of 100 epochs, with early stopping to prevent model overfitting on train data and checkpointing the weights on best val_loss. After that, a new model is created disabling all the layers only useful during training such as dropouts and batchnorms (i.e. in this case the batchnorm layers are not used).
[[File:Train Accuracy.png|thumb|centerleft|200px500px|Work in progressPlot of model's accuracy during training phase]][[File:Train Loss.png|thumb|centerright|200px500px|Work in progressPlot of model's loss during training phase]]
===Prune the model===
In this particular case, a good compromise between compression and accuracy drop, is to prune only the two dense layers of the model, which have a high number of parameters, with a pruning schedule that start at epoch 0, ends at 1/3 of the total number of epochs (i.e. 100 epochs), starting with an initial sparsity of 50% and ending with a final sparsity of 80%, with a pruning frequency of 5 steps (i.e. the model is pruned every 5 steps during the training phase).
[[File:Pruned Accuracy.png|thumb|center|200px500px|Work in progressPlot of model's accuracy during pruning phase]][[File:Pruned Loss.png|thumb|center|200px500px|Work in progressPlot of model's loss during pruning phase]]
The weights sparsity of the model, after applying pruning:
dave_user
207
edits