"Properties of and Generalization in Pruned Deep Neural Networks"
, via Zoom
Pruning deep neural network (DNN) parameters to reduce memory/computation requirements is an area of much interest, but a variety of pruning approaches also increase generalization (accuracy on unobserved data). Knowing how pruning improves generalization could lead to better pruning algorithms, and a better understanding of the factors affecting generalization. Traditionally, pruning was thought to prevent overfitting to the data used to train the model by reducing the number of parameters. However, such an explanation appears to be in conflict with recent DNN generalization theory, which suggests that more parameters can reduce overfitting. We develop a new perspective on how pruning regularizes, which we connect to existing theory and practice and validate via experiments. Additionally, we demonstrate that novel pruning schemes that are suggested by this perspective may lead to generalization levels that are unsurpassed in the compression literature.