Paper Daily: Rethinking the value of network pruning

In this work, the authors make several surprising observations which contradict common beliefs. Their results have several implications: 1) training a large, over-parameterized model is not necessary to obtain an efficient final model, 2) learned “important” weights of the large model are not necessarily useful for the small pruned model, 3) the pruned architecture itself, … Continue reading Paper Daily: Rethinking the value of network pruning