Paper Daily: Rethinking the value of network pruning

In this work, the authors make several surprising observations which contradict common beliefs. Their results have several implications: 1) training a large, over-parameterized model is not necessary to obtain an efficient final model, 2) learned “important” weights of the large model are not necessarily useful for the small pruned model, 3) the pruned architecture itself, rather a set of inherited “important” weights, is what leads to the efficiency benefit in the final model, which suggests that some pruning algorithms could be seen as performing network architecture search.

Introduction

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.