Sharpness-Aware Minimization

Axel Böhm, 19. Apr 2023

For overparameterized models, the training loss provides few guarantees on model generalization. Prior work already observed connections between the geometry of the loss landscape, in particular flatness of a minimizer, and generalization. Sharpness-aware minimization attempts to make use of this property by simultaneously minimizing the loss value and loss sharpness.

Resources:

https://arxiv.org/abs/2010.01412
http://arxiv.org/abs/2211.05729
https://arxiv.org/abs/2206.06232
https://arxiv.org/abs/2004.05884
https://arxiv.org/abs/2010.04925