Sharpness-Aware Minimization

Axel Böhm, 19. Apr 2023

For overparameterized models, the training loss provides few guarantees on model generalization. Prior work already observed connections between the geometry of the loss landscape, in particular flatness of a minimizer, and generalization. Sharpness-aware minimization attempts to make use of this property by simultaneously minimizing the loss value and loss sharpness.