Poster
Fishers and Hessians of Continuous Relaxations
Felix Petersen · Christian Borgelt · Tobias Sutter · Hilde Kuehne · Oliver Deussen · Stefano Ermon
West Ballroom A-D #6207
When training neural networks with custom objectives, such as ranking objectives and shortest-path objectives, a common problem is that they, per se, do not have gradients.A popular approach is to continuously relax the objectives in order to provide gradients, enabling learning.However, such continuous relaxations often suffer from vanishing and exploding gradients.In this work, we explore a technique for using the empirical Fisher matrices and Hessians of relaxations to alleviate the training bottleneck that arises from vanishing and exploding gradients in the objective function.We benchmark our approach on four relaxations of ranking and shortest-path objectives each, showing strong improvements particularly for rather `ad-hoc' relaxations that do not come with strong theoretical guarantees.
Live content is unavailable. Log in and register to view live content