Skip to yearly menu bar Skip to main content


Poster

Fishers and Hessians of Continuous Relaxations

Felix Petersen · Christian Borgelt · Tobias Sutter · Hilde Kuehne · Oliver Deussen · Stefano Ermon

West Ballroom A-D #6207
[ ]
Wed 11 Dec 11 a.m. PST — 2 p.m. PST

Abstract:

When training neural networks with custom objectives, such as ranking objectives and shortest-path objectives, a common problem is that they, per se, do not have gradients.A popular approach is to continuously relax the objectives in order to provide gradients, enabling learning.However, such continuous relaxations often suffer from vanishing and exploding gradients.In this work, we explore a technique for using the empirical Fisher matrices and Hessians of relaxations to alleviate the training bottleneck that arises from vanishing and exploding gradients in the objective function.We benchmark our approach on four relaxations of ranking and shortest-path objectives each, showing strong improvements particularly for rather `ad-hoc' relaxations that do not come with strong theoretical guarantees.

Live content is unavailable. Log in and register to view live content