Timezone: »

Fourier-transform-based attribution priors improve the interpretability and stability of deep learning models for genomics
Alex Tseng · Avanti Shrikumar · Anshul Kundaje

Wed Dec 09 09:00 PM -- 11:00 PM (PST) @ Poster Session 4 #1231

Deep learning models can accurately map genomic DNA sequences to associated functional molecular readouts such as protein-DNA binding data. Base-resolution importance (i.e. "attribution") scores inferred from these models can highlight predictive sequence motifs and syntax. Unfortunately, these models are prone to overfitting and are sensitive to random initializations, often resulting in noisy and irreproducible attributions that obfuscate underlying motifs. To address these shortcomings, we propose a novel attribution prior, where the Fourier transform of input-level attribution scores are computed at training-time, and high-frequency components of the Fourier spectrum are penalized. We evaluate different model architectures with and without our attribution prior, training on genome-wide binary labels or continuous molecular profiles. We show that our attribution prior significantly improves models' stability, interpretability, and performance on held-out data, especially when training data is severely limited. Our attribution prior also allows models to identify biologically meaningful sequence motifs more sensitively and precisely within individual regulatory elements. The prior is agnostic to the model architecture or predicted experimental assay, yet provides similar gains across all experiments. This work represents an important advancement in improving the reliability of deep learning models for deciphering the regulatory code of the genome.

Author Information

Alex Tseng (Stanford University)
Avanti Shrikumar (Stanford University)
Anshul Kundaje (Stanford University)

More from the Same Authors

  • 2021 Workshop: Learning Meaningful Representations of Life (LMRL) »
    Elizabeth Wood · Adji Bousso Dieng · Aleksandrina Goeva · Anshul Kundaje · Barbara Engelhardt · Chang Liu · David Van Valen · Debora Marks · Edward Boyden · Eli N Weinstein · Lorin Crawford · Mor Nitzan · Romain Lopez · Tamara Broderick · Ray Jones · Wouter Boomsma · Yixin Wang
  • 2020 Workshop: Learning Meaningful Representations of Life (LMRL.org) »
    Elizabeth Wood · Debora Marks · Ray Jones · Adji Bousso Dieng · Alan Aspuru-Guzik · Anshul Kundaje · Barbara Engelhardt · Chang Liu · Edward Boyden · Kresten Lindorff-Larsen · Mor Nitzan · Smita Krishnaswamy · Wouter Boomsma · Yixin Wang · David Van Valen · Orr Ashenberg
  • 2016 Poster: Unsupervised Learning from Noisy Networks with Applications to Hi-C Data »
    Bo Wang · Junjie Zhu · Armin Pourshafeie · Oana Ursu · Serafim Batzoglou · Anshul Kundaje
  • 2014 Workshop: Machine Learning in Computational Biology »
    Oliver Stegle · Sara Mostafavi · Anna Goldenberg · Su-In Lee · Michael Leung · Anshul Kundaje · Mark B Gerstein · Martin Renqiang Min · Hannes Bretschneider · Francesco Paolo Casale · Loïc Schwaller · Amit G Deshwar · Benjamin A Logsdon · Yuanyang Zhang · Ali Punjani · Derek C Aguiar · Samuel Kaski