Timezone: »
Poster
Reverse-Complement Equivariant Networks for DNA Sequences
Vincent Mallet · Jean-Philippe Vert
As DNA sequencing technologies keep improving in scale and cost, there is a growing need to develop machine learning models to analyze DNA sequences, e.g., to decipher regulatory signals from DNA fragments bound by a particular protein of interest. As a double helix made of two complementary strands, a DNA fragment can be sequenced as two equivalent, so-called reverse complement (RC) sequences of nucleotides. To take into account this inherent symmetry of the data in machine learning models can facilitate learning. In this sense, several authors have recently proposed particular RC-equivariant convolutional neural networks (CNNs). However, it remains unknown whether other RC-equivariant architecture exist, which could potentially increase the set of basic models adapted to DNA sequences for practitioners. Here, we close this gap by characterizing the set of all linear RC-equivariant layers, and show in particular that new architectures exist beyond the ones already explored. We further discuss RC-equivariant pointwise nonlinearities adapted to different architectures, as well as RC-equivariant embeddings of $k$-mers as an alternative to one-hot encoding of nucleotides. We show experimentally that the new architectures can outperform existing ones.
Author Information
Vincent Mallet (Pasteur Institute)
Jean-Philippe Vert (Google)
More from the Same Authors
-
2021 Poster: Framing RNN as a kernel method: A neural ODE approach »
Adeline Fermanian · Pierre Marion · Jean-Philippe Vert · Gérard Biau -
2021 Oral: Framing RNN as a kernel method: A neural ODE approach »
Adeline Fermanian · Pierre Marion · Jean-Philippe Vert · Gérard Biau -
2018 Poster: Relating Leverage Scores and Density using Regularized Christoffel Functions »
Edouard Pauwels · Francis Bach · Jean-Philippe Vert -
2015 : Learning from Rankings »
Jean-Philippe Vert -
2014 Poster: Tight convex relaxations for sparse matrix factorization »
Emile Richard · Guillaume R Obozinski · Jean-Philippe Vert -
2013 Workshop: Machine Learning in Computational Biology »
Jean-Philippe Vert · Anna Goldenberg · Sara Mostafavi · Oliver Stegle -
2012 Workshop: Machine Learning in Computational Biology »
Jean-Philippe Vert · Anna Goldenberg · Christina Leslie -
2012 Session: Oral Session 9 »
Jean-Philippe Vert -
2011 Workshop: Machine Learning in Computational Biology »
Jean-Philippe Vert · Gunnar Rätsch · Yanjun Qi · Tomer Hertz · Anna Goldenberg · Christina Leslie -
2010 Workshop: Machine Learning in Computational Biology »
Gunnar Rätsch · Jean-Philippe Vert · Tomer Hertz · Yanjun Qi -
2010 Poster: Fast detection of multiple change-points shared by many signals using group LARS »
Jean-Philippe Vert · Kevin Bleakley -
2009 Workshop: Temporal Segmentation: Perspectives from Statistics, Machine Learning, and Signal Processing »
Stephane Canu · Olivier Cappé · Arthur Gretton · Zaid Harchaoui · Alain Rakotomamonjy · Jean-Philippe Vert -
2009 Workshop: Machine Learning in Computational Biology »
Gal Chechik · Tomer Hertz · William S Noble · Yanjun Qi · Jean-Philippe Vert · Alexander Zien -
2009 Mini Symposium: Machine Learning in Computational Biology »
Yanjun Qi · Jean-Philippe Vert · Gal Chechik · Alexander Zien · Tomer Hertz · William S Noble -
2009 Poster: White Functionals for Anomaly Detection in Dynamical Systems »
Marco Cuturi · Jean-Philippe Vert · Alexandre d'Aspremont -
2008 Poster: Clustered Multi-Task Learning: A Convex Formulation »
Laurent Jacob · Francis Bach · Jean-Philippe Vert -
2008 Spotlight: Clustered Multi-Task Learning: A Convex Formulation »
Laurent Jacob · Francis Bach · Jean-Philippe Vert