Timezone: »

Scalable imputation of genetic data with a discrete fragmentation-coagulation process
Lloyd T Elliott · Yee Whye Teh

Mon Dec 03 07:00 PM -- 12:00 AM (PST) @ Harrah’s Special Events Center 2nd Floor

We present a Bayesian nonparametric model for genetic sequence data in which a set of genetic sequences is modelled using a Markov model of partitions. The partitions at consecutive locations in the genome are related by their clusters first splitting and then merging. Our model can be thought of as a discrete time analogue of continuous time fragmentation-coagulation processes [Teh et al 2011], preserving the important properties of projectivity, exchangeability and reversibility, while being more scalable. We apply this model to the problem of genotype imputation, showing improved computational efficiency while maintaining the same accuracies as in [Teh et al 2011].

Author Information

Lloyd T Elliott (University of Oxford)
Yee Whye Teh (University of Oxford, DeepMind)

I am a Professor of Statistical Machine Learning at the Department of Statistics, University of Oxford and a Research Scientist at DeepMind. I am also an Alan Turing Institute Fellow and a European Research Council Consolidator Fellow. I obtained my Ph.D. at the University of Toronto (working with Geoffrey Hinton), and did postdoctoral work at the University of California at Berkeley (with Michael Jordan) and National University of Singapore (as Lee Kuan Yew Postdoctoral Fellow). I was a Lecturer then a Reader at the Gatsby Computational Neuroscience Unit, UCL, and a tutorial fellow at University College Oxford, prior to my current appointment. I am interested in the statistical and computational foundations of intelligence, and works on scalable machine learning, probabilistic models, Bayesian nonparametrics and deep learning. I was programme co-chair of ICML 2017 and AISTATS 2010.

More from the Same Authors