Timezone: »
We consider the problem of learning a low dimensional representation for compositional data. Compositional data consists of a collection of nonnegative data that sum to a constant value. Since the parts of the collection are statistically dependent, many standard tools cannot be directly applied. Instead, compositional data must be first transformed before analysis. Focusing on principal component analysis (PCA), we propose an approach that allows low dimensional representation learning directly from the original data. Our approach combines the benefits of the log-ratio transformation from compositional data analysis and exponential family PCA. A key tool in its derivation is a generalization of the scaled Bregman theorem, that relates the perspective transform of a Bregman divergence to the Bregman divergence of a perspective transform and a remainder conformal divergence. Our proposed approach includes a convenient surrogate (upper bound) loss of the exponential family PCA which has an easy to optimize form. We also derive the corresponding form for nonlinear autoencoders. Experiments on simulated data and microbiome data show the promise of our method.
Author Information
Marta Avalos (INRIA, INSERM U1219, University of Bordeaux)
Marta AVALOS, Ph.D. is an Associate Professor of Biostatistics at Bordeaux School of Public Health, University of Bordeaux, France, since 2005. She leads the interdisciplinary first-year of the Master of Public Health since 2011. She is a member of the research team "Statistics In Systems Biology and Translational Medicine" (SISTM) from the French National Institutes of computer science and automation research (INRIA) and health and medical research (INSERM). Marta received her Ph.D. in Information and Systems Technologies from the Technology University of Compiègne. She completed a Master's degree in Public Health at Paris-Sud University and a bachelor's degree in Mathematics at the University of Barcelona, Spain. Her work focuses on developing and integrating innovative statistical approaches, particularly Lasso-type regularization methods, to advance population health.
Richard Nock (Data61, the Australian National University and the University of Sydney)
Cheng Soon Ong (Data61 and ANU)
Cheng Soon Ong is a principal research scientist at the Machine Learning Research Group, Data61, CSIRO, and is the director of the machine learning and artificial intelligence future science platform at CSIRO. He is also an adjunct associate professor at the Australian National University. He is interested in enabling scientific discovery by extending statistical machine learning methods.
Julien Rouar (University of Bordeaux)
Ke Sun (Data61, CSIRO)
More from the Same Authors
-
2021 : Gaussian Process Bandits with Aggregated Feedback »
Mengyan Zhang · Russell Tsuchida · Cheng Soon Ong -
2021 : Factorized Fourier Neural Operators »
Alasdair Tran · Alexander Mathews · Lexing Xie · Cheng Soon Ong -
2022 : When are equilibrium networks scoring algorithms? »
Russell Tsuchida · Cheng Soon Ong -
2023 Poster: Squared Neural Families: A New Class of Tractable Density Models »
Russell Tsuchida · Cheng Soon Ong · Dino Sejdinovic -
2023 Poster: Boosting with Tempered Exponential Measures »
Richard Nock · Ehsan Amid · Manfred Warmuth -
2022 Poster: Fair Wrapping for Black-box Predictions »
Alexander Soen · Ibrahim Alabdulmohsin · Sanmi Koyejo · Yishay Mansour · Nyalleng Moorosi · Richard Nock · Ke Sun · Lexing Xie -
2021 Poster: Contrastive Laplacian Eigenmaps »
Hao Zhu · Ke Sun · Peter Koniusz -
2021 Poster: On the Variance of the Fisher Information for Deep Learning »
Alexander Soen · Ke Sun -
2020 Tutorial: (Track1) There and Back Again: A Tale of Slopes and Expectations »
Marc Deisenroth · Cheng Soon Ong -
2019 Poster: Disentangled behavioural representations »
Amir Dezfouli · Hassan Ashtiani · Omar Ghattas · Richard Nock · Peter Dayan · Cheng Soon Ong -
2019 Poster: A Primal-Dual link between GANs and Autoencoders »
Hisham Husain · Richard Nock · Robert Williamson -
2017 Poster: f-GANs in an Information Geometric Nutshell »
Richard Nock · Zac Cranko · Aditya K Menon · Lizhen Qu · Robert Williamson -
2017 Spotlight: f-GANs in an Information Geometric Nutshell »
Richard Nock · Zac Cranko · Aditya K Menon · Lizhen Qu · Robert Williamson -
2016 Poster: A scaled Bregman theorem with applications »
Richard Nock · Aditya Menon · Cheng Soon Ong -
2016 Poster: On Regularizing Rademacher Observation Losses »
Richard Nock -
2015 Workshop: Learning and privacy with incomplete data and weak supervision »
Giorgio Patrini · Tony Jebara · Richard Nock · Dimitrios Kotzias · Felix Xinnan Yu -
2014 Poster: (Almost) No Label No Cry »
Giorgio Patrini · Richard Nock · Tiberio Caetano · Paul Rivera -
2014 Spotlight: (Almost) No Label No Cry »
Giorgio Patrini · Richard Nock · Tiberio Caetano · Paul Rivera -
2013 Workshop: Machine Learning Open Source Software: Towards Open Workflows »
Antti Honkela · Cheng Soon Ong -
2011 Poster: Contextual Gaussian Process Bandit Optimization »
Andreas Krause · Cheng Soon Ong -
2010 Workshop: New Directions in Multiple Kernel Learning »
Marius Kloft · Ulrich Rueckert · Cheng Soon Ong · Alain Rakotomamonjy · Soeren Sonnenburg · Francis Bach -
2010 Demonstration: mldata.org - machine learning data and benchmark »
Cheng Soon Ong -
2008 Workshop: Machine Learning Open Source Software »
Soeren Sonnenburg · Mikio L Braun · Cheng Soon Ong -
2008 Poster: On the Efficient Minimization of Classification Calibrated Surrogates »
Richard Nock · Frank NIELSEN -
2008 Spotlight: On the Efficient Minimization of Classification Calibrated Surrogates »
Richard Nock · Frank NIELSEN