Timezone: »
In many natural settings, the analysis goal is not to characterize a single data set in isolation, but rather to understand the difference between one set of observations and another. For example, given a background corpus of news articles together with writings of a particular author, one may want a topic model that explains word patterns and themes specific to the author. Another example comes from genomics, in which biological signals may be collected from different regions of a genome, and one wants a model that captures the differential statistics observed in these regions. This paper formalizes this notion of contrastive learning for mixture models, and develops spectral algorithms for inferring mixture components specific to a foreground data set when contrasted with a background data set. The method builds on recent moment-based estimators and tensor decompositions for latent variable models, and has the intuitive feature of using background data statistics to appropriately modify moments estimated from foreground data. A key advantage of the method is that the background data need only be coarsely modeled, which is important when the background is too complex, noisy, or not of interest. The method is demonstrated on applications in contrastive topic modeling and genomic sequence analysis.
Author Information
James Y Zou (Microsoft Research)
Daniel Hsu (Columbia University)
See <https://www.cs.columbia.edu/~djhsu/>
David Parkes (Harvard University)
David C. Parkes is Gordon McKay Professor of Computer Science in the School of Engineering and Applied Sciences at Harvard University. He was the recipient of the NSF Career Award, the Alfred P. Sloan Fellowship, the Thouron Scholarship and the Harvard University Roslyn Abramson Award for Teaching. Parkes received his Ph.D. degree in Computer and Information Science from the University of Pennsylvania in 2001, and an M.Eng. (First class) in Engineering and Computing Science from Oxford University in 1995. At Harvard, Parkes leads the EconCS group and teaches classes in artificial intelligence, optimization, and topics at the intersection between computer science and economics. Parkes has served as Program Chair of ACM EC’07 and AAMAS’08 and General Chair of ACM EC’10, served on the editorial board of Journal of Artificial Intelligence Research, and currently serves as Editor of Games and Economic Behavior and on the boards of Journal of Autonomous Agents and Multi-agent Systems and INFORMS Journal of Computing. His research interests include computational mechanism design, electronic commerce, stochastic optimization, preference elicitation, market design, bounded rationality, computational social choice, networks and incentives, multi-agent systems, crowd-sourcing and social computing.
Ryan Adams (Princeton University)
More from the Same Authors
-
2020 : Biased Programmers? Or Biased Data? A Field Experiment in Operationalizing AI Ethics »
Bo Cowgill · Fabrizio Dell'Acqua · Augustin Chaintreau · Nakul Verma · Samuel Deng · Daniel Hsu -
2021 Spotlight: Slice Sampling Reparameterization Gradients »
David Zoltowski · Diana Cai · Ryan Adams -
2021 Spotlight: Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate »
Xingyuan Sun · Tianju Xue · Szymon Rusinkiewicz · Ryan Adams -
2021 Spotlight: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2021 : ProBF: Probabilistic Safety Certificates with Barrier Functions »
Sulin Liu · Athindran Ramesh Kumar · Jaime Fisac · Ryan Adams · Peter J. Ramadge -
2021 : Reading the Road: Leveraging Meta-Learning to Learn Other Driver Behavior »
Anat Kleiman · Ryan Adams -
2021 : Deep Reinforcement Learning Explanation via Model Transforms »
Sarah Keren · Yoav Kolumbus · Jeffrey S Rosenschein · David Parkes · Mira Finkelstein -
2022 : Predictive Multiplicity in Probabilistic Classification »
Jamelle Watson-Daniels · David Parkes · Berk Ustun -
2022 : A code superoptimizer through neural Monte-Carlo tree search »
Wenda Zhou · Olga Solodova · Ryan Adams -
2022 : Predictive Multiplicity in Probabilistic Classification »
Jamelle Watson-Daniels · David Parkes · Berk Ustun -
2022 : Learning to Mitigate AI Collusion on E-Commerce Platforms »
Eric Mibuari · Gianluca Brero · David Parkes · Nicolas Lepore -
2022 : Meta-RL for Multi-Agent RL: Learning to Adapt to Evolving Agents »
Matthias Gerstgrasser · David Parkes -
2023 Poster: Data Market Design through Deep Learning »
Sai Srivatsa Ravindranath · Yanchen Jiang · David Parkes -
2023 Poster: Deep Contract Design via Discontinuous Piecewise Affine Neural Networks »
Tonghan Wang · Paul Duetting · Dmitry Ivanov · Inbal Talgam-Cohen · David Parkes -
2023 Poster: Representational Strengths and Limitations of Transformers »
Clayton Sanford · Daniel Hsu · Matus Telgarsky -
2022 Spotlight: Lightning Talks 5A-2 »
Qiang LI · Zhiwei Xu · Jia-Qi Yang · Thai Hung Le · Haoxuan Qu · Yang Li · Artyom Sorokin · Peirong Zhang · Mira Finkelstein · Nitsan levy · Chung-Yiu Yau · dapeng li · Thommen Karimpanal George · De-Chuan Zhan · Nazar Buzun · Jiajia Jiang · Li Xu · Yichuan Mo · Yujun Cai · Yuliang Liu · Leonid Pugachev · Bin Zhang · Lucy Liu · Hoi-To Wai · Liangliang Shi · Majid Abdolshah · Yoav Kolumbus · Lin Geng Foo · Junchi Yan · Mikhail Burtsev · Lianwen Jin · Yuan Zhan · Dung Nguyen · David Parkes · Yunpeng Baiia · Jun Liu · Kien Do · Guoliang Fan · Jeffrey S Rosenschein · Sunil Gupta · Sarah Keren · Svetha Venkatesh -
2022 Spotlight: Explainable Reinforcement Learning via Model Transforms »
Mira Finkelstein · Nitsan levy · Lucy Liu · Yoav Kolumbus · David Parkes · Jeffrey S Rosenschein · Sarah Keren -
2022 : A code superoptimizer through neural Monte-Carlo tree search »
Wenda Zhou · Olga Solodova · Ryan Adams -
2022 Poster: Explainable Reinforcement Learning via Model Transforms »
Mira Finkelstein · Nitsan levy · Lucy Liu · Yoav Kolumbus · David Parkes · Jeffrey S Rosenschein · Sarah Keren -
2022 Poster: Multi-fidelity Monte Carlo: a pseudo-marginal approach »
Diana Cai · Ryan Adams -
2022 Poster: Masked Prediction: A Parameter Identifiability View »
Bingbin Liu · Daniel Hsu · Pradeep Ravikumar · Andrej Risteski -
2022 Poster: Learning to Mitigate AI Collusion on Economic Platforms »
Gianluca Brero · Eric Mibuari · Nicolas Lepore · David Parkes -
2021 : Randomized Automatic Differentiation - Ryan Adams - Princeton University »
Ryan Adams -
2021 Workshop: Learning in Presence of Strategic Behavior »
Omer Ben-Porat · Nika Haghtalab · Annie Liang · Yishay Mansour · David Parkes -
2021 Poster: Slice Sampling Reparameterization Gradients »
David Zoltowski · Diana Cai · Ryan Adams -
2021 Poster: Support vector machines and linear regression coincide with very high-dimensional features »
Navid Ardeshir · Clayton Sanford · Daniel Hsu -
2021 Poster: Amortized Synthesis of Constrained Configurations Using a Differentiable Surrogate »
Xingyuan Sun · Tianju Xue · Szymon Rusinkiewicz · Ryan Adams -
2021 Poster: Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability »
Dibya Ghosh · Jad Rahme · Aviral Kumar · Amy Zhang · Ryan Adams · Sergey Levine -
2021 Poster: Bayesian decision-making under misspecified priors with applications to meta-learning »
Max Simchowitz · Christopher Tosh · Akshay Krishnamurthy · Daniel Hsu · Thodoris Lykouris · Miro Dudik · Robert Schapire -
2020 : Orals 1.1: Randomized Automatic Differentiation »
Deniz Oktay · Nick McGreivy · Alex Beatson · Ryan Adams -
2020 Workshop: Machine Learning for Engineering Modeling, Simulation and Design »
Alex Beatson · Priya Donti · Amira Abdel-Rahman · Stephan Hoyer · Rose Yu · J. Zico Kolter · Ryan Adams -
2020 Workshop: Machine Learning for Economic Policy »
Stephan Zheng · Alexander Trott · Annie Liang · Jamie Morgenstern · David Parkes · Nika Haghtalab -
2020 Poster: On Warm-Starting Neural Network Training »
Jordan Ash · Ryan Adams -
2020 Poster: Task-Agnostic Amortized Inference of Gaussian Process Hyperparameters »
Sulin Liu · Xingyuan Sun · Peter J. Ramadge · Ryan Adams -
2020 Poster: Learning Composable Energy Surrogates for PDE Order Reduction »
Alex Beatson · Jordan Ash · Geoffrey Roeder · Tianju Xue · Ryan Adams -
2020 Poster: From Predictions to Decisions: Using Lookahead Regularization »
Nir Rosenfeld · Anna Hilgard · Sai Srivatsa Ravindranath · David Parkes -
2020 Poster: Ensuring Fairness Beyond the Training Data »
Debmalya Mandal · Samuel Deng · Suman Jana · Jeannette Wing · Daniel Hsu -
2020 Oral: Learning Composable Energy Surrogates for PDE Order Reduction »
Alex Beatson · Jordan Ash · Geoffrey Roeder · Tianju Xue · Ryan Adams -
2019 Poster: SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers »
Igor Fedorov · Ryan Adams · Matthew Mattina · Paul Whatmough -
2019 Poster: Discrete Object Generation with Reversible Inductive Construction »
Ari Seff · Wenda Zhou · Farhan Damani · Abigail Doyle · Ryan Adams -
2019 Poster: Finding Friend and Foe in Multi-Agent Games »
Jack Serrino · Max Kleiman-Weiner · David Parkes · Josh Tenenbaum -
2019 Spotlight: Finding Friend and Foe in Multi-Agent Games »
Jack Serrino · Max Kleiman-Weiner · David Parkes · Josh Tenenbaum -
2019 Poster: On the number of variables to use in principal component regression »
Ji Xu · Daniel Hsu -
2018 : Discussion Panel: Ryan Adams, Nicolas Heess, Leslie Kaelbling, Shie Mannor, Emo Todorov (moderator: Roy Fox) »
Ryan Adams · Nicolas Heess · Leslie Kaelbling · Shie Mannor · Emo Todorov · Roy Fox -
2018 : Inference and Control of Learning Behavior in Rodents (Ryan Adams) »
Ryan Adams -
2018 Poster: Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate »
Mikhail Belkin · Daniel Hsu · Partha P Mitra -
2018 Poster: A Bayesian Nonparametric View on Count-Min Sketch »
Diana Cai · Michael Mitzenmacher · Ryan Adams -
2018 Poster: Benefits of over-parameterization with EM »
Ji Xu · Daniel Hsu · Arian Maleki -
2018 Poster: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2018 Spotlight: Leveraged volume sampling for linear regression »
Michal Derezinski · Manfred K. Warmuth · Daniel Hsu -
2017 : Optimal Economic Design through Deep Learning »
David Parkes -
2017 Poster: PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference »
Jonathan Huggins · Ryan Adams · Tamara Broderick -
2017 Poster: Multi-View Decision Processes: The Helper-AI Problem »
Christos Dimitrakakis · David Parkes · Goran Radanovic · Paul Tylkin -
2017 Spotlight: PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference »
Jonathan Huggins · Ryan Adams · Tamara Broderick -
2017 Poster: Linear regression without correspondence »
Daniel Hsu · Kevin Shi · Xiaorui Sun -
2017 Poster: Reducing Reparameterization Gradient Variance »
Andrew Miller · Nick Foti · Alexander D'Amour · Ryan Adams -
2016 : Panel Discussion »
Shakir Mohamed · David Blei · Ryan Adams · José Miguel Hernández-Lobato · Ian Goodfellow · Yarin Gal -
2016 : A Tribute to David MacKay »
Ryan Adams -
2016 Workshop: Bayesian Optimization: Black-box Optimization and Beyond »
Roberto Calandra · Bobak Shahriari · Javier Gonzalez · Frank Hutter · Ryan Adams -
2016 Workshop: Machine Learning in Computational Biology »
Gerald Quon · Sara Mostafavi · James Y Zou · Barbara Engelhardt · Oliver Stegle · Nicolo Fusi -
2016 : Leveraging Structure in Bayesian Optimization »
Ryan Adams -
2016 Poster: Long-term Causal Effects via Behavioral Game Theory »
Panagiotis Toulis · David Parkes -
2016 Poster: Bayesian latent structure discovery from multi-neuron recordings »
Scott Linderman · Ryan Adams · Jonathan Pillow -
2016 Poster: Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings »
Tolga Bolukbasi · Kai-Wei Chang · James Y Zou · Venkatesh Saligrama · Adam T Kalai -
2016 Poster: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Oral: Global Analysis of Expectation Maximization for Mixtures of Two Gaussians »
Ji Xu · Daniel Hsu · Arian Maleki -
2016 Poster: Composing graphical models with neural networks for structured representations and fast inference »
Matthew Johnson · David Duvenaud · Alex Wiltschko · Ryan Adams · Sandeep R Datta -
2016 Poster: Search Improves Label for Active Learning »
Alina Beygelzimer · Daniel Hsu · John Langford · Chicheng Zhang -
2015 Workshop: Bayesian Optimization: Scalability and Flexibility »
Bobak Shahriari · Ryan Adams · Nando de Freitas · Amar Shah · Roberto Calandra -
2015 : Discovering Salient Features via Adaptively Chosen Comparisons »
James Y Zou -
2015 Workshop: Statistical Methods for Understanding Neural Systems »
Alyson Fletcher · Jakob H Macke · Ryan Adams · Jascha Sohl-Dickstein -
2015 Poster: Mixing Time Estimation in Reversible Markov Chains from a Single Sample Path »
Daniel Hsu · Aryeh Kontorovich · Csaba Szepesvari -
2015 Poster: Convolutional Networks on Graphs for Learning Molecular Fingerprints »
David Duvenaud · Dougal Maclaurin · Jorge Iparraguirre · Rafael Bombarell · Timothy Hirzel · Alan Aspuru-Guzik · Ryan Adams -
2015 Poster: A Gaussian Process Model of Quasar Spectral Energy Distributions »
Andrew Miller · Albert Wu · Jeffrey Regier · Jon McAuliffe · Dustin Lang · Mr. Prabhat · David Schlegel · Ryan Adams -
2015 Poster: Learnability of Influence in Networks »
Harikrishna Narasimhan · David Parkes · Yaron Singer -
2015 Poster: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Spotlight: Efficient and Parsimonious Agnostic Active Learning »
Tzu-Kuo Huang · Alekh Agarwal · Daniel Hsu · John Langford · Robert Schapire -
2015 Poster: Spectral Representations for Convolutional Neural Networks »
Oren Rippel · Jasper Snoek · Ryan Adams -
2015 Poster: Dependent Multinomial Models Made Easy: Stick-Breaking with the Polya-gamma Augmentation »
Scott Linderman · Matthew Johnson · Ryan Adams -
2014 Workshop: NIPS’14 Workshop on Crowdsourcing and Machine Learning »
David Parkes · Denny Zhou · Chien-Ju Ho · Nihar Bhadresh Shah · Adish Singla · Jared Heyman · Edwin Simpson · Andreas Krause · Rafael Frongillo · Jennifer Wortman Vaughan · Panagiotis Papadimitriou · Damien Peters -
2014 Workshop: Analysis of Rank Data: Confluence of Social Choice, Operations Research, and Machine Learning »
Shivani Agarwal · Hossein Azari Soufiani · Guy Bresler · Sewoong Oh · David Parkes · Arun Rajkumar · Devavrat Shah -
2014 Workshop: Bayesian Optimization in Academia and Industry »
Zoubin Ghahramani · Ryan Adams · Matthew Hoffman · Kevin Swersky · Jasper Snoek -
2014 Workshop: NIPS Workshop on Transactional Machine Learning and E-Commerce »
David Parkes · David H Wolpert · Jennifer Wortman Vaughan · Jacob D Abernethy · Amos Storkey · Mark Reid · Ping Jin · Nihar Bhadresh Shah · Mehryar Mohri · Luis E Ortiz · Robin Hanson · Aaron Roth · Satyen Kale · Sebastien Lahaie -
2014 Poster: A Statistical Decision-Theoretic Framework for Social Choice »
Hossein Azari Soufiani · David Parkes · Lirong Xia -
2014 Oral: A Statistical Decision-Theoretic Framework for Social Choice »
Hossein Azari Soufiani · David Parkes · Lirong Xia -
2014 Poster: Scalable Non-linear Learning with Adaptive Polynomial Expansions »
Alekh Agarwal · Alina Beygelzimer · Daniel Hsu · John Langford · Matus J Telgarsky -
2014 Poster: The Large Margin Mechanism for Differentially Private Maximization »
Kamalika Chaudhuri · Daniel Hsu · Shuang Song -
2014 Poster: A framework for studying synaptic plasticity with neural spike train data »
Scott Linderman · Christopher H Stock · Ryan Adams -
2013 Workshop: Bayesian Optimization in Theory and Practice »
Matthew Hoffman · Jasper Snoek · Nando de Freitas · Michael A Osborne · Ryan Adams · Sebastien Bubeck · Philipp Hennig · Remi Munos · Andreas Krause -
2013 Workshop: Workshop on Spectral Learning »
Byron Boots · Daniel Hsu · Borja Balle -
2013 Poster: Multi-Task Bayesian Optimization »
Kevin Swersky · Jasper Snoek · Ryan Adams -
2013 Poster: Message Passing Inference with Chemical Reaction Networks »
Nils E Napp · Ryan Adams -
2013 Oral: Message Passing Inference with Chemical Reaction Networks »
Nils E Napp · Ryan Adams -
2013 Poster: A Determinantal Point Process Latent Variable Model for Inhibition in Neural Spiking Data »
Jasper Snoek · Richard Zemel · Ryan Adams -
2013 Poster: Generalized Random Utility Models with Multiple Types »
Hossein Azari Soufiani · Hansheng Diao · Zhenyu Lai · David Parkes -
2013 Poster: When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity »
Anima Anandkumar · Daniel Hsu · Majid Janzamin · Sham M Kakade -
2013 Poster: Generalized Method-of-Moments for Rank Aggregation »
Hossein Azari Soufiani · William Z Chen · David Parkes · Lirong Xia -
2012 Poster: Bayesian n-Choose-k Models for Classification and Ranking »
Kevin Swersky · Danny Tarlow · Richard Zemel · Ryan Adams · Brendan J Frey -
2012 Poster: Learning Mixtures of Tree Graphical Models »
Anima Anandkumar · Daniel Hsu · Furong Huang · Sham M Kakade -
2012 Poster: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Identifiability and Unmixing of Latent Parse Trees »
Percy Liang · Sham M Kakade · Daniel Hsu -
2012 Poster: Priors for Diversity in Generative Latent Variable Models »
James Y Zou · Ryan Adams -
2012 Spotlight: A Spectral Algorithm for Latent Dirichlet Allocation »
Anima Anandkumar · Dean P Foster · Daniel Hsu · Sham M Kakade · Yi-Kai Liu -
2012 Poster: Cardinality Restricted Boltzmann Machines »
Kevin Swersky · Danny Tarlow · Ilya Sutskever · Richard Zemel · Russ Salakhutdinov · Ryan Adams -
2012 Poster: Practical Bayesian Optimization of Machine Learning Algorithms »
Jasper Snoek · Hugo Larochelle · Ryan Adams -
2011 Workshop: Bayesian Nonparametric Methods: Hope or Hype? »
Emily Fox · Ryan Adams -
2011 Poster: Stochastic convex optimization with bandit feedback »
Alekh Agarwal · Dean P Foster · Daniel Hsu · Sham M Kakade · Sasha Rakhlin -
2011 Poster: Spectral Methods for Learning Multivariate Latent Tree Structure »
Anima Anandkumar · Kamalika Chaudhuri · Daniel Hsu · Sham M Kakade · Le Song · Tong Zhang -
2010 Workshop: Transfer Learning Via Rich Generative Models. »
Russ Salakhutdinov · Ryan Adams · Josh Tenenbaum · Zoubin Ghahramani · Tom Griffiths -
2010 Workshop: Monte Carlo Methods for Bayesian Inference in Modern Day Applications »
Ryan Adams · Mark A Girolami · Iain Murray -
2010 Oral: Tree-Structured Stick Breaking for Hierarchical Data »
Ryan Adams · Zoubin Ghahramani · Michael Jordan -
2010 Oral: Slice sampling covariance hyperparameters of latent Gaussian models »
Iain Murray · Ryan Adams -
2010 Invited Talk: The Interplay of Machine Learning and Mechanism Design »
David Parkes -
2010 Poster: Tree-Structured Stick Breaking for Hierarchical Data »
Ryan Adams · Zoubin Ghahramani · Michael Jordan -
2010 Poster: Slice sampling covariance hyperparameters of latent Gaussian models »
Iain Murray · Ryan Adams -
2010 Poster: Agnostic Active Learning Without Constraints »
Alina Beygelzimer · Daniel Hsu · John Langford · Tong Zhang -
2009 Poster: A Parameter-free Hedging Algorithm »
Kamalika Chaudhuri · Yoav Freund · Daniel Hsu -
2009 Poster: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2009 Oral: Multi-Label Prediction via Compressed Sensing »
Daniel Hsu · Sham M Kakade · John Langford · Tong Zhang -
2008 Poster: The Gaussian Process Density Sampler »
Ryan Adams · Iain Murray · David MacKay -
2008 Spotlight: The Gaussian Process Density Sampler »
Ryan Adams · Iain Murray · David MacKay -
2007 Spotlight: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni -
2007 Poster: A general agnostic active learning algorithm »
Sanjoy Dasgupta · Daniel Hsu · Claire Monteleoni