Timezone: »
Predictive modeling often uses black box machine learning methods, such as deep neural networks, to achieve state-of-the-art performance. In scientific domains, the scientist often wishes to discover which features are actually important for making the predictions. These discoveries may lead to costly follow-up experiments and as such it is important that the error rate on discoveries is not too high. Model-X knockoffs enable important features to be discovered with control of the false discovery rate (FDR). However, knockoffs require rich generative models capable of accurately modeling the knockoff features while ensuring they obey the so-called "swap" property. We develop Deep Direct Likelihood Knockoffs (DDLK), which directly minimizes the KL divergence implied by the knockoff swap property. DDLK consists of two stages: it first maximizes the explicit likelihood of the features, then minimizes the KL divergence between the joint distribution of features and knockoffs and any swap between them. To ensure that the generated knockoffs are valid under any possible swap, DDLK uses the Gumbel-Softmax trick to optimize the knockoff generator under the worst-case swap. We find DDLK has higher power than baselines while controlling the false discovery rate on a variety of synthetic and real benchmarks including a task involving the largest COVID-19 health record dataset in the United States.
Author Information
Mukund Sudarshan (New York University)
Wesley Tansey (Columbia University)
Rajesh Ranganath (New York University)
More from the Same Authors
-
2021 Spotlight: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2021 : Learning Invariant Representations with Missing Data »
Mark Goldstein · Adriel Saporta · Aahlad Puli · Rajesh Ranganath · Andrew Miller -
2021 : Learning to Accelerate MR Screenings »
Raghav Singhal · Mukund Sudarshan · Angela Tong · Daniel Sodickson · Rajesh Ranganath -
2021 : Individual treatment effect estimation in the presence of unobserved confounding based on a fixed relative treatment effect »
Wouter van Amsterdam · Rajesh Ranganath -
2021 : Quantile Filtered Imitation Learning »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2022 : An "interpretable-by-design" neural network to decipher RNA splicing regulatory logic »
Susan Liao · Mukund Sudarshan · Oded Regev -
2023 Poster: Why models take shortcuts when roads are perfect: Understanding and mitigating shortcut learning in tasks with perfect stable features »
Aahlad Manas Puli · Lily Zhang · Yoav Wald · Rajesh Ranganath -
2021 Poster: Inverse-Weighted Survival Games »
Xintian Han · Mark Goldstein · Aahlad Puli · Thomas Wies · Adler Perotte · Rajesh Ranganath -
2021 Poster: Offline RL Without Off-Policy Evaluation »
David Brandfonbrener · Will Whitney · Rajesh Ranganath · Joan Bruna -
2020 Poster: General Control Functions for Causal Effect Estimation from IVs »
Aahlad Puli · Rajesh Ranganath -
2020 Poster: X-CAL: Explicit Calibration for Survival Analysis »
Mark Goldstein · Xintian Han · Aahlad Puli · Adler Perotte · Rajesh Ranganath -
2020 Poster: Causal Estimation with Functional Confounders »
Aahlad Puli · Adler Perotte · Rajesh Ranganath -
2020 : Deep Direct Likelihood Knockoffs »
Mukund Sudarshan -
2019 Poster: Energy-Inspired Models: Learning with Sampler-Induced Distributions »
Dieterich Lawson · George Tucker · Bo Dai · Rajesh Ranganath