Skip to yearly menu bar Skip to main content


Finding significant combinations of features in the presence of categorical covariates

Laetitia Papaxanthos · Felipe Llinares-L√≥pez · Dean Bodenham · Karsten Borgwardt

Area 5+6+7+8 #65

Keywords: [ (Application) Bioinformatics and Systems Biology ] [ Sparsity and Feature Selection ]


In high-dimensional settings, where the number of features p is typically much larger than the number of samples n, methods which can systematically examine arbitrary combinations of features, a huge 2^p-dimensional space, have recently begun to be explored. However, none of the current methods is able to assess the association between feature combinations and a target variable while conditioning on a categorical covariate, in order to correct for potential confounding effects. We propose the Fast Automatic Conditional Search (FACS) algorithm, a significant discriminative itemset mining method which conditions on categorical covariates and only scales as O(k log k), where k is the number of states of the categorical covariate. Based on the Cochran-Mantel-Haenszel Test, FACS demonstrates superior speed and statistical power on simulated and real-world datasets compared to the state of the art, opening the door to numerous applications in biomedicine.

Live content is unavailable. Log in and register to view live content