Bivariate Causal Discovery for Categorical Data via Classification with Optimal Label Permutation

Yang Ni

Hall J #114

Keywords: [ bayesian network ] [ Categorical Data ] [ discrete data ] [ causal discovery ] [ Qualitative Data ]

[ Abstract ]
[ Paper [ Poster [ OpenReview
Thu 1 Dec 9 a.m. PST — 11 a.m. PST


Causal discovery for quantitative data has been extensively studied but less is known for categorical data. We propose a novel causal model for categorical data based on a new classification model, termed classification with optimal label permutation (COLP). By design, COLP is a parsimonious classifier, which gives rise to a provably identifiable causal model. A simple learning algorithm via comparing likelihood functions of causal and anti-causal models suffices to learn the causal direction. Through experiments with synthetic and real data, we demonstrate the favorable performance of the proposed COLP-based causal model compared to state-of-the-art methods. We also make available an accompanying R package COLP, which contains the proposed causal discovery algorithm and a benchmark dataset of categorical cause-effect pairs.

Chat is not available.