In this paper, we propose a regularization-based approach to impose conditional demographic parity in supervised learning problems. While many methods exist to achieve demographic parity, conditional demographic parity can be much more challenging to achieve, particularly when the conditioning variables are continuous or discrete with many levels. Our regularization approach is based on a probability distribution distance called bi-causal transport distance proposed in the Optimal Transport literature. Our method utilizes a single regularization term whose computational cost is $O(n^2)$ in the sample size, regardless of the dimension of the conditioning variables or whether those variables are continuous or discrete. We also target full independence of the conditional distributions, rather than only targeting the first moments like many existing methods for demographic parity. We validate the efficacy of our approach using experiments on real-world and synthetic datasets.