Fourier Feature Methods for Nonlinear Causal Discovery: FFML Scoring, TRFF Scoring, and FFCI Testing in Mixed Data
Gaussian process (GP) marginal likelihood scores and kernel conditional independence tests are theoretically appealing for nonlinear causal discovery but computationally prohibitive at scale. We present three complementary RFF-based methods forming a practical toolkit for score-based, constraint-based, and hybrid causal discovery. The Fourier Feature Marginal Likelihood (FFML) score approximates the exact GP marginal likelihood by replacing the $n x n$ kernel Gram matrix with a finite-dimensional feature representation, reducing cost to $O(nm^2 + m^3)$ while retaining the probabilistic interpretation and automatic complexity penalty of the exact score. FFML extends to mixed (continuous and discrete) parent sets via a product-kernel construction, with a Kronecker path for small discrete parent sets and a Hadamard-product path otherwise. The Tetrad Random Fourier Feature (TRFF) score is a complementary BIC-style alternative using penalized Student-t regression with random Fourier features. TRFF offers robustness to heavy-tailed noise and faster runtime than FFML. Empirically, TRFF and FFML exhibit a complementary precision-recall profile: TRFF achieves higher precision while FFML achieves better recall and lower SHD overall. The Fourier Feature Conditional Independence (FFCI) test is a fast nonparametric CI test for mixed data, using ridge residualization in feature space and a Frobenius-norm cross-covariance statistic approximated as a weighted sum of chi-squared variables. Empirically, BOSS+FFML achieves the lowest SHD on nonlinear data, while BOSS+TRFF offers the highest precision. When run through PC-Max, FFCI and RCIT exhibit complementary precision-recall profiles: RCIT is more precise while FFCI achieves better recall and substantially lower SHD, at approximately twice the runtime.