Source author record

Kyoungjae Lee

Kyoungjae Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.ST Statistics Theory

Catalog footprint

What is connected

9works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bayesian inference on hierarchical nonlocal priors in generalized linear models

Variable selection methods with nonlocal priors have been widely studied in linear regression models, and their theoretical and empirical performances have been reported. However, the crucial model selection properties for hierarchical nonlocal priors in high-dimensional generalized linear regression have rarely been investigated. In this paper, we consider a hierarchical nonlocal prior for high-dimensional logistic regression models and investigate theoretical properties of the posterior distribution. Specifically, a product moment (pMOM) nonlocal prior is imposed over the regression coefficients with an Inverse-Gamma prior on the tuning parameter. Under standard regularity assumptions, we establish strong model selection consistency in a high-dimensional setting, where the number of covariates is allowed to increase at a sub-exponential rate with the sample size. We implement the Laplace approximation for computing the posterior probabilities, and a modified shotgun stochastic search procedure is suggested for efficiently exploring the model space. We demonstrate the validity of the proposed method through simulation studies and an RNA-sequencing dataset for stratifying disease risk.

preprint2022arXiv

Consistent and scalable Bayesian joint variable and graph selection for disease diagnosis leveraging functional brain network

We consider the joint inference of regression coefficients and the inverse covariance matrix for covariates in high-dimensional probit regression, where the predictors are both relevant to the binary response and functionally related to one another. A hierarchical model with spike and slab priors over regression coefficients and the elements in the inverse covariance matrix is employed to simultaneously perform variable and graph selection. We establish joint selection consistency for both the variable and the underlying graph when the dimension of predictors is allowed to grow much larger than the sample size, which is the first theoretical result in the Bayesian literature. A scalable Gibbs sampler is derived that performs better in high-dimensional simulation studies compared with other state-of-art methods. We illustrate the practical impact and utilities of the proposed method via a functional MRI dataset, where both the regions of interest with altered functional activities and the underlying functional brain network are inferred and integrated together for stratifying disease risk.

preprint2021arXiv

Bayesian inference for high-dimensional decomposable graphs

In this paper, we consider high-dimensional Gaussian graphical models where the true underlying graph is decomposable. A hierarchical $G$-Wishart prior is proposed to conduct a Bayesian inference for the precision matrix and its graph structure. Although the posterior asymptotics using the $G$-Wishart prior has received increasing attention in recent years, most of results assume moderate high-dimensional settings, where the number of variables $p$ is smaller than the sample size $n$. However, this assumption might not hold in many real applications such as genomics, speech recognition and climatology. Motivated by this gap, we investigate asymptotic properties of posteriors under the high-dimensional setting where $p$ can be much larger than $n$. The pairwise Bayes factor consistency, posterior ratio consistency and graph selection consistency are obtained in this high-dimensional setting. Furthermore, the posterior convergence rate for precision matrices under the matrix $\ell_1$-norm is derived, which turns out to coincide with the minimax convergence rate for sparse precision matrices. A simulation study confirms that the proposed Bayesian procedure outperforms competitors.

preprint2021arXiv

The Beta-Mixture Shrinkage Prior for Sparse Covariances with Posterior Minimax Rates

Statistical inference for sparse covariance matrices is crucial to reveal dependence structure of large multivariate data sets, but lacks scalable and theoretically supported Bayesian methods. In this paper, we propose beta-mixture shrinkage prior, computationally more efficient than the spike and slab prior, for sparse covariance matrices and establish its minimax optimality in high-dimensional settings. The proposed prior consists of beta-mixture shrinkage and gamma priors for off-diagonal and diagonal entries, respectively. To ensure positive definiteness of the resulting covariance matrix, we further restrict the support of the prior to a subspace of positive definite matrices. We obtain the posterior convergence rate of the induced posterior under the Frobenius norm and establish a minimax lower bound for sparse covariance matrices. The class of sparse covariance matrices for the minimax lower bound considered in this paper is controlled by the number of nonzero off-diagonal elements and has more intuitive appeal than those appeared in the literature. The obtained posterior convergence rate coincides with the minimax lower bound unless the true covariance matrix is extremely sparse. In the simulation study, we show that the proposed method is computationally more efficient than competitors, while achieving comparable performance. Advantages of the shrinkage prior are demonstrated based on two real data sets.

preprint2020arXiv

Bayesian High-dimensional Semi-parametric Inference beyond sub-Gaussian Errors

We consider a sparse linear regression model with unknown symmetric error under the high-dimensional setting. The true error distribution is assumed to belong to the locally $β$-Hölder class with an exponentially decreasing tail, which does not need to be sub-Gaussian. We obtain posterior convergence rates of the regression coefficient and the error density, which are nearly optimal and adaptive to the unknown sparsity level. Furthermore, we derive the semi-parametric Bernstein-von Mises (BvM) theorem to characterize asymptotic shape of the marginal posterior for regression coefficients. Under the sub-Gaussianity assumption on the true score function, strong model selection consistency for regression coefficients are also obtained, which eventually asserts the frequentist's validity of credible sets.

preprint2020arXiv

Bayesian joint inference for multiple directed acyclic graphs

In many applications, data often arise from multiple groups that may share similar characteristics. A joint estimation method that models several groups simultaneously can be more efficient than estimating parameters in each group separately. We focus on unraveling the dependence structures of data based on directed acyclic graphs and propose a Bayesian joint inference method for multiple graphs. To encourage similar dependence structures across all groups, a Markov random field prior is adopted. We establish the joint selection consistency of the fractional posterior in high dimensions, and benefits of the joint inference are shown under the common support assumption. This is the first Bayesian method for joint estimation of multiple directed acyclic graphs. The performance of the proposed method is demonstrated using simulation studies, and it is shown that our joint inference outperforms other competitors. We apply our method to an fMRI data for simultaneously inferring multiple brain functional networks.

preprint2020arXiv

Joint Bayesian Variable and DAG Selection Consistency for High-dimensional Regression Models with Network-structured Covariates

We consider the joint sparse estimation of regression coefficients and the covariance matrix for covariates in a high-dimensional regression model, where the predictors are both relevant to a response variable of interest and functionally related to one another via a Gaussian directed acyclic graph (DAG) model. Gaussian DAG models introduce sparsity in the Cholesky factor of the inverse covariance matrix, and the sparsity pattern in turn corresponds to specific conditional independence assumptions on the underlying predictors. A variety of methods have been developed in recent years for Bayesian inference in identifying such network-structured predictors in regression setting, yet crucial sparsity selection properties for these models have not been thoroughly investigated. In this paper, we consider a hierarchical model with spike and slab priors on the regression coefficients and a flexible and general class of DAG-Wishart distributions with multiple shape parameters on the Cholesky factors of the inverse covariance matrix. Under mild regularity assumptions, we establish the joint selection consistency for both the variable and the underlying DAG of the covariates when the dimension of predictors is allowed to grow much larger than the sample size. We demonstrate that our method outperforms existing methods in selecting network-structured predictors in several simulation settings.

preprint2020arXiv

Maximum Pairwise Bayes Factors for Covariance Structure Testing

Hypothesis testing of structure in covariance matrices is of significant importance, but faces great challenges in high-dimensional settings. Although consistent frequentist one-sample covariance tests have been proposed, there is a lack of simple, computationally scalable, and theoretically sound Bayesian testing methods for large covariance matrices. Motivated by this gap and by the need for tests that are powerful against sparse alternatives, we propose a novel testing framework based on the maximum pairwise Bayes factor. Our initial focus is on one-sample covariance testing; the proposed test can {\it optimally} distinguish null and alternative hypotheses in a frequentist asymptotic sense. We then propose diagonal tests and a scalable covariance graph selection procedure that are shown to be consistent. A simulation study evaluates the proposed approach relative to competitors. We illustrate advantages of our graph selection method on a gene expression data set.

preprint2016arXiv

Laplace based approximate posterior inference for differential equation models

Ordinary differential equations are arguably the most popular and useful mathematical tool for describing physical and biological processes in the real world. Often, these physical and biological processes are observed with errors, in which case the most natural way to model such data is via regression where the mean function is defined by an ordinary differential equation believed to provide an understanding of the underlying process. These regression based dynamical models are called differential equation models. Parameter inference from differential equation models poses computational challenges mainly due to the fact that analytic solutions to most differential equations are not available. In this paper, we propose an approximation method for obtaining the posterior distribution of parameters in differential equation models. The approximation is done in two steps. In the first step, the solution of a differential equation is approximated by the general one-step method which is a class of numerical methods for ordinary differential equations including the Euler and the Runge-Kutta procedures; in the second step, nuisance parameters are marginalized using Laplace approximation. The proposed Laplace approximated posterior gives a computationally fast alternative to the full Bayesian computational scheme (such as Markov Chain Monte Carlo) and produces more accurate and stable estimators than the popular smoothing methods (called collocation methods) based on frequentist procedures. For a theoretical support of the proposed method, we prove that the Laplace approximated posterior converges to the actual posterior under certain conditions and analyze the relation between the order of numerical error and its Laplace approximation. The proposed method is tested on simulated data sets and compared with the other existing methods.

Kyoungjae Lee

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Bayesian inference on hierarchical nonlocal priors in generalized linear models

Consistent and scalable Bayesian joint variable and graph selection for disease diagnosis leveraging functional brain network

Bayesian inference for high-dimensional decomposable graphs

The Beta-Mixture Shrinkage Prior for Sparse Covariances with Posterior Minimax Rates

Bayesian High-dimensional Semi-parametric Inference beyond sub-Gaussian Errors

Bayesian joint inference for multiple directed acyclic graphs

Joint Bayesian Variable and DAG Selection Consistency for High-dimensional Regression Models with Network-structured Covariates

Maximum Pairwise Bayes Factors for Covariance Structure Testing

Laplace based approximate posterior inference for differential equation models