Researcher profile

Xiao-Li Meng

Xiao-Li Meng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

The LIRA-Ising Model: Estimating the boundaries of irregularly shaped X-ray sources

Mapping the boundary of an extended source is a key step in the study of its morphology. The background contamination and statistical fluctuations of typical astronomical images make this a challenging statistical task, particularly for X-ray images with low surface brightness. We develop a three-step Bayesian procedure to identify the boundaries of irregularly shaped sources. We first apply a Bayesian multiscale reconstruction algorithm known as LIRA to obtain posterior pixelwise probability distributions of the source intensity that properly account for known structures, astrophysical background, and the effect of the telescope point spread function. Next, we adopt an Ising model to group pixels with similar intensities into cohesive regions corresponding to background and source. Finally, the boundary is derived on the basis of the most likely aggregation of pixels into the source region. Because the overall model combines LIRA and the Ising model, we call it LIRA-Ising. We verify the proposed method using a set of simulation studies. We then apply it to the Chandra X-ray Observatory images of two high redshift quasars, PKS J1421-0643 and 0730+257, to determine the extent and morphology of X-ray jets. Our method shows a uniform X-ray surface brightness of PKS J1421-0643 jet, and identifies knotty structure in the X-ray jet of 0730+257.

preprint2022arXiv

Scalable Spike-and-Slab

Spike-and-slab priors are commonly used for Bayesian variable selection, due to their interpretability and favorable statistical properties. However, existing samplers for spike-and-slab posteriors incur prohibitive computational costs when the number of variables is large. In this article, we propose Scalable Spike-and-Slab ($S^3$), a scalable Gibbs sampling implementation for high-dimensional Bayesian regression with the continuous spike-and-slab prior of George and McCulloch (1993). For a dataset with $n$ observations and $p$ covariates, $S^3$ has order $\max\{ n^2 p_t, np \}$ computational cost at iteration $t$ where $p_t$ never exceeds the number of covariates switching spike-and-slab states between iterations $t$ and $t-1$ of the Markov chain. This improves upon the order $n^2 p$ per-iteration cost of state-of-the-art implementations as, typically, $p_t$ is substantially smaller than $p$. We apply $S^3$ on synthetic and real-world datasets, demonstrating orders of magnitude speed-ups over existing exact samplers and significant gains in inferential quality over approximate samplers with comparable cost.

preprint2021arXiv

Multiple Improvements of Multiple Imputation Likelihood Ratio Tests

Multiple imputation (MI) inference handles missing data by imputing the missing values $m$ times, and then combining the results from the $m$ complete-data analyses. However, the existing method for combining likelihood ratio tests (LRTs) has multiple defects: (i) the combined test statistic can be negative, but its null distribution is approximated by an $F$-distribution; (ii) it is not invariant to re-parametrization; (iii) it fails to ensure monotonic power owing to its use of an inconsistent estimator of the fraction of missing information (FMI) under the alternative hypothesis; and (iv) it requires nontrivial access to the LRT statistic as a function of parameters instead of data sets. We show, using both theoretical derivations and empirical investigations, that essentially all of these problems can be straightforwardly addressed if we are willing to perform an additional LRT by stacking the $m$ completed data sets as one big completed data set. This enables users to implement the MI LRT without modifying the complete-data procedure. A particularly intriguing finding is that the FMI can be estimated consistently by an LRT statistic for testing whether the $m$ completed data sets can be regarded effectively as samples coming from a common model. Practical guidelines are provided based on an extensive comparison of existing MI tests. Issues related to nuisance parameters are also discussed.

preprint2021arXiv

Prior sample size extensions for assessing prior impact and prior--likelihood discordance

This paper outlines a framework for quantifying the prior's contribution to posterior inference in the presence of prior-likelihood discordance, a broader concept than the usual notion of prior-likelihood conflict. We achieve this dual purpose by extending the classic notion of \textit{prior sample size}, $M$, in three directions: (I) estimating $M$ beyond conjugate families; (II) formulating $M$ as a relative notion, i.e., as a function of the likelihood sample size $k, M(k),$ which also leads naturally to a graphical diagnosis; and (III) permitting negative $M$, as a measure of prior-likelihood conflict, i.e., harmful discordance. Our asymptotic regime permits the prior sample size to grow with the likelihood data size, hence making asymptotic arguments meaningful for investigating the impact of the prior relative to that of likelihood. It leads to a simple asymptotic formula for quantifying the impact of a proper prior that only involves computing a centrality and a spread measure of the prior and the posterior. We use simulated and real data to illustrate the potential of the proposed framework, including quantifying how weak is a "weakly informative" prior adopted in a study of lupus nephritis. Whereas we take a pragmatic perspective in assessing the impact of a prior on a given inference problem under a specific evaluative metric, we also touch upon conceptual and theoretical issues such as using improper priors and permitting priors with asymptotically non-vanishing influence.

preprint2020arXiv

Congenial Differential Privacy under Mandated Disclosure

Differentially private data releases are often required to satisfy a set of external constraints that reflect the legal, ethical, and logical mandates to which the data curator is obligated. The enforcement of constraints, when treated as post-processing, adds an extra phase in the production of privatized data. It is well understood in the theory of multi-phase processing that congeniality, a form of procedural compatibility between phases, is a prerequisite for the end users to straightforwardly obtain statistically valid results. Congenial differential privacy is theoretically principled, which facilitates transparency and intelligibility of the mechanism that would otherwise be undermined by ad-hoc post-processing procedures. We advocate for the systematic integration of mandated disclosure into the design of the privacy mechanism via standard probabilistic conditioning on the invariant margins. Conditioning automatically renders congeniality because any extra post-processing phase becomes unnecessary. We provide both initial theoretical guarantees and a Markov chain algorithm for our proposal. We also discuss intriguing theoretical issues that arise in comparing congenital differential privacy and optimization-based post-processing, as well as directions for further research.