Researcher profile

X. Jessie Jeng

X. Jessie Jeng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
1close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Weak Signal Inclusion Under Sparsity and Dependence

We consider the scenario where important signals are not strong enough to be separable from a large amount of noise. Such weak signals commonly exist in large-scale data analysis and play vital roles in many biomedical applications. Existing methods however are mostly underpowered for such weak signals. We address the challenge from the perspective of false negative control and develop a new method to efficiently regulate false negative proportion at a user-specified level. The new method is developed in a realistic setting with arbitrary covariance dependence between variables. We calibrate the overall dependence through a parameter whose scale is compatible with the existing phase diagram in high-dimensional sparse inference. Utilizing the new calibration, we asymptotically explicate the joint effect of covariance dependence, signal sparsity, and signal intensity on the proposed method. We interpret the results using a new phase diagram, which shows that the proposed method can efficiently retain a high proportion of signals even when they cannot be well-separated from noise. Finite sample performance of the proposed method is compared to those of several existing methods in simulation studies. The proposed method outperforms the others in adapting to a user-specified false negative control level. We apply the new method to analyze an fMRI dataset to locate voxels that are functionally relevant to saccadic eye movements. The new method exhibits a nice balance in identifying functional relevant regions and avoiding excessive noise voxels.

preprint2013arXiv

Identification of Signal, Noise, and Indistinguishable Subsets in High-Dimensional Data Analysis

Motivated by applications in high-dimensional data analysis where strong signals often stand out easily and weak ones may be indistinguishable from the noise, we develop a statistical framework to provide a novel categorization of the data into the signal, noise, and indistinguishable subsets. The three-subset categorization is especially relevant under high-dimensionality as a large proportion of signals can be obscured by the large amount of noise. Understanding the three-subset phenomenon is important for the researchers in real applications to design efficient follow-up studies. %For example, candidates belonging to the signal subset may have priority for more focused study, while those in the noise subset can be removed; and, for candidates in the indistinguishable subset, additional data may be collected to further separate weak signals from the noise. We develop an efficient data-driven procedure to identify the three subsets. Theoretical study shows that, under certain conditions, only signals are included in the identified signal subset while the remaining signals are included in the identified indistinguishable subsets with high probability. Moreover, the proposed procedure adapts to the unknown signal intensity, so that the identified indistinguishable subset shrinks with the true indistinguishable subset when signals become stronger. The procedure is examined and compared with methods based on FDR control using Monte Carlo simulation. Further, it is applied successfully in a real-data application to identify genomic variants having different signal intensity.