Paper detail

Identification of Signal, Noise, and Indistinguishable Subsets in High-Dimensional Data Analysis

Motivated by applications in high-dimensional data analysis where strong signals often stand out easily and weak ones may be indistinguishable from the noise, we develop a statistical framework to provide a novel categorization of the data into the signal, noise, and indistinguishable subsets. The three-subset categorization is especially relevant under high-dimensionality as a large proportion of signals can be obscured by the large amount of noise. Understanding the three-subset phenomenon is important for the researchers in real applications to design efficient follow-up studies. %For example, candidates belonging to the signal subset may have priority for more focused study, while those in the noise subset can be removed; and, for candidates in the indistinguishable subset, additional data may be collected to further separate weak signals from the noise. We develop an efficient data-driven procedure to identify the three subsets. Theoretical study shows that, under certain conditions, only signals are included in the identified signal subset while the remaining signals are included in the identified indistinguishable subsets with high probability. Moreover, the proposed procedure adapts to the unknown signal intensity, so that the identified indistinguishable subset shrinks with the true indistinguishable subset when signals become stronger. The procedure is examined and compared with methods based on FDR control using Monte Carlo simulation. Further, it is applied successfully in a real-data application to identify genomic variants having different signal intensity.

preprint2013arXivOpen access

Signal facts

What is known right now

Open access1 author1 topic

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.