Source author record

Yoav Benjamini

Yoav Benjamini appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology math.ST Neurons and Cognition Statistics Theory

Catalog footprint

What is connected

9works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Coping with Space Neophobia in Drosophila melanogaster: The Asymmetric Dynamics of Crossing a Doorway to the Untrodden

Insects exhibit remarkable cognitive skills in the field and several cognitive abilities have been demonstrated in Drosophila in the laboratory. By devising an ethologically relevant experimental setup that also allows comparison of behavior across remote taxonomic groups we sought to reduce the gap between the field and the laboratory, and reveal as yet undiscovered ethological phenomena within a wider phylogenetic perspective. We tracked individual flies that eclosed in a small (45mm) arena containing a piece of fruit, connected to a larger (130mm) arena by a wide (5mm) doorway. Using this setup we show that the widely open doorway initially functions as a barrier: the likelihood of entering the large arena increases gradually, requiring repeated approaches to the doorway, and even after entering the flies immediately return. Gradually the flies acquire the option to avoid returning, spending more relative time and performing relatively longer excursions in the large arena. The entire process may take up three successive days. This behavior constitutes coping with space neophobia, the avoidance of untrodden space. It appears to be the same as the neophobic doorway-crossing reported in mouse models of anxiety. In both mice and flies the moment-to-moment developmental dynamics of transition between trodden and untrodden terrain appear to be the same, and in mice it is taken to imply memory and, therefore, cognition. Recent claims have been made for a deep homology between the arthropod central complex and the vertebrate basal ganglia, two structures involved in navigation. The shared dynamics of space occupancy in flies and mice might indicate the existence of cognitive exploration also in the flies or else a convergent structure exhibiting the same developmental dynamics.

preprint2015arXiv

Many Phenotypes without Many False Discoveries: Error Controlling Strategies for Multi-Traits Association Studies

The genetic basis of multiple phenotypes such as gene expression, metabolite levels, or imaging features is often investigated by testing a large collection of hypotheses, probing the existence of association between each of the traits and hundreds of thousands of genotyped variants. Appropriate multiplicity adjustment is crucial to guarantee replicability of findings, and False Discovery Rate (FDR) is frequently adopted as a measure of global error. In the interest of interpretability, results are often summarized so that reporting focuses on variants discovered to be associated to some phenotypes. We show that applying FDR-controlling procedures on the entire collection of hypotheses fails to control the rate of false discovery of associated variants as well as the average rate of false discovery of phenotypes influenced by such variants. We propose a simple hierarchical testing procedure which allows control of both these error rates and provides a more reliable basis for the identification of variants with functional effects. We demonstrate the utility of this approach through simulation studies comparing various error rates and measures of power for genetic association studies of multiple traits. Finally, we apply the proposed method to identify genetic variants which impact flowering phenotypes in Arabdopsis thaliana, expanding the set of discoveries.

preprint2015arXiv

Quantifying replicability in systematic reviews: the r-value

In order to assess the effect of a health care intervention, it is useful to look at an ensemble of relevant studies. The Cochrane Collaboration's admirable goal is to provide systematic reviews of all relevant clinical studies, in order to establish whether or not there is a conclusive evidence about a specific intervention. This is done mainly by conducting a meta-analysis: a statistical synthesis of results from a series of systematically collected studies. Health practitioners often interpret a significant meta-analysis summary effect as a statement that the treatment effect is consistent across a series of studies. However, the meta-analysis significance may be driven by an effect in only one of the studies. Indeed, in an analysis of two domains of Cochrane reviews we show that in a non-negligible fraction of reviews, the removal of a single study from the meta-analysis of primary endpoints makes the conclusion non-significant. Therefore, reporting the evidence towards replicability of the effect across studies in addition to the significant meta-analysis summary effect will provide credibility to the interpretation that the effect was replicated across studies. We suggest an objective, easily computed quantity, we term the r-value, that quantifies the extent of this reliance on single studies. We suggest adding the r-values to the main results and to the forest plots of systematic reviews.

preprint2015arXiv

Testing for replicability in a follow-up study when the primary study hypotheses are two-sided

When testing for replication of results from a primary study with two-sided hypotheses in a follow-up study, we are usually interested in discovering the features with discoveries in the same direction in the two studies. The direction of testing in the follow-up study for each feature can therefore be decided by the primary study. We prove that in this case the methods suggested in Heller, Bogomolov, and Benjamini (2014) for control over false replicability claims are valid. Specifically, we prove that if we input into the procedures in Heller, Bogomolov, and Benjamini (2014) the one-sided p-values in the directions favoured by the primary study, then we achieve directional control over the desired error measure (family-wise error rate or false discovery rate).

preprint2014arXiv

Deciding whether follow-up studies have replicated findings in a preliminary large-scale "omics' study"

We propose a formal method to declare that findings from a primary study have been replicated in a follow-up study. Our proposal is appropriate for primary studies that involve large-scale searches for rare true positives (i.e. needles in a haystack). Our proposal assigns an $r$-value to each finding; this is the lowest false discovery rate at which the finding can be called replicated. Examples are given and software is available.

preprint2014arXiv

Selective Correlations - the conditional estimators

The problem of Voodoo correlations is recognized in neuroimaging as the problem of estimating quantities of interest from the same data that was used to select them as interesting. In statistical terminology, the problem of inference following selection from the same data is that of selective inference. Motivated by the unwelcome side-effects of the recommended remedy- splitting the data. A method for constructing confidence intervals based on the correct post-selection distribution of the observations has been suggested recently. We utilize a similar approach in order to provide point estimates that account for a large part of the selection bias. We show via extensive simulations that the proposed estimator has favorable properties, namely, that it is likely to reduce estimation bias and the mean squared error compared to the direct estimator without sacrificing power to detect non-zero correlation as in the case of the data splitting approach. We show that both point estimates and confidence intervals are needed in order to get a full assessment of the uncertainty in the point estimates as both are integrated into the Confidence Calibration Plots proposed recently. The computation of the estimators is implemented in an accompanying software package.

preprint2013arXiv

Revisiting Multi-Subject Random Effects in fMRI: Advocating Prevalence Estimation

Random Effects analysis has been introduced into fMRI research in order to generalize findings from the study group to the whole population. Generalizing findings is obviously harder than detecting activation in the study group since in order to be significant, an activation has to be larger than the inter-subject variability. Indeed, detected regions are smaller when using random effect analysis versus fixed effects. The statistical assumptions behind the classic random effects model are that the effect in each location is normally distributed over subjects, and "activation" refers to a non-null mean effect. We argue this model is unrealistic compared to the true population variability, where, due to functional plasticity and registration anomalies, at each brain location some of the subjects are active and some are not. We propose a finite-Gaussian--mixture--random-effect. A model that amortizes between-subject spatial disagreement and quantifies it using the "prevalence" of activation at each location. This measure has several desirable properties: (a) It is more informative than the typical active/inactive paradigm. (b) In contrast to the hypothesis testing approach (thus t-maps) which are trivially rejected for large sample sizes, the larger the sample size, the more informative the prevalence statistic becomes. In this work we present a formal definition and an estimation procedure of this prevalence. The end result of the proposed analysis is a map of the prevalence at locations with significant activation, highlighting activations regions that are common over many brains.

preprint2011arXiv

Adjusting for selection bias in testing multiple families of hypotheses

In many large multiple testing problems the hypotheses are divided into families. Given the data, families with evidence for true discoveries are selected, and hypotheses within them are tested. Neither controlling the error-rate in each family separately nor controlling the error-rate over all hypotheses together can assure that an error-rate is controlled in the selected families. We formulate this concern about selective inference in its generality, for a very wide class of error-rates and for any selection criterion, and present an adjustment of the testing level inside the selected families that retains the average error-rate over the selected families.

preprint2010arXiv

High-throughput data analysis in behavior genetics

In recent years, a growing need has arisen in different fields for the development of computational systems for automated analysis of large amounts of data (high-throughput). Dealing with nonstandard noise structure and outliers, that could have been detected and corrected in manual analysis, must now be built into the system with the aid of robust methods. We discuss such problems and present insights and solutions in the context of behavior genetics, where data consists of a time series of locations of a mouse in a circular arena. In order to estimate the location, velocity and acceleration of the mouse, and identify stops, we use a nonstandard mix of robust and resistant methods: LOWESS and repeated running median. In addition, we argue that protection against small deviations from experimental protocols can be handled automatically using statistical methods. In our case, it is of biological interest to measure a rodent's distance from the arena's wall, but this measure is corrupted if the arena is not a perfect circle, as required in the protocol. The problem is addressed by estimating robustly the actual boundary of the arena and its center using a nonparametric regression quantile of the behavioral data, with the aid of a fast algorithm developed for that purpose.

Yoav Benjamini

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Coping with Space Neophobia in Drosophila melanogaster: The Asymmetric Dynamics of Crossing a Doorway to the Untrodden

Many Phenotypes without Many False Discoveries: Error Controlling Strategies for Multi-Traits Association Studies

Quantifying replicability in systematic reviews: the r-value

Testing for replicability in a follow-up study when the primary study hypotheses are two-sided

Deciding whether follow-up studies have replicated findings in a preliminary large-scale "omics' study"

Selective Correlations - the conditional estimators

Revisiting Multi-Subject Random Effects in fMRI: Advocating Prevalence Estimation

Adjusting for selection bias in testing multiple families of hypotheses

High-throughput data analysis in behavior genetics