Source author record

Ya'acov Ritov

Ya'acov Ritov appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Applications Methodology Machine Learning physics.soc-ph Populations and Evolution stat.OT

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Generalized maximum likelihood estimation of the mean of parameters of mixtures, with applications to sampling

Let $f(y|θ), \; θ\in Ω$ be a parametric family, $η(θ)$ a given function, and $G$ an unknown mixing distribution. It is desired to estimate $E_G (η(θ))\equiv η_G$ based on independent observations $Y_1,...,Y_n$, where $Y_i \sim f(y|θ_i)$, and $θ_i \sim G$ are iid. We explore the Generalized Maximum Likelihood Estimators (GMLE) for this problem. Some basic properties and representations of those estimators are shown. In particular we suggest a new perspective, of the weak convergence result by Kiefer and Wolfowitz (1956), with implications to a corresponding setup in which $θ_1,...,θ_n$ are {\it fixed} parameters. We also relate the above problem, of estimating $η_G$, to non-parametric empirical Bayes estimation under a squared loss. Applications of GMLE to sampling problems are presented. The performance of the GMLE is demonstrated both in simulations and through a real data example.

preprint2022arXiv

Rank-Constrained Least-Squares: Prediction and Inference

In this work, we focus on the high-dimensional trace regression model with a low-rank coefficient matrix. We establish a nearly optimal in-sample prediction risk bound for the rank-constrained least-squares estimator under no assumptions on the design matrix. Lying at the heart of the proof is a covering number bound for the family of projection operators corresponding to the subspaces spanned by the design. By leveraging this complexity result, we perform a power analysis for a permutation test on the existence of a low-rank signal under the high-dimensional trace regression model. We show that the permutation test based on the rank-constrained least-squares estimator achieves non-trivial power with no assumptions on the minimum (restricted) eigenvalue of the covariance matrix of the design. Finally, we use alternating minimization to approximately solve the rank-constrained least-squares problem to evaluate its empirical in-sample prediction risk and power of the resulting permutation test in our numerical study.

preprint2021arXiv

Inference In High-dimensional Single-Index Models Under Symmetric Designs

The problem of statistical inference for regression coefficients in a high-dimensional single-index model is considered. Under elliptical symmetry, the single index model can be reformulated as a proxy linear model whose regression parameter is identifiable. We construct estimates of the regression coefficients of interest that are similar to the debiased lasso estimates in the standard linear model and exhibit similar properties: root-n-consistency and asymptotic normality. The procedure completely bypasses the estimation of the unknown link function, which can be extremely challenging depending on the underlying structure of the problem. Furthermore, under Gaussianity, we propose more efficient estimates of the coefficients by expanding the link function in the Hermite polynomial basis. Finally, we illustrate our approach via carefully designed simulation experiments.

preprint2020arXiv

Inference Without Compatibility

We consider hypotheses testing problems for three parameters in high-dimensional linear models with minimal sparsity assumptions of their type but without any compatibility conditions. Under this framework, we construct the first $\sqrt{n}$-consistent estimators for low-dimensional coefficients, the signal strength, and the noise level. We support our results using numerical simulations and provide comparisons with other estimators.

preprint2020arXiv

Markovian And Non-Markovian Processes with Active Decision Making Strategies For Addressing The COVID-19 Pandemic

We study and predict the evolution of Covid-19 in six US states from the period May 1 through August 31 using a discrete compartment-based model and prescribe active intervention policies, like lockdowns, on the basis of minimizing a loss function, within the broad framework of partially observed Markov decision processes. For each state, Covid-19 data for 40 days (starting from May 1 for two northern states and June 1 for four southern states) are analyzed to estimate the transition probabilities between compartments and other parameters associated with the evolution of the epidemic. These quantities are then used to predict the course of the epidemic in the given state for the next 50 days (test period) under various policy allocations, leading to different values of the loss function over the training horizon. The optimal policy allocation is the one corresponding to the smallest loss. Our analysis shows that none of the six states need lockdowns over the test period, though the no lockdown prescription is to be interpreted with caution: responsible mask use and social distancing of course need to be continued. The caveats involved in modeling epidemic propagation of this sort are discussed at length. A sketch of a non-Markovian formulation of Covid-19 propagation (and more general epidemic propagation) is presented as an attractive avenue for future research in this area.

preprint2020arXiv

Optimal Linear Discriminators For The Discrete Choice Model In Growing Dimensions

Manski's celebrated maximum score estimator for the discrete choice model, which is an optimal linear discriminator, has been the focus of much investigation in both the econometrics and statistics literatures, but its behavior under growing dimension scenarios largely remains unknown. This paper addresses that gap. Two different cases are considered: $p$ grows with $n$ but at a slow rate, i.e. $p/n \rightarrow 0$; and $p \gg n$ (fast growth). In the binary response model, we recast Manski's score estimation as empirical risk minimization for a classification problem, and derive the $\ell_2$ rate of convergence of the score estimator under a \emph{transition condition} in terms of our margin parameter that calibrates the level of difficulty of the estimation problem. We also establish upper and lower bounds for the minimax $\ell_2$ error in the binary choice model that differ by a logarithmic factor, and construct a minimax-optimal estimator in the slow growth regime. Some extensions to the general case -- the multinomial response model -- are also considered. Last but not least, we use a variety of learning algorithms to compute the maximum score estimator in growing dimensions.

preprint2016arXiv

On Bayesian robust regression with diverging number of predictors

This paper concerns the robust regression model when the number of predictors and the number of observations grow in a similar rate. Theory for M-estimators in this regime has been recently developed by several authors [El Karoui et al., 2013, Bean et al., 2013, Donoho and Montanari, 2013]. Motivated by the inability of M-estimators to successfully estimate the Euclidean norm of the coefficient vector, we consider a Bayesian framework for this model. We suggest a two-component mixture of normals prior for the coefficients and develop a Gibbs sampler procedure for sampling from relevant posterior distributions, while utilizing a scale mixture of normal representation for the error distribution . Unlike M-estimators, the proposed Bayes estimator is consistent in the Euclidean norm sense. Simulation results demonstrate the superiority of the Bayes estimator over traditional estimation methods.

preprint2015arXiv

Identifying a minimal class of models for high-dimensional data

Model selection consistency in the high-dimensional regression setting can be achieved only if strong assumptions are fulfilled. We therefore suggest to pursue a different goal, which we call a minimal class of models. The minimal class of models includes models that are similar in their prediction accuracy but not necessarily in their elements. We suggest a random search algorithm to reveal candidate models. The algorithm implements simulated annealing while using a score for each predictor that we suggest to derive using a combination of the Lasso and the Elastic Net. The utility of using a minimal class of models is demonstrated in the analysis of two datasets.

preprint2014arXiv

On asymptotically optimal confidence regions and tests for high-dimensional models

We propose a general method for constructing confidence intervals and statistical tests for single or low-dimensional components of a large parameter vector in a high-dimensional model. It can be easily adjusted for multiplicity taking dependence among tests into account. For linear models, our method is essentially the same as in Zhang and Zhang [J. R. Stat. Soc. Ser. B Stat. Methodol. 76 (2014) 217-242]: we analyze its asymptotic properties and establish its asymptotic optimality in terms of semiparametric efficiency. Our method naturally extends to generalized linear models with convex loss functions. We develop the corresponding theory which includes a careful analysis for Gaussian, sub-Gaussian and bounded correlated designs.

preprint2012arXiv

Around the goal: Examining the effect of the first goal on the second goal in soccer using survival analysis methods

In this paper we apply survival techniques to soccer data, treating a goal scoring as the event of interest. It specifically concerns the relationship between the time of the first goal in the game and the time of the second goal. In order to do so, the relevant survival analysis concepts are readjusted to fit the problem and a Cox model is developed for the hazard function. Attributes such as time dependent covariates and a frailty term are also being considered. We also use a reliable propensity score to summarize the pre-game covariates. The conclusions derived from the results are that a first goal occurrence could either expedite or impede the next goal scoring, depending on the time it was scored. Moreover, once a goal is scored, another goal scoring become more and more likely as the game progresses. Furthermore, the first goal effect is the same whether the goal was scored or conceded.

preprint2011arXiv

A Random Walk with Drift: Interview with Peter J. Bickel

I met Peter J. Bickel for the first time in 1981. He came to Jerusalem for a year; I had just started working on my Ph.D. studies. Yossi Yahav, who was my advisor at this time, busy as the Dean of Social Sciences, brought us together. Peter became my chief thesis advisor. A year and a half later I came to Berkeley as a post-doc. Since then we have continued to work together. Peter was first my advisor, then a teacher, and now he is also a friend. It is appropriate that this interview took place in two cities. We spoke together first in Jerusalem, at Mishkenot Shaananim and the Center for Research of Rationality, and then at the University of California at Berkeley. These conversations were not formal interviews, but just questions that prompted Peter to tell his story.

preprint2010arXiv

On the trasductive arguments in statistics

The paper argues that a part of the current statistical discussion is not based on the standard firm foundations of the field. Among the examples we consider are prediction into the future, semi-supervised classification, and causality inference based on observational data.

preprint2010arXiv

Simultaneous analysis of Lasso and Dantzig selector

We exhibit an approximate equivalence between the Lasso estimator and Dantzig selector. For both methods we derive parallel oracle inequalities for the prediction risk in the general nonparametric regression model, as well as bounds on the $\ell_p$ estimation loss for $1\le p\le 2$ in the linear model when the number of variables can be much larger than the sample size.

preprint2010arXiv

Sparse Empirical Bayes Analysis (SEBA)

We consider a joint processing of $n$ independent sparse regression problems. Each is based on a sample $(y_{i1},x_{i1})...,(y_{im},x_{im})$ of $m$ \iid observations from $y_{i1}=x_{i1}\tβ_i+\eps_{i1}$, $y_{i1}\in \R$, $x_{i 1}\in\R^p$, $i=1,...,n$, and $\eps_{i1}\dist N(0,\sig^2)$, say. $p$ is large enough so that the empirical risk minimizer is not consistent. We consider three possible extensions of the lasso estimator to deal with this problem, the lassoes, the group lasso and the RING lasso, each utilizing a different assumption how these problems are related. For each estimator we give a Bayesian interpretation, and we present both persistency analysis and non-asymptotic error bounds based on restricted eigenvalue - type assumptions.

preprint2010arXiv

The Best Linear Unbiased Estimator for Continuation of a Function

We show how to construct the best linear unbiased predictor (BLUP) for the continuation of a curve in a spline-function model. We assume that the entire curve is drawn from some smooth random process and that the curve is given up to some cut point. We demonstrate how to compute the BLUP efficiently. Confidence bands for the BLUP are discussed. Finally, we apply the proposed BLUP to real-world call center data. Specifically, we forecast the continuation of both the call arrival counts and the workload process at the call center of a commercial bank.

Ya'acov Ritov

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Generalized maximum likelihood estimation of the mean of parameters of mixtures, with applications to sampling

Rank-Constrained Least-Squares: Prediction and Inference

Inference In High-dimensional Single-Index Models Under Symmetric Designs

Inference Without Compatibility

Markovian And Non-Markovian Processes with Active Decision Making Strategies For Addressing The COVID-19 Pandemic

Optimal Linear Discriminators For The Discrete Choice Model In Growing Dimensions

On Bayesian robust regression with diverging number of predictors

Identifying a minimal class of models for high-dimensional data

On asymptotically optimal confidence regions and tests for high-dimensional models

Around the goal: Examining the effect of the first goal on the second goal in soccer using survival analysis methods

A Random Walk with Drift: Interview with Peter J. Bickel

On the trasductive arguments in statistics

Simultaneous analysis of Lasso and Dantzig selector

Sparse Empirical Bayes Analysis (SEBA)

The Best Linear Unbiased Estimator for Continuation of a Function

Ya&#39;acov Ritov

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Generalized maximum likelihood estimation of the mean of parameters of mixtures, with applications to sampling

Rank-Constrained Least-Squares: Prediction and Inference

Inference In High-dimensional Single-Index Models Under Symmetric Designs

Inference Without Compatibility

Markovian And Non-Markovian Processes with Active Decision Making Strategies For Addressing The COVID-19 Pandemic

Optimal Linear Discriminators For The Discrete Choice Model In Growing Dimensions

On Bayesian robust regression with diverging number of predictors

Identifying a minimal class of models for high-dimensional data

On asymptotically optimal confidence regions and tests for high-dimensional models

Around the goal: Examining the effect of the first goal on the second goal in soccer using survival analysis methods

A Random Walk with Drift: Interview with Peter J. Bickel

On the trasductive arguments in statistics

Simultaneous analysis of Lasso and Dantzig selector

Sparse Empirical Bayes Analysis (SEBA)

The Best Linear Unbiased Estimator for Continuation of a Function

Ya'acov Ritov