Source author record

Marianna Pensky

Marianna Pensky appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.ST Statistics Theory Methodology Applications math.FA

Catalog footprint

What is connected

18works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Clustering in statistical ill-posed linear inverse problems

In many statistical linear inverse problems, one needs to recover classes of similar curves from their noisy images under an operator that does not have a bounded inverse. Problems of this kind appear in many areas of application. Routinely, in such problems clustering is carried out at the pre-processing step and then the inverse problem is solved for each of the cluster averages separately. As a result, the errors of the procedures are usually examined for the estimation step only. The objective of this paper is to examine, both theoretically and via simulations, the effect of clustering on the accuracy of the solutions of general ill-posed linear inverse problems. In particular, we assume that one observes $X_m = A f_m + δε_m$, $m=1, \cdots, M$, where functions $f_m$ can be grouped into $K$ classes and one needs to recover a vector function ${\bf f}= (f_1,\cdots, f_M)^T$. We construct an estimators for ${\bf f}$ as a solution of a penalized optimization problem and derive an oracle inequality for its precision. By deriving upper and minimax lower bounds for the error, we confirm that the estimator is minimax optimal or nearly minimax optimal up to a logarithmic factor of the number of observations. One of the advantages of our estimation procedure is that we do not assume that the number of clusters is known in advance. We conclude that clustering at the pre-processing step is beneficial when the problem is moderately ill-posed. It should be applied with extreme care when the problem is severely ill-posed.

preprint2020arXiv

Estimation and Clustering in Popularity Adjusted Stochastic Block Model

The paper considers the Popularity Adjusted Block model (PABM) introduced by Sengupta and Chen (2018). We argue that the main appeal of the PABM is the flexibility of the spectral properties of the graph which makes the PABM an attractive choice for modeling networks that appear in biological sciences. We expand the theory of PABM to the case of an arbitrary number of communities which possibly grows with a number of nodes in the network and is not assumed to be known. We produce the estimators of the probability matrix and the community structure and provide non-asymptotic upper bounds for the estimation and the clustering errors. We use the Sparse Subspace Clustering (SSC) approach to partition the network into communities, the approach that, to the best of our knowledge, has not been used for clustering network data. The theory is supplemented by a simulation study. In addition, we show advantages of the PABM for modeling a butterfly similarity network and a human brain functional network.

preprint2015arXiv

Estimation of delta-contaminated density of the random intensity of Poisson data

In the present paper, we constructed an estimator of a delta contaminated mixing density function $g(λ)$ of the intensity $λ$ of the Poisson distribution. The estimator is based on an expansion of the continuous portion $g_0(λ)$ of the unknown pdf over an overcomplete dictionary with the recovery of the coefficients obtained as solution of an optimization problem with Lasso penalty. In order to apply Lasso technique in the, so called, prediction setting where it requires virtually no assumptions on dictionary and, moreover, to ensure fast convergence of Lasso estimator, we use a novel formulation of the optimization problem based on inversion of the dictionary elements. The total estimator of the delta contaminated mixing pdf is obtained using a two-stage iterative procedure. We formulate conditions on the dictionary and the unknown mixing density that yield a sharp oracle inequality for the norm of the difference between $g_0 (λ)$ and its estimator and, thus, obtain a smaller error than in a minimax setting. Numerical simulations and comparisons with the Laguerre functions based estimator recently constructed by Comte and Genon-Catalot (2015) also show advantages of our procedure. At last, we apply the technique developed in the paper to estimation of a delta contaminated mixing density of the Poisson intensity of the Saturn's rings data.

preprint2015arXiv

Laplace deconvolution on the basis of time domain data and its application to Dynamic Contrast Enhanced imaging

In the present paper we consider the problem of Laplace deconvolution with noisy discrete non-equally spaced observations on a finite time interval. We propose a new method for Laplace deconvolution which is based on expansions of the convolution kernel, the unknown function and the observed signal over Laguerre functions basis (which acts as a surrogate eigenfunction basis of the Laplace convolution operator) using regression setting. The expansion results in a small system of linear equations with the matrix of the system being triangular and Toeplitz. Due to this triangular structure, there is a common number $m$ of terms in the function expansions to control, which is realized via complexity penalty. The advantage of this methodology is that it leads to very fast computations, produces no boundary effects due to extension at zero and cut-off at $T$ and provides an estimator with the risk within a logarithmic factor of the oracle risk. We emphasize that, in the present paper, we consider the true observational model with possibly nonequispaced observations which are available on a finite interval of length $T$ which appears in many different contexts, and account for the bias associated with this model (which is not present when $T\rightarrow\infty$). The study is motivated by perfusion imaging using a short injection of contrast agent, a procedure which is applied for medical assessment of micro-circulation within tissues such as cancerous tumors. Presence of a tuning parameter $a$ allows to choose the most advantageous time units, so that both the kernel and the unknown right hand side of the equation are well represented for the deconvolution. The methodology is illustrated by an extensive simulation study and a real data example which confirms that the proposed technique is fast, efficient, accurate, usable from a practical point of view and very competitive.

preprint2015arXiv

Minimax theory of estimation of linear functionals of the deconvolution density with or without sparsity

The present paper considers a problem of estimating a linear functional $Φ=\int_{-\infty}^\infty φ(x) f(x)dx$ of an unknown deconvolution density $f$ on the basis of i.i.d. observations $Y_i = θ_i + ξ_i$ where $ξ_i$ has a known pdf $g$ and $f$ is the pdf of $θ_i$. Although various aspects and particular cases of this problem have been treated by a number of authors, there are still many gaps. In particular, there are no minimax lower bounds for an estimator of $Φ$ for an arbitrary function $φ$. The general upper risk bounds cover only the case when the Fourier transform of $φ$ exists. Moreover, no theory exists for estimating $Φ$ when vector of observations is sparse. In addition, until now, the related problem of estimation of functionals $Φ_n = n^{-1} \sum_{i=1}^n φ(θ_i)$ in indirect observations have been treated as a separate problem with no connection to estimation of $Φ$. The objective of the present paper is to fill in the gaps and develop the general minimax theory of estimation of $Φ$ and $Φ_n$. We offer a general approach to estimation of $Φ$ (and $Φ_n$) and provide the upper and the minimax lower risk bounds in the case when function $φ$ is square integrable. Furthermore, we extend the theory to the case when Fourier transform of $φ$ does not exist and $Φ$ can be presented as a linear functional of the Fourier transform of $f$ and its derivatives. Finally, we generalize our results to handle the situation when vector $θ$ is sparse. As a direct application of the proposed theory, we obtain multiple new results and automatically recover existing ones for a variety of problems such as estimation of the $(2M+1)$-th absolute moment or a generalized moment of the deconvolution density, estimation of the mixing cdf or estimation of the mixing pdf with classical and Berkson errors.

preprint2015arXiv

Solution of linear ill-posed problems using overcomplete dictionaries

In the present paper we consider application of overcomplete dictionaries to solution of general ill-posed linear inverse problems. Construction of an adaptive optimal solution for such problems usually relies either on a singular value decomposition or representation of the solution via an orthonormal basis. The shortcoming of both approaches lies in the fact that, in many situations, neither the eigenbasis of the linear operator nor a standard orthonormal basis constitutes an appropriate collection of functions for sparse representation of the unknown function. In the context of regression problems, there have been an enormous amount of effort to recover an unknown function using an overcomplete dictionary. One of the most popular methods, Lasso, is based on minimizing the empirical likelihood and requires stringent assumptions on the dictionary, the, so called, compatibility conditions. While these conditions may be satisfied for the original dictionary functions, they usually do not hold for their images due to contraction imposed by the linear operator. In what follows, we bypass this difficulty by a novel approach which is based on inverting each of the dictionary functions and matching the resulting expansion to the true function, thus, avoiding unrealistic assumptions on the dictionary and using Lasso in a predictive setting. We examine both the white noise and the observational model formulations and also discuss how exact inverse images of the dictionary functions can be replaced by their approximate counterparts. Furthermore, we show how the suggested methodology can be extended to the problem of estimation of a mixing density in a continuous mixture. For all the situations listed above, we provide the oracle inequalities for the risk in a finite sample setting. Simulation studies confirm good computational properties of the Lasso-based technique.

preprint2014arXiv

Sparse high-dimensional varying coefficient model: non-asymptotic minimax study

The objective of the present paper is to develop a minimax theory for the varying coefficient model in a non-asymptotic setting. We consider a high-dimensional sparse varying coefficient model where only few of the covariates are present and only some of those covariates are time dependent. Our analysis allows the time dependent covariates to have different degrees of smoothness and to be spatially inhomogeneous. We develop the minimax lower bounds for the quadratic risk and construct an adaptive estimator which attains those lower bounds within a constant (if all time-dependent covariates are spatially homogeneous) or logarithmic factor of the number of observations.

preprint2013arXiv

Adaptive Nonparametric Empirical Bayes Estimation Via Wavelet Series: the Minimax Study

In the present paper, we derive lower bounds for the risk of the nonparametric empirical Bayes estimators. In order to attain the optimal convergence rate, we propose generalization of the linear empirical Bayes estimation method which takes advantage of the flexibility of the wavelet techniques. We present an empirical Bayes estimator as a wavelet series expansion and estimate coefficients by minimizing the prior risk of the estimator. As a result, estimation of wavelet coefficients requires solution of a well-posed low-dimensional sparse system of linear equations. The dimension of the system depends on the size of wavelet support and smoothness of the Bayes estimator. An adaptive choice of the resolution level is carried out using Lepski (1997) method. The method is computationally efficient and provides asymptotically optimal adaptive EB estimators. The theory is supplemented by numerous examples.

preprint2013arXiv

Anisotropic Denoising in Functional Deconvolution Model with Dimension-free Convergence Rates

In the present paper we consider the problem of estimating a periodic $(r+1)$-dimensional function $f$ based on observations from its noisy convolution. We construct a wavelet estimator of $f$, derive minimax lower bounds for the $L^2$-risk when $f$ belongs to a Besov ball of mixed smoothness and demonstrate that the wavelet estimator is adaptive and asymptotically near-optimal within a logarithmic factor, in a wide range of Besov balls. We prove in particular that choosing this type of mixed smoothness leads to rates of convergence which are free of the "curse of dimensionality" and, hence, are higher than usual convergence rates when $r$ is large. The problem studied in the paper is motivated by seismic inversion which can be reduced to solution of noisy two-dimensional convolution equations that allow to draw inference on underground layer structures along the chosen profiles. The common practice in seismology is to recover layer structures separately for each profile and then to combine the derived estimates into a two-dimensional function. By studying the two-dimensional version of the model, we demonstrate that this strategy usually leads to estimators which are less accurate than the ones obtained as two-dimensional functional deconvolutions. Indeed, we show that unless the function $f$ is very smooth in the direction of the profiles, very spatially inhomogeneous along the other direction and the number of profiles is very limited, the functional deconvolution solution has a much better precision compared to a combination of $M$ solutions of separate convolution equations. A limited simulation study in the case of $r=1$ confirms theoretical claims of the paper.

preprint2013arXiv

De-noising procedures for frame operators

The present paper provides a comprehensive study of de-noising properties of frames and, in particular, tight frames, which constitute one of the most popular tools in contemporary signal processing. The objective of the paper is to bridge the existing gap between mathematical and statistical theories on one hand and engineering practice on the other and explore how one can take advantage of a specific structure of a frame in contrast to an arbitrary collection of vectors or an orthonormal basis. For both the general and the tight frames, the paper presents a set of practically implementable de-noising techniques which take frame induced correlation structures into account. These results are supplemented by an examination of the case when the frame is constructed as a collection of orthonormal bases. In particular, recommendations are given for aggregation of the estimators at the stage of frame coefficients. The paper is concluded by a finite sample simulation study which confirms that taking frame structure and frame induced correlations into account indeed improves de-noising precision.

preprint2013arXiv

Laplace deconvolution with noisy observations

In the present paper we consider Laplace deconvolution for discrete noisy data observed on the interval whose length may increase with a sample size. Although this problem arises in a variety of applications, to the best of our knowledge, it has been given very little attention by the statistical community. Our objective is to fill this gap and provide statistical treatment of Laplace deconvolution problem with noisy discrete data. The main contribution of the paper is explicit construction of an asymptotically rate-optimal (in the minimax sense) Laplace deconvolution estimator which is adaptive to the regularity of the unknown function. We show that the original Laplace deconvolution problem can be reduced to nonparametric estimation of a regression function and its derivatives on the interval of growing length T_n. Whereas the forms of the estimators remain standard, the choices of the parameters and the minimax convergence rates, which are expressed in terms of T_n^2/n in this case, are affected by the asymptotic growth of the length of the interval. We derive an adaptive kernel estimator of the function of interest, and establish its asymptotic minimaxity over a range of Sobolev classes. We illustrate the theory by examples of construction of explicit expressions of Laplace deconvolution estimators. A simulation study shows that, in addition to providing asymptotic optimality as the number of observations turns to infinity, the proposed estimator demonstrates good performance in finite sample examples.

preprint2013arXiv

Multichannel Deconvolution with Long-Range Dependence: A Minimax Study

We consider the problem of estimating the unknown response function in the multichannel deconvolution model with long-range dependent Gaussian errors. We do not limit our consideration to a specific type of long-range dependence rather we assume that the errors should satisfy a general assumption in terms of the smallest and larger eigenvalues of their covariance matrices. We derive minimax lower bounds for the quadratic risk in the proposed multichannel deconvolution model when the response function is assumed to belong to a Besov ball and the blurring function is assumed to possess some smoothness properties, including both regular-smooth and super-smooth convolutions. Furthermore, we propose an adaptive wavelet estimator of the response function that is asymptotically optimal (in the minimax sense), or near-optimal within a logarithmic factor, in a wide range of Besov balls. It is shown that the optimal convergence rates depend on the balance between the smoothness parameter of the response function, the kernel parameters of the blurring function, the long memory parameters of the errors, and how the total number of observations is distributed among the total number of channels. Some examples of inverse problems in mathematical physics where one needs to recover initial or boundary conditions on the basis of observations from a noisy solution of a partial differential equation are used to illustrate the application of the theory we developed. The optimal convergence rates and the adaptive estimators we consider extend the ones studied by Pensky and Sapatinas (2009, 2010) for independent and identically distributed Gaussian errors to the case of long-range dependent Gaussian errors.

preprint2013arXiv

Non-asymptotic approach to varying coefficient model

In the present paper we consider the varying coefficient model which represents a useful tool for exploring dynamic patterns in many applications. Existing methods typically provide asymptotic evaluation of precision of estimation procedures under the assumption that the number of observations tends to infinity. In practical applications, however, only a finite number of measurements are available. In the present paper we focus on a non-asymptotic approach to the problem. We propose a novel estimation procedure which is based on recent developments in matrix estimation. In particular, for our estimator, we obtain upper bounds for the mean squared and the pointwise estimation errors. The obtained oracle inequalities are non-asymptotic and hold for finite sample size.

preprint2013arXiv

Spatially inhomogeneous linear inverse problems with possible singularities

The objective of the present paper is to introduce the concept of a spatially inhomogeneous linear inverse problem where the degree of ill-posedness of operator $Q$ depends not only on the scale but also on location. In this case, the rates of convergence are determined by the interaction of four parameters, the smoothness and spatial homogeneity of the unknown function $f$ and degrees of ill-posedness and spatial inhomogeneity of operator $Q$. Estimators obtained in the paper are based either on wavelet-vaguelette decomposition (if the norms of all vaguelettes are finite) or on a hybrid of wavelet-vaguelette decomposition and Galerkin method (if vaguelettes in the neighborhood of the singularity point have infinite norms). The hybrid estimator is a combination of a linear part in the vicinity of the singularity point and the nonlinear block thresholding wavelet estimator elsewhere. To attain adaptivity, an optimal resolution level for the linear, singularity affected, portion of the estimator is obtained using Lepski [Theory Probab. Appl. 35 (1990) 454-466 and 36 (1991) 682-697] method and is used subsequently as the lowest resolution level for the nonlinear wavelet estimator. We show that convergence rates of the hybrid estimator lie within a logarithmic factor of the optimal minimax convergence rates. The theory presented in the paper is supplemented by examples of deconvolution with a spatially inhomogeneous kernel and deconvolution in the presence of locally extreme noise or extremely inhomogeneous design. The first two problems are examined via a limited simulation study which demonstrates advantages of the hybrid estimator when the degree of spatial inhomogeneity is high. In addition, we apply the technique to recovery of a convolution signal transmitted via amplitude modulation.

preprint2012arXiv

Laplace deconvolution and its application to Dynamic Contrast Enhanced imaging

In the present paper we consider the problem of Laplace deconvolution with noisy discrete observations. The study is motivated by Dynamic Contrast Enhanced imaging using a bolus of contrast agent, a procedure which allows considerable improvement in {evaluating} the quality of a vascular network and its permeability and is widely used in medical assessment of brain flows or cancerous tumors. Although the study is motivated by medical imaging application, we obtain a solution of a general problem of Laplace deconvolution based on noisy data which appears in many different contexts. We propose a new method for Laplace deconvolution which is based on expansions of the convolution kernel, the unknown function and the observed signal over Laguerre functions basis. The expansion results in a small system of linear equations with the matrix of the system being triangular and Toeplitz. The number $m$ of the terms in the expansion of the estimator is controlled via complexity penalty. The advantage of this methodology is that it leads to very fast computations, does not require exact knowledge of the kernel and produces no boundary effects due to extension at zero and cut-off at $T$. The technique leads to an estimator with the risk within a logarithmic factor of $m$ of the oracle risk under no assumptions on the model and within a constant factor of the oracle risk under mild assumptions. The methodology is illustrated by a finite sample simulation study which includes an example of the kernel obtained in the real life DCE experiments. Simulations confirm that the proposed technique is fast, efficient, accurate, usable from a practical point of view and competitive.

preprint2012arXiv

Nonparametric Regression Estimation Based on Spatially Inhomogeneous Data: Minimax Global Convergence Rates and Adaptivity

We consider the nonparametric regression estimation problem of recovering an unknown response function f on the basis of spatially inhomogeneous data when the design points follow a known compactly supported density g with a finite number of well separated zeros. In particular, we consider two different cases: when g has zeros of a polynomial order and when g has zeros of an exponential order. These two cases correspond to moderate and severe data losses, respectively. We obtain asymptotic minimax lower bounds for the global risk of an estimator of f and construct adaptive wavelet nonlinear thresholding estimators of f which attain those minimax convergence rates (up to a logarithmic factor in the case of a zero of a polynomial order), over a wide range of Besov balls. The spatially inhomogeneous ill-posed problem that we investigate is inherently more difficult than spatially homogeneous problems like, e.g., deconvolution. In particular, due to spatial irregularity, assessment of minimax global convergence rates is a much harder task than the derivation of minimax local convergence rates studied recently in the literature. Furthermore, the resulting estimators exhibit very different behavior and minimax global convergence rates in comparison with the solution of spatially homogeneous ill-posed problems. For example, unlike in deconvolution problem, the minimax global convergence rates are greatly influenced not only by the extent of data loss but also by the degree of spatial homogeneity of f. Specifically, even if 1/g is not integrable, one can recover f as well as in the case of an equispaced design (in terms of minimax global convergence rates) when it is homogeneous enough since the estimator is "borrowing strength" in the areas where f is adequately sampled.

preprint2011arXiv

Multichannel Boxcar Deconvolution with Growing Number of Channels

We consider the problem of estimating the unknown response function in the multichannel deconvolution model with a boxcar-like kernel which is of particular interest in signal processing. It is known that, when the number of channels is finite, the precision of reconstruction of the response function increases as the number of channels $M$ grow (even when the total number of observations $n$ for all channels $M$ remains constant) and this requires that the parameter of the channels form a Badly Approximable $M$-tuple. Recent advances in data collection and recording techniques made it of urgent interest to study the case when the number of channels $M=M_n$ grow with the total number of observations $n$. However, in real-life situations, the number of channels $M = M_n$ usually refers to the number of physical devices and, consequently, may grow to infinity only at a slow rate as $n \rightarrow \infty$. When $M=M_n$ grows slowly as $n$ increases, we develop a procedure for the construction of a Badly Approximable $M$-tuple on a specified interval, of a non-asymptotic length, together with a lower bound associated with this $M$-tuple, which explicitly shows its dependence on $M$ as $M$ is growing. This result is further used for the evaluation of the $L^2$-risk of the suggested adaptive wavelet thresholding estimator of the unknown response function and, furthermore, for the choice of the optimal number of channels $M$ which minimizes the $L^2$-risk.

preprint2010arXiv

On convergence rates equivalency and sampling strategies in functional deconvolution models

Using the asymptotical minimax framework, we examine convergence rates equivalency between a continuous functional deconvolution model and its real-life discrete counterpart over a wide range of Besov balls and for the $L^2$-risk. For this purpose, all possible models are divided into three groups. For the models in the first group, which we call uniform, the convergence rates in the discrete and the continuous models coincide no matter what the sampling scheme is chosen, and hence the replacement of the discrete model by its continuous counterpart is legitimate. For the models in the second group, to which we refer as regular, one can point out the best sampling strategy in the discrete model, but not every sampling scheme leads to the same convergence rates; there are at least two sampling schemes which deliver different convergence rates in the discrete model (i.e., at least one of the discrete models leads to convergence rates that are different from the convergence rates in the continuous model). The third group consists of models for which, in general, it is impossible to devise the best sampling strategy; we call these models irregular. We formulate the conditions when each of these situations takes place. In the regular case, we not only point out the number and the selection of sampling points which deliver the fastest convergence rates in the discrete model but also investigate when, in the case of an arbitrary sampling scheme, the convergence rates in the continuous model coincide or do not coincide with the convergence rates in the discrete model. We also study what happens if one chooses a uniform, or a more general pseudo-uniform, sampling scheme which can be viewed as an intuitive replacement of the continuous model.

Marianna Pensky

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

Clustering in statistical ill-posed linear inverse problems

Estimation and Clustering in Popularity Adjusted Stochastic Block Model

Estimation of delta-contaminated density of the random intensity of Poisson data

Laplace deconvolution on the basis of time domain data and its application to Dynamic Contrast Enhanced imaging

Minimax theory of estimation of linear functionals of the deconvolution density with or without sparsity

Solution of linear ill-posed problems using overcomplete dictionaries

Sparse high-dimensional varying coefficient model: non-asymptotic minimax study

Adaptive Nonparametric Empirical Bayes Estimation Via Wavelet Series: the Minimax Study

Anisotropic Denoising in Functional Deconvolution Model with Dimension-free Convergence Rates

De-noising procedures for frame operators

Laplace deconvolution with noisy observations

Multichannel Deconvolution with Long-Range Dependence: A Minimax Study

Non-asymptotic approach to varying coefficient model

Spatially inhomogeneous linear inverse problems with possible singularities

Laplace deconvolution and its application to Dynamic Contrast Enhanced imaging

Nonparametric Regression Estimation Based on Spatially Inhomogeneous Data: Minimax Global Convergence Rates and Adaptivity

Multichannel Boxcar Deconvolution with Growing Number of Channels

On convergence rates equivalency and sampling strategies in functional deconvolution models