Source author record

Albert Cohen

Albert Cohen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA math.AP Numerical Analysis math.PR Programming Languages Distributed, Parallel, and Cluster Computing math.ST Performance Statistics Theory Cryptography and Security Machine Learning math.FA math.OC Software Engineering Systems and Control

Catalog footprint

What is connected

31works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction

Despite significant investment in software infrastructure, machine learning systems, runtimes and compilers do not compose properly. We propose a new design aiming at providing unprecedented degrees of modularity, composability and genericity. This paper discusses a structured approach to the construction of domain-specific code generators for tensor compilers, with the stated goal of improving the productivity of both compiler engineers and end-users. The approach leverages the natural structure of tensor algebra. It has been the main driver for the design of progressive lowering paths in \MLIR. The proposed abstractions and transformations span data structures and control flow with both functional (SSA form) and imperative (side-effecting) semantics. We discuss the implications of this infrastructure on compiler construction and present preliminary experimental results.

preprint2021arXiv

Secure Optimization Through Opaque Observations

Secure applications implement software protections against side-channel and physical attacks. Such protections are meaningful at machine code or micro-architectural level, but they typically do not carry observable semantics at source level. To prevent optimizing compilers from altering the protection, security engineers embed input/output side-effects into the protection. These side-effects are error-prone and compiler-dependent, and the current practice involves analyzing the generated machine code to make sure security or privacy properties are still enforced. Vu et al. recently demonstrated how to automate the insertion of volatile side-effects in a compiler [52], but these may be too expensive in fined-grained protections such as control-flow integrity. We introduce observations of the program state that are intrinsic to the correct execution of security protections, along with means to specify and preserve observations across the compilation flow. Such observations complement the traditional input/output-preservation contract of compilers. We show how to guarantee their preservation without modifying compilation passes and with as little performance impact as possible. We validate our approach on a range of benchmarks, expressing the secure compilation of these applications in terms of observations to be made at specific program points.

preprint2020arXiv

MLIR: A Compiler Infrastructure for the End of Moore's Law

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. MLIR facilitates the design and implementation of code generators, translators and optimizers at different levels of abstraction and also across application domains, hardware targets and execution environments. The contribution of this work includes (1) discussion of MLIR as a research artifact, built for extension and evolution, and identifying the challenges and opportunities posed by this novel design point in design, semantics, optimization specification, system, and engineering. (2) evaluation of MLIR as a generalized infrastructure that reduces the cost of building compilers-describing diverse use-cases to show research and educational opportunities for future programming languages, compilers, execution environments, and computer architecture. The paper also presents the rationale for MLIR, its original design principles, structures and semantics.

preprint2020arXiv

Nonlinear Methods for Model Reduction

The usual approach to model reduction for parametric partial differential equations (PDEs) is to construct a linear space $V_n$ which approximates well the solution manifold $\mathcal{M}$ consisting of all solutions $u(y)$ with $y$ the vector of parameters. This linear reduced model $V_n$ is then used for various tasks such as building an online forward solver for the PDE or estimating parameters from data observations. It is well understood in other problems of numerical computation that nonlinear methods such as adaptive approximation, $n$-term approximation, and certain tree-based methods may provide improved numerical efficiency. For model reduction, a nonlinear method would replace the linear space $V_n$ by a nonlinear space $Σ_n$. This idea has already been suggested in recent papers on model reduction where the parameter domain is decomposed into a finite number of cells and a linear space of low dimension is assigned to each cell. Up to this point, little is known in terms of performance guarantees for such a nonlinear strategy. Moreover, most numerical experiments for nonlinear model reduction use a parameter dimension of only one or two. In this work, a step is made towards a more cohesive theory for nonlinear model reduction. Framing these methods in the general setting of library approximation allows us to give a first comparison of their performance with those of standard linear approximation for any general compact set. We then turn to the study these methods for solution manifolds of parametrized elliptic PDEs. We study a very specific example of library approximation where the parameter domain is split into a finite number $N$ of rectangular cells and where different reduced affine spaces of dimension $m$ are assigned to each cell. The performance of this nonlinear procedure is analyzed from the viewpoint of accuracy of approximation versus $m$ and $N$.

preprint2020arXiv

Optimal reduced model algorithms for data-based state estimation

Reduced model spaces, such as reduced basis and polynomial chaos, are linear spaces $V_n$ of finite dimension $n$ which are designed for the efficient approximation of families parametrized PDEs in a Hilbert space $V$. The manifold $\mathcal{M}$ that gathers the solutions of the PDE for all admissible parameter values is globally approximated by the space $V_n$ with some controlled accuracy $ε_n$, which is typically much smaller than when using standard approximation spaces of the same dimension such as finite elements. Reduced model spaces have also been proposed in [13] as a vehicle to design a simple linear recovery algorithm of the state $u\in\mathcal{M}$ corresponding to a particular solution when the values of parameters are unknown but a set of data is given by $m$ linear measurements of the state. The measurements are of the form $\ell_j(u)$, $j=1,\dots,m$, where the $\ell_j$ are linear functionals on $V$. The analysis of this approach in [2] shows that the recovery error is bounded by $μ_nε_n$, where $μ_n=μ(V_n,W)$ is the inverse of an inf-sup constant that describe the angle between $V_n$ and the space $W$ spanned by the Riesz representers of $(\ell_1,\dots,\ell_m)$. A reduced model space which is efficient for approximation might thus be ineffective for recovery if $μ_n$ is large or infinite. In this paper, we discuss the existence and construction of an optimal reduced model space for this recovery method, and we extend our search to affine spaces. Our basic observation is that this problem is equivalent to the search of an optimal affine algorithm for the recovery of $\mathcal{M}$ in the worst case error sense. This allows us to perform our search by a convex optimization procedure. Numerical tests illustrate that the reduced model spaces constructed from our approach perform better than the classical reduced basis spaces.

preprint2020arXiv

Optimal Stable Nonlinear Approximation

While it is well known that nonlinear methods of approximation can often perform dramatically better than linear methods, there are still questions on how to measure the optimal performance possible for such methods. This paper studies nonlinear methods of approximation that are compatible with numerical implementation in that they are required to be numerically stable. A measure of optimal performance, called {\em stable manifold widths}, for approximating a model class $K$ in a Banach space $X$ by stable manifold methods is introduced. Fundamental inequalities between these stable manifold widths and the entropy of $K$ are established. The effects of requiring stability in the settings of deep learning and compressed sensing are discussed.

preprint2020arXiv

Reduced Basis Greedy Selection Using Random Training Sets

Reduced bases have been introduced for the approximation of parametrized PDEs in applications where many online queries are required. Their numerical efficiency for such problems has been theoretically confirmed in \cite{BCDDPW,DPW}, where it is shown that the reduced basis space $V_n$ of dimension $n$, constructed by a certain greedy strategy, has approximation error similar to that of the optimal space associated to the Kolmogorov $n$-width of the solution manifold. The greedy construction of the reduced basis space is performed in an offline stage which requires at each step a maximization of the current error over the parameter space. For the purpose of numerical computation, this maximization is performed over a finite {\em training set} obtained through a discretization. of the parameter domain. To guarantee a final approximation error $\varepsilon$ for the space generated by the greedy algorithm requires in principle that the snapshots associated to this training set constitute an approximation net for the solution manifold with accuracy or order $\varepsilon$. Hence, the size of the training set is the $\varepsilon$ covering number for $\mathcal{M}$ and this covering number typically behaves like $\exp(C\varepsilon^{-1/s})$ for some $C>0$ when the solution manifold has $n$-width decay $O(n^{-s})$. Thus, the shear size of the training set prohibits implementation of the algorithm when $\varepsilon$ is small. The main result of this paper shows that, if one is willing to accept results which hold with high probability, rather than with certainty, then for a large class of relevant problems one may replace the fine discretization by a random training set of size polynomial in $\varepsilon^{-1}$. Our proof of this fact is established by using inverse inequalities for polynomials in high dimensions.

preprint2020arXiv

State Estimation -- The Role of Reduced Models

The exploration of complex physical or technological processes usually requires exploiting available information from different sources: (i) physical laws often represented as a family of parameter dependent partial differential equations and (ii) data provided by measurement devices or sensors. The amount of sensors is typically limited and data acquisition may be expensive and in some cases even harmful. This article reviews some recent developments for this "small-data" scenario where inversion is strongly aggravated by the typically large parametric dimensionality. The proposed concepts may be viewed as exploring alternatives to Bayesian inversion in favor of more deterministic accuracy quantification related to the required computational complexity. We discuss optimality criteria which delineate intrinsic information limits, and highlight the role of reduced models for developing efficient computational strategies. In particular, the need to adapt the reduced models -- not to a specific (possibly noisy) data set but rather to the sensor system -- is a central theme. This, in turn, is facilitated by exploiting geometric perspectives based on proper stable variational formulations of the continuous model.

preprint2016arXiv

Diffusion Coefficients Estimation for Elliptic Partial Differential Equations

This paper considers the Dirichlet problem $$ -\mathrm{div}(a\nabla u_a)=f \quad \hbox{on}\,\,\ D, \qquad u_a=0\quad \hbox{on}\,\,\partial D, $$ for a Lipschitz domain $D\subset \mathbb R^d$, where $a$ is a scalar diffusion function. For a fixed $f$, we discuss under which conditions is $a$ uniquely determined and when can $a$ be stably recovered from the knowledge of $u_a$. A first result is that whenever $a\in H^1(D)$, with $0<λ\le a\le Λ$ on $D$, and $f\in L_\infty(D)$ is strictly positive, then $$ \|a-b\|_{L_2(D)}\le C\|u_a-u_b\|_{H_0^1(D)}^{1/6}. $$ More generally, it is shown that the assumption $a\in H^1(D)$ can be weakened to $a\in H^s(D)$, for certain $s<1$, at the expense of lowering the exponent $1/6$ to a value that depends on $s$.

preprint2016arXiv

Discrete least-squares approximations over optimized downward closed polynomial spaces in arbitrary dimension

We analyze the accuracy of the discrete least-squares approximation of a function $u$ in multivariate polynomial spaces $\mathbb{P}_Λ:={\rm span} \{y\mapsto y^ν\,: \, ν\in Λ\}$ with $Λ\subset \mathbb{N}_0^d$ over the domain $Γ:=[-1,1]^d$, based on the sampling of this function at points $y^1,\dots,y^m \in Γ$. The samples are independently drawn according to a given probability density $ρ$ belonging to the class of multivariate beta densities, which includes the uniform and Chebyshev densities as particular cases. We restrict our attention to polynomial spaces associated with \emph{downward closed} sets $Λ$ of \emph{prescribed} cardinality $n$, and we optimize the choice of the space for the given sample. This implies, in particular, that the selected polynomial space depends on the sample. We are interested in comparing the error of this least-squares approximation measured in $L^2(Γ,dρ)$ with the best achievable polynomial approximation error when using downward closed sets of cardinality $n$. We establish conditions between the dimension $n$ and the size $m$ of the sample, under which these two errors are proven to be comparable. Our main finding is that the dimension $d$ enters only moderately in the resulting trade-off between $m$ and $n$, in terms of a logarithmic factor $\ln(d)$, and is even absent when the optimization is restricted to a relevant subclass of downward closed sets, named {\it anchored} sets. In principle, this allows one to use these methods in arbitrarily high or even infinite dimension. Our analysis builds upon [2] which considered fixed and nonoptimized downward closed multi-index sets. Potential applications of the proposed results are found in the development and analysis of numerical methods for computing the solution to high-dimensional parametric or stochastic PDEs.

preprint2016arXiv

Multivariate approximation in downward closed polynomial spaces

The task of approximating a function of d variables from its evaluations at a given number of points is ubiquitous in numerical analysis and engineering applications. When d is large, this task is challenged by the so-called curse of dimensionality. As a typical example, standard polynomial spaces, such as those of total degree type, are often uneffective to reach a prescribed accuracy unless a prohibitive number of evaluations is invested. In recent years it has been shown that, for certain relevant applications, there are substantial advantages in using certain sparse polynomial spaces having anisotropic features with respect to the different variables. These applications include in particular the numerical approximation of high-dimensional parametric and stochastic partial differential equations. We start by surveying several results in this direction, with an emphasis on the numerical algorithms that are available for the construction of the approximation, in particular through interpolation or discrete least-squares fitting. All such algorithms rely on the assumption that the set of multi-indices associated with the polynomial space is downward closed. In the present paper we introduce some tools for the study of approximation in multivariate spaces under this assumption, and use them in the derivation of error bounds, sometimes independent of the dimension d, and in the development of adaptive strategies.

preprint2016arXiv

Optimal weighted least-squares methods

We consider the problem of reconstructing an unknown bounded function $u$ defined on a domain $X\subset \mathbb{R}^d$ from noiseless or noisy samples of $u$ at $n$ points $(x^i)_{i=1,\dots,n}$. We measure the reconstruction error in a norm $L^2(X,dρ)$ for some given probability measure $dρ$. Given a linear space $V_m$ with ${\rm dim}(V_m)=m\leq n$, we study in general terms the weighted least-squares approximations from the spaces $V_m$ based on independent random samples. The contribution of the present paper is twofold. From the theoretical perspective, we establish results in expectation and in probability for weighted least squares in general approximation spaces $V_m$. These results show that for an optimal choice of sampling measure $dμ$ and weight $w$, which depends on the space $V_m$ and on the measure $dρ$, stability and optimal accuracy are achieved under the mild condition that $n$ scales linearly with $m$ up to an additional logarithmic factor. The present analysis covers also cases where the function $u$ and its approximants from $V_m$ are unbounded, which might occur for instance in the relevant case where $X=\mathbb{R}^d$ and $dρ$ is the Gaussian measure. From the numerical perspective, we propose a sampling method which allows one to generate independent and identically distributed samples from the optimal measure $dμ$. This method becomes of interest in the multivariate setting where $dμ$ is generally not of tensor product type. We illustrate this for particular examples of approximation spaces $V_m$ of polynomial type, where the domain $X$ is allowed to be unbounded and high or even infinite dimensional, motivated by certain applications to parametric and stochastic PDEs.

preprint2016arXiv

Representations of Gaussian random fields and approximation of elliptic PDEs with lognormal coefficients

Approximation of elliptic PDEs with random diffusion coefficients typically requires a representation of the diffusion field in terms of a sequence $y=(y_j)_{j\geq 1}$ of scalar random variables. One may then apply high-dimensional approximation methods to the solution map $y\mapsto u(y)$. Although Karhunen-Loève representations are commonly used, it was recently shown, in the relevant case of lognormal diffusion fields, that they do not generally yield optimal approximation rates. Motivated by these results, we construct wavelet-type representations of stationary Gaussian random fields defined on bounded domains. The size and localization properties of these wavelets are studied, and used to obtain polynomial approximation results for the related elliptic PDE which outperform those achievable when using Karhunen-Loève representations. Our construction is based on a periodic extension of the random field, and the expansion on the domain is then obtained by simple restriction. This makes the approach easily applicable even when the computational domain of the PDE has a complicated geometry. In particular, we apply this construction to the class of Gaussian processes defined by the family of Matérn covariances.

preprint2016arXiv

Sparse polynomial approximation of parametric elliptic PDEs. Part I: affine coefficients

We consider elliptic partial differential equations with diffusion coefficients that depend affinely on countably many parameters. We study the summability properties of polynomial expansions of the function mapping parameter values to solutions of the PDE, considering both Taylor and Legendre series. Our results considerably improve on previously known estimates of this type, in particular taking into account structural features of the affine parametrization of the coefficient. Moreover, the results carry over to more general Jacobi polynomial expansions. We demonstrate that the new bounds are sharp in certain model cases and we illustrate them by numerical experiments.

preprint2015arXiv

Approximation of high-dimensional parametric PDEs

Parametrized families of PDEs arise in various contexts such as inverse problems, control and optimization, risk assessment, and uncertainty quantification. In most of these applications, the number of parameters is large or perhaps even infinite. Thus, the development of numerical methods for these parametric problems is faced with the possible curse of dimensionality. This article is directed at (i) identifying and understanding which properties of parametric equations allow one to avoid this curse and (ii) developing and analyzing effective numerical methodd which fully exploit these properties and, in turn, are immune to the growth in dimensionality. The first part of this article studies the smoothness and approximability of the solution map, that is, the map $a\mapsto u(a)$ where $a$ is the parameter value and $u(a)$ is the corresponding solution to the PDE. It is shown that for many relevant parametric PDEs, the parametric smoothness of this map is typically holomorphic and also highly anisotropic in that the relevant parameters are of widely varying importance in describing the solution. These two properties are then exploited to establish convergence rates of $n$-term approximations to the solution map for which each term is separable in the parametric and physical variables. These results reveal that, at least on a theoretical level, the solution map can be well approximated by discretizations of moderate complexity, thereby showing how the curse of dimensionality is broken. This theoretical analysis is carried out through concepts of approximation theory such as best $n$-term approximation, sparsity, and $n$-widths. These notions determine a priori the best possible performance of numerical methods and thus serve as a benchmark for concrete algorithms. The second part of this article turns to the development of numerical algorithms based on the theoretically established sparse separable approximations. The numerical methods studied fall into two general categories. The first uses polynomial expansions in terms of the parameters to approximate the solution map. The second one searches for suitable low dimensional spaces for simultaneously approximating all members of the parametric family. The numerical implementation of these approaches is carried out through adaptive and greedy algorithms. An a priori analysis of the performance of these algorithms establishes how well they meet the theoretical benchmarks.

preprint2015arXiv

Data Assimilation in Reduced Modeling

We consider the problem of optimal recovery of an element $u$ of a Hilbert space $\mathcal{H}$ from $m$ measurements obtained through known linear functionals on $\mathcal{H}$. Problems of this type are well studied \cite{MRW} under an assumption that $u$ belongs to a prescribed model class, e.g. a known compact subset of $\mathcal{H}$. Motivated by reduced modeling for parametric partial differential equations, this paper considers another setting where the additional information about $u$ is in the form of how well $u$ can be approximated by a certain known subspace $V_n$ of $\mathcal{H}$ of dimension $n$, or more generally, how well $u$ can be approximated by each $k$-dimensional subspace $V_k$ of a sequence of nested subspaces $V_0\subset V_1\cdots\subset V_n$. A recovery algorithm for the one-space formulation, proposed in \cite{MPPY}, is proven here to be optimal and to have a simple formulation, if certain favorable bases are chosen to represent $V_n$ and the measurements. The major contribution of the present paper is to analyze the multi-space case for which it is shown that the set of all $u$ satisfying the given information can be described as the intersection of a family of known ellipsoids in $\mathcal{H}$. It follows that a near optimal recovery algorithm in the multi-space problem is to identify any point in this intersection which can provide a much better accuracy than in the one-space problem. Two iterative algorithms based on alternating projections are proposed for recovery in the multi-space problem. A detailed analysis of one of them provides a posteriori performance estimates for the iterates, stopping criteria, and convergence rates. Since the limit of the algorithm is a point in the intersection of the aforementioned ellipsoids, it provides a near optimal recovery for $u$.

preprint2015arXiv

Kolmogorov widths and low-rank approximations of parametric elliptic PDEs

Kolmogorov $n$-widths and low-rank approximations are studied for families of elliptic diffusion PDEs parametrized by the diffusion coefficients. The decay of the $n$-widths can be controlled by that of the error achieved by best $n$-term approximations using polynomials in the parametric variable. However, we prove that in certain relevant instances where the diffusion coefficients are piecewise constant over a partition of the physical domain, the $n$-widths exhibit significantly faster decay. This, in turn, yields a theoretical justification of the fast convergence of reduced basis or POD methods when treating such parametric PDEs. Our results are confirmed by numerical experiments, which also reveal the influence of the partition geometry on the decay of the $n$-widths.

preprint2015arXiv

Kolmogorov widths under holomorphic mappings

If $L$ is a bounded linear operator mapping the Banach space $X$ into the Banach space $Y$ and $K$ is a compact set in $X$, then the Kolmogorov widths of the image $L(K)$ do not exceed those of $K$ multiplied by the norm of $L$. We extend this result from linear maps to holomorphic mappings $u$ from $X$ to $Y$ in the following sense: when the $n$ widths of $K$ are $O(n^{-r})$ for some $r\textgreater{}1$, then those of $u(K)$ are $O(n^{-s})$ for any $s \textless{} r-1$, We then use these results to prove various theorems about Kolmogorov widths of manifolds consisting of solutions to certain parametrized PDEs. Results of this type are important in the numerical analysis of reduced bases and other reduced modeling methods, since the best possible performance of such methods is governed by the rate of decay of the Kolmogorov widths of the solution manifold.

preprint2015arXiv

Orthogonal Matching Pursuit under the Restricted Isometry Property

This paper is concerned with the performance of Orthogonal Matching Pursuit (OMP) algorithms applied to a dictionary $\mathcal{D}$ in a Hilbert space $\mathcal{H}$. Given an element $f\in \mathcal{H}$, OMP generates a sequence of approximations $f_n$, $n=1,2,\dots$, each of which is a linear combination of $n$ dictionary elements chosen by a greedy criterion. It is studied whether the approximations $f_n$ are in some sense comparable to {\em best $n$ term approximation} from the dictionary. One important result related to this question is a theorem of Zhang \cite{TZ} in the context of sparse recovery of finite dimensional signals. This theorem shows that OMP exactly recovers $n$-sparse signal, whenever the dictionary $\mathcal{D}$ satisfies a Restricted Isometry Property (RIP) of order $An$ for some constant $A$, and that the procedure is also stable in $\ell^2$ under measurement noise. The main contribution of the present paper is to give a structurally simpler proof of Zhang's theorem, formulated in the general context of $n$ term approximation from a dictionary in arbitrary Hilbert spaces $\mathcal{H}$. Namely, it is shown that OMP generates near best $n$ term approximations under a similar RIP condition.

preprint2015arXiv

Sparse polynomial approximation of parametric elliptic PDEs. Part II: lognormal coefficients

Elliptic partial differential equations with diffusion coefficients of lognormal form, that is $a=exp(b)$, where $b$ is a Gaussian random field, are considered. We study the $\ell^p$ summability properties of the Hermite polynomial expansion of the solution in terms of the countably many scalar parameters appearing in a given representation of $b$. These summability results have direct consequences on the approximation rates of best $n$-term truncated Hermite expansions. Our results significantly improve on the state of the art estimates available for this problem. In particular, they take into account the support properties of the basis functions involved in the representation of $b$, in addition to the size of these functions. One interesting conclusion from our analysis is that in certain relevant cases, the Karhunen-Loève representation of $b$ may not be the best choice concerning the resulting sparsity and approximability of the Hermite expansion.

preprint2015arXiv

Stochastic Optimal Control for Online Seller under Reputational Mechanisms

In this work we propose and analyze a model which addresses the pulsing behavior of sellers in an online auction (store). This pulsing behavior is observed when sellers switch between advertising and processing states. We assert that a seller switches her state in order to maximize her profit, and further that this switch can be identified through the seller's reputation. We show that for each seller there is an optimal reputation, i.e., the reputation at which the seller should switch her state in order to maximize her total profit. We design a stochastic behavioral model for an online seller, which incorporates the dynamics of resource allocation and reputation. The design of the model is optimized by using a stochastic advertising model from (16) and used effectively in the Stochastic Optimal Control of Advertising (12). This model of reputation is combined with the effect of online reputation on sales price empirically verified in (9). We derive the Hamilton-Jacobi-Bellman (HJB) differential equation, whose solution relates optimal wealth level to a seller's reputation. We formulate both a full model, as well as a reduced model with fewer parameters, both of which have the same qualitative description of the optimal seller behavior. Coincidentally, the reduced model has a closed form analytical solution that we construct.

preprint2014arXiv

Automatic Detection of Performance Anomalies in Task-Parallel Programs

To efficiently exploit the resources of new many-core architectures, integrating dozens or even hundreds of cores per chip, parallel programming models have evolved to expose massive amounts of parallelism, often in the form of fine-grained tasks. Task-parallel languages, such as OpenStream, X10, Habanero Java and C or StarSs, simplify the development of applications for new architectures, but tuning task-parallel applications remains a major challenge. Performance bottlenecks can occur at any level of the implementation, from the algorithmic level (e.g., lack of parallelism or over-synchronization), to interactions with the operating and runtime systems (e.g., data placement on NUMA architectures), to inefficient use of the hardware (e.g., frequent cache misses or misaligned memory accesses); detecting such issues and determining the exact cause is a difficult task. In previous work, we developed Aftermath, an interactive tool for trace-based performance analysis and debugging of task-parallel programs and run-time systems. In contrast to other trace-based analysis tools, such as Paraver or Vampir, Aftermath offers native support for tasks, i.e., visualization, statistics and analysis tools adapted for performance debugging at task granularity. However, the tool currently does not provide support for the automatic detection of performance bottlenecks and it is up to the user to investigate the relevant aspects of program execution by focusing the inspection on specific slices of a trace file. In this paper, we present ongoing work on two extensions that guide the user through this process.

preprint2014arXiv

Classification algorithms using adaptive partitioning

Algorithms for binary classification based on adaptive tree partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. In contrast with the most studied tree-based algorithms, which utilize piecewise constant approximation on the generated partition [IEEE Trans. Inform. Theory 52 (2006) 1335-1353; Mach. Learn. 66 (2007) 209-242], we consider decorated trees, which allow us to derive higher order methods. Convergence rates for these methods are derived in terms the parameter $α$ of margin conditions and a rate $s$ of best approximation of the Bayes set by decorated adaptive partitions. They can also be expressed in terms of the Besov smoothness $β$ of the regression function that governs its approximability by piecewise polynomials on adaptive partition. The execution of the algorithms does not require knowledge of the smoothness or margin conditions. Besov smoothness conditions are weaker than the commonly used Hölder conditions, which govern approximation by nonadaptive partitions, and therefore for a given regression function can result in a higher rate of convergence. This in turn mitigates the compatibility conflict between smoothness and margin parameters.

preprint2014arXiv

Sampling and reconstruction of solutions to the Helmholtz equation

We consider the inverse problem of reconstructing general solutions to the Helmholtz equation on some domain $Ω$ from their values at scattered points $x_1,\dots,x_n\subset Ω$. This problem typically arises when sampling acoustic fields with $n$ microphones for the purpose of reconstructing this field over a region of interest $Ω$ contained in a larger domain $D$ in which the acoustic field propagates. In many applied settings, the shape of $D$ and the boundary conditions on its border are unknown. Our reconstruction method is based on the approximation of a general solution $u$ by linear combinations of Fourier-Bessel functions or plane waves. We analyze the convergence of the least-squares estimates to $u$ using these families of functions based on the samples $(u(x_i))_{i=1,\dots,n}$. Our analysis describes the amount of regularization needed to guarantee the convergence of the least squares estimate towards $u$, in terms of a condition that depends on the dimension of the approximation subspace, the sample size $n$ and the distribution of the samples. It reveals the advantage of using non-uniform distributions that have more points on the boundary of $Ω$. Numerical illustrations show that our approach compares favorably with reconstruction methods using other basis functions, and other types of regularization.

preprint2013arXiv

PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs

We motivate the design and implementation of a platform-neutral compute intermediate language (PENCIL) for productive and performance-portable accelerator programming.

preprint2011arXiv

A general wavelet-based profile decomposition in the critical embedding of function spaces

We characterize the lack of compactness in the critical embedding of functions spaces $X\subset Y$ having similar scaling properties in the following terms : a sequence $(u_n)_{n\geq 0}$ bounded in $X$ has a subsequence that can be expressed as a finite sum of translations and dilations of functions $(ϕ_l)_{l>0}$ such that the remainder converges to zero in $Y$ as the number of functions in the sum and $n$ tend to $+\infty$. Such a decomposition was established by Gérard for the embedding of the homogeneous Sobolev space $X=\dot H^s$ into the $Y=L^p$ in $d$ dimensions with $0<s=d/2-d/p$, and then generalized by Jaffard to the case where $X$ is a Riesz potential space, using wavelet expansions. In this paper, we revisit the wavelet-based profile decomposition, in order to treat a larger range of examples of critical embedding in a hopefully simplified way. In particular we identify two generic properties on the spaces $X$ and $Y$ that are of key use in building the profile decomposition. These properties may then easily be checked for typical choices of $X$ and $Y$ satisfying critical embedding properties. These includes Sobolev, Besov, Triebel-Lizorkin, Lorentz, Hölder and BMO spaces.

preprint2011arXiv

Adaptive and anisotropic piecewise polynomial approximation

We survey the main results of approximation theory for adaptive piecewise polynomial functions. In such methods, the partition on which the piecewise polynomial approximation is defined is not fixed in advance, but adapted to the given function f which is approximated. We focus our discussion on (i) the properties that describe an optimal partition for f, (ii) the smoothness properties of f that govern the rate of convergence of the approximation in the Lp-norms, and (iii) fast refinement algorithms that generate near optimal partitions. While these results constitute a fairly established theory in the univariate case and in the multivariate case when dealing with elements of isotropic shape, the approximation theory for adaptive and anisotropic elements is still building up. We put a particular emphasis on some recent results obtained in this direction.

preprint2011arXiv

Adaptive multiresolution analysis based on anisotropic triangulations

A simple greedy refinement procedure for the generation of data-adapted triangulations is proposed and studied. Given a function of two variables, the algorithm produces a hierarchy of triangulations and piecewise polynomial approximations on these triangulations. The refinement procedure consists in bisecting a triangle T in a direction which is chosen so as to minimize the local approximation error in some prescribed norm between the approximated function and its piecewise polynomial approximation after T is bisected. The hierarchical structure allows us to derive various approximation tools such as multiresolution analysis, wavelet bases, adaptive triangulations based either on greedy or optimal CART trees, as well as a simple encoding of the corresponding triangulations. We give a general proof of convergence in the Lp norm of all these approximations. Numerical tests performed in the case of piecewise linear approximation of functions with analytic expressions or of numerical images illustrate the fact that the refinement procedure generates triangles with an optimal aspect ratio (which is dictated by the local Hessian of of the approximated function in case of C2 functions).

preprint2011arXiv

Anisotropic smoothness classes : from finite element approximation to image models

We propose and study quantitative measures of smoothness which are adapted to anisotropic features such as edges in images or shocks in PDE's. These quantities govern the rate of approximation by adaptive finite elements, when no constraint is imposed on the aspect ratio of the triangles, the simplest examples of such quantities are based on the determinant of the hessian of the function to be approximated. Since they are not semi-norms, these quantities cannot be used to define linear function spaces. We show that they can be well defined by mollification when the function to be approximated has jump discontinuities along piecewise smooth curves. This motivates for using them in image processing as an alternative to the frequently used record variation semi-norm which does not account for the geometric smoothness of the edges.

preprint2011arXiv

Greedy bisection generates optimally adapted triangulations

We study the properties of a simple greedy algorithm for the generation of data-adapted anisotropic triangulations. Given a function f, the algorithm produces nested triangulations and corresponding piecewise polynomial approximations of f. The refinement procedure picks the triangle which maximizes the local Lp approximation error, and bisect it in a direction which is chosen so to minimize this error at the next step. We study the approximation error in the Lp norm when the algorithm is applied to C2 functions with piecewise linear approximations. We prove that as the algorithm progresses, the triangles tend to adopt an optimal aspect ratio which is dictated by the local hessian of f. For convex functions, we also prove that the adaptive triangulations satisfy a convergence bound which is known to be asymptotically optimal among all possible triangulations.

preprint2011arXiv

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization

Research in automatic parallelization of loop-centric programs started with static analysis, then broadened its arsenal to include dynamic inspection-execution and speculative execution, the best results involving hybrid static-dynamic schemes. Beyond the detection of parallelism in a sequential program, scalable parallelization on many-core processors involves hard and interesting parallelism adaptation and mapping challenges. These challenges include tailoring data locality to the memory hierarchy, structuring independent tasks hierarchically to exploit multiple levels of parallelism, tuning the synchronization grain, balancing the execution load, decoupling the execution into thread-level pipelines, and leveraging heterogeneous hardware with specialized accelerators. The polyhedral framework allows to model, construct and apply very complex loop nest transformations addressing most of the parallelism adaptation and mapping challenges. But apart from hardware-specific, back-end oriented transformations (if-conversion, trace scheduling, value prediction), loop nest optimization has essentially ignored dynamic and speculative techniques. Research in polyhedral compilation recently reached a significant milestone towards the support of dynamic, data-dependent control flow. This opens a large avenue for blending dynamic analyses and speculative techniques with advanced loop nest optimizations. Selecting real-world examples from SPEC benchmarks and numerical kernels, we make a case for the design of synergistic static, dynamic and speculative loop transformation techniques. We also sketch the embedding of dynamic information, including speculative assumptions, in the heart of affine transformation search spaces.

Albert Cohen

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction

Secure Optimization Through Opaque Observations

MLIR: A Compiler Infrastructure for the End of Moore's Law

Nonlinear Methods for Model Reduction

Optimal reduced model algorithms for data-based state estimation

Optimal Stable Nonlinear Approximation

Reduced Basis Greedy Selection Using Random Training Sets

State Estimation -- The Role of Reduced Models

Diffusion Coefficients Estimation for Elliptic Partial Differential Equations

Discrete least-squares approximations over optimized downward closed polynomial spaces in arbitrary dimension

Multivariate approximation in downward closed polynomial spaces

Optimal weighted least-squares methods

Representations of Gaussian random fields and approximation of elliptic PDEs with lognormal coefficients

Sparse polynomial approximation of parametric elliptic PDEs. Part I: affine coefficients

Approximation of high-dimensional parametric PDEs

Data Assimilation in Reduced Modeling

Kolmogorov widths and low-rank approximations of parametric elliptic PDEs

Kolmogorov widths under holomorphic mappings

Orthogonal Matching Pursuit under the Restricted Isometry Property

Sparse polynomial approximation of parametric elliptic PDEs. Part II: lognormal coefficients

Stochastic Optimal Control for Online Seller under Reputational Mechanisms

Automatic Detection of Performance Anomalies in Task-Parallel Programs

Classification algorithms using adaptive partitioning

Sampling and reconstruction of solutions to the Helmholtz equation

PENCIL: Towards a Platform-Neutral Compute Intermediate Language for DSLs

A general wavelet-based profile decomposition in the critical embedding of function spaces

Adaptive and anisotropic piecewise polynomial approximation

Adaptive multiresolution analysis based on anisotropic triangulations

Anisotropic smoothness classes : from finite element approximation to image models

Greedy bisection generates optimally adapted triangulations

The Potential of Synergistic Static, Dynamic and Speculative Loop Nest Optimizations for Automatic Parallelization