Source author record

Mones Raslan

Mones Raslan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.FA Machine Learning math.NA Numerical Analysis math.GN

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by classical neural network approximation results. Concretely, we use the existence of a small reduced basis to construct, for a large variety of parametric partial differential equations, neural networks that yield approximations of the parametric solution maps in such a way that the sizes of these networks essentially only depend on the size of the reduced basis.

preprint2020arXiv

Approximation Rates for Neural Networks with Encodable Weights in Smoothness Spaces

We examine the necessary and sufficient complexity of neural networks to approximate functions from different smoothness spaces under the restriction of encodable network weights. Based on an entropy argument, we start by proving lower bounds for the number of nonzero encodable weights for neural network approximation in Besov spaces, Sobolev spaces and more. These results are valid for all sufficiently smooth activation functions. Afterwards, we provide a unifying framework for the construction of approximate partitions of unity by neural networks with fairly general activation functions. This allows us to approximate localized Taylor polynomials by neural networks and make use of the Bramble-Hilbert Lemma. Based on our framework, we derive almost optimal upper bounds in higher-order Sobolev norms. This work advances the theory of approximating solutions of partial differential equations by neural networks.

preprint2020arXiv

Expressivity of Deep Neural Networks

In this review paper, we give a comprehensive overview of the large variety of approximation results for neural networks. Approximation rates for classical function spaces as well as benefits of deep neural networks over shallow ones for specifically structured function classes are discussed. While the mainbody of existing results is for general feedforward architectures, we also depict approximation results for convolutional, residual and recurrent neural networks.

preprint2020arXiv

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory predicts that the performance of the model should depend only very mildly on the dimension of the parameter space and is determined by the intrinsic dimension of the solution manifold of the parametric partial differential equation. We use various methods to establish comparability between test-cases by minimizing the effect of the choice of test-cases on the optimization and sampling aspects of the learning problem. We find strong support for the hypothesis that approximation-theoretical effects heavily influence the practical behavior of learning problems in numerical analysis.

preprint2020arXiv

Topological properties of the set of functions generated by neural networks of fixed size

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0 < p < \infty$, for all practically-used activation functions, and also not closed with respect to the $L^\infty$-norm for all practically-used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if $f_1, f_2$ are two functions realized by neural networks and if $f_1, f_2$ are close in the sense that $\|f_1 - f_2\|_{L^\infty} \leq \varepsilon$ for $\varepsilon > 0$, it is, regardless of the size of $\varepsilon$, usually not possible to find weights $w_1, w_2$ close together such that each $f_i$ is realized by a neural network with weights $w_i$. Overall, our findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.

Mones Raslan

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

Approximation Rates for Neural Networks with Encodable Weights in Smoothness Spaces

Expressivity of Deep Neural Networks

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

Topological properties of the set of functions generated by neural networks of fixed size