Source author record

Philipp Petersen

Philipp Petersen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.FA Machine Learning math.NA Artificial Intelligence Numerical Analysis math.AP math.GN math.HO math.ST

Catalog footprint

What is connected

16works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.

preprint2026arXiv

FactoryBench: Evaluating Industrial Machine Understanding

We introduce FactoryBench, a benchmark for evaluating time-series models and LLMs on machine understanding over industrial robotic telemetry. Q&A pairs are organized along four causal levels (state, intervention, counterfactual, decision) instantiating Pearl's ladder of causation, and span five answer formats: four structured formats are scored deterministically and free-form answers are scored by an LLM-as-judge voting protocol. We propose a scalable Q&A generation framework built around structured question templates, present FactoryWave (a dense, multitask, multivariate sensor dataset collected from a UR3 cobot and a KUKA KR10 industrial arm), and construct FactoryBench as a large-scale benchmark of over 70k Q&A items grounded in roughly 15k normalized episodes from FactoryWave, AURSAD, and voraus-AD. Zero-shot evaluation of six frontier LLMs shows that no model exceeds 50% on structured levels or 18% on decision-making, revealing a wide gap between current models and operational machine understanding.

preprint2026arXiv

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

We introduce the first universal pretraining corpus for industrial time-series data: FactoryNet. 51M datapoints across 23k end-to-end task executions (13.3k real, 9.8k synthetic) on six embodiments, unified by a shared schema that enables robust zero-shot cross-embodiment transfer and highly parameter-efficient anomaly detection. We introduce a novel schema: Setpoint, Effort, Feedback, Context (S-E-F-C) underlying the whole pipeline that maps any actuated system into a common representational frame. The corpus spans 27 annotated anomaly types alongside healthy baselines and counterfactual pairs across robotic manipulation and machining domains. Cross-embodiment transfer experiments yield positive results: under bias-aware metrics our model demonstrates fair cross-embodiment transfer capabilities on the evaluated source-target pair, while 24 schema-aligned signals achieves competitive anomaly detection performance compared to high-dimensional baselines. We release FactoryNet as a growing, multi-embodiment dataset to drive progress toward industrial foundation models.

preprint2026arXiv

HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series

Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is scarce because such events are rare and costly to annotate. We introduce HEPA (Horizon-conditioned Event Predictive Architecture), built on two key principles. First, a causal Transformer encoder is pretrained via a Joint-Embedding Predictive Architecture (JEPA): a horizon-conditioned predictor learns to forecast future representations rather than future values, forcing the encoder to capture predictable temporal dynamics from unlabeled data alone. Second, we freeze the encoder and finetune only the predictor toward the target event, producing a monotonic survival cumulative distribution function (CDF) over horizons. With fixed architecture and optimiser hyperparameters across all benchmarks, HEPA handles water contamination, cyberattack detection, volatility regimes, and eight further event types across 11 domains, exceeding leading time-series architectures including PatchTST, iTransformer, MAE, and Chronos-2 on at least 10 of 14 benchmarks, with an order of magnitude fewer tuned parameters and, on lifecycle datasets, an order of magnitude less labeled data.

preprint2026arXiv

Mathematical theory of deep learning

This book provides an introduction to the mathematical analysis of deep learning. It covers fundamental results in approximation theory, optimization theory, and statistical learning theory, which are the three main pillars of deep neural network theory. Serving as a guide for students and researchers in mathematics and related fields, the book aims to equip readers with foundational knowledge on the topic. It prioritizes simplicity over generality, and presents rigorous yet accessible results to help build an understanding of the essential mathematical concepts underpinning deep learning.

preprint2022arXiv

Neural network approximation and estimation of classifiers with classification boundary in a Barron class

We prove bounds for the approximation and estimation of certain binary classification functions using ReLU neural networks. Our estimation bounds provide a priori performance guarantees for empirical risk minimization using networks of a suitable size, depending on the number of training samples available. The obtained approximation and estimation rates are independent of the dimension of the input, showing that the curse of dimensionality can be overcome in this setting; in fact, the input dimension only enters in the form of a polynomial factor. Regarding the regularity of the target classification function, we assume the interfaces between the different classes to be locally of Barron-type. We complement our results by studying the relations between various Barron-type spaces that have been proposed in the literature. These spaces differ substantially more from each other than the current literature suggests.

preprint2021arXiv

Equivalence of approximation by convolutional neural networks and fully-connected networks

Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of {fully-connected} neural networks for functions $f \in \mathcal{C}$ -- for an arbitrary function class $\mathcal{C}$ -- translate to essentially the same bounds concerning approximation rates of convolutional neural networks for functions $f \in {\mathcal{C}^{equi}}$, with the class ${\mathcal{C}^{equi}}$ consisting of all translation equivariant functions whose first coordinate belongs to $\mathcal{C}$. All presented results consider exclusively the case of convolutional neural networks without any pooling operation and with circular convolutions, i.e., not based on zero-padding.

preprint2020arXiv

A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by classical neural network approximation results. Concretely, we use the existence of a small reduced basis to construct, for a large variety of parametric partial differential equations, neural networks that yield approximations of the parametric solution maps in such a way that the sizes of these networks essentially only depend on the size of the reduced basis.

preprint2020arXiv

Efficient Approximation of Solutions of Parametric Linear Transport Equations by ReLU DNNs

We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimension. We demonstrate that through their inherent compositionality deep neural networks can resolve the characteristic flow underlying the transport equations and thereby allow approximation rates independent of the parameter dimension.

preprint2020arXiv

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory predicts that the performance of the model should depend only very mildly on the dimension of the parameter space and is determined by the intrinsic dimension of the solution manifold of the parametric partial differential equation. We use various methods to establish comparability between test-cases by minimizing the effect of the choice of test-cases on the optimization and sampling aspects of the learning problem. We find strong support for the hypothesis that approximation-theoretical effects heavily influence the practical behavior of learning problems in numerical analysis.

preprint2020arXiv

Topological properties of the set of functions generated by neural networks of fixed size

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0 < p < \infty$, for all practically-used activation functions, and also not closed with respect to the $L^\infty$-norm for all practically-used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if $f_1, f_2$ are two functions realized by neural networks and if $f_1, f_2$ are close in the sense that $\|f_1 - f_2\|_{L^\infty} \leq \varepsilon$ for $\varepsilon > 0$, it is, regardless of the size of $\varepsilon$, usually not possible to find weights $w_1, w_2$ close together such that each $f_i$ is realized by a neural network with weights $w_i$. Overall, our findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.

preprint2016arXiv

Regularization and Numerical Solution of the Inverse Scattering Problem using Shearlet Frames

Regularization techniques for the numerical solution of inverse scattering problems in two space dimensions are discussed. Assuming that the boundary of a scatterer is its most prominent feature, we exploit as model the class of cartoon-like functions. Since functions in this class are asymptotically optimally sparsely approximated by shearlet frames, we consider shearlets as a means for regularization in a Tikhonov method. We analyze two approaches, namely solvers for the nonlinear problem and for the linearized problem obtained by the Born approximation technique. As example for the first class we study the acoustic inverse scattering problem, and for the second class, the inverse scattering problem of the Schrödinger equation. In both cases, we derive analytical results for our approaches. Whereas our emphasis for the linearized problem is more on the theoretical side due to the standardness of associated solvers, we provide numerical examples for the nonlinear problem that highlight the effectiveness of our algorithmic approach.

preprint2016arXiv

Shearlet approximation of functions with discontinuous derivatives

We demonstrate that shearlet systems yield superior $N$-term approximation rates compared with wavelet systems of functions whose first or higher order derivatives are smooth away from smooth discontinuity curves. We will also provide an improved estimate for the decay of shearlet coefficients that intersect a discontinuity curve non-tangentially.

preprint2015arXiv

Classification of Edges Using Compactly Supported Shearlets

We analyze the detection and classification of singularities of functions $f = χ_B$, where $B \subset \mathbb{R}^d$ and $d = 2,3$. It will be shown how the set $\partial B$ can be extracted by a continuous shearlet transform associated with compactly supported shearlets. Furthermore, if $\partial S$ is a $d-1$ dimensional piecewise smooth manifold with $d=2$ or $3$, we will classify smooth and non-smooth components of $\partial S$. This improves previous results given for shearlet systems with a certain band-limited generator, since the estimates we derive are uniform. Moreover, we will show that our bounds are optimal. Along the way, we also obtain novel results on the characterization of wavefront sets in $3$ dimensions by compactly supported shearlets. Finally, geometric properties of $\partial S$ such as curvature are described in terms of the continuous shearlet transform of $f$.

preprint2015arXiv

Detection of Geometric Structures by an Optimally Subsampled Shearlet System in Noisy Digital Images

We provide a statistical analysis of the ability of digitized continuous shearlet systems to detect objects embedded in white noise. We analyze the possibility to subsample the shearlet transform and obtain a subset of significantly reduced cardinality that can still yield statistically optimal detection results.

preprint2015arXiv

Linear independence of compactly supported separable shearlet systems

This paper examines linear independence of shearlet systems. This property has already been studied for wavelets and other systems such as, for instance, for Gabor systems. In fact, for Gabor systems this problem is commonly known as the HRT conjecture. In this paper we present a proof of linear independence of compactly supported separable shearlet systems. For this, we employ a sampling strategy to utilize the structure of an implicitly given underlying oversampled wavelet system as well as the shape of the supports of the shearlet elements.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

math.FA Machine Learning math.NA Artificial Intelligence Numerical Analysis math.AP math.GN math.HO math.ST

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.04995:author:3:philipp-petersen

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.09081:author:8:philipp-petersen

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.07675:author:11:philipp-petersen

Imported May 20, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.11130:author:6:philipp-petersen

Imported May 20, 2026Synced May 21, 2026

4 works

Gitta Kutyniok

Researcher

Gitta Kutyniok contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Camilla Mazzoleni

Researcher

Camilla Mazzoleni contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Federico Martelli

Researcher

Federico Martelli contributes to research discovery and scholarly infrastructure.

Open to collaborate

3 works

Felix Voigtlaender

Researcher

Felix Voigtlaender contributes to research discovery and scholarly infrastructure.

Open to collaborate

Philipp Petersen

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

FactoryBench: Evaluating Industrial Machine Understanding

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series

Mathematical theory of deep learning

Neural network approximation and estimation of classifiers with classification boundary in a Barron class

Equivalence of approximation by convolutional neural networks and fully-connected networks

A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

Efficient Approximation of Solutions of Parametric Linear Transport Equations by ReLU DNNs

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

Topological properties of the set of functions generated by neural networks of fixed size

Regularization and Numerical Solution of the Inverse Scattering Problem using Shearlet Frames

Shearlet approximation of functions with discontinuous derivatives

Classification of Edges Using Compactly Supported Shearlets

Detection of Geometric Structures by an Optimally Subsampled Shearlet System in Noisy Digital Images

Linear independence of compactly supported separable shearlet systems