Researcher profile

Philipp Petersen

Philipp Petersen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Adaptivity Under Realizability Constraints: Comparing In-Context and Agentic Learning

We compare in-context learning with fixed queries and agentic learning with adaptive queries for uniform approximation of task families. We consider two settings: an unrestricted regime, where querying and approximation are arbitrary functions, and a realizable regime, where we require these operations to be implemented by ReLU neural networks. In both settings, adaptivity never hinders approximation performance. However, this advantage can change when one passes from the unrestricted regime to the realizable regime. We identify four distinct approximation scenarios, each witnessed by an explicit task family: (a) no advantage of adaptivity; (b) an advantage in the unrestricted regime that persists under ReLU realizability; (c) an advantage that arises only under realizability; and (d) an advantage that disappears under realizability. This demonstrates that representational constraints interact profoundly with the effect of adaptivity.

preprint2026arXiv

FactoryBench: Evaluating Industrial Machine Understanding

We introduce FactoryBench, a benchmark for evaluating time-series models and LLMs on machine understanding over industrial robotic telemetry. Q&A pairs are organized along four causal levels (state, intervention, counterfactual, decision) instantiating Pearl's ladder of causation, and span five answer formats: four structured formats are scored deterministically and free-form answers are scored by an LLM-as-judge voting protocol. We propose a scalable Q&A generation framework built around structured question templates, present FactoryWave (a dense, multitask, multivariate sensor dataset collected from a UR3 cobot and a KUKA KR10 industrial arm), and construct FactoryBench as a large-scale benchmark of over 70k Q&A items grounded in roughly 15k normalized episodes from FactoryWave, AURSAD, and voraus-AD. Zero-shot evaluation of six frontier LLMs shows that no model exceeds 50% on structured levels or 18% on decision-making, revealing a wide gap between current models and operational machine understanding.

preprint2026arXiv

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

We introduce the first universal pretraining corpus for industrial time-series data: FactoryNet. 51M datapoints across 23k end-to-end task executions (13.3k real, 9.8k synthetic) on six embodiments, unified by a shared schema that enables robust zero-shot cross-embodiment transfer and highly parameter-efficient anomaly detection. We introduce a novel schema: Setpoint, Effort, Feedback, Context (S-E-F-C) underlying the whole pipeline that maps any actuated system into a common representational frame. The corpus spans 27 annotated anomaly types alongside healthy baselines and counterfactual pairs across robotic manipulation and machining domains. Cross-embodiment transfer experiments yield positive results: under bias-aware metrics our model demonstrates fair cross-embodiment transfer capabilities on the evaluated source-target pair, while 24 schema-aligned signals achieves competitive anomaly detection performance compared to high-dimensional baselines. We release FactoryNet as a growing, multi-embodiment dataset to drive progress toward industrial foundation models.

preprint2026arXiv

HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series

Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is scarce because such events are rare and costly to annotate. We introduce HEPA (Horizon-conditioned Event Predictive Architecture), built on two key principles. First, a causal Transformer encoder is pretrained via a Joint-Embedding Predictive Architecture (JEPA): a horizon-conditioned predictor learns to forecast future representations rather than future values, forcing the encoder to capture predictable temporal dynamics from unlabeled data alone. Second, we freeze the encoder and finetune only the predictor toward the target event, producing a monotonic survival cumulative distribution function (CDF) over horizons. With fixed architecture and optimiser hyperparameters across all benchmarks, HEPA handles water contamination, cyberattack detection, volatility regimes, and eight further event types across 11 domains, exceeding leading time-series architectures including PatchTST, iTransformer, MAE, and Chronos-2 on at least 10 of 14 benchmarks, with an order of magnitude fewer tuned parameters and, on lifecycle datasets, an order of magnitude less labeled data.

preprint2026arXiv

Mathematical theory of deep learning

This book provides an introduction to the mathematical analysis of deep learning. It covers fundamental results in approximation theory, optimization theory, and statistical learning theory, which are the three main pillars of deep neural network theory. Serving as a guide for students and researchers in mathematics and related fields, the book aims to equip readers with foundational knowledge on the topic. It prioritizes simplicity over generality, and presents rigorous yet accessible results to help build an understanding of the essential mathematical concepts underpinning deep learning.

preprint2022arXiv

Neural network approximation and estimation of classifiers with classification boundary in a Barron class

We prove bounds for the approximation and estimation of certain binary classification functions using ReLU neural networks. Our estimation bounds provide a priori performance guarantees for empirical risk minimization using networks of a suitable size, depending on the number of training samples available. The obtained approximation and estimation rates are independent of the dimension of the input, showing that the curse of dimensionality can be overcome in this setting; in fact, the input dimension only enters in the form of a polynomial factor. Regarding the regularity of the target classification function, we assume the interfaces between the different classes to be locally of Barron-type. We complement our results by studying the relations between various Barron-type spaces that have been proposed in the literature. These spaces differ substantially more from each other than the current literature suggests.

preprint2021arXiv

Equivalence of approximation by convolutional neural networks and fully-connected networks

Convolutional neural networks are the most widely used type of neural networks in applications. In mathematical analysis, however, mostly fully-connected networks are studied. In this paper, we establish a connection between both network architectures. Using this connection, we show that all upper and lower bounds concerning approximation rates of {fully-connected} neural networks for functions $f \in \mathcal{C}$ -- for an arbitrary function class $\mathcal{C}$ -- translate to essentially the same bounds concerning approximation rates of convolutional neural networks for functions $f \in {\mathcal{C}^{equi}}$, with the class ${\mathcal{C}^{equi}}$ consisting of all translation equivariant functions whose first coordinate belongs to $\mathcal{C}$. All presented results consider exclusively the case of convolutional neural networks without any pooling operation and with circular convolutions, i.e., not based on zero-padding.

preprint2020arXiv

A Theoretical Analysis of Deep Neural Networks and Parametric PDEs

We derive upper bounds on the complexity of ReLU neural networks approximating the solution maps of parametric partial differential equations. In particular, without any knowledge of its concrete shape, we use the inherent low-dimensionality of the solution manifold to obtain approximation rates which are significantly superior to those provided by classical neural network approximation results. Concretely, we use the existence of a small reduced basis to construct, for a large variety of parametric partial differential equations, neural networks that yield approximations of the parametric solution maps in such a way that the sizes of these networks essentially only depend on the size of the reduced basis.

preprint2020arXiv

Efficient Approximation of Solutions of Parametric Linear Transport Equations by ReLU DNNs

We demonstrate that deep neural networks with the ReLU activation function can efficiently approximate the solutions of various types of parametric linear transport equations. For non-smooth initial conditions, the solutions of these PDEs are high-dimensional and non-smooth. Therefore, approximation of these functions suffers from a curse of dimension. We demonstrate that through their inherent compositionality deep neural networks can resolve the characteristic flow underlying the transport equations and thereby allow approximation rates independent of the parameter dimension.

preprint2020arXiv

Numerical Solution of the Parametric Diffusion Equation by Deep Neural Networks

We perform a comprehensive numerical study of the effect of approximation-theoretical results for neural networks on practical learning problems in the context of numerical analysis. As the underlying model, we study the machine-learning-based solution of parametric partial differential equations. Here, approximation theory predicts that the performance of the model should depend only very mildly on the dimension of the parameter space and is determined by the intrinsic dimension of the solution manifold of the parametric partial differential equation. We use various methods to establish comparability between test-cases by minimizing the effect of the choice of test-cases on the optimization and sampling aspects of the learning problem. We find strong support for the hypothesis that approximation-theoretical effects heavily influence the practical behavior of learning problems in numerical analysis.

preprint2020arXiv

Topological properties of the set of functions generated by neural networks of fixed size

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties. It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0 < p < \infty$, for all practically-used activation functions, and also not closed with respect to the $L^\infty$-norm for all practically-used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable for every practically used activation function. In other words, if $f_1, f_2$ are two functions realized by neural networks and if $f_1, f_2$ are close in the sense that $\|f_1 - f_2\|_{L^\infty} \leq \varepsilon$ for $\varepsilon > 0$, it is, regardless of the size of $\varepsilon$, usually not possible to find weights $w_1, w_2$ close together such that each $f_i$ is realized by a neural network with weights $w_i$. Overall, our findings identify potential causes for issues in the training procedure of deep learning such as no guaranteed convergence, explosion of parameters, and slow convergence.