Researcher profile

Felix Dietrich

Felix Dietrich contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Dictionary learning for Kernel EDMD

Studying nonlinear dynamical systems through their state space behavior can be challenging, and one possible alternative is to analyze them via their associated Koopman operator. This turns the nonlinear problem into a linear, infinite-dimensional one. To approximate the operator in finite dimensions, extended dynamic mode decomposition (EDMD) is a commonly used algorithm. It requires a finite list of functionals and a set of snapshots from the system to compute an approximation of the operator and its corresponding spectrum. Instead of choosing the list of functionals directly, it can be implicitly defined via kernels, a method known as kernel extended dynamic mode decomposition (kEDMD). However, one still needs to define the kernel and choose its parameter values. In this paper, we aim to streamline this process by extending dictionary learning for EDMD to kernel learning in kEDMD. By simplifying kEDMD we show how to perform gradient-based optimization over the learnable kernel parameters, and demonstrate that this method leads to useful kernels for the original kEDMD. The focus of our work is a method that takes a weighted list of kernels with randomly initialized values as input and outputs a list of kernels and parameter values suitable for approximating the Koopman operator of the underlying system. We demonstrate that unimportant kernels can be removed from the list by analyzing the weights in the weighted sum. We evaluate the method across several experiments, including the Duffing oscillator and the Kuramoto-Sivashinsky PDE, showcasing the method's different strengths.

preprint2022arXiv

Learning the temporal evolution of multivariate densities via normalizing flows

In this work, we propose a method to learn multivariate probability distributions using sample path data from stochastic differential equations. Specifically, we consider temporally evolving probability distributions (e.g., those produced by integrating local or nonlocal Fokker-Planck equations). We analyze this evolution through machine learning assisted construction of a time-dependent mapping that takes a reference distribution (say, a Gaussian) to each and every instance of our evolving distribution. If the reference distribution is the initial condition of a Fokker-Planck equation, what we learn is the time-T map of the corresponding solution. Specifically, the learned map is a multivariate normalizing flow that deforms the support of the reference density to the support of each and every density snapshot in time. We demonstrate that this approach can approximate probability density function evolutions in time from observed sampled data for systems driven by both Brownian and Lévy noise. We present examples with two- and three-dimensional, uni- and multimodal distributions to validate the method.

preprint2022arXiv

On the Parameter Combinations That Matter and on Those That do Not

We present a data-driven approach to characterizing nonidentifiability of a model's parameters and illustrate it through dynamic as well as steady kinetic models. By employing Diffusion Maps and their extensions, we discover the minimal combinations of parameters required to characterize the output behavior of a chemical system: a set of effective parameters for the model. Furthermore, we introduce and use a Conformal Autoencoder Neural Network technique, as well as a kernel-based Jointly Smooth Function technique, to disentangle the redundant parameter combinations that do not affect the output behavior from the ones that do. We discuss the interpretability of our data-driven effective parameters, and demonstrate the utility of the approach both for behavior prediction and parameter estimation. In the latter task, it becomes important to describe level sets in parameter space that are consistent with a particular output behavior. We validate our approach on a model of multisite phosphorylation, where a reduced set of effective parameters (nonlinear combinations of the physical ones) has previously been established analytically.

preprint2022arXiv

Personalized Algorithm Generation: A Case Study in Learning ODE Integrators

We study the learning of numerical algorithms for scientific computing, which combines mathematically driven, handcrafted design of general algorithm structure with a data-driven adaptation to specific classes of tasks. This represents a departure from the classical approaches in numerical analysis, which typically do not feature such learning-based adaptations. As a case study, we develop a machine learning approach that automatically learns effective solvers for initial value problems in the form of ordinary differential equations (ODEs), based on the Runge-Kutta (RK) integrator architecture. We show that we can learn high-order integrators for targeted families of differential equations without the need for computing integrator coefficients by hand. Moreover, we demonstrate that in certain cases we can obtain superior performance to classical RK methods. This can be attributed to certain properties of the ODE families being identified and exploited by the approach. Overall, this work demonstrates an effective learning-based approach to the design of algorithms for the numerical solution of differential equations. This can be readily extended to other numerical tasks.

preprint2022arXiv

Safe Policy Improvement Approaches and their Limitations

Safe Policy Improvement (SPI) is an important technique for offline reinforcement learning in safety critical applications as it improves the behavior policy with a high probability. We classify various SPI approaches from the literature into two groups, based on how they utilize the uncertainty of state-action pairs. Focusing on the Soft-SPIBB (Safe Policy Improvement with Soft Baseline Bootstrapping) algorithms, we show that their claim of being provably safe does not hold. Based on this finding, we develop adaptations, the Adv-Soft-SPIBB algorithms, and show that they are provably safe. A heuristic adaptation, Lower-Approx-Soft-SPIBB, yields the best performance among all SPIBB algorithms in extensive experiments on two benchmarks. We also check the safety guarantees of the provably safe algorithms and show that huge amounts of data are necessary such that the safety bounds become useful in practice.

preprint2022arXiv

Safe Policy Improvement Approaches on Discrete Markov Decision Processes

Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy. Building on SPI with Soft Baseline Bootstrapping (Soft-SPIBB) by Nadjahi et al., we identify theoretical issues in their approach, provide a corrected theory, and derive a new algorithm that is provably safe on finite Markov Decision Processes (MDP). Additionally, we provide a heuristic algorithm that exhibits the best performance among many state of the art SPI algorithms on two different benchmarks. Furthermore, we introduce a taxonomy of SPI algorithms and empirically show an interesting property of two classes of SPI algorithms: while the mean performance of algorithms that incorporate the uncertainty as a penalty on the action-value is higher, actively restricting the set of policies more consistently produces good policies and is, thus, safer.

preprint2021arXiv

LOCA: LOcal Conformal Autoencoder for standardized data coordinates

We propose a deep-learning based method for obtaining standardized data coordinates from scientific measurements.Data observations are modeled as samples from an unknown, non-linear deformation of an underlying Riemannian manifold, which is parametrized by a few normalized latent variables. By leveraging a repeated measurement sampling strategy, we present a method for learning an embedding in $\mathbb{R}^d$ that is isometric to the latent variables of the manifold. These data coordinates, being invariant under smooth changes of variables, enable matching between different instrumental observations of the same phenomenon. Our embedding is obtained using a LOcal Conformal Autoencoder (LOCA), an algorithm that constructs an embedding to rectify deformations by using a local z-scoring procedure while preserving relevant geometric information. We demonstrate the isometric embedding properties of LOCA on various model settings and observe that it exhibits promising interpolation and extrapolation capabilities. Finally, we apply LOCA to single-site Wi-Fi localization data, and to $3$-dimensional curved surface estimation based on a $2$-dimensional projection.

preprint2021arXiv

Spectral Discovery of Jointly Smooth Features for Multimodal Data

In this paper, we propose a spectral method for deriving functions that are jointly smooth on multiple observed manifolds. This allows us to register measurements of the same phenomenon by heterogeneous sensors, and to reject sensor-specific noise. Our method is unsupervised and primarily consists of two steps. First, using kernels, we obtain a subspace spanning smooth functions on each separate manifold. Then, we apply a spectral method to the obtained subspaces and discover functions that are jointly smooth on all manifolds. We show analytically that our method is guaranteed to provide a set of orthogonal functions that are as jointly smooth as possible, ordered by increasing Dirichlet energy from the smoothest to the least smooth. In addition, we show that the extracted functions can be efficiently extended to unseen data using the Nyström method. We demonstrate the proposed method on both simulated and real measured data and compare the results to nonlinear variants of the seminal Canonical Correlation Analysis (CCA). Particularly, we show superior results for sleep stage identification. In addition, we show how the proposed method can be leveraged for finding minimal realizations of parameter spaces of nonlinear dynamical systems.

preprint2021arXiv

Transformations between deep neural networks

We propose to test, and when possible establish, an equivalence between two different artificial neural networks by attempting to construct a data-driven transformation between them, using manifold-learning techniques. In particular, we employ diffusion maps with a Mahalanobis-like metric. If the construction succeeds, the two networks can be thought of as belonging to the same equivalence class. We first discuss transformation functions between only the outputs of the two networks; we then also consider transformations that take into account outputs (activations) of a number of internal neurons from each network. In general, Whitney's theorem dictates the number of measurements from one of the networks required to reconstruct each and every feature of the second network. The construction of the transformation function relies on a consistent, intrinsic representation of the network input space. We illustrate our algorithm by matching neural network pairs trained to learn (a) observations of scalar functions; (b) observations of two-dimensional vector fields; and (c) representations of images of a moving three-dimensional object (a rotating horse). The construction of such equivalence classes across different network instantiations clearly relates to transfer learning. We also expect that it will be valuable in establishing equivalence between different Machine Learning-based models of the same phenomenon observed through different instruments and by different research groups.

preprint2020arXiv

Domain Adaptation with Optimal Transport on the Manifold of SPD matrices

In this paper, we address the problem of Domain Adaptation (DA) using Optimal Transport (OT) on Riemannian manifolds. We model the difference between two domains by a diffeomorphism and use the polar factorization theorem to claim that OT is indeed optimal for DA in a well-defined sense, up to a volume preserving map. We then focus on the manifold of Symmetric and Positive-Definite (SPD) matrices, whose structure provided a useful context in recent applications. We demonstrate the polar factorization theorem on this manifold. Due to the uniqueness of the weighted Riemannian mean, and by exploiting existing regularized OT algorithms, we formulate a simple algorithm that maps the source domain to the target domain. We test our algorithm on two Brain-Computer Interface (BCI) data sets and observe state of the art performance.

preprint2020arXiv

On Learning Hamiltonian Systems from Data

Concise, accurate descriptions of physical systems through their conserved quantities abound in the natural sciences. In data science, however, current research often focuses on regression problems, without routinely incorporating additional assumptions about the system that generated the data. Here, we propose to explore a particular type of underlying structure in the data: Hamiltonian systems, where an "energy" is conserved. Given a collection of observations of such a Hamiltonian system over time, we extract phase space coordinates and a Hamiltonian function of them that acts as the generator of the system dynamics. The approach employs an autoencoder neural network component to estimate the transformation from observations to the phase space of a Hamiltonian system. An additional neural network component is used to approximate the Hamiltonian function on this constructed space, and the two components are trained jointly. As an alternative approach, we also demonstrate the use of Gaussian processes for the estimation of such a Hamiltonian. After two illustrative examples, we extract an underlying phase space as well as the generating Hamiltonian from a collection of movies of a pendulum. The approach is fully data-driven, and does not assume a particular form of the Hamiltonian function.

preprint2020arXiv

On the Koopman operator of algorithms

A systematic mathematical framework for the study of numerical algorithms would allow comparisons, facilitate conjugacy arguments, as well as enable the discovery of improved, accelerated, data-driven algorithms. Over the course of the last century, the Koopman operator has provided a mathematical framework for the study of dynamical systems, which facilitates conjugacy arguments and can provide efficient reduced descriptions. More recently, numerical approximations of the operator have enabled the analysis of a large number of deterministic and stochastic dynamical systems in a completely data-driven, essentially equation-free pipeline. Discrete or continuous time numerical algorithms (integrators, nonlinear equation solvers, optimization algorithms) are themselves dynamical systems. In this paper, we use this insight to leverage the Koopman operator framework in the data-driven study of such algorithms and discuss benefits for analysis and acceleration of numerical computation. For algorithms acting on high-dimensional spaces by quickly contracting them towards low-dimensional manifolds, we demonstrate how basis functions adapted to the data help to construct efficient reduced representations of the operator. Our illustrative examples include the gradient descent and Nesterov optimization algorithms, as well as the Newton-Raphson algorithm.

preprint2019arXiv

Manifold Learning for Organizing Unstructured Sets of Process Observations

Data mining is routinely used to organize ensembles of short temporal observations so as to reconstruct useful, low-dimensional realizations of an underlying dynamical system. In this paper, we use manifold learning to organize unstructured ensembles of observations ("trials") of a system's response surface. We have no control over where every trial starts; and during each trial operating conditions are varied by turning "agnostic" knobs, which change system parameters in a systematic but unknown way. As one (or more) knobs "turn" we record (possibly partial) observations of the system response. We demonstrate how such partial and disorganized observation ensembles can be integrated into coherent response surfaces whose dimension and parametrization can be systematically recovered in a data-driven fashion. The approach can be justified through the Whitney and Takens embedding theorems, allowing reconstruction of manifolds/attractors through different types of observations. We demonstrate our approach by organizing unstructured observations of response surfaces, including the reconstruction of a cusp bifurcation surface for Hydrogen combustion in a Continuous Stirred Tank Reactor. Finally, we demonstrate how this observation-based reconstruction naturally leads to informative transport maps between input parameter space and output/state variable spaces.