Researcher profile

Sören Hohmann

Sören Hohmann contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Excitation for Adaptive Optimal Control of Nonlinear Systems in Differential Games

This work focuses on the fulfillment of the Persistent Excitation (PE) condition for signals which result from transformations by means of polynomials. This is essential e.g. for the convergence of Adaptive Dynamic Programming algorithms due to commonly used polynomial function approximators. As theoretical statements are scarce regarding the nonlinear transformation of PE signals, we propose conditions on the system state such that its transformation by polynomials is PE. To validate our theoretical statements, we develop an exemplary excitation procedure based on our conditions using a feedforward control approach and demonstrate the effectiveness of our method in a nonzero-sum differential game. In this setting, our approach outperforms commonly used probing noise in terms of convergence time and the degree of PE, shown by a numerical example.

preprint2021arXiv

Adaptive Optimal Trajectory Tracking Control Applied to a Large-Scale Ball-on-Plate System

While many theoretical works concerning Adaptive Dynamic Programming (ADP) have been proposed, application results are scarce. Therefore, we design an ADP-based optimal trajectory tracking controller and apply it to a large-scale ball-on-plate system. Our proposed method incorporates an approximated reference trajectory instead of using setpoint tracking and allows to automatically compensate for constant offset terms. Due to the off-policy characteristics of the algorithm, the method requires only a small amount of measured data to train the controller. Our experimental results show that this tracking mechanism significantly reduces the control cost compared to setpoint controllers. Furthermore, a comparison with a model-based optimal controller highlights the benefits of our model-free data-based ADP tracking controller, where no system model and manual tuning are required but the controller is tuned automatically using measured data.

preprint2020arXiv

A Scalable Port-Hamiltonian Approach to Plug-and-Play Voltage Stabilization in DC Microgrids

One of the major challenges of voltage stabilization in converter-based DC microgrids are the multiple interacting units displaying intermittent supply behavior. In this paper, we address this by a decentralized scalable, plug-and-play voltage controller for voltage-source converters (VSCs) at primary level. In contrast to existing approaches, we follow a systematic and constructive design based on port-Hamiltonian systems (PHSs) which does neither require the heuristic proposition of a Lyapunov function nor the computation of auxilliary variables such as time-derivatives. By employing the Hamiltonian naturally obtained from the PHS approach as Lyapunov function and using the modularity of passive systems, we provide sufficient conditions under which the designed VSC controllers achieve microgrid-wide asymptotic voltage stability. Integral action (IA), which preserves the passive PHS structure, robustifies the design against unknown disturbances and ensures zero voltage errors in the steady-state. Numerical simulations illustrate the functionality of the proposed voltage controller.

preprint2020arXiv

Adaptive Dynamic Programming for Model-free Tracking of Trajectories with Time-varying Parameters

In order to autonomously learn to control unknown systems optimally w.r.t. an objective function, Adaptive Dynamic Programming (ADP) is well-suited to adapt controllers based on experience from interaction with the system. In recent years, many researchers focused on the tracking case, where the aim is to follow a desired trajectory. So far, ADP tracking controllers assume that the reference trajectory follows time-invariant exo-system dynamics-an assumption that does not hold for many applications. In order to overcome this limitation, we propose a new Q-function which explicitly incorporates a parametrized approximation of the reference trajectory. This allows to learn to track a general class of trajectories by means of ADP. Once our Q-function has been learned, the associated controller copes with time-varying reference trajectories without need of further training and independent of exo-system dynamics. After proposing our general model-free off-policy tracking method, we provide analysis of the important special case of linear quadratic tracking. We conclude our paper with an example which demonstrates that our new method successfully learns the optimal tracking controller and outperforms existing approaches in terms of tracking error and cost.

preprint2020arXiv

Inverse Dynamic Games Based on Maximum Entropy Inverse Reinforcement Learning

We consider the inverse problem of dynamic games, where cost function parameters are sought which explain observed behavior of interacting players. Maximum entropy inverse reinforcement learning is extended to the N-player case in order to solve inverse dynamic games with continuous-valued state and control spaces. We present methods for identification of cost function parameters from observed data which correspond to (i) a Pareto efficient solution, (ii) an open-loop Nash equilibrium or (iii) a feedback Nash equilibrium. Furthermore, we give results on the unbiasedness of the estimation of cost function parameters for each arising class of inverse dynamic game. The applicability of the methods is demonstrated with simulation examples of a nonlinear and a linear-quadratic dynamic game.

preprint2020arXiv

Multi-Robot Task Allocation and Scheduling Considering Cooperative Tasks and Precedence Constraints

In order to fully exploit the advantages inherent to cooperating heterogeneous multi-robot teams, sophisticated coordination algorithms are essential. Time-extended multi-robot task allocation approaches assign and schedule a set of tasks to a group of robots such that certain objectives are optimized and operational constraints are met. This is particularly challenging if cooperative tasks, i.e. tasks that require two or more robots to work directly together, are considered. In this paper, we present an easy-to-implement criterion to validate the feasibility, i.e. executability, of solutions to time-extended multi-robot task allocation problems with cross schedule dependencies arising from the consideration of cooperative tasks and precedence constraints. Using the introduced feasibility criterion, we propose a local improvement heuristic based on a neighborhood operator for the problem class under consideration. The initial solution is obtained by a greedy constructive heuristic. Both methods use a generalized cost structure and are therefore able to handle various objective function instances. We evaluate the proposed approach using test scenarios of different problem sizes, all comprising the complexity aspects of the regarded problem. The simulation results illustrate the improvement potential arising from the application of the local improvement heuristic.

preprint2020arXiv

Optimal Control of Port-Hamiltonian Systems: A Time-Continuous Learning Approach

Feedback controllers for port-Hamiltonian systems reveal an intrinsic inverse optimality property since each passivating state feedback controller is optimal with respect to some specific performance index. Due to the nonlinear port-Hamiltonian system structure, however, explicit (forward) methods for optimal control of port-Hamiltonian systems require the generally intractable analytical solution of the Hamilton-Jacobi-Bellman equation. Adaptive dynamic programming methods provide a means to circumvent this issue. However, the few existing approaches for port-Hamiltonian systems hinge on very specific sub-classes of either performance indices or system dynamics or require the intransparent guessing of stabilizing initial weights. In this paper, we contribute towards closing this largely unexplored research area by proposing a time-continuous adaptive feedback controller for the optimal control of general time-continuous input-state-output port-Hamiltonian systems with respect to general Lagrangian performance indices. Its control law implements an online learning procedure which uses the Hamiltonian of the system as an initial value function candidate. The time-continuous learning of the value function is achieved by means of a certain Lagrange multiplier that allows to evaluate the optimality of the current solution. In particular, constructive conditions for stabilizing initial weights are stated and asymptotic stability of the closed-loop equilibrium is proven. Our work is concluded by simulations for exemplary linear and nonlinear optimization problems which demonstrate asymptotic convergence of the controllers resulting from the proposed online adaptation procedure.

preprint2020arXiv

Partner Approximating Learners (PAL): Simulation-Accelerated Learning with Explicit Partner Modeling in Multi-Agent Domains

Mixed cooperative-competitive control scenarios such as human-machine interaction with individual goals of the interacting partners are very challenging for reinforcement learning agents. In order to contribute towards intuitive human-machine collaboration, we focus on problems in the continuous state and control domain where no explicit communication is considered and the agents do not know the others' goals or control laws but only sense their control inputs retrospectively. Our proposed framework combines a learned partner model based on online data with a reinforcement learning agent that is trained in a simulated environment including the partner model. Thus, we overcome drawbacks of independent learners and, in addition, benefit from a reduced amount of real world data required for reinforcement learning which is vital in the human-machine context. We finally analyze an example that demonstrates the merits of our proposed framework which learns fast due to the simulated environment and adapts to the continuously changing partner due to the partner approximation.

preprint2020arXiv

Passivity Conditions for Plug-and-Play Operation of Nonlinear Static AC Loads

The complexity arising in AC microgrids from multiple interacting distributed generation units (DGUs) with intermittent supply behavior requires local voltage-source inverters (VSIs) to be controlled in a distributed or decentralized manner at primary level. In (Strehle et al., 2019), we use passivity theory to design decentralized, plug-and-play voltage and frequency controllers for such VSIs. However, the stability analysis of the closed-loop system requires a load-connected topology, in contrast to real grids where loads are arbitrarily located. In this paper, we expand our former approach by considering the more realistic and general case of nonlinear static AC loads (ZIP and exponential) at arbitrary locations within an AC microgrid. Investigating the monotonicity of differentiable mappings, we derive sufficient inequality conditions for the strict passivity of these nonlinear static AC loads. Together with our plug-and-play VSI controller, this allows us to use passivity arguments to infer asymptotic voltage and frequency stability for AC microgrids with arbitrary topologies. An illustrative simulation validating our theoretical findings concludes our work.