Researcher profile

Michele Palladino

Michele Palladino contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Convergence results for an averaged LQR problem with applications to reinforcement learning

In this paper, we will deal with a Linear Quadratic Optimal Control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution $π$ on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the "average" Linear Quadratic Optimal Control problem with respect to a certain $π$ converges to the optimal control driven related to the Linear Quadratic Optimal Control problem governed by the actual, underlying dynamics. This approach is closely related to model-based Reinforcement Learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.

preprint2021arXiv

Stability-Constrained Markov Decision Processes Using MPC

In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured policy in the context of Reinforcement Learning to make it possible to introduce stability requirements directly inside the MPC-based policy. This will restrict the solution of the MDP to stabilizing policies by construction. The stability theory for MPC is most mature for the undiscounted MPC case. Hence, we will first show in this paper that stable discounted MDPs can be reformulated as undiscounted ones. This observation will entail that the MPC-based policy with stability requirements will produce the optimal policy for the discounted MDP if it is stable, and the best stabilizing policy otherwise.

preprint2020arXiv

Variational Problems for Tree Roots and Branches

This paper studies two classes of variational problems introduced in [7], related to the optimal shapes of tree roots and branches. Given a measure $μ$ describing the distribution of leaves, a sunlight functional $§(μ)$ computes the total amount of light captured by the leaves. For a measure $μ$ describing the distribution of root hair cells, a harvest functional $\H(μ)$ computes the total amount of water and nutrients gathered by the roots. In both cases, we seek a measure $μ$ that maximizes these functionals subject to a rami?ed transportation cost, for transporting nutrients from the roots to the trunk or from the trunk to the leaves. Compared with [7], here we do not impose any a priori bound on the total mass of the optimal measure $μ$, and more careful a priori estimates are thus required. In the unconstrained optimization problem for branches, we prove that an optimal measure exists, with bounded support and bounded total mass. In the unconstrained problem for tree roots, we prove that an optimal measure exists, with bounded support but possibly unbounded total mass. The last section of the paper analyzes how the size of the optimal tree depends on the parameters defining the various functionals.

preprint2019arXiv

A geometrically based criterion to avoid infimum-gaps in Optimal Control

In optimal control theory the expression infimum gap means a strictly negative difference between the infimum value of a given minimum problem and the infimum value of a new problem obtained by the former by extending the original family V of controls to a larger family W. Now, for some classes of domain-extensions -- like convex relaxation or impulsive embedding of unbounded control problems -- the normality of an extended minimizer has been shown to be sufficient for the avoidance of an infimum gaps. A natural issue is then the search of a general hypothesis under which the criterium 'normality implies no gap' holds true. We prove that, far from being a peculiarity of those specific extensions and from requiring the convexity of the extended dynamics, this criterium is valid provided the original family V of controls is abundant in the extended family W. Abundance, which is stronger than the mere C^0-density of the original trajectories in the set of extended trajectories, is a dynamical-topological notion introduced by J. Warga, and is here utilized in a 'non-convex' version which, moreover, is adapted to differential manifolds. To get the main result, which is based on set separation arguments, we prove an open mapping result valid for Quasi-Differential-Quotient (QDQ) approximating cones, a notion of 'tangent cone' resulted as a peculiar specification of H. Sussmann's Approximate-Generalized-Differential-Quotients (AGDQ) approximating cone.