Source author record

Marco Caccamo

Marco Caccamo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Robotics Systems and Control eess.SY Cryptography and Security eess.SP Information Theory math.IT

Catalog footprint

What is connected

8works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Higher Resolution, Better Generalization: Unlocking Visual Scaling in Deep Reinforcement Learning

Pixel-based deep reinforcement learning agents are typically trained on heavily downsampled visual observations, a convention inherited from early benchmarks rather than grounded in principled design. In this work, we show that observation resolution is a critical yet overlooked variable for policy learning: higher-resolution inputs can substantially improve both performance and generalization, provided the network architecture can process them effectively. We find that the widely used Impala encoder, which flattens spatial features into a vector, suffers from quadratic parameter growth as resolution increases and fails to leverage the additional visual detail. Replacing this operation with global average pooling, as in the Impoola architecture, decouples parameter count from resolution and yields consistent improvements across resolutions and network widths - at their respective best conditions, visual scaling unlocks a 28 % performance gain for Impoola over Impala. These gains are strongest in environments that require precise perception of small or distant objects, and gradient saliency analysis confirms that the underlying mechanism is a more spatially localized visual attention of the policy at higher resolutions. Our results challenge the prevailing practice of aggressive input downsampling and position resolution-independent architectures as a simple, effective path toward scalable visual deep RL. To facilitate future research on resolution scaling in deep RL, we publicly release the open-source code for the Procgen-HD benchmark: https://github.com/raphajaner/procgen-hd.

preprint2022arXiv

Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning

Deep reinforcement learning (DRL) is a promising approach to solve complex control tasks by learning policies through interactions with the environment. However, the training of DRL policies requires large amounts of training experiences, making it impractical to learn the policy directly on physical systems. Sim-to-real approaches leverage simulations to pretrain DRL policies and then deploy them in the real world. Unfortunately, the direct real-world deployment of pretrained policies usually suffers from performance deterioration due to the different dynamics, known as the reality gap. Recent sim-to-real methods, such as domain randomization and domain adaptation, focus on improving the robustness of the pretrained agents. Nevertheless, the simulation-trained policies often need to be tuned with real-world data to reach optimal performance, which is challenging due to the high cost of real-world samples. This work proposes a distributed cloud-edge architecture to train DRL agents in the real world in real-time. In the architecture, the inference and training are assigned to the edge and cloud, separating the real-time control loop from the computationally expensive training loop. To overcome the reality gap, our architecture exploits sim-to-real transfer strategies to continue the training of simulation-pretrained agents on a physical system. We demonstrate its applicability on a physical inverted-pendulum control system, analyzing critical parameters. The real-world experiments show that our architecture can adapt the pretrained DRL agents to unseen dynamics consistently and efficiently.

preprint2022arXiv

Sandboxing (AI-based) Unverified Controllers in Stochastic Games: An Abstraction-based Approach with Safe-visor Architecture

In this paper, we propose a construction scheme for a Safe-visor architecture for sandboxing unverified controllers, e.g., artificial intelligence-based (a.k.a. AI-based) controllers, in two-players non-cooperative stochastic games. Concretely, we leverage abstraction-based approaches to construct a supervisor that checks and decides whether or not to accept the inputs provided by the unverified controller, and a safety advisor that provides fallback control inputs to ensure safety whenever the unverified controller is rejected. Moreover, by leveraging an ($ε,δ$)-approximate probabilistic relation between the original game and its finite abstraction, we provide a formal safety guarantee with respect to safety specifications modeled by deterministic finite automata (DFA), while the functionality of the unverified controllers is still exploited. To show the effectiveness of the proposed results, we apply them to a control problem of a quadrotor tracking a moving ground vehicle, in which an AI-based unverified controller is employed to control the quadrotor.

preprint2022arXiv

Synthesizing Safety Controllers for Uncertain Linear Systems: A Direct Data-driven Approach

In this paper, we provide a direct data-driven approach to synthesize safety controllers for unknown linear systems affected by unknown-but-bounded disturbances, in which identifying the unknown model is not required. First, we propose a notion of $γ$-robust safety invariant ($γ$-RSI) sets and their associated state-feedback controllers, which can be applied to enforce invariance properties. Then, we formulate a data-driven computation of these sets in terms of convex optimization problems with linear matrix inequalities (LMI) as constraints, which can be solved based on a finite number of data collected from a single input-state trajectory of the system. To show the effectiveness of the proposed approach, we apply our results to a 4-dimensional inverted pendulum.

preprint2021arXiv

UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning

Coverage path planning (CPP) is the task of designing a trajectory that enables a mobile agent to travel over every point of an area of interest. We propose a new method to control an unmanned aerial vehicle (UAV) carrying a camera on a CPP mission with random start positions and multiple options for landing positions in an environment containing no-fly zones. While numerous approaches have been proposed to solve similar CPP problems, we leverage end-to-end reinforcement learning (RL) to learn a control policy that generalizes over varying power constraints for the UAV. Despite recent improvements in battery technology, the maximum flying range of small UAVs is still a severe constraint, which is exacerbated by variations in the UAV's power consumption that are hard to predict. By using map-like input channels to feed spatial information through convolutional network layers to the agent, we are able to train a double deep Q-network (DDQN) to make control decisions for the UAV, balancing limited power budget and coverage goal. The proposed method can be applied to a wide variety of environments and harmonizes complex goal structures with system constraints.

preprint2021arXiv

UAV Path Planning using Global and Local Map Information with Deep Reinforcement Learning

Path planning methods for autonomous unmanned aerial vehicles (UAVs) are typically designed for one specific type of mission. This work presents a method for autonomous UAV path planning based on deep reinforcement learning (DRL) that can be applied to a wide range of mission scenarios. Specifically, we compare coverage path planning (CPP), where the UAV's goal is to survey an area of interest to data harvesting (DH), where the UAV collects data from distributed Internet of Things (IoT) sensor devices. By exploiting structured map information of the environment, we train double deep Q-networks (DDQNs) with identical architectures on both distinctly different mission scenarios to make movement decisions that balance the respective mission goal with navigation constraints. By introducing a novel approach exploiting a compressed global map of the environment combined with a cropped but uncompressed local map showing the vicinity of the UAV agent, we demonstrate that the proposed method can efficiently scale to large environments. We also extend previous results for generalizing control policies that require no retraining when scenario parameters change and offer a detailed analysis of crucial map processing parameters' effects on path planning performance.

preprint2020arXiv

UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach

Autonomous deployment of unmanned aerial vehicles (UAVs) supporting next-generation communication networks requires efficient trajectory planning methods. We propose a new end-to-end reinforcement learning (RL) approach to UAV-enabled data collection from Internet of Things (IoT) devices in an urban environment. An autonomous drone is tasked with gathering data from distributed sensor nodes subject to limited flying time and obstacle avoidance. While previous approaches, learning and non-learning based, must perform expensive recomputations or relearn a behavior when important scenario parameters such as the number of sensors, sensor positions, or maximum flying time, change, we train a double deep Q-network (DDQN) with combined experience replay to learn a UAV control policy that generalizes over changing scenario parameters. By exploiting a multi-layer map of the environment fed through convolutional network layers to the agent, we show that our proposed network architecture enables the agent to make movement decisions for a variety of scenario parameters that balance the data collection goal with flight time efficiency and safety constraints. Considerable advantages in learning efficiency from using a map centered on the UAV's position over a non-centered map are also illustrated.

preprint2012arXiv

S3A: Secure System Simplex Architecture for Enhanced Security of Cyber-Physical Systems

Until recently, cyber-physical systems, especially those with safety-critical properties that manage critical infrastructure (e.g. power generation plants, water treatment facilities, etc.) were considered to be invulnerable against software security breaches. The recently discovered 'W32.Stuxnet' worm has drastically changed this perception by demonstrating that such systems are susceptible to external attacks. Here we present an architecture that enhances the security of safety-critical cyber-physical systems despite the presence of such malware. Our architecture uses the property that control systems have deterministic execution behavior, to detect an intrusion within 0.6 μs while still guaranteeing the safety of the plant. We also show that even if an attack is successful, the overall state of the physical system will still remain safe. Even if the operating system's administrative privileges have been compromised, our architecture will still be able to protect the physical system from coming to harm.

Marco Caccamo

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Higher Resolution, Better Generalization: Unlocking Visual Scaling in Deep Reinforcement Learning

Cloud-Edge Training Architecture for Sim-to-Real Deep Reinforcement Learning

Sandboxing (AI-based) Unverified Controllers in Stochastic Games: An Abstraction-based Approach with Safe-visor Architecture

Synthesizing Safety Controllers for Uncertain Linear Systems: A Direct Data-driven Approach

UAV Coverage Path Planning under Varying Power Constraints using Deep Reinforcement Learning

UAV Path Planning using Global and Local Map Information with Deep Reinforcement Learning

UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach

S3A: Secure System Simplex Architecture for Enhanced Security of Cyber-Physical Systems