Source author record

Giacomo Arcieri

Giacomo Arcieri appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

1works
1topics
2close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

1 published item(s)

preprint2022arXiv

Which Model to Trust: Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms for Continuous Control Tasks

The need for algorithms able to solve Reinforcement Learning (RL) problems with few trials has motivated the advent of model-based RL methods. The reported performance of model-based algorithms has dramatically increased within recent years. However, it is not clear how much of the recent progress is due to improved algorithms or due to improved models. While different modeling options are available to choose from when applying a model-based approach, the distinguishing traits and particular strengths of different models are not clear. The main contribution of this work lies precisely in assessing the model influence on the performance of RL algorithms. A set of commonly adopted models is established for the purpose of model comparison. These include Neural Networks (NNs), ensembles of NNs, two different approximations of Bayesian NNs (BNNs), that is, the Concrete Dropout NN and the Anchored Ensembling, and Gaussian Processes (GPs). The model comparison is evaluated on a suite of continuous control benchmarking tasks. Our results reveal that significant differences in model performance do exist. The Concrete Dropout NN reports persistently superior performance. We summarize these differences for the benefit of the modeler and suggest that the model choice is tailored to the standards required by each specific application.