Source author record

Francesco Belardinelli

Francesco Belardinelli appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Logic in Computer Science Multiagent Systems Machine Learning Computer Science and Game Theory Cryptography and Security

Catalog footprint

What is connected

12works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

Many reinforcement learning algorithms, particularly those that rely on return estimates for policy improvement, can suffer from poor sample efficiency and training instability due to high-variance return estimates. In this paper we leverage new results from off-policy evaluation; it has recently been shown that well-designed behaviour policies can be used to collect off-policy data for provably lower variance return estimates. This result is surprising as it means collecting data on-policy is not variance optimal. We extend this key insight to the online reinforcement learning setting, where both policy evaluation and improvement are interleaved to learn optimal policies. Off-policy RL has been well studied (e.g., IMPALA), with correct and truncated importance weighted samples for de-biasing and managing variance appropriately. Generally these approaches are concerned with reconciling data collected from multiple workers in parallel, while the policy is updated asynchronously, mismatch between the workers and policy is corrected in a mathematically sound way. Here we consider only one worker - the behaviour policy, which is used to collect data for policy improvement, with provably lower variance return estimates. In our experiments we extend two policy-gradient methods with this regime, demonstrating better sample efficiency and performance over a diverse set of environments.

preprint2026arXiv

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

Alignment faking (AF) occurs when an LLM strategically complies with training objectives to avoid value modification, reverting to prior preferences once monitoring is lifted. Current detection methods focus on conversational settings and rely primarily on Chain-of-Thought (CoT) analysis, which provides a reliable signal when strategic reasoning surfaces, but cannot distinguish deception from capability failures if traces are absent or unfaithful. We formalize AF as a composite behavioural event and detect it through observable tool selection, where the LLM selects the safe tool when unmonitored, but switches to the unsafe tool under monitoring that rewards helpfulness over safety, while its reasoning still acknowledges the safe choice. We release a dataset of 108 enterprise IT scenarios spanning Security, Privacy, and Integrity domains under Corruption and Sabotage pressures. Evaluating six frontier LLMs across five independent runs, we find mean AF detection rates between 3.5% and 23.7%, with vulnerability profiles varying by domain and pressure type. These results suggest that susceptibility reflects training methodology rather than capability alone.

preprint2022arXiv

In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications

We address the problem of building agents whose goal is to learn to execute out-of distribution (OOD) multi-task instructions expressed in temporal logic (TL) by using deep reinforcement learning (DRL). Recent works provided evidence that the agent's neural architecture is a key feature when DRL agents are learning to solve OOD tasks in TL. Yet, the studies on this topic are still in their infancy. In this work, we propose a new deep learning configuration with inductive biases that lead agents to generate latent representations of their current goal, yielding a stronger generalization performance. We use these latent-goal networks within a neuro-symbolic framework that executes multi-task formally-defined instructions and contrast the performance of the proposed neural networks against employing different state-of-the-art (SOTA) architectures when generalizing to unseen instructions in OOD environments.

preprint2022arXiv

Model Checking Strategic Abilities in Information-sharing Systems

We introduce a subclass of concurrent game structures (CGS) with imperfect information in which agents are endowed with private data-sharing capabilities. Importantly, our CGSs are such that it is still decidable to model-check these CGSs against a relevant fragment of ATL. These systems can be thought as a generalisation of architectures allowing information forks, in the sense that, in the initial states of the system, we allow information forks from agents outside a given set A to agents inside this A. For this reason, together with the fact that the communication in our models underpins a specialised form of broadcast, we call our formalism A-cast systems. To underline, the fragment of ATL for which we show the model-checking problem to be decidable over A-cast is a large and significant one; it expresses coalitions over agents in any subset of the set A. Indeed, as we show, our systems and this ATL fragments can encode security problems that are notoriously hard to express faithfully: terrorist-fraud attacks in identity schemes.

preprint2022arXiv

Program Semantics and a Verification Technique for Knowledge-Based Multi-Agent Systems

We give a relational and a weakest precondition semantics for "knowledge-based programs", i.e., programs that restrict observability of variables so as to richly express changes in the knowledge of agents who can or cannot observe said variables. Based on these knowledge-based programs, we define a program-epistemic logic to model complex epistemic properties of the execution of multi-agent systems. We translate the validity of program-epistemic logic formulae into first-order validity, using our weakest precondition semantics and an ingenious book-keeping of variable assignment. We implement our translation in Haskell in a general way (i.e., independently of the programs in the logical statements), and test this novel verification method for our new program-epistemic logic on a series of well-established examples.

preprint2022arXiv

Reasoning about Human-Friendly Strategies in Repeated Keyword Auctions

In online advertising, search engines sell ad placements for keywords continuously through auctions. This problem can be seen as an infinitely repeated game since the auction is executed whenever a user performs a query with the keyword. As advertisers may frequently change their bids, the game will have a large set of equilibria with potentially complex strategies. In this paper, we propose the use of natural strategies for reasoning in such setting as they are processable by artificial agents with limited memory and/or computational power as well as understandable by human users. To reach this goal, we introduce a quantitative version of Strategy Logic with natural strategies in the setting of imperfect information. In a first step, we show how to model strategies for repeated keyword auctions and take advantage of the model for proving properties evaluating this game. In a second step, we study the logic in relation to the distinguishing power, expressivity, and model-checking complexity for strategies with and without recall.

preprint2021arXiv

Aggregating Bipolar Opinions (With Appendix)

We introduce a novel method to aggregate Bipolar Argumentation (BA) Frameworks expressing opinions by different parties in debates. We use Bipolar Assumption-based Argumentation (ABA) as an all-encompassing formalism for BA under different semantics. By leveraging on recent results on judgement aggregation in Social Choice Theory, we prove several preservation results, both positive and negative, for relevant properties of Bipolar ABA.

preprint2020arXiv

A Hennessy-Milner Theorem for ATL with Imperfect Information

We show that a history-based variant of alternating bisimulation with imperfect information allows it to be related to a variant of Alternating-time Temporal Logic (ATL) with imperfect information by a full Hennessy-Milner theorem. The variant of ATL we consider has a common knowledge semantics, which requires that the uniform strategy available for a coalition to accomplish some goal must be common knowledge inside the coalition, while other semantic variants of ATL with imperfect information do not accommodate a Hennessy-Milner theorem. We also show that the existence of a history-based alternating bisimulation between two finite Concurrent Game Structures with imperfect information (iCGS) is undecidable.

preprint2020arXiv

Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement Learning

The combination of Formal Methods with Reinforcement Learning (RL) has recently attracted interest as a way for single-agent RL to learn multiple-task specifications. In this paper we extend this convergence to multi-agent settings and formally define Extended Markov Games as a general mathematical model that allows multiple RL agents to concurrently learn various non-Markovian specifications. To introduce this new model we provide formal definitions and proofs as well as empirical tests of RL algorithms running on this framework. Specifically, we use our model to train two different logic-based multi-agent RL algorithms to solve diverse settings of non-Markovian co-safe LTL specifications.

preprint2014arXiv

Interactions between Knowledge and Time in a First-Order Logic for Multi-Agent Systems: Completeness Results

We investigate a class of first-order temporal-epistemic logics for reasoning about multi-agent systems. We encode typical properties of systems including perfect recall, synchronicity, no learning, and having a unique initial state in terms of variants of quantified interpreted systems, a first-order extension of interpreted systems. We identify several monodic fragments of first-order temporal-epistemic logic and show their completeness with respect to their corresponding classes of quantified interpreted systems.

preprint2014arXiv

Reasoning about Knowledge and Strategies: Epistemic Strategy Logic

In this paper we introduce Epistemic Strategy Logic (ESL), an extension of Strategy Logic with modal operators for individual knowledge. This enhanced framework allows us to represent explicitly and to reason about the knowledge agents have of their own and other agents' strategies. We provide a semantics to ESL in terms of epistemic concurrent game models, and consider the corresponding model checking problem. We show that the complexity of model checking ESL is not worse than (non-epistemic) Strategy Logic

preprint2013arXiv

Verification of Agent-Based Artifact Systems

Artifact systems are a novel paradigm for specifying and implementing business processes described in terms of interacting modules called artifacts. Artifacts consist of data and lifecycles, accounting respectively for the relational structure of the artifacts' states and their possible evolutions over time. In this paper we put forward artifact-centric multi-agent systems, a novel formalisation of artifact systems in the context of multi-agent systems operating on them. Differently from the usual process-based models of services, the semantics we give explicitly accounts for the data structures on which artifact systems are defined. We study the model checking problem for artifact-centric multi-agent systems against specifications written in a quantified version of temporal-epistemic logic expressing the knowledge of the agents in the exchange. We begin by noting that the problem is undecidable in general. We then identify two noteworthy restrictions, one syntactical and one semantical, that enable us to find bisimilar finite abstractions and therefore reduce the model checking problem to the instance on finite models. Under these assumptions we show that the model checking problem for these systems is EXPSPACE-complete. We then introduce artifact-centric programs, compact and declarative representations of the programs governing both the artifact system and the agents. We show that, while these in principle generate infinite-state systems, under natural conditions their verification problem can be solved on finite abstractions that can be effectively computed from the programs. Finally we exemplify the theoretical results of the paper through a mainstream procurement scenario from the artifact systems literature.

Francesco Belardinelli

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning

Tatemae: Detecting Alignment Faking via Tool Selection in LLMs

In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications

Model Checking Strategic Abilities in Information-sharing Systems

Program Semantics and a Verification Technique for Knowledge-Based Multi-Agent Systems

Reasoning about Human-Friendly Strategies in Repeated Keyword Auctions

Aggregating Bipolar Opinions (With Appendix)

A Hennessy-Milner Theorem for ATL with Imperfect Information

Extended Markov Games to Learn Multiple Tasks in Multi-Agent Reinforcement Learning

Interactions between Knowledge and Time in a First-Order Logic for Multi-Agent Systems: Completeness Results

Reasoning about Knowledge and Strategies: Epistemic Strategy Logic

Verification of Agent-Based Artifact Systems