Source author record

Francesco Sovrano

Francesco Sovrano appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence cs.CY Machine Learning Robotics

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation

A key goal of explainable AI (XAI) is to express the decision logic of large language models (LLMs) in symbolic form and link it to internal mechanisms. Global rule-extraction methods typically learn symbolic surrogates without grounding rules in model circuitry, while mechanistic interpretability can connect behaviors to neuron sets but often depends on hand-crafted hypotheses and expensive neuron-level interventions. We introduce MechaRule, a pipeline that grounds rule extraction in LLM circuits by efficiently localizing sparse neurons called agonists, whose activation neutralization disrupts rule-related behaviors. MechaRule rests on two empirical observations. First, within a fixed baseline/flip regime, sparse agonist effects can be approximately monotone and saturating: a few dominant neuron activations can overtop weaker ones at coarse scales, while overlapping neurons flip many of the same examples. This motivates viewing localization as adaptive group testing driven by a regime-conditional strength predicate with confidence-guided conservative pruning, yielding Theta(k log(N/k) + k) interventions over N candidates when k << N neurons are agonists under the monotone-overtopping abstraction. Second, agonists emerge more reliably when ablations are verified through data splits aligned with close-to-faithful rule behavior; spectral splits remain a useful rule-free fallback, while unfaithful splits degrade localization. Empirically, overtopping appears mainly in learned, task-aligned regimes: on arithmetic and jailbreak tasks across Qwen2 and GPT-J, MechaRule recalls 96.8% of high-effect brute-force agonists in completed comparisons, and suppressing localized agonists reduces arithmetic accuracy and jailbreak success by up to 71.1% and 8.8%, respectively.

preprint2022arXiv

Foreseeing the Impact of the Proposed AI Act on the Sustainability and Safety of Critical Infrastructures

The AI Act has been recently proposed by the European Commission to regulate the use of AI in the EU, especially on high-risk applications, i.e. systems intended to be used as safety components in the management and operation of road traffic and the supply of water, gas, heating and electricity. On the other hand, IEC 61508, one of the most adopted international standards for safety-critical electronic components, seem to mostly forbid the use of AI in such systems. Given this conflict between IEC 61508 and the proposed AI Act, also stressed by the fact that IEC 61508 is not an harmonised European standard, with the present paper we study and analyse what is going to happen to industry after the entry into force of the AI Act. In particular, we focus on how the proposed AI Act might positively impact on the sustainability of critical infrastructures by allowing the use of AI on an industry where it was previously forbidden. To do so, we provide several examples of AI-based solutions falling under the umbrella of IEC 61508 that might have a positive impact on sustainability in alignment with the current long-term goals of the EU and the Sustainable Development Goals of the United Nations, i.e., affordable and clean energy, sustainable cities and communities.

preprint2021arXiv

A Survey on Methods and Metrics for the Assessment of Explainability under the Proposed AI Act

This study discusses the interplay between metrics used to measure the explainability of the AI systems and the proposed EU Artificial Intelligence Act. A standardisation process is ongoing: several entities (e.g. ISO) and scholars are discussing how to design systems that are compliant with the forthcoming Act and explainability metrics play a significant role. This study identifies the requirements that such a metric should possess to ease compliance with the AI Act. It does so according to an interdisciplinary approach, i.e. by departing from the philosophical concept of explainability and discussing some metrics proposed by scholars and standardisation entities through the lenses of the explainability obligations set by the proposed AI Act. Our analysis proposes that metrics to measure the kind of explainability endorsed by the proposed AI Act shall be risk-focused, model-agnostic, goal-aware, intelligible & accessible. This is why we discuss the extent to which these requirements are met by the metrics currently under discussion.

preprint2021arXiv

Explanation-Aware Experience Replay in Rule-Dense Environments

Human environments are often regulated by explicit and complex rulesets. Integrating Reinforcement Learning (RL) agents into such environments motivates the development of learning mechanisms that perform well in rule-dense and exception-ridden environments such as autonomous driving on regulated roads. In this paper, we propose a method for organising experience by means of partitioning the experience buffer into clusters labelled on a per-explanation basis. We present discrete and continuous navigation environments compatible with modular rulesets and 9 learning tasks. For environments with explainable rulesets, we convert rule-based explanations into case-based explanations by allocating state-transitions into clusters labelled with explanations. This allows us to sample experiences in a curricular and task-oriented manner, focusing on the rarity, importance, and meaning of events. We label this concept Explanation-Awareness (XA). We perform XA experience replay (XAER) with intra and inter-cluster prioritisation, and introduce XA-compatible versions of DQN, TD3, and SAC. Performance is consistently superior with XA versions of those algorithms, compared to traditional Prioritised Experience Replay baselines, indicating that explanation engineering can be used in lieu of reward engineering for environments with explainable features.

preprint2021arXiv

Making Things Explainable vs Explaining: Requirements and Challenges under the GDPR

The European Union (EU) through the High-Level Expert Group on Artificial Intelligence (AI-HLEG) and the General Data Protection Regulation (GDPR) has recently posed an interesting challenge to the eXplainable AI (XAI) community, by demanding a more user-centred approach to explain Automated Decision-Making systems (ADMs). Looking at the relevant literature, XAI is currently focused on producing explainable software and explanations that generally follow an approach we could term One-Size-Fits-All, that is unable to meet a requirement of centring on user needs. One of the causes of this limit is the belief that making things explainable alone is enough to have pragmatic explanations. Thus, insisting on a clear separation between explainabilty (something that can be explained) and explanations, we point to explanatorY AI (YAI) as an alternative and more powerful approach to win the AI-HLEG challenge. YAI builds over XAI with the goal to collect and organize explainable information, articulating it into something we called user-centred explanatory discourses. Through the use of explanatory discourses/narratives we represent the problem of generating explanations for Automated Decision-Making systems (ADMs) into the identification of an appropriate path over an explanatory space, allowing explainees to interactively explore it and produce the explanation best suited to their needs.

Francesco Sovrano

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Neuron-Anchored Rule Extraction for Large Language Models via Contrastive Hierarchical Ablation

Foreseeing the Impact of the Proposed AI Act on the Sustainability and Safety of Critical Infrastructures

A Survey on Methods and Metrics for the Assessment of Explainability under the Proposed AI Act

Explanation-Aware Experience Replay in Rule-Dense Environments

Making Things Explainable vs Explaining: Requirements and Challenges under the GDPR