Source author record

Mohamed Chetouani

Mohamed Chetouani appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Artificial Intelligence Human-Computer Interaction Machine Learning Computation and Language Computer Science and Game Theory cs.CY eess.AS Multiagent Systems Sound

Catalog footprint

What is connected

11works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

PRISM: Perception Reasoning Interleaved for Sequential Decision Making

Scaling LLM-based embodied agents from text-only environments to complex multimodal settings remains a major challenge. Recent work identifies a perception-reasoning-decision gap in standalone Vision-Language Models (VLMs), which often overlook task-critical information. In this paper, we introduce PRISM, a framework that tightly couples perception (VLM) and decision (LLM) through a dynamic question-answer (DQA) pipeline. Instead of passively accepting the VLM's description, the LLM critiques it, probes the VLM with goal-oriented questions, and synthesizes a compact image description. This closed-loop interaction yields a sharp, task-driven understanding of the scene. We evaluate PRISM on the ALFWorld and Room-to-Room (R2R) benchmarks. We show that: (1) PRISM significantly outperforms state-of-the-art image-based models, (2) our Interactive goal-oriented perception pipeline yields systematic and substantial gains, and (3) PRISM is fully automatic, eliminating the need for handcrafted questions or answers.

preprint2022arXiv

A new approach to evaluating legibility: Comparing legibility frameworks using framework-independent robot motion trajectories

Robots that share an environment with humans may communicate their intent using a variety of different channels. Movement is one of these channels and, particularly in manipulation tasks, intent communication via movement is called legibility. It alters a robot's trajectory to make it intent expressive. Here we propose a novel evaluation method that improves the data efficiency of collected experimental data when benchmarking approaches generating such legible behavior. The primary novelty of the proposed method is that it uses trajectories that were generated independently of the framework being tested. This makes evaluation easier, enables N-way comparisons between approaches, and allows easier comparison across papers. We demonstrate the efficiency of the new evaluation method by comparing 10 legibility frameworks in 2 scenarios. The paper, thus, provides readers with (1) a novel approach to investigate and/or benchmark legibility, (2) an overview of existing frameworks, (3) an evaluation of 10 legibility frameworks (from 6 papers), and (4) evidence that viewing angle and trajectory progression matter when users evaluate the legibility of a motion.

preprint2022arXiv

A New Nonlinear speaker parameterization algorithm for speaker identification

In this paper we propose a new parameterization algorithm based on nonlinear prediction, which is an extension of the classical LPC parameters. The parameters performances are estimated by two different methods: the Arithmetic-Harmonic Sphericity (AHS) and the Auto-Regressive Vector Model (ARVM). Two different methods are proposed for the parameterization based on the Neural Predictive Coding (NPC): classical neural networks initialization and linear initialization. We applied these two parameters to speaker identification. The fist parameters obtained smaller rates. We show for the first parameters how they can be combined with the classical parameters (LPCC, MFCC, etc.) in order to improve the results of only one classical parameterization (MFCC provides 97.55% and MFCC+NPC 98.78%). For the linear initialization, we obtain 100% which is great improvement. This study opens a new way towards different parameterization schemes that offer better accuracy on speaker recognition tasks.

preprint2022arXiv

Learning Collective Action under Risk Diversity

Collective risk dilemmas (CRDs) are a class of n-player games that represent societal challenges where groups need to coordinate to avoid the risk of a disastrous outcome. Multi-agent systems incurring such dilemmas face difficulties achieving cooperation and often converge to sub-optimal, risk-dominant solutions where everyone defects. In this paper we investigate the consequences of risk diversity in groups of agents learning to play CRDs. We find that risk diversity places new challenges to cooperation that are not observed in homogeneous groups. We show that increasing risk diversity significantly reduces overall cooperation and hinders collective target achievement. It leads to asymmetrical changes in agents' policies -- i.e. the increase in contributions from individuals at high risk is unable to compensate for the decrease in contributions from individuals at low risk -- which overall reduces the total contributions in a population. When comparing RL behaviors to rational individualistic and social behaviors, we find that RL populations converge to fairer contributions among agents. Our results highlight the need for aligning risk perceptions among agents or develop new learning techniques that explicitly account for risk diversity.

preprint2022arXiv

Two ways to make your robot proactive: reasoning about human intentions, or reasoning about possible futures

Robots sharing their space with humans need to be proactive in order to be helpful. Proactive robots are able to act on their own initiative in an anticipatory way to benefit humans. In this work, we investigate two ways to make robots proactive. One way is to recognize humans' intentions and to act to fulfill them, like opening the door that you are about to cross. The other way is to reason about possible future threats or opportunities and to act to prevent or to foster them, like recommending you to take an umbrella since rain has been forecasted. In this paper, we present approaches to realize these two types of proactive behavior. We then present an integrated system that can generate proactive robot behavior by reasoning on both factors: intentions and predictions. We illustrate our system on a sample use case including a domestic robot and a human. We first run this use case with the two separate proactive systems, intention-based and prediction-based, and then run it with our integrated system. The results show that the integrated system is able to take into account a broader variety of aspects that are needed for proactivity.

preprint2021arXiv

Explainable Agents Through Social Cues: A Review

The issue of how to make embodied agents explainable has experienced a surge of interest over the last three years, and, there are many terms that refer to this concept, e.g., transparency or legibility. One reason for this high variance in terminology is the unique array of social cues that embodied agents can access in contrast to that accessed by non-embodied agents. Another reason is that different authors use these terms in different ways. Hence, we review the existing literature on explainability and organize it by (1) providing an overview of existing definitions, (2) showing how explainability is implemented and how it exploits different social cues, and (3) showing how the impact of explainability is measured. Additionally, we present a list of open questions and challenges that highlight areas that require further investigation by the community. This provides the interested reader with an overview of the current state-of-the-art.

preprint2021arXiv

Grounding Language to Autonomously-Acquired Skills via Goal Generation

We are interested in the autonomous acquisition of repertoires of skills. Language-conditioned reinforcement learning (LC-RL) approaches are great tools in this quest, as they allow to express abstract goals as sets of constraints on the states. However, most LC-RL agents are not autonomous and cannot learn without external instructions and feedback. Besides, their direct language condition cannot account for the goal-directed behavior of pre-verbal infants and strongly limits the expression of behavioral diversity for a given language input. To resolve these issues, we propose a new conceptual approach to language-conditioned RL: the Language-Goal-Behavior architecture (LGB). LGB decouples skill learning and language grounding via an intermediate semantic representation of the world. To showcase the properties of LGB, we present a specific implementation called DECSTR. DECSTR is an intrinsically motivated learning agent endowed with an innate semantic representation describing spatial relations between physical objects. In a first stage (G -> B), it freely explores its environment and targets self-generated semantic configurations. In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs. We showcase the additional properties of LGB w.r.t. both an end-to-end LC-RL approach and a similar approach leveraging non-semantic, continuous intermediate representations. Intermediate semantic representations help satisfy language commands in a diversity of ways, enable strategy switching after a failure and facilitate language grounding.

preprint2020arXiv

Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL

In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world. The notion of Language Grounding questions the interactions between language and embodiment: how do learning agents connect or ground linguistic representations to the physical world ? This question has recently been approached by the Reinforcement Learning community under the framework of instruction-following agents. In these agents, behavioral policies or reward functions are conditioned on the embedding of an instruction expressed in natural language. This paper proposes another approach: using language to condition goal generators. Given any goal-conditioned policy, one could train a language-conditioned goal generator to generate language-agnostic goals for the agent. This method allows to decouple sensorimotor learning from language acquisition and enable agents to demonstrate a diversity of behaviors for any given instruction. We propose a particular instantiation of this approach and demonstrate its benefits.

preprint2020arXiv

MobiAxis: An Embodied Learning Task for Teaching Multiplication with a Social Robot

The use of robots in educational settings is growing increasingly popular. Yet, many of the learning tasks involving social robots do not take full advantage of their physical embodiment. MobiAxis is a proposed learning task which uses the physical capabilities of a Pepper robot to teach the concepts of positive and negative multiplication along a number line. The robot is embodied with a number of multi-modal socially intelligent features and behaviours which are designed to enhance learning. This paper is a position paper describing the technical and theoretical implementation of the task, as well as proposed directions for future studies.

preprint2015arXiv

Towards engagement models that consider individual factors in HRI: on the relation of extroversion and negative attitude towards robots to gaze and speech during a human-robot assembly task

Estimating the engagement is critical for human - robot interaction. Engagement measures typically rely on the dynamics of the social signals exchanged by the partners, especially speech and gaze. However, the dynamics of these signals is likely to be influenced by individual and social factors, such as personality traits, as it is well documented that they critically influence how two humans interact with each other. Here, we assess the influence of two factors, namely extroversion and negative attitude toward robots, on speech and gaze during a cooperative task, where a human must physically manipulate a robot to assemble an object. We evaluate if the scores of extroversion and negative attitude towards robots co-variate with the duration and frequency of gaze and speech cues. The experiments were carried out with the humanoid robot iCub and N=56 adult participants. We found that the more people are extrovert, the more and longer they tend to talk with the robot; and the more people have a negative attitude towards robots, the less they will look at the robot face and the more they will look at the robot hands where the assembly and the contacts occur. Our results confirm and provide evidence that the engagement models classically used in human-robot interaction should take into account attitudes and personality traits.

preprint2015arXiv

Trust as indicator of robot functional and social acceptance. An experimental study on user conformation to the iCub's answers

To investigate the functional and social acceptance of a humanoid robot, we carried out an experimental study with 56 adult participants and the iCub robot. Trust in the robot has been considered as a main indicator of acceptance in decision-making tasks characterized by perceptual uncertainty (e.g., evaluating the weight of two objects) and socio-cognitive uncertainty (e.g., evaluating which is the most suitable item in a specific context), and measured by the participants' conformation to the iCub's answers to specific questions. In particular, we were interested in understanding whether specific (i) user-related features (i.e. desire for control), (ii) robot-related features (i.e., attitude towards social influence of robots), and (iii) context-related features (i.e., collaborative vs. competitive scenario), may influence their trust towards the iCub robot. We found that participants conformed more to the iCub's answers when their decisions were about functional issues than when they were about social issues. Moreover, the few participants conforming to the iCub's answers for social issues also conformed less for functional issues. Trust in the robot's functional savvy does not thus seem to be a pre-requisite for trust in its social savvy. Finally, desire for control, attitude towards social influence of robots and type of interaction scenario did not influence the trust in iCub. Results are discussed with relation to methodology of HRI research.

Mohamed Chetouani

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

PRISM: Perception Reasoning Interleaved for Sequential Decision Making

A new approach to evaluating legibility: Comparing legibility frameworks using framework-independent robot motion trajectories

A New Nonlinear speaker parameterization algorithm for speaker identification

Learning Collective Action under Risk Diversity

Two ways to make your robot proactive: reasoning about human intentions, or reasoning about possible futures

Explainable Agents Through Social Cues: A Review

Grounding Language to Autonomously-Acquired Skills via Goal Generation

Language-Conditioned Goal Generation: a New Approach to Language Grounding for RL

MobiAxis: An Embodied Learning Task for Teaching Multiplication with a Social Robot

Towards engagement models that consider individual factors in HRI: on the relation of extroversion and negative attitude towards robots to gaze and speech during a human-robot assembly task

Trust as indicator of robot functional and social acceptance. An experimental study on user conformation to the iCub's answers