Source author record

Matej Hoffmann

Matej Hoffmann appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Artificial Intelligence Neurons and Cognition Computer Vision Neural and Evolutionary Computing Human-Computer Interaction Systems and Control

Catalog footprint

What is connected

18works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

BLANKET: Anonymizing Faces in Infant Video Recordings

Ensuring the ethical use of video data involving human subjects, particularly infants, requires robust anonymization methods. We propose BLANKET (Baby-face Landmark-preserving ANonymization with Keypoint dEtection consisTency), a novel approach designed to anonymize infant faces in video recordings while preserving essential facial attributes. Our method comprises two stages. First, a new random face, compatible with the original identity, is generated via inpainting using a diffusion model. Second, the new identity is seamlessly incorporated into each video frame through temporally consistent face swapping with authentic expression transfer. The method is evaluated on a dataset of short video recordings of babies and is compared to the popular anonymization method, DeepPrivacy2. Key metrics assessed include the level of de-identification, preservation of facial attributes, impact on human pose estimation (as an example of a downstream task), and presence of artifacts. Both methods alter the identity, and our method outperforms DeepPrivacy2 in all other respects. The code is available as an easy-to-use anonymization demo at https://github.com/ctu-vras/blanket-infant-face-anonym.

preprint2026arXiv

Neuromorphic visual attention for Sign-language recognition on SpiNNaker

Sign-language recognition has achieved substantial gains in classification accuracy in recent years; however, the latency and power requirements of most existing methods limit their suitability for real-time deployment. Neuromorphic sensing and processing offer an alternative paradigm based on sparse, event-driven computation that supports low-latency and energy-efficient perception. In this work, we introduce an end-to-end neuromorphic architecture for American Sign Language (ASL) fingerspelling recognition that integrates a spiking visual attention mechanism for online region-of-interest extraction with a compact spiking neural network deployed on the SpiNNaker neuromorphic platform. We benchmark the proposed system against two datasets: a synthetically generated event-based version of the Sign Language MNIST dataset and a natively recorded ASL-DVS dataset, whilst providing a comprehensive overview of Sign-language recognition and related work. This work yields competitive performance in simulation (92.27%) and comparable performance on neuromorphic hardware deployment (83.1%), while achieving the most energy-efficient architecture (0.565 mW) and low latency (3 ms) across all benchmarked approaches. Despite its compact design, the system demonstrates the suitability of task-dependent visual attention applications for edge deployment.

preprint2026arXiv

Simulating Infant First-Person Sensorimotor Experience via Motion Retargeting from Babies to Humanoids

Motion retargeting from humans to human-like artificial agents is becoming increasingly important as humanoid robots grow more capable. However, most existing approaches focus only on reproducing kinematics and ignore the rich sensorimotor experience associated with human movement. In this work, we present a framework for simulating the multimodal sensorimotor experiences of infants using physical and virtual humanoids. From a single video, our method reconstructs the infant's body configuration by extracting its skeletal structure and estimating the full 3D pose from each frame. Then we map the reconstructed motion onto several developmental platforms: the physical iCub robot and the virtual simulators pyCub, EMFANT and MIMo. Replaying the retargeted motions on these embodiments produces simulated multisensory streams including proprioception (joints and muscles), touch, and vision. For the best-matching embodiment, the retargeting achieves sub-centimeter accuracy and enables a rich multimodal analysis of infant development as well as enhanced automated annotation of behaviors. This framework provides a unique window into the infant's sensorimotor experience, offering new tools for robotics, developmental science, and early detection of neurodevelopmental disorders. The code is available at https://github.com/ctu-vras/motion-retargeting/.

preprint2022arXiv

Active Visuo-Haptic Object Shape Completion

Recent advancements in object shape completion have enabled impressive object reconstructions using only visual input. However, due to self-occlusion, the reconstructions have high uncertainty in the occluded object parts, which negatively impacts the performance of downstream robotic tasks such as grasping. In this work, we propose an active visuo-haptic shape completion method called Act-VH that actively computes where to touch the objects based on the reconstruction uncertainty. Act-VH reconstructs objects from point clouds and calculates the reconstruction uncertainty using IGR, a recent state-of-the-art implicit surface deep neural network. We experimentally evaluate the reconstruction accuracy of Act-VH against five baselines in simulation and in the real world. We also propose a new simulation environment for this purpose. The results show that Act-VH outperforms all baselines and that an uncertainty-driven haptic exploration policy leads to higher reconstruction accuracy than a random policy and a policy driven by Gaussian Process Implicit Surfaces. As a final experiment, we evaluate Act-VH and the best reconstruction baseline on grasping 10 novel objects. The results show that Act-VH reaches a significantly higher grasp success rate than the baseline on all objects. Together, this work opens up the door for using active visuo-haptic shape completion in more complex cluttered scenes.

preprint2022arXiv

Body Models in Humans and Robots

Neurocognitive models of higher-level somatosensory processing have emphasised the role of stored body representations in interpreting real-time sensory signals coming from the body (Longo, Azanon and Haggard, 2010; Tame, Azanon and Longo, 2019). The need for such stored representations arises from the fact that immediate sensory signals coming from the body do not specify metric details about body size and shape. Several aspects of somatoperception, therefore, require that immediate sensory signals be combined with stored body representations. This basic problem is equally true for humanoid robots and, intriguingly, neurocognitive models developed to explain human perception are strikingly similar to those developed independently for localizing touch on humanoid robots, such as the iCub, equipped with artificial electronic skin on the majority of its body surface (Roncone et al., 2014; Hoffmann, 2021). In this chapter, we will review the key features of these models, discuss their similarities and differences to each other, and to other models in the literature. Using robots as embodied computational models is an example of synthetic methodology or 'understanding by building' (e.g., Hoffmann and Pfeifer, 2018), computational embodied neuroscience (Caligiore et al., 2010) or 'synthetic psychology of the self' (Prescott and Camilleri, 2019). Such models have the advantage that they need to be worked out into every detail, making any theory explicit and complete. There is also an additional way of (pre)validating such a theory other than comparing to the biological or psychological phenomenon studied by simply verifying that a particular implementation really performs the task: can the robot localize where it is being touched (see https://youtu.be/pfse424t5mQ)?

preprint2022arXiv

Body schema or the body as its own best model

Rodney Brooks (1991) put forth the idea that during an agent's interaction with its environment, representations of the world often stand in the way. Instead, using the world as its own best model, i.e. interacting with it directly without making models, often leads to better and more natural behavior. The same perspective can be applied to representations of the agent's body. I analyze different examples from biology -- octopus and humans in particular -- and compare them with robots and their body models. At one end of the spectrum, the octopus, a highly intelligent animal, largely relies on the mechanical properties of its arms and peripheral nervous system. No central representations or maps of its body were found in its central nervous system. Primate brains do contain areas dedicated to processing body-related information and different body maps were found. Yet, these representations are still largely implicit and distributed and some functionality is also offloaded to the periphery. Robots, on the other hand, rely almost exclusively on their body models when planning and executing behaviors. I analyze the pros and cons of these different approaches and propose what may be the best solution for robots of the future.

preprint2020arXiv

Active exploration for body model learning through self-touch on a humanoid robot with artificial skin

The mechanisms of infant development are far from understood. Learning about one's own body is likely a foundation for subsequent development. Here we look specifically at the problem of how spontaneous touches to the body in early infancy may give rise to first body models and bootstrap further development such as reaching competence. Unlike visually elicited reaching, reaching to own body requires connections of the tactile and motor space only, bypassing vision. Still, the problems of high dimensionality and redundancy of the motor system persist. In this work, we present an embodied computational model on a simulated humanoid robot with artificial sensitive skin on large areas of its body. The robot should autonomously develop the capacity to reach for every tactile sensor on its body. To do this efficiently, we employ the computational framework of intrinsic motivations and variants of goal babbling, as opposed to motor babbling, that prove to make the exploration process faster and alleviate the ill-posedness of learning inverse kinematics. Based on our results, we discuss the next steps in relation to infant studies: what information will be necessary to further ground this computational model in behavioral data.

preprint2020arXiv

Robot in the mirror: toward an embodied computational model of mirror self-recognition

Self-recognition or self-awareness is a capacity attributed typically only to humans and few other species. The definitions of these concepts vary and little is known about the mechanisms behind them. However, there is a Turing test-like benchmark: the mirror self-recognition, which consists in covertly putting a mark on the face of the tested subject, placing her in front of a mirror, and observing the reactions. In this work, first, we provide a mechanistic decomposition, or process model, of what components are required to pass this test. Based on these, we provide suggestions for empirical research. In particular, in our view, the way the infants or animals reach for the mark should be studied in detail. Second, we develop a model to enable the humanoid robot Nao to pass the test. The core of our technical contribution is learning the appearance representation and visual novelty detection by means of learning the generative model of the face with deep auto-encoders and exploiting the prediction error. The mark is identified as a salient region on the face and reaching action is triggered, relying on a previously learned mapping to arm joint angles. The architecture is tested on two robots with a completely different face.

preprint2020arXiv

Robot self-calibration using multiple kinematic chains -- a simulation study on the iCub humanoid robot

Mechanism calibration is an important and non-trivial task in robotics. Advances in sensor technology make affordable but increasingly accurate devices such as cameras and tactile sensors available, making it possible to perform automated self-contained calibration relying on redundant information in these sensory streams. In this work, we use a simulated iCub humanoid robot with a stereo camera system and end-effector contact emulation to quantitatively compare the performance of kinematic calibration by employing different combinations of intersecting kinematic chains -- either through self-observation or self-touch. The parameters varied were: (i) type and number of intersecting kinematic chains used for calibration, (ii) parameters and chains subject to optimization, (iii) amount of initial perturbation of kinematic parameters, (iv) number of poses/configurations used for optimization, (v) amount of measurement noise in end-effector positions / cameras. The main findings are: (1) calibrating parameters of a single chain (e.g. one arm) by employing multiple kinematic chains ("self-observation" and "self-touch") is superior in terms of optimization results as well as observability; (2) when using multi-chain calibration, fewer poses suffice to get similar performance compared to when for example only observation from a single camera is used; (3) parameters of all chains (here 86 DH parameters) can be subject to calibration simultaneously and with 50 (100) poses, end-effector error of around 2 (1) mm can be achieved; (4) adding noise to a sensory modality degrades performance of all calibrations employing the chains relying on this information.

preprint2020arXiv

Safe physical HRI: Toward a unified treatment of speed and separation monitoring together with power and force limiting

So-called collaborative robots are a current trend in industrial robotics. However, they still face many problems in practical application such as reduced speed to ascertain their collaborativeness. The standards prescribe two regimes: (i) speed and separation monitoring and (ii) power and force limiting, where the former requires reliable estimation of distances between the robot and human body parts and the latter imposes constraints on the energy absorbed during collisions prior to robot stopping. Following the standards, we deploy the two collaborative regimes in a single application and study the performance in a mock collaborative task under the individual regimes, including transitions between them. Additionally, we compare the performance under "safety zone monitoring" with keypoint pair-wise separation distance assessment relying on an RGB-D sensor and skeleton extraction algorithm to track human body parts in the workspace. Best performance has been achieved in the following setting: robot operates at full speed until a distance threshold between any robot and human body part is crossed; then, reduced robot speed per power and force limiting is triggered. Robot is halted only when the operator's head crosses a predefined distance from selected robot parts. We demonstrate our methodology on a setup combining a KUKA LBR iiwa robot, Intel RealSense RGB-D sensor and OpenPose for human pose estimation.

preprint2020arXiv

Should a small robot have a small personal space? Investigating personal spatial zones and proxemic behavior in human-robot interaction

This paper presents the first study in a series of proxemics experiments concerned with the role of personal spatial zones in human-robot interaction. In the study 40 participants approached a NAO robot positioned approximately at participants' eye level and entered different social zones around the robot (personal and intimate space). When the robot perceived the approaching person entering its personal space, it started gazing at the participant, and upon the intrusion of its intimate space it leaned back. Our research questions were: (1) given the small size of the robot (58 cm tall), will people expect its social zones to shrink by its size? (2) Will the robot behaviors be interpreted as appropriate social behaviors? We found that the average approach distance of the participants was 48 cm, which represents the inner limit of the human-size personal zone (45-120 cm), but is outside of the personal zone scaled to robot size (16-42 cm). This suggests that most participants did not (fully) scale down the extent of these zones to the robot size. We also found that the leaning back behavior of the robot was correctly interpreted by most participants as the robot's reaction to the intrusion of its personal space; however, our implementation of the behavior was often perceived as "unfriendly". We will discuss this and other limitations of the study in detail. Additionally we found positive correlations between participants' personality traits, Godspeed Questionnaire subscales, and the average approach distance. The technical contribution of this work is the real-time perception of 25 keypoints on the human body using a single compact RGB-D camera and the use of these points for accurate interpersonal distance estimation and as gazing targets for the robot.

preprint2018arXiv

Symbol Emergence in Cognitive Developmental Systems: a Survey

Humans use signs, e.g., sentences in a spoken language, for communication and thought. Hence, symbol systems like language are crucial for our communication with other agents and adaptation to our real-world environment. The symbol systems we use in our human society adaptively and dynamically change over time. In the context of artificial intelligence (AI) and cognitive systems, the symbol grounding problem has been regarded as one of the central problems related to {\it symbols}. However, the symbol grounding problem was originally posed to connect symbolic AI and sensorimotor information and did not consider many interdisciplinary phenomena in human communication and dynamic symbol systems in our society, which semiotics considered. In this paper, we focus on the symbol emergence problem, addressing not only cognitive dynamics but also the dynamics of symbol systems in society, rather than the symbol grounding problem. We first introduce the notion of a symbol in semiotics from the humanities, to leave the very narrow idea of symbols in symbolic AI. Furthermore, over the years, it became more and more clear that symbol emergence has to be regarded as a multifaceted problem. Therefore, secondly, we review the history of the symbol emergence problem in different fields, including both biological and artificial systems, showing their mutual relations. We summarize the discussion and provide an integrative viewpoint and comprehensive overview of symbol emergence in cognitive systems. Additionally, we describe the challenges facing the creation of cognitive systems that can be part of symbol emergence systems.

preprint2017arXiv

DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.

preprint2016arXiv

The encoding of proprioceptive inputs in the brain: knowns and unknowns from a robotic perspective

Somatosensory inputs can be grossly divided into tactile (or cutaneous) and proprioceptive -- the former conveying information about skin stimulation, the latter about limb position and movement. The principal proprioceptors are constituted by muscle spindles, which deliver information about muscle length and speed. In primates, this information is relayed to the primary somatosensory cortex and eventually the posterior parietal cortex, where integrated information about body posture (postural schema) is presumably available. However, coming from robotics and seeking a biologically motivated model that could be used in a humanoid robot, we faced a number of difficulties. First, it is not clear what neurons in the ascending pathway and primary somatosensory cortex code. To an engineer, joint angles would seem the most useful variables. However, the lengths of individual muscles have nonlinear relationships with the angles at joints. Kim et al. (Neuron, 2015) found different types of proprioceptive neurons in the primary somatosensory cortex -- sensitive to movement of single or multiple joints or to static postures. Second, there are indications that the somatotopic arrangement ("the homunculus") of these brain areas is to a significant extent learned. However, the mechanisms behind this developmental process are unclear. We will report first results from modeling of this process using data obtained from body babbling in the iCub humanoid robot and feeding them into a Self-Organizing Map (SOM). Our results reveal that the SOM algorithm is only suited to develop receptive fields of the posture-selective type. Furthermore, the SOM algorithm has intrinsic difficulties when combined with population code on its input and in particular with nonlinear tuning curves (sigmoids or Gaussians).

preprint2014arXiv

Trade-Offs in Exploiting Body Morphology for Control: from Simple Bodies and Model-Based Control to Complex Bodies with Model-Free Distributed Control Schemes

Tailoring the design of robot bodies for control purposes is implicitly performed by engineers, however, a methodology or set of tools is largely absent and optimization of morphology (shape, material properties of robot bodies, etc.) is lagging behind the development of controllers. This has become even more prominent with the advent of compliant, deformable or "soft" bodies. These carry substantial potential regarding their exploitation for control---sometimes referred to as "morphological computation" in the sense of offloading computation needed for control to the body. Here, we will argue in favor of a dynamical systems rather than computational perspective on the problem. Then, we will look at the pros and cons of simple vs. complex bodies, critically reviewing the attractive notion of "soft" bodies automatically taking over control tasks. We will address another key dimension of the design space---whether model-based control should be used and to what extent it is feasible to develop faithful models for different morphologies.

preprint2012arXiv

The implications of embodiment for behavior and cognition: animal and robotic case studies

In this paper, we will argue that if we want to understand the function of the brain (or the control in the case of robots), we must understand how the brain is embedded into the physical system, and how the organism interacts with the real world. While embodiment has often been used in its trivial meaning, i.e. 'intelligence requires a body', the concept has deeper and more important implications, concerned with the relation between physical and information (neural, control) processes. A number of case studies are presented to illustrate the concept. These involve animals and robots and are concentrated around locomotion, grasping, and visual perception. A theoretical scheme that can be used to embed the diverse case studies will be presented. Finally, we will establish a link between the low-level sensory-motor processes and cognition. We will present an embodied view on categorization, and propose the concepts of 'body schema' and 'forward models' as a natural extension of the embodied approach toward first representations.

preprint2011arXiv

SNF Project Locomotion: Final report 2009-2010

Summary of results in last project period (1. 10. 2009 - 30. 9. 2010) of SNFS Project "From locomotion to cognition" The research that we have been involved in, and will continue to do, starts from the insight that in order to understand and design intelligent behavior, we must adopt an embodied perspective, i.e. we must take the entire agent, including its shape or morphology, the materials out of which it is built, and its interaction with the environment into account, in addition to the neural control. A lot of our research in the past has been on relatively low-level sensory-motor tasks such as locomotion (e.g. walking, running, jumping), navigation, and grasping. While this research is of interest in itself, in the context of artificial intelligence and cognitive science, this leads to the question of what these kinds of tasks have to do with higher levels of cognition, or to put it more provocatively, "What does walking have to do with thinking?" This question is of course reminiscent of the notorious "symbol grounding problem". In contrast to most of the research on symbol grounding, we propose to exploit the dynamic interaction between the embodied agent and the environment as the basis for grounding. We use the term "morphological computation" to designate the fact that some of the control or computation can be taken over by the dynamic interaction derived from morphological properties (e.g. the passive forward swing of the leg in walking, the spring-like properties of the muscles, and the weight distribution). By taking morphological computation into account, an agent will be able to achieve not only faster, more robust, and more energy-efficient behavior, but also more situated exploration by the agent for the comprehensive understanding of the environment.

preprint2011arXiv

SNF Project Locomotion: Progress report 2008-2009

Summary of results (project period 1. 10. 2008 - 30. 9. 2009) of SNFS Project "From locomotion to cognition" The research that we have been involved in, and will continue to do, starts from the insight that in order to understand and design intelligent behavior, we must adopt an embodied perspective, i.e. we must take the entire agent, including its shape or morphology, the materials out of which it is built, and its interaction with the environment into account, in addition to the neural control. A lot of our research in the past has been on relatively low-level sensory-motor tasks such as locomotion (e.g. walking, running, jumping), navigation, and grasping. While this research is of interest in itself, in the context of artificial intelligence and cognitive science, this leads to the question of what these kinds of tasks have to do with higher levels of cognition, or to put it more provocatively, "What does walking have to do with thinking?" This question is of course reminiscent of the notorious "symbol grounding problem". In contrast to most of the research on symbol grounding, we propose to exploit the dynamic interaction between the embodied agent and the environment as the basis for grounding. We use the term "morphological computation" to designate the fact that some of the control or computation can be taken over by the dynamic interaction derived from morphological properties (e.g. the passive forward swing of the leg in walking, the spring-like properties of the muscles, and the weight distribution). By taking morphological computation into account, an agent will be able to achieve not only faster, more robust, and more energy-efficient behavior, but also more situated exploration by the agent for the comprehensive understanding of the environment.

Matej Hoffmann

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

BLANKET: Anonymizing Faces in Infant Video Recordings

Neuromorphic visual attention for Sign-language recognition on SpiNNaker

Simulating Infant First-Person Sensorimotor Experience via Motion Retargeting from Babies to Humanoids

Active Visuo-Haptic Object Shape Completion

Body Models in Humans and Robots

Body schema or the body as its own best model

Active exploration for body model learning through self-touch on a humanoid robot with artificial skin

Robot in the mirror: toward an embodied computational model of mirror self-recognition

Robot self-calibration using multiple kinematic chains -- a simulation study on the iCub humanoid robot

Safe physical HRI: Toward a unified treatment of speed and separation monitoring together with power and force limiting

Should a small robot have a small personal space? Investigating personal spatial zones and proxemic behavior in human-robot interaction

Symbol Emergence in Cognitive Developmental Systems: a Survey

DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self

The encoding of proprioceptive inputs in the brain: knowns and unknowns from a robotic perspective

Trade-Offs in Exploiting Body Morphology for Control: from Simple Bodies and Model-Based Control to Complex Bodies with Model-Free Distributed Control Schemes

The implications of embodiment for behavior and cognition: animal and robotic case studies

SNF Project Locomotion: Final report 2009-2010

SNF Project Locomotion: Progress report 2008-2009