Source author record

Dongheui Lee

Dongheui Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics Computer Vision Systems and Control Artificial Intelligence eess.SY Computation and Language Machine Learning math.OC

Catalog footprint

What is connected

11works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

ATLAS: An Annotation Tool for Long-horizon Robotic Action Segmentation

Annotating long-horizon robotic demonstrations with precise temporal action boundaries is crucial for training and evaluating action segmentation and manipulation policy learning methods. Existing annotation tools, however, are often limited: they are designed primarily for vision-only data, do not natively support synchronized visualization of robot-specific time-series signals (e.g., gripper state or force/torque), or require substantial effort to adapt to different dataset formats. In this paper, we introduce ATLAS, an annotation tool tailored for long-horizon robotic action segmentation. ATLAS provides time-synchronized visualization of multi-modal robotic data, including multi-view video and proprioceptive signals, and supports annotation of action boundaries, action labels, and task outcomes. The tool natively handles widely used robotics dataset formats such as ROS bags and the Reinforcement Learning Dataset (RLDS) format, and provides direct support for specific datasets such as REASSEMBLE. ATLAS can be easily extended to new formats via a modular dataset abstraction layer. Its keyboard-centric interface minimizes annotation effort and improves efficiency. In experiments on a contact-rich assembly task, ATLAS reduced the average per-action annotation time by at least 6% compared to ELAN, while the inclusion of time-series data improved temporal alignment with expert annotations by more than 2.8% and decreased boundary error fivefold compared to vision-only annotation tools.

preprint2022arXiv

Visually Grounding Language Instruction for History-Dependent Manipulation

This paper emphasizes the importance of a robot's ability to refer to its task history, especially when it executes a series of pick-and-place manipulations by following language instructions given one by one. The advantage of referring to the manipulation history can be categorized into two folds: (1) the language instructions omitting details but using expressions referring to the past can be interpreted, and (2) the visual information of objects occluded by previous manipulations can be inferred. For this, we introduce a history-dependent manipulation task which objective is to visually ground a series of language instructions for proper pick-and-place manipulations by referring to the past. We also suggest a relevant dataset and model which can be a baseline, and show that our model trained with the proposed dataset can also be applied to the real world based on the CycleGAN. Our dataset and code are publicly available on the project website: https://sites.google.com/view/history-dependent-manipulation.

preprint2020arXiv

A Transfer Learning Approach to Cross-Modal Object Recognition: From Visual Observation to Robotic Haptic Exploration

In this work, we introduce the problem of cross-modal visuo-tactile object recognition with robotic active exploration. With this term, we mean that the robot observes a set of objects with visual perception and, later on, it is able to recognize such objects only with tactile exploration, without having touched any object before. Using a machine learning terminology, in our application we have a visual training set and a tactile test set, or vice versa. To tackle this problem, we propose an approach constituted by four steps: finding a visuo-tactile common representation, defining a suitable set of features, transferring the features across the domains, and classifying the objects. We show the results of our approach using a set of 15 objects, collecting 40 visual examples and five tactile examples for each object. The proposed approach achieves an accuracy of 94.7%, which is comparable with the accuracy of the monomodal case, i.e., when using visual data both as training set and test set. Moreover, it performs well compared to the human ability, which we have roughly estimated carrying out an experiment with ten participants.

preprint2020arXiv

Efficient State Abstraction using Object-centered Predicates for Manipulation Planning

The definition of symbolic descriptions that consistently represent relevant geometrical aspects in manipulation tasks is a challenging problem that has received little attention in the robotic community. This definition is usually done from an observer perspective of a finite set of object relations and orientations that only satisfy geometrical constraints to execute experiments in laboratory conditions. This restricts the possible changes with manipulation actions in the object configuration space to those compatible with that particular external reference definitions, which greatly limits the spectrum of possible manipulations. To tackle these limitations we propose an object-centered representation that permits characterizing a much wider set of possible changes in configuration spaces than the traditional observer perspective counterpart. Based on this representation, we define universal planning operators for picking and placing actions that permits generating plans with geometric and force consistency in manipulation tasks. This object-centered description is directly obtained from the poses and bounding boxes of objects using a novel learning mechanisms that permits generating signal-symbols relations without the need of handcrafting these relations for each particular scenario.

preprint2020arXiv

Incremental Skill Learning of Stable Dynamical Systems

Efficient skill acquisition, representation, and on-line adaptation to different scenarios has become of fundamental importance for assistive robotic applications. In the past decade, dynamical systems (DS) have arisen as a flexible and robust tool to represent learned skills and to generate motion trajectories. This work presents a novel approach to incrementally modify the dynamics of a generic autonomous DS when new demonstrations of a task are provided. A control input is learned from demonstrations to modify the trajectory of the system while preserving the stability properties of the reshaped DS. Learning is performed incrementally through Gaussian process regression, increasing the robot's knowledge of the skill every time a new demonstration is provided. The effectiveness of the proposed approach is demonstrated with experiments on a publicly available dataset of complex motions.

preprint2020arXiv

Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems

Stable dynamical systems are a flexible tool to plan robotic motions in real-time. In the robotic literature, dynamical system motions are typically planned without considering possible limitations in the robot's workspace. This work presents a novel approach to learn workspace constraints from human demonstrations and to generate motion trajectories for the robot that lie in the constrained workspace. Training data are incrementally clustered into different linear subspaces and used to fit a low dimensional representation of each subspace. By considering the learned constraint subspaces as zeroing barrier functions, we are able to design a control input that keeps the system trajectory within the learned bounds. This control input is effectively combined with the original system dynamics preserving eventual asymptotic properties of the unconstrained system. Simulations and experiments on a real robot show the effectiveness of the proposed approach.

preprint2020arXiv

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

We study how well different types of approaches generalise in the task of 3D hand pose estimation under single hand scenarios and hand-object interaction. We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set. Unfortunately, since the space of hand poses is highly dimensional, it is inherently not feasible to cover the whole space densely, despite recent efforts in collecting large-scale training datasets. This sampling problem is even more severe when hands are interacting with objects and/or inputs are RGB rather than depth images, as RGB images also vary with lighting conditions and colors. To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set. More exactly, HANDS'19 is designed (a) to evaluate the influence of both depth and color modalities on 3D hand pose estimation, under the presence or absence of objects; (b) to assess the generalisation abilities w.r.t. four main axes: shapes, articulations, viewpoints, and objects; (c) to explore the use of a synthetic hand model to fill the gaps of current datasets. Through the challenge, the overall accuracy has dramatically improved over the baseline, especially on extrapolation tasks, from 27mm to 13mm mean joint error. Our analyses highlight the impacts of: Data pre-processing, ensemble approaches, the use of a parametric 3D hand model (MANO), and different HPE methods/backbones.

preprint2020arXiv

Merging Position and Orientation Motion Primitives

In this paper, we focus on generating complex robotic trajectories by merging sequential motion primitives. A robotic trajectory is a time series of positions and orientations ending at a desired target. Hence, we first discuss the generation of converging pose trajectories via dynamical systems, providing a rigorous stability analysis. Then, we present approaches to merge motion primitives which represent both the position and the orientation part of the motion. Developed approaches preserve the shape of each learned movement and allow for continuous transitions among succeeding motion primitives. Presented methodologies are theoretically described and experimentally evaluated, showing that it is possible to generate a smooth pose trajectory out of multiple motion primitives.

preprint2020arXiv

Particle Filter Based Monocular Human Tracking with a 3D Cardbox Model and a Novel Deterministic Resampling Strategy

The challenge of markerless human motion tracking is the high dimensionality of the search space. Thus, efficient exploration in the search space is of great significance. In this paper, a motion capturing algorithm is proposed for upper body motion tracking. The proposed system tracks human motion based on monocular silhouette-matching, and it is built on the top of a hierarchical particle filter, within which a novel deterministic resampling strategy (DRS) is applied. The proposed system is evaluated quantitatively with the ground truth data measured by an inertial sensor system. In addition, we compare the DRS with the stratified resampling strategy (SRS). It is shown in experiments that DRS outperforms SRS with the same amount of particles. Moreover, a new 3D articulated human upper body model with the name 3D cardbox model is created and is proven to work successfully for motion tracking. Experiments show that the proposed system can robustly track upper body motion without self-occlusion. Motions towards the camera can also be well tracked.

preprint2020arXiv

Prioritized Inverse Kinematics: Nonsmoothness, Trajectory Existence, Task Convergence, Stability

In this paper, we study various theoretical properties of a class of prioritized inverse kinematics (PIK) solutions that can be considered as a class of (output regulation or tracking) control laws of a dynamical system with prioritized multiple outputs. We first develop tools to investigate nonsmoothness of PIK solutions and find a sufficient condition for nonsmoothness. It implies that existence and uniqueness of a joint trajectory satisfying a PIK solution cannot be guaranteed by the classical theorems. So, we construct an alternative existence and uniqueness theorem that uses structural information of PIK solutions. Then, we narrow the class of PIK solutions down to the case that all tasks are designed to follow some desired task trajectories and discover a few properties related to task convergence. The study goes further to analyze stability of equilibrium points of the differential equation whose right hand side is a PIK solution when all tasks are designed to reach some desired task positions. Finally, we furnish an example with a two-link manipulator that shows how our findings can be used to analyze the behavior of a joint trajectory generated from a PIK solution.

preprint2015arXiv

A Preliminary Study on the Learning Informativeness of Data Subsets

Estimating the internal state of a robotic system is complex: this is performed from multiple heterogeneous sensor inputs and knowledge sources. Discretization of such inputs is done to capture saliences, represented as symbolic information, which often presents structure and recurrence. As these sequences are used to reason over complex scenarios, a more compact representation would aid exactness of technical cognitive reasoning capabilities, which are today constrained by computational complexity issues and fallback to representational heuristics or human intervention. Such problems need to be addressed to ensure timely and meaningful human-robot interaction. Our work is towards understanding the variability of learning informativeness when training on subsets of a given input dataset. This is in view of reducing the training size while retaining the majority of the symbolic learning potential. We prove the concept on human-written texts, and conjecture this work will reduce training data size of sequential instructions, while preserving semantic relations, when gathering information from large remote sources.

Dongheui Lee

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

ATLAS: An Annotation Tool for Long-horizon Robotic Action Segmentation

Visually Grounding Language Instruction for History-Dependent Manipulation

A Transfer Learning Approach to Cross-Modal Object Recognition: From Visual Observation to Robotic Haptic Exploration

Efficient State Abstraction using Object-centered Predicates for Manipulation Planning

Incremental Skill Learning of Stable Dynamical Systems

Learning Barrier Functions for Constrained Motion Planning with Dynamical Systems

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

Merging Position and Orientation Motion Primitives

Particle Filter Based Monocular Human Tracking with a 3D Cardbox Model and a Novel Deterministic Resampling Strategy

Prioritized Inverse Kinematics: Nonsmoothness, Trajectory Existence, Task Convergence, Stability

A Preliminary Study on the Learning Informativeness of Data Subsets