Researcher profile

H. Jin Kim

H. Jin Kim contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Quality-Aware Exploration Budget Allocation for Cooperative Multi-Agent Reinforcement Learning

Cooperative multi-agent reinforcement learning (MARL) requires agents to discover joint strategies in a combinatorially large state-action space, yet effective coordination configurations are exceedingly rare. Intrinsic motivation, which augments task rewards with novelty bonuses, is a popular approach for driving exploration, but its effectiveness hinges on the exploration intensity $β$, where too large a value overwhelms the task signal and causes coordination collapse, while too small a value prevents discovery of rare strategies. We address two complementary challenges: adapting $β$ globally over training, and allocating the exploration budget across agents whose intrinsic reward signals vary in reliability. Our framework combines a return-conditioned sigmoid schedule (RCB) for global intensity control with a per-agent Reward Signal Quality (RSQ) metric that concentrates the exploration budget on agents with reliable signals. The core insight is that agents receiving noisy intrinsic rewards should explore less aggressively, and this allocation can be determined automatically from signal-to-noise statistics. Successor Distance (SD), a quasimetric intrinsic reward, naturally produces distinguishable per-agent signal quality, completing the framework with convergence and ordering preservation guarantees. On seven cooperative benchmarks (MPE, SMAX, MABrax), our method achieves top-tier returns across all environments.

preprint2026arXiv

SceneNAT: Masked Generative Modeling for Language-Guided Indoor Scene Synthesis

We present SceneNAT, a single-stage masked non-autoregressive Transformer that synthesizes complete 3D indoor scenes from natural language instructions through only a few parallel decoding passes, offering improved performance and efficiency compared to prior state-of-the-art approaches. SceneNAT is trained via masked modeling over fully discretized representations of both semantic and spatial attributes. By applying a masking strategy at both the attribute level and the instance level, the model can better capture intra-object and inter-object structure. To boost relational reasoning, SceneNAT employs a dedicated triplet predictor for modeling the scene's layout and object relationships by mapping a set of learnable relation queries to a sparse set of symbolic triplets (subject, predicate, object). Extensive experiments on the 3D-FRONT dataset demonstrate that SceneNAT achieves superior performance compared to state-of-the-art autoregressive and diffusion baselines in both semantic compliance and spatial arrangement accuracy, while operating with substantially lower computational cost.

preprint2023arXiv

Stable Contact Guaranteeing Motion/Force Control for an Aerial Manipulator on an Arbitrarily Tilted Surface

This study aims to design a motion/force controller for an aerial manipulator which guarantees the tracking of time-varying motion/force trajectories as well as the stability during the transition between free and contact motions. To this end, we model the force exerted on the end-effector as the Kelvin-Voigt linear model and estimate its parameters by recursive least-squares estimator. Then, the gains of the disturbance-observer (DOB)-based motion/force controller are calculated based on the stability conditions considering both the model uncertainties in the dynamic equation and switching between the free and contact motions. To validate the proposed controller, we conducted the time-varying motion/force tracking experiments with different approach speeds and orientations of the surface. The results show that our controller enables the aerial manipulator to track the time-varying motion/force trajectories.

preprint2023arXiv

SwinDepth: Unsupervised Depth Estimation using Monocular Sequences via Swin Transformer and Densely Cascaded Network

Monocular depth estimation plays a critical role in various computer vision and robotics applications such as localization, mapping, and 3D object detection. Recently, learning-based algorithms achieve huge success in depth estimation by training models with a large amount of data in a supervised manner. However, it is challenging to acquire dense ground truth depth labels for supervised training, and the unsupervised depth estimation using monocular sequences emerges as a promising alternative. Unfortunately, most studies on unsupervised depth estimation explore loss functions or occlusion masks, and there is little change in model architecture in that ConvNet-based encoder-decoder structure becomes a de-facto standard for depth estimation. In this paper, we employ a convolution-free Swin Transformer as an image feature extractor so that the network can capture both local geometric features and global semantic features for depth estimation. Also, we propose a Densely Cascaded Multi-scale Network (DCMNet) that connects every feature map directly with another from different scales via a top-down cascade pathway. This densely cascaded connectivity reinforces the interconnection between decoding layers and produces high-quality multi-scale depth outputs. The experiments on two different datasets, KITTI and Make3D, demonstrate that our proposed method outperforms existing state-of-the-art unsupervised algorithms.

preprint2022arXiv

Automating Reinforcement Learning with Example-based Resets

Deep reinforcement learning has enabled robots to learn motor skills from environmental interactions with minimal to no prior knowledge. However, existing reinforcement learning algorithms assume an episodic setting, in which the agent resets to a fixed initial state distribution at the end of each episode, to successfully train the agents from repeated trials. Such reset mechanism, while trivial for simulated tasks, can be challenging to provide for real-world robotics tasks. Resets in robotic systems often require extensive human supervision and task-specific workarounds, which contradicts the goal of autonomous robot learning. In this paper, we propose an extension to conventional reinforcement learning towards greater autonomy by introducing an additional agent that learns to reset in a self-supervised manner. The reset agent preemptively triggers a reset to prevent manual resets and implicitly imposes a curriculum for the forward agent. We apply our method to learn from scratch on a suite of simulated and real-world continuous control tasks and demonstrate that the reset agent successfully learns to reduce manual resets whilst also allowing the forward policy to improve gradually over time.

preprint2022arXiv

Online Distributed Trajectory Planning for Quadrotor Swarm with Feasibility Guarantee using Linear Safe Corridor

This paper presents a new online multi-agent trajectory planning algorithm that guarantees to generate safe, dynamically feasible trajectories in a cluttered environment. The proposed algorithm utilizes a linear safe corridor (LSC) to formulate the distributed trajectory optimization problem with only feasible constraints, so it does not resort to slack variables or soft constraints to avoid optimization failure. We adopt a priority-based goal planning method to prevent the deadlock without an additional procedure to decide which robot to yield. The proposed algorithm can compute the trajectories for 60 agents on average 15.5 ms per agent with an Intel i7 laptop and shows a similar flight distance and distance compared to the baselines based on soft constraints. We verified that the proposed method can reach the goal without deadlock in both the random forest and the indoor space, and we validated the safety and operability of the proposed algorithm through a real flight test with ten quadrotors in a maze-like environment.

preprint2020arXiv

Aerial Manipulation using Model Predictive Control for Opening a Hinged Door

Existing studies for environment interaction with an aerial robot have been focused on interaction with static surroundings. However, to fully explore the concept of an aerial manipulation, interaction with moving structures should also be considered. In this paper, a multirotor-based aerial manipulator opening a daily-life moving structure, a hinged door, is presented. In order to address the constrained motion of the structure and to avoid collisions during operation, model predictive control (MPC) is applied to the derived coupled system dynamics between the aerial manipulator and the door involving state constraints. By implementing a constrained version of differential dynamic programming (DDP), MPC can generate position setpoints to the disturbance observer (DOB)-based robust controller in real-time, which is validated by our experimental results.

preprint2020arXiv

Detection-Aware Trajectory Generation for a Drone Cinematographer

This work investigates an efficient trajectory generation for chasing a dynamic target, which incorporates the detectability objective. The proposed method actively guides the motion of a cinematographer drone so that the color of a target is well-distinguished against the colors of the background in the view of the drone. For the objective, we define a measure of color detectability given a chasing path. After computing a discrete path optimized for the metric, we generate a dynamically feasible trajectory. The whole pipeline can be updated on-the-fly to respond to the motion of the target. For the efficient discrete path generation, we construct a directed acyclic graph (DAG) for which a topological sorting can be determined analytically without the depth-first search. The smooth path is obtained in quadratic programming (QP) framework. We validate the enhanced performance of state-of-the-art object detection and tracking algorithms when the camera drone executes the trajectory obtained from the proposed method.

preprint2020arXiv

Efficient Multi-Agent Trajectory Planning with Feasibility Guarantee using Relative Bernstein Polynomial

This paper presents a new efficient algorithm which guarantees a solution for a class of multi-agent trajectory planning problems in obstacle-dense environments. Our algorithm combines the advantages of both grid-based and optimization-based approaches, and generates safe, dynamically feasible trajectories without suffering from an erroneous optimization setup such as imposing infeasible collision constraints. We adopt a sequential optimization method with \textit{dummy agents} to improve the scalability of the algorithm, and utilize the convex hull property of Bernstein and relative Bernstein polynomial to replace non-convex collision avoidance constraints to convex ones. The proposed method can compute the trajectory for 64 agents on average 6.36 seconds with Intel Core i7-7700 @ 3.60GHz CPU and 16G RAM, and it reduces more than $50\%$ of the objective cost compared to our previous work. We validate the proposed algorithm through simulation and flight tests.

preprint2020arXiv

Fail-safe Flight of a Fully-Actuated Quadcopter in a Single Motor Failure

In this paper, we introduce a new quadcopter fail-safe flight solution that can perform the same four controllable degrees-of-freedom flight as a regular multirotor even when a single thruster fails. The new solution employs a novel multirotor platform known as the T3-Multirotor and utilizes a distinctive strategy of actively controlling the center of gravity position to restore the nominal flight performance. A dedicated control structure is introduced, along with a detailed analysis of the dynamic characteristics of the platform that change during emergency flights. Experimental results are provided to validate the feasibility of the proposed fail-safe flight strategy.

preprint2020arXiv

Moving object detection for visual odometry in a dynamic environment based on occlusion accumulation

Detection of moving objects is an essential capability in dealing with dynamic environments. Most moving object detection algorithms have been designed for color images without depth. For robotic navigation where real-time RGB-D data is often readily available, utilization of the depth information would be beneficial for obstacle recognition. Here, we propose a simple moving object detection algorithm that uses RGB-D images. The proposed algorithm does not require estimating a background model. Instead, it uses an occlusion model which enables us to estimate the camera pose on a background confused with moving objects that dominate the scene. The proposed algorithm allows to separate the moving object detection and visual odometry (VO) so that an arbitrary robust VO method can be employed in a dynamic situation with a combination of moving object detection, whereas other VO algorithms for a dynamic environment are inseparable. In this paper, we use dense visual odometry (DVO) as a VO method with a bi-square regression weight. Experimental results show the segmentation accuracy and the performance improvement of DVO in the situations. We validate our algorithm in public datasets and our dataset which also publicly accessible.

preprint2020arXiv

Pose Correction Algorithm for Relative Frames between Keyframes in SLAM

With the dominance of keyframe-based SLAM in the field of robotics, the relative frame poses between keyframes have typically been sacrificed for a faster algorithm to achieve online applications. However, those approaches can become insufficient for applications that may require refined poses of all frames, not just keyframes which are relatively sparse compared to all input frames. This paper proposes a novel algorithm to correct the relative frames between keyframes after the keyframes have been updated by a back-end optimization process. The correction model is derived using conservation of the measurement constraint between landmarks and the robot pose. The proposed algorithm is designed to be easily integrable to existing keyframe-based SLAM systems while exhibiting robust and accurate performance superior to existing interpolation methods. The algorithm also requires low computational resources and hence has a minimal burden on the whole SLAM pipeline. We provide the evaluation of the proposed pose correction algorithm in comparison to existing interpolation methods in various vector spaces, and our method has demonstrated excellent accuracy in both KITTI and EuRoC datasets.