Source author record

Hikaru Sasaki

Hikaru Sasaki appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Robotics cond-mat.mtrl-sci

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Disturbance-Injected Robust Imitation Learning with Task Achievement

Robust imitation learning using disturbance injections overcomes issues of limited variation in demonstrations. However, these methods assume demonstrations are optimal, and that policy stabilization can be learned via simple augmentations. In real-world scenarios, demonstrations are often of diverse-quality, and disturbance injection instead learns sub-optimal policies that fail to replicate desired behavior. To address this issue, this paper proposes a novel imitation learning framework that combines both policy robustification and optimal demonstration learning. Specifically, this combinatorial approach forces policy learning and disturbance injection optimization to focus on mainly learning from high task achievement demonstrations, while utilizing low achievement ones to decrease the number of samples needed. The effectiveness of the proposed method is verified through experiments using an excavation task in both simulations and a real robot, resulting in high-achieving policies that are more stable and robust to diverse-quality demonstrations. In addition, this method utilizes all of the weighted sub-optimal demonstrations without eliminating them, resulting in practical data efficiency benefits.

preprint2022arXiv

Gaussian Process Self-triggered Policy Search in Weakly Observable Environments

The environments of such large industrial machines as waste cranes in waste incineration plants are often weakly observable, where little information about the environmental state is contained in the observations due to technical difficulty or maintenance cost (e.g., no sensors for observing the state of the garbage to be handled). Based on the findings that skilled operators in such environments choose predetermined control strategies (e.g., grasping and scattering) and their durations based on sensor values, %thereby improving the robustness of their actions, we propose a novel non-parametric policy search algorithm: Gaussian process self-triggered policy search (GPSTPS). GPSTPS has two types of control policies: action and duration. A gating mechanism either maintains the action selected by the action policy for the duration specified by the duration policy or updates the action and duration by passing new observations to the policy; therefore, it is categorized as self-triggered. GPSTPS simultaneously learns both policies by trial and error based on sparse GP priors and variational learning to maximize the return. To verify the performance of our proposed method, we conducted experiments on garbage-grasping-scattering task for a waste crane with weak observations using a simulation and a robotic waste crane system. As experimental results, the proposed method acquired suitable policies to determine the action and duration based on the garbage's characteristics.

preprint2014arXiv

Effect of varying mixture ratio of raw material powders on the thermoelectric properties of AlMgB14-based materials prepared by spark plasma sintering

The thermoelectric properties of AlMgB14-based materials prepared by spark plasma sintering were investigated. Al, Mg, and B powder were used as raw material powders. The raw powders were mixed using a V-shaped mixer, and then the mixture was sintered at 1673 K or 1773K. The mixture ratio of raw powders was varied around stoichiometric ratio of AlMgB14. X-ray diffraction patterns of samples showed that all samples consist of AlMgB14 and some by-products, MgAl2O4, B2O and AlB12. The Seebeck coefficient of the samples exhibited significant change depending on the varying mixture ratio. One sample exhibited a large negative value for the Seebeck coefficient (approximately -500 μV/K) over the temperature range from 573K to 1073 K, while others showed positive value (250-450 μV/K). Thus n-type AlMgB14-based material has been realized simply by varying raw material ratio.