Source author record

Masakazu Yoshimura

Masakazu Yoshimura appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Robotics eess.IV

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Structuring Open-Ended NAS: Semi-Automated Design Knowledge Structuring with LLMs for Efficient Neural Architecture Search

Current neural architecture search (NAS) methods are often limited by their predefined, restrictive search spaces. While recent large language model (LLM)-assisted NAS methods enable open-ended search spaces, they often suffer from inefficient exploration due to biased or low-quality design ideas. To address these issues, we propose to semi-automatically structure model design knowledge to guide the search process. Our approach first defines a high-level structural template of architectural attributes. An LLM then populates this template by analyzing papers, creating a rich and diverse search space that embodies this structured design knowledge. To efficiently explore this vast space, we introduce FairNAD, using a multi-type mutation that enables broad exploration through mutation with fair idea sampling, Pareto-aware mutation, LLM-driven iterative mutation, and a fine-grained feedback loop. We demonstrate the effectiveness of FairNAD in discovering high-performing architectures that yield 0.84, 2.17, and 2.35 points improvement on CIFAR-10, CIFAR-100, and ImageNet16-120, respectively, compared to current state-of-the-art methods.

preprint2022arXiv

MBAPose: Mask and Bounding-Box Aware Pose Estimation of Surgical Instruments with Photorealistic Domain Randomization

Surgical robots are usually controlled using a priori models based on the robots' geometric parameters, which are calibrated before the surgical procedure. One of the challenges in using robots in real surgical settings is that those parameters can change over time, consequently deteriorating control accuracy. In this context, our group has been investigating online calibration strategies without added sensors. In one step toward that goal, we have developed an algorithm to estimate the pose of the instruments' shafts in endoscopic images. In this study, we build upon that earlier work and propose a new framework to more precisely estimate the pose of a rigid surgical instrument. Our strategy is based on a novel pose estimation model called MBAPose and the use of synthetic training data. Our experiments demonstrated an improvement of 21 % for translation error and 26 % for orientation error on synthetic test data with respect to our previous work. Results with real test data provide a baseline for further research.

preprint2021arXiv

FOSS: Multi-Person Age Estimation with Focusing on Objects and Still Seeing Surroundings

Age estimation from images can be used in many practical scenes. Most of the previous works targeted on the estimation from images in which only one face exists. Also, most of the open datasets for age estimation contain images like that. However, in some situations, age estimation in the wild and for multi-person is needed. Usually, such situations were solved by two separate models; one is a face detector model which crops facial regions and the other is an age estimation model which estimates from cropped images. In this work, we propose a method that can detect and estimate the age of multi-person with a single model which estimates age with focusing on faces and still seeing surroundings. Also, we propose a training method which enables the model to estimate multi-person well despite trained with images in which only one face is photographed. In the experiments, we evaluated our proposed method compared with the traditional approach using two separate models. As the result, the accuracy could be enhanced with our proposed method. We also adapted our proposed model to commonly used single person photographed age estimation datasets and it is proved that our method is also effective to those images and outperforms the state of the art accuracy.

preprint2020arXiv

Single-Shot Pose Estimation of Surgical Robot Instruments' Shafts from Monocular Endoscopic Images

Surgical robots are used to perform minimally invasive surgery and alleviate much of the burden imposed on surgeons. Our group has developed a surgical robot to aid in the removal of tumors at the base of the skull via access through the nostrils. To avoid injuring the patients, a collision-avoidance algorithm that depends on having an accurate model for the poses of the instruments' shafts is used. Given that the model's parameters can change over time owing to interactions between instruments and other disturbances, the online estimation of the poses of the instrument's shaft is essential. In this work, we propose a new method to estimate the pose of the surgical instruments' shafts using a monocular endoscope. Our method is based on the use of an automatically annotated training dataset and an improved pose-estimation deep-learning architecture. In preliminary experiments, we show that our method can surpass state of the art vision-based marker-less pose estimation techniques (providing an error decrease of 55% in position estimation, 64% in pitch, and 69% in yaw) by using artificial images.