Source author record

Kazuya Takeda

Kazuya Takeda appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Vision Machine Learning Robotics Computation and Language eess.AS Applications eess.IV eess.SY Multiagent Systems Sound Systems and Control

Catalog footprint

What is connected

12works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Perception and Sensing for Autonomous Vehicles Under Adverse Weather Conditions: A Survey

Automated Driving Systems (ADS) open up a new domain for the automotive industry and offer new possibilities for future transportation with higher efficiency and comfortable experiences. However, autonomous driving under adverse weather conditions has been the problem that keeps autonomous vehicles (AVs) from going to level 4 or higher autonomy for a long time. This paper assesses the influences and challenges that weather brings to ADS sensors in an analytic and statistical way, and surveys the solutions against inclement weather conditions. State-of-the-art techniques on perception enhancement with regard to each kind of weather are thoroughly reported. External auxiliary solutions, weather conditions coverage in currently available datasets, simulators, and experimental facilities with weather chambers are distinctly sorted out. Additionally, potential future ADS sensors candidates and approaches beyond common senses are provided. By looking into all kinds of major weather problems the autonomous driving field is currently facing, and reviewing both hardware and computer science solutions in recent years, this survey points out the main moving trends of adverse weather problems in autonomous driving, i.e., advanced sensor fusions, more sophisticated networks, and V2X & IoT technologies; and also the limitations brought by emerging 1550 nm LiDARs. In general, this work contributes a holistic overview of the obstacles and directions of ADS development in terms of adverse weather driving conditions.

preprint2022arXiv

Automatic detection of faults in race walking from a smartphone camera: a comparison of an Olympic medalist and university athletes

Automatic fault detection is a major challenge in many sports. In race walking, referees visually judge faults according to the rules. Hence, ensuring objectivity and fairness while judging is important. To address this issue, some studies have attempted to use sensors and machine learning to automatically detect faults. However, there are problems associated with sensor attachments and equipment such as a high-speed camera, which conflict with the visual judgement of referees, and the interpretability of the fault detection models. In this study, we proposed a fault detection system for non-contact measurement. We used pose estimation and machine learning models trained based on the judgements of multiple qualified referees to realize fair fault judgement. We verified them using smartphone videos of normal race walking and walking with intentional faults in several athletes including the medalist of the Tokyo Olympics. The validation results show that the proposed system detected faults with an average accuracy of over 90%. We also revealed that the machine learning model detects faults according to the rules of race walking. In addition, the intentional faulty walking movement of the medalist was different from that of university walkers. This finding informs realization of a more general fault detection model. The code and data are available at https://github.com/SZucchini/racewalk-aijudge.

preprint2022arXiv

Estimating the Effect of Team Hitting Strategies Using Counterfactual Virtual Simulation in Baseball

In baseball, every play on the field is quantitatively evaluated and has an effect on individual and team strategies. The weighted on base average (wOBA) is well known as a measure of an batter's hitting contribution. However, this measure ignores the game situation, such as the runners on base, which coaches and batters are known to consider when employing multiple hitting strategies, yet, the effectiveness of these strategies is unknown. This is probably because (1) we cannot obtain the batter's strategy and (2) it is difficult to estimate the effect of the strategies. Here, we propose a new method for estimating the effect using counterfactual batting simulation. To this end, we propose a deep learning model that transforms batting ability when batting strategy is changed. This method can estimate the effects of various strategies, which has been traditionally difficult with actual game data. We found that, when the switching cost of batting strategies can be ignored, the use of different strategies increased runs. When the switching cost is considered, the conditions for increasing runs were limited. Our validation results suggest that our simulation could clarify the effect of using multiple batting strategies.

preprint2022arXiv

Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction

Evaluating the individual movements for teammates in soccer players is crucial for assessing teamwork, scouting, and fan engagement. It has been said that players in a 90-min game do not have the ball for about 87 minutes on average. However, it has remained difficult to evaluate an attacking player without receiving the ball, and to reveal how movement contributes to the creation of scoring opportunities for teammates. In this paper, we evaluate players who create off-ball scoring opportunities by comparing actual movements with the reference movements generated via trajectory prediction. First, we predict the trajectories of players using a graph variational recurrent neural network that can accurately model the relationship between players and predict the long-term trajectory. Next, based on the difference in the modified off-ball evaluation index between the actual and the predicted trajectory as a reference, we evaluate how the actual movement contributes to scoring opportunity compared to the predicted movement. For verification, we examined the relationship with the annual salary, the goals, and the rating in the game by experts for all games of a team in a professional soccer league in a year. The results show that the annual salary and the proposed indicator correlated significantly, which could not be explained by the existing indicators and goals. Our results suggest the effectiveness of the proposed method as an indicator for a player without the ball to create a scoring chance for teammates.

preprint2022arXiv

Pitching strategy evaluation via stratified analysis using propensity score

Recent measurement technologies enable us to analyze baseball at higher levels. There are, however, still many unclear points around the pitching strategy. The two elements make it difficult to measure the effect of pitching strategy. First, most public datasets do not include location data where the catcher demands a ball, which is essential information to obtain the battery's intent. Second, there are many confounders associated with pitching/batting results when evaluating pitching strategy. We here clarify the effect of pitching attempts to a specific location, e.g., inside or outside. We employ a causal inference framework called stratified analysis using a propensity score to evaluate the effects while removing the effect of disturbing factors. We used a pitch-by-pitch dataset of Japanese professional baseball games held in 2014-2019, which includes location data where the catcher demands a ball. The results reveal that an outside pitching attempt is more effective than an inside one to minimize allowed run on average. Besides, the stratified analysis shows that the outside pitching attempt was always effective despite the magnitude of the estimated batter's ability, and the ratio of pitched inside for pitcher/batter. Our analysis would provide practical insights into selecting a pitching strategy to minimize allowed runs.

preprint2022arXiv

RSG-Net: Towards Rich Sematic Relationship Prediction for Intelligent Vehicle in Complex Environments

Behavioral and semantic relationships play a vital role on intelligent self-driving vehicles and ADAS systems. Different from other research focused on trajectory, position, and bounding boxes, relationship data provides a human understandable description of the object's behavior, and it could describe an object's past and future status in an amazingly brief way. Therefore it is a fundamental method for tasks such as risk detection, environment understanding, and decision making. In this paper, we propose RSG-Net (Road Scene Graph Net): a graph convolutional network designed to predict potential semantic relationships from object proposals, and produces a graph-structured result, called "Road Scene Graph". The experimental results indicate that this network, trained on Road Scene Graph dataset, could efficiently predict potential semantic relationships among objects around the ego-vehicle.

preprint2020arXiv

A Survey of Autonomous Driving: Common Practices and Emerging Technologies

Automated driving systems (ADSs) promise a safe, comfortable and efficient driving experience. However, fatalities involving vehicles equipped with ADSs are on the rise. The full potential of ADSs cannot be realized unless the robustness of state-of-the-art improved further. This paper discusses unsolved problems and surveys the technical aspect of automated driving. Studies regarding present challenges, high-level system architectures, emerging methodologies and core functions: localization, mapping, perception, planning, and human machine interface, were thoroughly reviewed. Furthermore, the state-of-the-art was implemented on our own platform and various algorithms were compared in a real-world driving setting. The paper concludes with an overview of available datasets and tools for ADS development.

preprint2020arXiv

Characterization of Multiple 3D LiDARs for Localization and Mapping using Normal Distributions Transform

In this work, we present a detailed comparison of ten different 3D LiDAR sensors, covering a range of manufacturers, models, and laser configurations, for the tasks of mapping and vehicle localization, using as common reference the Normal Distributions Transform (NDT) algorithm implemented in the self-driving open source platform Autoware. LiDAR data used in this study is a subset of our LiDAR Benchmarking and Reference (LIBRE) dataset, captured independently from each sensor, from a vehicle driven on public urban roads multiple times, at different times of the day. In this study, we analyze the performance and characteristics of each LiDAR for the tasks of (1) 3D mapping including an assessment map quality based on mean map entropy, and (2) 6-DOF localization using a ground truth reference map.

preprint2020arXiv

End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

This paper integrates a voice activity detection (VAD) function with end-to-end automatic speech recognition toward an online speech interface and transcribing very long audio recordings. We focus on connectionist temporal classification (CTC) and its extension of CTC/attention architectures. As opposed to an attention-based architecture, input-synchronous label prediction can be performed based on a greedy search with the CTC (pre-)softmax output. This prediction includes consecutive long blank labels, which can be regarded as a non-speech region. We use the labels as a cue for detecting speech segments with simple thresholding. The threshold value is directly related to the length of a non-speech region, which is more intuitive and easier to control than conventional VAD hyperparameters. Experimental results on unsegmented data show that the proposed method outperformed the baseline methods using the conventional energy-based and neural-network-based VAD methods and achieved an RTF less than 0.2. The proposed method is publicly available.

preprint2020arXiv

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit supports state-of-the-art E2E-TTS models, including Tacotron~2, Transformer TTS, and FastSpeech, and also provides recipes inspired by the Kaldi automatic speech recognition (ASR) toolkit. The recipes are based on the design unified with the ESPnet ASR recipe, providing high reproducibility. The toolkit also provides pre-trained models and samples of all of the recipes so that users can use it as a baseline. Furthermore, the unified design enables the integration of ASR functions with TTS, e.g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models. This paper describes the design of the toolkit and experimental evaluation in comparison with other toolkits. The experimental results show that our models can achieve state-of-the-art performance comparable to the other latest toolkits, resulting in a mean opinion score (MOS) of 4.25 on the LJSpeech dataset. The toolkit is publicly available at https://github.com/espnet/espnet.

preprint2020arXiv

LIBRE: The Multiple 3D LiDAR Dataset

In this work, we present LIBRE: LiDAR Benchmarking and Reference, a first-of-its-kind dataset featuring 10 different LiDAR sensors, covering a range of manufacturers, models, and laser configurations. Data captured independently from each sensor includes three different environments and configurations: static targets, where objects were placed at known distances and measured from a fixed position within a controlled environment; adverse weather, where static obstacles were measured from a moving vehicle, captured in a weather chamber where LiDARs were exposed to different conditions (fog, rain, strong light); and finally, dynamic traffic, where dynamic objects were captured from a vehicle driven on public urban roads, multiple times at different times of the day, and including supporting sensors such as cameras, infrared imaging, and odometry devices. LIBRE will contribute to the research community to (1) provide a means for a fair comparison of currently available LiDARs, and (2) facilitate the improvement of existing self-driving vehicles and robotics-related software, in terms of development and tuning of LiDAR-based perception algorithms.

preprint2020arXiv

Risky Action Recognition in Lane Change Video Clips using Deep Spatiotemporal Networks with Segmentation Mask Transfer

Advanced driver assistance and automated driving systems rely on risk estimation modules to predict and avoid dangerous situations. Current methods use expensive sensor setups and complex processing pipeline, limiting their availability and robustness. To address these issues, we introduce a novel deep learning based action recognition framework for classifying dangerous lane change behavior in short video clips captured by a monocular camera. We designed a deep spatiotemporal classification network that uses pre-trained state-of-the-art instance segmentation network Mask R-CNN as its spatial feature extractor for this task. The Long-Short Term Memory (LSTM) and shallower final classification layers of the proposed method were trained on a semi-naturalistic lane change dataset with annotated risk labels. A comprehensive comparison of state-of-the-art feature extractors was carried out to find the best network layout and training strategy. The best result, with a 0.937 AUC score, was obtained with the proposed network. Our code and trained models are available open-source.

Kazuya Takeda

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Perception and Sensing for Autonomous Vehicles Under Adverse Weather Conditions: A Survey

Automatic detection of faults in race walking from a smartphone camera: a comparison of an Olympic medalist and university athletes

Estimating the Effect of Team Hitting Strategies Using Counterfactual Virtual Simulation in Baseball

Evaluation of creating scoring opportunities for teammates in soccer via trajectory prediction

Pitching strategy evaluation via stratified analysis using propensity score

RSG-Net: Towards Rich Sematic Relationship Prediction for Intelligent Vehicle in Complex Environments

A Survey of Autonomous Driving: Common Practices and Emerging Technologies

Characterization of Multiple 3D LiDARs for Localization and Mapping using Normal Distributions Transform

End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

LIBRE: The Multiple 3D LiDAR Dataset

Risky Action Recognition in Lane Change Video Clips using Deep Spatiotemporal Networks with Segmentation Mask Transfer