Researcher profile

Akira Taniguchi

Akira Taniguchi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2023arXiv

Emergent Communication through Metropolis-Hastings Naming Game with Deep Generative Models

Constructive studies on symbol emergence systems seek to investigate computational models that can better explain human language evolution, the creation of symbol systems, and the construction of internal representations. This study provides a new model for emergent communication, which is based on a probabilistic generative model (PGM) instead of a discriminative model based on deep reinforcement learning. We define the Metropolis-Hastings (MH) naming game by generalizing previously proposed models. It is not a referential game with explicit feedback, as assumed by many emergent communication studies. Instead, it is a game based on joint attention without explicit feedback. Mathematically, the MH naming game is proved to be a type of MH algorithm for an integrative PGM that combines two agents that play the naming game. From this viewpoint, symbol emergence is regarded as decentralized Bayesian inference, and semiotic communication is regarded as inter-personal cross-modal inference. This notion leads to the collective predictive coding hypothesis} regarding language evolution and, in general, the emergence of symbols. We also propose the inter-Gaussian mixture model (GMM)+ variational autoencoder (VAE), a deep generative model for emergent communication based on the MH naming game. The model has been validated on MNIST and Fruits 360 datasets. Experimental findings demonstrate that categories are formed from real images observed by agents, and signs are correctly shared across agents by successfully utilizing both of the observations of agents via the MH naming game. Furthermore, scholars verified that visual images were recalled from signs uttered by agents. Notably, emergent communication without supervision and reward feedback improved the performance of the unsupervised representation learning of agents.

preprint2022arXiv

A Whole Brain Probabilistic Generative Model: Toward Realizing Cognitive Architectures for Developmental Robots

Building a humanlike integrative artificial cognitive system, that is, an artificial general intelligence (AGI), is the holy grail of the artificial intelligence (AI) field. Furthermore, a computational model that enables an artificial system to achieve cognitive development will be an excellent reference for brain and cognitive science. This paper describes an approach to develop a cognitive architecture by integrating elemental cognitive modules to enable the training of the modules as a whole. This approach is based on two ideas: (1) brain-inspired AI, learning human brain architecture to build human-level intelligence, and (2) a probabilistic generative model(PGM)-based cognitive system to develop a cognitive system for developmental robots by integrating PGMs. The development framework is called a whole brain PGM (WB-PGM), which differs fundamentally from existing cognitive architectures in that it can learn continuously through a system based on sensory-motor information. In this study, we describe the rationale of WB-PGM, the current status of PGM-based elemental cognitive modules, their relationship with the human brain, the approach to the integration of the cognitive modules, and future challenges. Our findings can serve as a reference for brain studies. As PGMs describe explicit informational relationships between variables, this description provides interpretable guidance from computational sciences to brain science. By providing such information, researchers in neuroscience can provide feedback to researchers in AI and robotics on what the current models lack with reference to the brain. Further, it can facilitate collaboration among researchers in neuro-cognitive sciences as well as AI and robotics.

preprint2022arXiv

Brain-inspired probabilistic generative model for double articulation analysis of spoken language

The human brain, among its several functions, analyzes the double articulation structure in spoken language, i.e., double articulation analysis (DAA). A hierarchical structure in which words are connected to form a sentence and words are composed of phonemes or syllables is called a double articulation structure. Where and how DAA is performed in the human brain has not been established, although some insights have been obtained. In addition, existing computational models based on a probabilistic generative model (PGM) do not incorporate neuroscientific findings, and their consistency with the brain has not been previously discussed. This study compared, mapped, and integrated these existing computational models with neuroscientific findings to bridge this gap, and the findings are relevant for future applications and further research. This study proposes a PGM for a DAA hypothesis that can be realized in the brain based on the outcomes of several neuroscientific surveys. The study involved (i) investigation and organization of anatomical structures related to spoken language processing, and (ii) design of a PGM that matches the anatomy and functions of the region of interest. Therefore, this study provides novel insights that will be foundational to further exploring DAA in the brain.

preprint2022arXiv

Hippocampal formation-inspired probabilistic generative model

In building artificial intelligence (AI) agents, referring to how brains function in real environments can accelerate development by reducing the design space. In this study, we propose a probabilistic generative model (PGM) for navigation in uncertain environments by integrating the neuroscientific knowledge of hippocampal formation (HF) and the engineering knowledge in robotics and AI, namely, simultaneous localization and mapping (SLAM). We follow the approach of brain reference architecture (BRA) (Yamakawa, 2021) to compose the PGM and outline how to verify the model. To this end, we survey and discuss the relationship between the HF findings and SLAM models. The proposed hippocampal formation-inspired probabilistic generative model (HF-PGM) is designed to be highly consistent with the anatomical structure and functions of the HF. By referencing the brain, we elaborate on the importance of integration of egocentric/allocentric information from the entorhinal cortex to the hippocampus and the use of discrete-event queues.

preprint2022arXiv

Map completion from partial observation using the global structure of multiple environmental maps

Using the spatial structure of various indoor environments as prior knowledge, the robot would construct the map more efficiently. Autonomous mobile robots generally apply simultaneous localization and mapping (SLAM) methods to understand the reachable area in newly visited environments. However, conventional mapping approaches are limited by only considering sensor observation and control signals to estimate the current environment map. This paper proposes a novel SLAM method, map completion network-based SLAM (MCN-SLAM), based on a probabilistic generative model incorporating deep neural networks for map completion. These map completion networks are primarily trained in the framework of generative adversarial networks (GANs) to extract the global structure of large amounts of existing map data. We show in experiments that the proposed method can estimate the environment map 1.3 times better than the previous SLAM methods in the situation of partial observation.

preprint2022arXiv

Symbol Emergence as Inter-personal Categorization with Head-to-head Latent Word

In this study, we propose a head-to-head type (H2H-type) inter-personal multimodal Dirichlet mixture (Inter-MDM) by modifying the original Inter-MDM, which is a probabilistic generative model that represents the symbol emergence between two agents as multiagent multimodal categorization. A Metropolis--Hastings method-based naming game based on the Inter-MDM enables two agents to collaboratively perform multimodal categorization and share signs with a solid mathematical foundation of convergence. However, the conventional Inter-MDM presumes a tail-to-tail connection across a latent word variable, causing inflexibility of the further extension of Inter-MDM for modeling a more complex symbol emergence. Therefore, we propose herein a head-to-head type (H2H-type) Inter-MDM that treats a latent word variable as a child node of an internal variable of each agent in the same way as many prior studies of multimodal categorization. On the basis of the H2H-type Inter-MDM, we propose a naming game in the same way as the conventional Inter-MDM. The experimental results show that the H2H-type Inter-MDM yields almost the same performance as the conventional Inter-MDM from the viewpoint of multimodal categorization and sign sharing.

preprint2021arXiv

Autonomous Planning Based on Spatial Concepts to Tidy Up Home Environments with Service Robots

Tidy-up tasks by service robots in home environments are challenging in robotics applications because they involve various interactions with the environment. In particular, robots are required not only to grasp, move, and release various home objects but also to plan the order and positions for placing the objects. In this paper, we propose a novel planning method that can efficiently estimate the order and positions of the objects to be tidied up by learning the parameters of a probabilistic generative model. The model allows a robot to learn the distributions of the co-occurrence probability of the objects and places to tidy up using the multimodal sensor information collected in a tidied environment. Additionally, we develop an autonomous robotic system to perform the tidy-up operation. We evaluate the effectiveness of the proposed method by an experimental simulation that reproduces the conditions of the Tidy Up Here task of the World Robot Summit 2018 international robotics competition. The simulation results show that the proposed method enables the robot to successively tidy up several objects and achieves the best task score among the considered baseline tidy-up methods.

preprint2021arXiv

StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition

Preserving the linguistic content of input speech is essential during voice conversion (VC). The star generative adversarial network-based VC method (StarGAN-VC) is a recently developed method that allows non-parallel many-to-many VC. Although this method is powerful, it can fail to preserve the linguistic content of input speech when the number of available training samples is extremely small. To overcome this problem, we propose the use of automatic speech recognition to assist model training, to improve StarGAN-VC, especially in low-resource scenarios. Experimental results show that using our proposed method, StarGAN-VC can retain more linguistic information than vanilla StarGAN-VC.

preprint2020arXiv

Improved and Scalable Online Learning of Spatial Concepts and Language Models with Mapping

We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. However, our original algorithm had limited estimation accuracy owing to the influence of the early stages of learning, and increased computational complexity with added training data. Therefore, we introduce techniques such as fixed-lag rejuvenation to reduce the calculation time while maintaining an accuracy higher than that of the original algorithm. The results show that, in terms of estimation accuracy, the proposed algorithm exceeds the original algorithm and is comparable to batch learning. In addition, the calculation time of the proposed algorithm does not depend on the amount of training data and becomes constant for each step of the scalable algorithm. Our approach will contribute to the realization of long-term spatial language interactions between humans and robots.

preprint2020arXiv

Neuro-SERKET: Development of Integrative Cognitive System through the Composition of Deep Probabilistic Generative Models

This paper describes a framework for the development of an integrative cognitive system based on probabilistic generative models (PGMs) called Neuro-SERKET. Neuro-SERKET is an extension of SERKET, which can compose elemental PGMs developed in a distributed manner and provide a scheme that allows the composed PGMs to learn throughout the system in an unsupervised way. In addition to the head-to-tail connection supported by SERKET, Neuro-SERKET supports tail-to-tail and head-to-head connections, as well as neural network-based modules, i.e., deep generative models. As an example of a Neuro-SERKET application, an integrative model was developed by composing a variational autoencoder (VAE), a Gaussian mixture model (GMM), latent Dirichlet allocation (LDA), and automatic speech recognition (ASR). The model is called VAE+GMM+LDA+ASR. The performance of VAE+GMM+LDA+ASR and the validity of Neuro-SERKET were demonstrated through a multimodal categorization task using image data and a speech signal of numerical digits.

preprint2020arXiv

Spatial Concept-Based Navigation with Human Speech Instructions via Probabilistic Inference on Bayesian Generative Model

Robots are required to not only learn spatial concepts autonomously but also utilize such knowledge for various tasks in a domestic environment. Spatial concept represents a multimodal place category acquired from the robot's spatial experience including vision, speech-language, and self-position. The aim of this study is to enable a mobile robot to perform navigational tasks with human speech instructions, such as `Go to the kitchen', via probabilistic inference on a Bayesian generative model using spatial concepts. Specifically, path planning was formalized as the maximization of probabilistic distribution on the path-trajectory under speech instruction, based on a control-as-inference framework. Furthermore, we described the relationship between probabilistic inference based on the Bayesian generative model and control problem including reinforcement learning. We demonstrated path planning based on human instruction using acquired spatial concepts to verify the usefulness of the proposed approach in the simulator and in real environments. Experimentally, places instructed by the user's speech commands showed high probability values, and the trajectory toward the target place was correctly estimated. Our approach, based on probabilistic inference concerning decision-making, can lead to further improvement in robot autonomy.