Researcher profile

Shun-ichi Amari

Shun-ichi Amari contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2023arXiv

Deep Learning in Random Neural Fields: Numerical Experiments via Neural Tangent Kernel

A biological neural network in the cortex forms a neural field. Neurons in the field have their own receptive fields, and connection weights between two neurons are random but highly correlated when they are in close proximity in receptive fields. In this paper, we investigate such neural fields in a multilayer architecture to investigate the supervised learning of the fields. We empirically compare the performances of our field model with those of randomly connected deep networks. The behavior of a randomly connected network is investigated on the basis of the key idea of the neural tangent kernel regime, a recent development in the machine learning theory of over-parameterized networks; for most randomly connected neural networks, it is shown that global minima always exist in their small neighborhoods. We numerically show that this claim also holds for our neural fields. In more detail, our model has two structures: i) each neuron in a field has a continuously distributed receptive field, and ii) the initial connection weights are random but not independent, having correlations when the positions of neurons are close in each layer. We show that such a multilayer neural field is more robust than conventional models when input patterns are deformed by noise disturbances. Moreover, its generalization ability can be slightly superior to that of conventional models.

preprint2020arXiv

Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective

It is known that any target function is realized in a sufficiently small neighborhood of any randomly connected deep network, provided the width (the number of neurons in a layer) is sufficiently large. There are sophisticated theories and discussions concerning this striking fact, but rigorous theories are very complicated. We give an elementary geometrical proof by using a simple model for the purpose of elucidating its structure. We show that high-dimensional geometry plays a magical role: When we project a high-dimensional sphere of radius 1 to a low-dimensional subspace, the uniform distribution over the sphere reduces to a Gaussian distribution of negligibly small covariances.

preprint2020arXiv

Unified framework for the entropy production and the stochastic interaction based on information geometry

We show a relationship between the entropy production in stochastic thermodynamics and the stochastic interaction in the information integrated theory. To clarify this relationship, we newly introduce an information geometric interpretation of the entropy production for a total system and the partial entropy productions for subsystems. We show that the violation of the additivity of the entropy productions is related to the stochastic interaction. This framework is a thermodynamic foundation of the integrated information theory. We also show that our information geometric formalism leads to a novel expression of the entropy production related to an optimization problem minimizing the Kullback-Leibler divergence. We analytically illustrate this interpretation by using the spin model.

preprint2020arXiv

Wasserstein statistics in 1D location-scale model

Wasserstein geometry and information geometry are two important structures introduced in a manifold of probability distributions. The former is defined by using the transportation cost between two distributions, so it reflects the metric structure of the base manifold on which distributions are defined. Information geometry is constructed based on the invariance criterion that the geometry is invariant under reversible transformations of the base space. Both have their own merits for applications. Statistical inference is constructed on information geometry, where the Fisher metric plays a fundamental role, whereas Wasserstein geometry is useful for applications to computer vision and AI. We propose statistical inference based on the Wasserstein geometry in the case that the base space is 1-dimensional. By using the location-scale model, we derive the $W$-estimator explicitly and studies its asymptotic behaviors.

preprint2020arXiv

Wasserstein Statistics in One-dimensional Location-Scale Model

Wasserstein geometry and information geometry are two important structures to be introduced in a manifold of probability distributions. Wasserstein geometry is defined by using the transportation cost between two distributions, so it reflects the metric of the base manifold on which the distributions are defined. Information geometry is defined to be invariant under reversible transformations of the base space. Both have their own merits for applications. In particular, statistical inference is based upon information geometry, where the Fisher metric plays a fundamental role, whereas Wasserstein geometry is useful in computer vision and AI applications. In this study, we analyze statistical inference based on the Wasserstein geometry in the case that the base space is one-dimensional. By using the location-scale model, we further derive the W-estimator that explicitly minimizes the transportation cost from the empirical distribution to a statistical model and study its asymptotic behaviors. We show that the W-estimator is consistent and explicitly give its asymptotic distribution by using the functional delta method. The W-estimator is Fisher efficient in the Gaussian case.

preprint2010arXiv

Dually flat structure with escort probability and its application to alpha-Voronoi diagrams

This paper studies geometrical structure of the manifold of escort probability distributions and shows its new applicability to information science. In order to realize escort probabilities we use a conformal transformation that flattens so-called alpha-geometry of the space of discrete probability distributions, which well characterizes nonadditive statistics on the space. As a result escort probabilities are proved to be flat coordinates of the usual probabilities for the derived dually flat structure. Finally, we demonstrate that escort probabilities with the new structure admits a simple algorithm to compute Voronoi diagrams and centroids with respect to alpha-divergences.

preprint2010arXiv

Modeling Basal Ganglia for understanding Parkinsonian Reaching Movements

We present a computational model that highlights the role of basal ganglia (BG) in generating simple reaching movements. The model is cast within the reinforcement learning (RL) framework with the correspondence between RL components and neuroanatomy as follows: dopamine signal of substantia nigra pars compacta as the Temporal Difference error, striatum as the substrate for the Critic, and the motor cortex as the Actor. A key feature of this neurobiological interpretation is our hypothesis that the indirect pathway is the Explorer. Chaotic activity, originating from the indirect pathway part of the model, drives the wandering, exploratory movements of the arm. Thus the direct pathway subserves exploitation while the indirect pathway subserves exploration. The motor cortex becomes more and more independent of the corrective influence of BG, as training progresses. Reaching trajectories show diminishing variability with training. Reaching movements associated with Parkinson's disease (PD) are simulated by (a) reducing dopamine and (b) degrading the complexity of indirect pathway dynamics by switching it from chaotic to periodic behavior. Under the simulated PD conditions, the arm exhibits PD motor symptoms like tremor, bradykinesia and undershoot. The model echoes the notion that PD is a dynamical disease.