Source author record

Masanori Koyama

Masanori Koyama appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.NA math.PR Computer Vision Molecular Networks Neurons and Cognition Quantitative Methods

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Meta Learning as Bayes Risk Minimization

Meta-Learning is a family of methods that use a set of interrelated tasks to learn a model that can quickly learn a new query task from a possibly small contextual dataset. In this study, we use a probabilistic framework to formalize what it means for two tasks to be related and reframe the meta-learning problem into the problem of Bayesian risk minimization (BRM). In our formulation, the BRM optimal solution is given by the predictive distribution computed from the posterior distribution of the task-specific latent variable conditioned on the contextual dataset, and this justifies the philosophy of Neural Process. However, the posterior distribution in Neural Process violates the way the posterior distribution changes with the contextual dataset. To address this problem, we present a novel Gaussian approximation for the posterior distribution that generalizes the posterior of the linear Gaussian model. Unlike that of the Neural Process, our approximation of the posterior distributions converges to the maximum likelihood estimate with the same rate as the true posterior distribution. We also demonstrate the competitiveness of our approach on benchmark datasets.

preprint2020arXiv

Train Sparsely, Generate Densely: Memory-efficient Unsupervised Training of High-resolution Temporal GAN

Training of Generative Adversarial Network (GAN) on a video dataset is a challenge because of the sheer size of the dataset and the complexity of each observation. In general, the computational cost of training GAN scales exponentially with the resolution. In this study, we present a novel memory efficient method of unsupervised learning of high-resolution video dataset whose computational cost scales only linearly with the resolution. We achieve this by designing the generator model as a stack of small sub-generators and training the model in a specific way. We train each sub-generator with its own specific discriminator. At the time of the training, we introduce between each pair of consecutive sub-generators an auxiliary subsampling layer that reduces the frame-rate by a certain ratio. This procedure can allow each sub-generator to learn the distribution of the video at different levels of resolution. We also need only a few GPUs to train a highly complex generator that far outperforms the predecessor in terms of inception scores.

preprint2015arXiv

Deep learning of fMRI big data: a novel approach to subject-transfer decoding

As a technology to read brain states from measurable brain activities, brain decoding are widely applied in industries and medical sciences. In spite of high demands in these applications for a universal decoder that can be applied to all individuals simultaneously, large variation in brain activities across individuals has limited the scope of many studies to the development of individual-specific decoders. In this study, we used deep neural network (DNN), a nonlinear hierarchical model, to construct a subject-transfer decoder. Our decoder is the first successful DNN-based subject-transfer decoder. When applied to a large-scale functional magnetic resonance imaging (fMRI) database, our DNN-based decoder achieved higher decoding accuracy than other baseline methods, including support vector machine (SVM). In order to analyze the knowledge acquired by this decoder, we applied principal sensitivity analysis (PSA) to the decoder and visualized the discriminative features that are common to all subjects in the dataset. Our PSA successfully visualized the subject-independent features contributing to the subject-transferability of the trained decoder.

preprint2015arXiv

Principal Sensitivity Analysis

We present a novel algorithm (Principal Sensitivity Analysis; PSA) to analyze the knowledge of the classifier obtained from supervised machine learning techniques. In particular, we define principal sensitivity map (PSM) as the direction on the input space to which the trained classifier is most sensitive, and use analogously defined k-th PSM to define a basis for the input space. We train neural networks with artificial data and real data, and apply the algorithm to the obtained supervised classifiers. We then visualize the PSMs to demonstrate the PSA's ability to decompose the knowledge acquired by the trained classifiers.

preprint2014arXiv

An asymptotic relationship between coupling methods for stochastically modeled population processes

This paper is concerned with elucidating a relationship between two common coupling methods for the continuous time Markov chain models utilized in the cell biology literature. The couplings considered here are primarily used in a computational framework by providing reductions in variance for different Monte Carlo estimators, thereby allowing for significantly more accurate results for a fixed amount of computational time. Common applications of the couplings include the estimation of parametric sensitivities via finite difference methods and the estimation of expectations via multi-level Monte Carlo algorithms. While a number of coupling strategies have been proposed for the models considered here, and a number of articles have experimentally compared the different strategies, to date there has been no mathematical analysis describing the connections between them. Such analyses are critical in order to determine the best use for each. In the current paper, we show a connection between the common reaction path (CRP) method and the split coupling (SC) method, which is termed coupled finite differences (CFD) in the parametric sensitivities literature. In particular, we show that the two couplings are both limits of a third coupling strategy we call the "local-CRP" coupling, with the split coupling method arising as a key parameter goes to infinity, and the common reaction path coupling arising as the same parameter goes to zero. The analysis helps explain why the split coupling method often provides a lower variance than does the common reaction path method, a fact previously shown experimentally.

preprint2012arXiv

Weak error analysis of numerical methods for stochastic models of population processes

The simplest, and most common, stochastic model for population processes, including those from biochemistry and cell biology, are continuous time Markov chains. Simulation of such models is often relatively straightforward as there are easily implementable methods for the generation of exact sample paths. However, when using ensemble averages to approximate expected values, the computational complexity can become prohibitive as the number of computations per path scales linearly with the number of jumps of the process. When such methods become computationally intractable, approximate methods, which introduce a bias, can become advantageous. In this paper, we provide a general framework for understanding the weak error, or bias, induced by different numerical approximation techniques in the current setting. The analysis takes into account both the natural scalings within a given system and the step-size of the numerical method. Examples are provided to demonstrate the main analytical results as well as the reduction in computational complexity achieved by the approximate methods.