Source author record

Yutaka Kano

Yutaka Kano appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology Computation Machine Learning Mathematical Software

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2014arXiv

Identification Problem for The Analysis of Binary Data with Non-ignorable Missing

When a missing-data mechanism is NMAR or non-ignorable, missingness is itself vital information and it must be taken into the likelihood, which, however, needs to introduce additional parameters to be estimated. The incompleteness of the data and introduction of more parameters can cause the identification problem. When a response variable is binary, it becomes a more serious problem because of less information of bi- nary data, however, there are no methods to briefly verify whether a mode is identified or not. Therefore, we provide a new necessary and sufficient condition to easily check model identifiability when analyzing binary data with non-ignorable missing by condi- tional models. This condition can give us what condition is needed for a model to have identifiability as well as make easily check the identifiability of a model.

preprint2014arXiv

Statistical Inference with Different Missing-data Mechanisms

When data are missing due to at most one cause from some time to next time, we can make sampling distribution inferences about the parameter of the data by modeling the missing-data mechanism correctly. Proverbially, in case its mechanism is missing at random (MAR), it can be ignored, but in case not missing at random (NMAR), it can not be. There are no methods, however, to analyze when missing of the data can occur because of several causes despite of there being many such data in practice. Hence the aim of this paper is to propose how to inference on such data. Concretely, we extend the missing-data indicator from usual binary random vectors to discrete random vectors, define missing-data mechanism for every causes and research ignorability of a mixture of missing-data mechanisms such as "MAR & MAR" and "MAR & NMAR". In particular, when the combination of mechanisms is "MAR & NMAR", generally the component of MAR can not be ignored, but in special case, it can be.

preprint2013arXiv

Full information maximum likelihood estimation in factor analysis with a lot of missing values

We consider the problem of full information maximum likelihood (FIML) estimation in a factor analysis model when a majority of the data values are missing. The expectation-maximization (EM) algorithm is often used to find the FIML estimates, in which the missing values on observed variables are included in complete data. However, the EM algorithm has an extremely high computational cost when the number of observations is large and/or plenty of missing values are involved. In this paper, we propose a new algorithm that is based on the EM algorithm but that efficiently computes the FIML estimates. A significant improvement in the computational speed is realized by not treating the missing values on observed variables as a part of complete data. Our algorithm is applied to a real data set collected from a Web questionnaire that asks about first impressions of human; almost $90\%$ of the data values are missing. When there are many missing data values, it is not clear if the FIML procedure can achieve good estimation accuracy even if the number of observations is large. In order to investigate this, we conduct Monte Carlo simulations under a wide variety of sample sizes.

preprint2012arXiv

Discovery of non-gaussian linear causal models using ICA

In recent years, several methods have been proposed for the discovery of causal structure from non-experimental data (Spirtes et al. 2000; Pearl 2000). Such methods make various assumptions on the data generating process to facilitate its identification from purely observational data. Continuing this line of research, we show how to discover the complete causal structure of continuous-valued data, under the assumptions that (a) the data generating process is linear, (b) there are no unobserved confounders, and (c) disturbance variables have non-gaussian distributions of non-zero variances. The solution relies on the use of the statistical method known as independent component analysis (ICA), and does not require any pre-specified time-ordering of the variables. We provide a complete Matlab package for performing this LiNGAM analysis (short for Linear Non-Gaussian Acyclic Model), and demonstrate the effectiveness of the method using artificially generated data.