Source author record

Naoki Hayashi

Naoki Hayashi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.ST Statistics Theory

Catalog footprint

What is connected

3works

3topics

1close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

The Exact Asymptotic Form of Bayesian Generalization Error in Latent Dirichlet Allocation

Latent Dirichlet allocation (LDA) obtains essential information from data by using Bayesian inference. It is applied to knowledge discovery via dimension reducing and clustering in many fields. However, its generalization error had not been yet clarified since it is a singular statistical model where there is no one-to-one mapping from parameters to probability distributions. In this paper, we give the exact asymptotic form of its generalization error and marginal likelihood, by theoretical analysis of its learning coefficient using algebraic geometry. The theoretical result shows that the Bayesian generalization error in LDA is expressed in terms of that in matrix factorization and a penalty from the simplex restriction of LDA's parameter region. A numerical experiment is consistent to the theoretical result.

preprint2020arXiv

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Latent Dirichlet allocation (LDA) is useful in document analysis, image processing, and many information systems; however, its generalization performance has been left unknown because it is a singular learning machine to which regular statistical theory can not be applied. Stochastic matrix factorization (SMF) is a restricted matrix factorization in which matrix factors are stochastic; the column of the matrix is in a simplex. SMF is being applied to image recognition and text mining. We can understand SMF as a statistical model by which a stochastic matrix of given data is represented by a product of two stochastic matrices, whose generalization performance has also been left unknown because of non-regularity. In this paper, by using an algebraic and geometric method, we show the analytic equivalence of LDA and SMF, both of which have the same real log canonical threshold (RLCT), resulting in that they asymptotically have the same Bayesian generalization error and the same log marginal likelihood. Moreover, we derive the upper bound of the RLCT and prove that it is smaller than the dimension of the parameter divided by two, hence the Bayesian generalization errors of them are smaller than those of regular statistical models.

preprint2020arXiv

Variational Approximation Error in Bayesian Non-negative Matrix Factorization

Non-negative matrix factorization (NMF) is a knowledge discovery method that is used in many fields. Variational inference and Gibbs sampling methods for it are also wellknown. However, the variational approximation error has not been clarified yet, because NMF is not statistically regular and the prior distribution used in variational Bayesian NMF (VBNMF) has zero or divergence points. In this paper, using algebraic geometrical methods, we theoretically analyze the difference in negative log evidence (a.k.a. free energy) between VBNMF and Bayesian NMF, i.e., the Kullback-Leibler divergence between the variational posterior and the true posterior. We derive an upper bound for the learning coefficient (a.k.a. the real log canonical threshold) in Bayesian NMF. By using the upper bound, we find a lower bound for the approximation error, asymptotically. The result quantitatively shows how well the VBNMF algorithm can approximate Bayesian NMF; the lower bound depends on the hyperparameters and the true nonnegative rank. A numerical experiment demonstrates the theoretical result.

Naoki Hayashi

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

The Exact Asymptotic Form of Bayesian Generalization Error in Latent Dirichlet Allocation

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Variational Approximation Error in Bayesian Non-negative Matrix Factorization