Source author record

Yoshihiro Nagano

Yoshihiro Nagano appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cond-mat.stat-mech cond-mat.str-el Machine Learning

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

On the Surrogate Gap between Contrastive and Supervised Losses

Contrastive representation learning encourages data representation to make semantically similar pairs closer than randomly drawn negative samples, which has been successful in various domains such as vision, language, and graphs. Recent theoretical studies have attempted to explain the benefit of the large negative sample size by upper-bounding the downstream classification loss with the contrastive loss. However, the previous surrogate bounds have two drawbacks: they are only legitimate for a limited range of negative sample sizes and prohibitively large even within that range. Due to these drawbacks, there still does not exist a consensus on how negative sample size theoretically correlates with downstream classification performance. Following the simplified setting where positive pairs are drawn from the true distribution (not generated by data augmentation; as supposed in previous studies), this study establishes surrogate upper and lower bounds for the downstream classification loss for all negative sample sizes that best explain the empirical observations on the negative sample size in the earlier studies. Our bounds suggest that the contrastive loss can be viewed as a surrogate objective of the downstream loss and larger negative sample sizes improve downstream classification because the surrogate gap between contrastive and supervised losses decays. We verify that our theory is consistent with experiments on synthetic, vision, and language datasets.

preprint2019arXiv

Monte Carlo study of the critical properties of noncollinear Heisenberg magnets: $O(3)\times O(2)$ universality class

The critical properties of the antiferromagnetic Heisenberg model on the three-dimensional stacked-triangular lattice are studied by means of a large-scale Monte Carlo simulation in order to get insight into the controversial issue of the criticality of the noncollinear magnets with the $O(3)\times O(2)$ symmetry. The maximum size studied is $384^3$, considerably larger than the sizes studied by the previous numerical works on the model. Availability of such large-size data enables us to examine the detailed critical properties including the effect of corrections to the leading scaling. Strong numerical evidence of the continuous nature of the transition is obtained. Our data indicates the existence of significant corrections to the leading scaling. Careful analysis by taking account of the possible corrections yield critical exponents estimates, $α=0.44(3)$, $β=0.26(2)$, $γ=1.03(5)$, $ν=0.52(1)$, $η=0.02(5)$, and the chirality exponents $β_κ=0.40(3)$ and $γ_κ=0.77(6)$, supporting the existence of the $O(3)$ chiral (or $O(3)\times O(2)$) universality class governed by a new `chiral' fixed point. We also obtain an indication that the underlying fixed point is of the focus-type, characterized by the complex-valued correction-to-scaling exponent, $ω=0.1^{+0.4}_{-0.05} + i\ 0.7^{+0.1}_{-0.4}$. The focus-like nature of the chiral fixed point accompanied by the spiral-like renormalization-group (RG) flow is likely to be the origin of the apparently complicated critical behavior. The results are compared and discussed in conjunction with the results of other numerical simulations, several distinct types of RG calculations including the higher-order perturbative massive and massless RG calculations and the nonperturbative functional RG calculation, and the conformal-bootstrap program.