Source author record

Abdulkadir Canatar

Abdulkadir Canatar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning cond-mat.dis-nn cond-mat.mes-hall

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Out-of-Distribution Generalization in Kernel Regression

In real word applications, data generating process for training a machine learning model often differs from what the model encounters in the test stage. Understanding how and whether machine learning models generalize under such distributional shifts have been a theoretical challenge. Here, we study generalization in kernel regression when the training and test distributions are different using methods from statistical physics. Using the replica method, we derive an analytical formula for the out-of-distribution generalization error applicable to any kernel and real datasets. We identify an overlap matrix that quantifies the mismatch between distributions for a given kernel as a key determinant of generalization performance under distribution shift. Using our analytical expressions we elucidate various generalization phenomena including possible improvement in generalization when there is a mismatch. We develop procedures for optimizing training and test distributions for a given data budget to find best and worst case generalizations under the shift. We present applications of our theory to real and synthetic datasets and for many kernels. We compare results of our theory applied to Neural Tangent Kernel with simulations of wide networks and show agreement. We analyze linear regression in further depth.

preprint2022arXiv

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

Generalization beyond a training dataset is a main goal of machine learning, but theoretical understanding of generalization remains an open problem for many models. The need for a new theory is exacerbated by recent observations in deep neural networks where overparameterization leads to better performance, contradicting the conventional wisdom from classical statistics. In this paper, we investigate generalization error for kernel regression, which, besides being a popular machine learning method, also includes infinitely overparameterized neural networks trained with gradient descent. We use techniques from statistical mechanics to derive an analytical expression for generalization error applicable to any kernel or data distribution. We present applications of our theory to real and synthetic datasets, and for many kernels including those that arise from training deep neural networks in the infinite-width limit. We elucidate an inductive bias of kernel regression to explain data with "simple functions", which are identified by solving a kernel eigenfunction problem on the data distribution. This notion of simplicity allows us to characterize whether a kernel is compatible with a learning task, facilitating good generalization performance from a small number of training examples. We show that more data may impair generalization when noisy or not expressible by the kernel, leading to non-monotonic learning curves with possibly many peaks. To further understand these phenomena, we turn to the broad class of rotation invariant kernels, which is relevant to training deep neural networks in the infinite-width limit, and present a detailed mathematical analysis of them when data is drawn from a spherically symmetric distribution and the number of input dimensions is large.

preprint2021arXiv

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples using theoretical methods from Gaussian processes and statistical physics. Our expressions apply to wide neural networks due to an equivalence between training them and kernel regression with the Neural Tangent Kernel (NTK). By computing the decomposition of the total generalization error due to different spectral components of the kernel, we identify a new spectral principle: as the size of the training set grows, kernel machines and neural networks fit successively higher spectral modes of the target function. When data are sampled from a uniform distribution on a high-dimensional hypersphere, dot product kernels, including NTK, exhibit learning stages where different frequency modes of the target function are learned. We verify our theory with simulations on synthetic data and MNIST dataset.

preprint2021arXiv

Strong localization in a suspended monolayer graphene by intervalley scattering

A gate induced insulating behavior at zero magnetic field is observed in a high mobility suspended monolayer graphene near the charge neutrality point. The graphene device initially cleaned by a current annealing technique was undergone a thermo-pressure cycle to allow short range impurities to be adsorbed directly on the ultra clean graphene surface. The adsorption process generated a strong temperature and electric field dependent behavior on the conductance of the graphene device. The conductance around the neutrality point is observed to be reduced from around $e^2/h$ at 30 K to $\sim0.01~e^2/h$ at 20 mK. A direct transition from insulator to quantum Hall conductor within $\approx0.4~T$ accompanied by broken-symmetry-induced $ν=0,\pm1$ plateaux confirms the presence of intervalley scatterers.

Abdulkadir Canatar

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Out-of-Distribution Generalization in Kernel Regression

Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks

Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks

Strong localization in a suspended monolayer graphene by intervalley scattering