Source author record

Konstantinos Pitas

Konstantinos Pitas appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Cold Posteriors through PAC-Bayes

We investigate the cold posterior effect through the lens of PAC-Bayes generalization bounds. We argue that in the non-asymptotic setting, when the number of training samples is (relatively) small, discussions of the cold posterior effect should take into account that approximate Bayesian inference does not readily provide guarantees of performance on out-of-sample data. Instead, out-of-sample error is better described through a generalization bound. In this context, we explore the connections between the ELBO objective from variational inference and the PAC-Bayes objectives. We note that, while the ELBO and PAC-Bayes objectives are similar, the latter objectives naturally contain a temperature parameter $λ$ which is not restricted to be $λ=1$. For both regression and classification tasks, in the case of isotropic Laplace approximations to the posterior, we show how this PAC-Bayesian interpretation of the temperature parameter captures the cold posterior effect.

preprint2020arXiv

Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation

Explaining how overparametrized neural networks simultaneously achieve low risk and zero empirical risk on benchmark datasets is an open problem. PAC-Bayes bounds optimized using variational inference (VI) have been recently proposed as a promising direction in obtaining non-vacuous bounds. We show empirically that this approach gives negligible gains when modeling the posterior as a Gaussian with diagonal covariance--known as the mean-field approximation. We investigate common explanations, such as the failure of VI due to problems in optimization or choosing a suboptimal prior. Our results suggest that investigating richer posteriors is the most promising direction forward.

preprint2020arXiv

The role of invariance in spectral complexity-based generalization bounds

Deep convolutional neural networks (CNNs) have been shown to be able to fit a random labeling over data while still being able to generalize well for normal labels. Describing CNN capacity through a posteriori measures of complexity has been recently proposed to tackle this apparent paradox. These complexity measures are usually validated by showing that they correlate empirically with GE; being empirically larger for networks trained on random vs normal labels. Focusing on the case of spectral complexity we investigate theoretically and empirically the insensitivity of the complexity measure to invariances relevant to CNNs, and show several limitations of spectral complexity that occur as a result. For a specific formulation of spectral complexity we show that it results in the same upper bound complexity estimates for convolutional and locally connected architectures (which don't have the same favorable invariance properties). This is contrary to common intuition and empirical results.

Konstantinos Pitas

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Cold Posteriors through PAC-Bayes

Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation

The role of invariance in spectral complexity-based generalization bounds