Researcher profile

Johan Jonasson

Johan Jonasson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

Robust Neural Network Classification via Double Regularization

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible classifiers like neural networks. Here we propose a novel double regularization of the neural network training loss that combines a penalty on the complexity of the classification model and an optimal reweighting of training observations. The combined penalties result in improved generalization properties and strong robustness against overfitting in different settings of mislabeled training data and also against variation in initial parameter values when training. We provide a theoretical justification for our proposed method derived for a simple case of logistic regression. We demonstrate the double regularization model, here denoted by DRFit, for neural net classification of (i) MNIST and (ii) CIFAR-10, in both cases with simulated mislabeling. We also illustrate that DRFit identifies mislabeled data points with very good precision. This provides strong support for DRFit as a practical of-the-shelf classifier, since, without any sacrifice in performance, we get a classifier that simultaneously reduces overfitting against mislabeling and gives an accurate measure of the trustworthiness of the labels.

preprint2021arXiv

Rapid mixing in unimodal landscapes and efficient simulatedannealing for multimodal distributions

We consider nearest neighbor weighted random walks on the $d$-dimensional box $[n]^d$ that are governed by some function $g:[0,1] \ra [0,\iy)$, by which we mean that standing at $x$, a neighbor $y$ of $x$ is picked at random and the walk then moves there with probability $(1/2)g(n^{-1}y)/(g(n^{-1}y)+g(n^{-1}x))$. We do this for $g$ of the form $f^{m_n}$ for some function $f$ which assumed to be analytically well-behaved and where $m_n \ra \iy$ as $n \ra \iy$. This class of walks covers an abundance of interesting special cases, e.g., the mean-field Potts model, posterior collapsed Gibbs sampling for Latent Dirichlet allocation and certain Bayesian posteriors for models in nuclear physics. The following are among the results of this paper: \begin{itemize} \item If $f$ is unimodal with negative definite Hessian at its global maximum, then the mixing time of the random walk is $O(n\log n)$. \item If $f$ is multimodal, then the mixing time is exponential in $n$, but we show that there is a simulated annealing scheme governed by $f^K$ for an increasing sequence of $K$ that mixes in time $O(n^2)$. Using a varying step size that decreases with $K$, this can be taken down to $O(n\log n)$. \item If the process is studied on a general graph rather than the $d$-dimensional box, a simulated annealing scheme expressed in terms of conductances of the underlying network, works similarly. \end{itemize} Several examples are given, including the ones mentioned above.

preprint2020arXiv

Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data

Recently, new methods for model assessment, based on subsampling and posterior approximations, have been proposed for scaling leave-one-out cross-validation (LOO) to large datasets. Although these methods work well for estimating predictive performance for individual models, they are less powerful in model comparison. We propose an efficient method for estimating differences in predictive performance by combining fast approximate LOO surrogates with exact LOO subsampling using the difference estimator and supply proofs with regards to scaling characteristics. The resulting approach can be orders of magnitude more efficient than previous approaches, as well as being better suited to model comparison.

preprint2019arXiv

Bayesian leave-one-out cross-validation for large data

Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.