Source author record

Johan Jonasson

Johan Jonasson appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Machine Learning Applications Discrete Mathematics math.ST Methodology Statistics Theory

Catalog footprint

What is connected

10works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Robust Neural Network Classification via Double Regularization

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible classifiers like neural networks. Here we propose a novel double regularization of the neural network training loss that combines a penalty on the complexity of the classification model and an optimal reweighting of training observations. The combined penalties result in improved generalization properties and strong robustness against overfitting in different settings of mislabeled training data and also against variation in initial parameter values when training. We provide a theoretical justification for our proposed method derived for a simple case of logistic regression. We demonstrate the double regularization model, here denoted by DRFit, for neural net classification of (i) MNIST and (ii) CIFAR-10, in both cases with simulated mislabeling. We also illustrate that DRFit identifies mislabeled data points with very good precision. This provides strong support for DRFit as a practical of-the-shelf classifier, since, without any sacrifice in performance, we get a classifier that simultaneously reduces overfitting against mislabeling and gives an accurate measure of the trustworthiness of the labels.

preprint2021arXiv

Rapid mixing in unimodal landscapes and efficient simulatedannealing for multimodal distributions

We consider nearest neighbor weighted random walks on the $d$-dimensional box $[n]^d$ that are governed by some function $g:[0,1] \ra [0,\iy)$, by which we mean that standing at $x$, a neighbor $y$ of $x$ is picked at random and the walk then moves there with probability $(1/2)g(n^{-1}y)/(g(n^{-1}y)+g(n^{-1}x))$. We do this for $g$ of the form $f^{m_n}$ for some function $f$ which assumed to be analytically well-behaved and where $m_n \ra \iy$ as $n \ra \iy$. This class of walks covers an abundance of interesting special cases, e.g., the mean-field Potts model, posterior collapsed Gibbs sampling for Latent Dirichlet allocation and certain Bayesian posteriors for models in nuclear physics. The following are among the results of this paper: \begin{itemize} \item If $f$ is unimodal with negative definite Hessian at its global maximum, then the mixing time of the random walk is $O(n\log n)$. \item If $f$ is multimodal, then the mixing time is exponential in $n$, but we show that there is a simulated annealing scheme governed by $f^K$ for an increasing sequence of $K$ that mixes in time $O(n^2)$. Using a varying step size that decreases with $K$, this can be taken down to $O(n\log n)$. \item If the process is studied on a general graph rather than the $d$-dimensional box, a simulated annealing scheme expressed in terms of conductances of the underlying network, works similarly. \end{itemize} Several examples are given, including the ones mentioned above.

preprint2020arXiv

Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data

Recently, new methods for model assessment, based on subsampling and posterior approximations, have been proposed for scaling leave-one-out cross-validation (LOO) to large datasets. Although these methods work well for estimating predictive performance for individual models, they are less powerful in model comparison. We propose an efficient method for estimating differences in predictive performance by combining fast approximate LOO surrogates with exact LOO subsampling using the difference estimator and supply proofs with regards to scaling characteristics. The resulting approach can be orders of magnitude more efficient than previous approaches, as well as being better suited to model comparison.

preprint2020arXiv

The existence phase transition for two Poisson random fractal models

In this paper we study the existence phase transition of the random fractal ball model and the random fractal box model. We show that both of these are in the empty phase at the critical point of this phase transition.

preprint2019arXiv

Bayesian leave-one-out cross-validation for large data

Model inference, such as model comparison, model checking, and model selection, is an important part of model development. Leave-one-out cross-validation (LOO) is a general approach for assessing the generalizability of a model, but unfortunately, LOO does not scale well to large datasets. We propose a combination of using approximate inference techniques and probability-proportional-to-size-sampling (PPS) for fast LOO model evaluation for large datasets. We provide both theoretical and empirical results showing good properties for large data.

preprint2015arXiv

Card-cyclic-to-random shuffling with relabeling

The card-cyclic-to-random shuffle is the card shuffle where the $n$ cards are labeled $1,\ldots,n$ according to their starting positions. Then the cards are mixed by first picking card $1$ from the deck and reinserting it at a uniformly random position, then repeating for card $2$, then for card $3$ and so on until all cards have been reinserted in this way. Then the procedure starts over again, by first picking the card with label $1$ and reinserting, and so on. Morris, Ning and Peres \cite{MNP} recently showed that the order of the number of shuffles needed to mix the deck in this way is $n\log n$. In the present paper, we consider a variant of this shuffle with relabeling, i.e.\ a shuffle that differs from the above in that after one round, i.e.\ after all cards have been reinserted once, we relabel the cards according to the positions in the deck that they now have. The relabeling is then repeated after each round of shuffling. It is shown that even in this case, the correct order of mixing is $n\log n$.

preprint2015arXiv

Stationary random graphs on $\mathbb{Z}$ with prescribed iid degrees and finite mean connections

Let $F$ be a probability distribution with support on the non-negative integers. A model is proposed for generating stationary simple graphs on $\mathbb{Z}$ with degree distribution $F$ and it is shown for this model that the expected total length of all edges at a given vertex is finite if $F$ has finite second moment. It is not hard to see that any stationary model for generating simple graphs on $\mathbb{Z}$ will give infinite mean for the total edge length per vertex if $F$ does not have finite second moment. Hence, finite second moment of $F$ is a necessary and sufficient condition for the existence of a model with finite mean total edge length.

preprint2015arXiv

The spectrum and convergence rates of exclusion and interchange processes on the complete graph

We give a short and completely elementary method to find the full spectrum of the exclusion process and a nicely limited superset of the spectrum of the interchange process (a.k.a.\ random transpositions) on the complete graph. In the case of the exclusion process, this gives a simple closed form expression for all the eigenvalues and their multiplicities. This result is then used to give an exact expression for the distance in $ L^2 $ from stationarity at any time and upper and lower bounds on the convergence rate for the exclusion process. In the case of the interchange process, upper and lower bounds are similarly found. Our results strengthen or reprove all known results of the mixing time for the two processes in a very simple way.

preprint2012arXiv

Mixing times for the interchange process

Consider the interchange process on a connected graph $G=(V,E)$ on $n$ vertices. I.e.\ shuffle a deck of cards by first placing one card at each vertex of $G$ in a fixed order and then at each tick of the clock, picking an edge uniformly at random and switching the two cards at the end vertices of the edge with probability 1/2. Well known special cases are the random transpositions shuffle, where $G$ is the complete graph, and the transposing neighbors shuffle, where $G$ is the $n$-path. Other cases that have been studied are the $d$-dimensional grid, the hypercube, lollipop graphs and Erd\H os-Rényi random graphs above the threshold for connectedness. In this paper the problem is studied for general $G$. Special attention is focused on trees, random trees and the giant component of critical and supercritical $G(N,p)$ random graphs. Upper and lower bounds on the mixing time are given. In many of the cases, we establish the exact order of the mixing time. We also mention the cases when $G$ is the hypercube and when $G$ is a bounded-degree expander, giving upper and lower bounds on the mixing time.

preprint2006arXiv

Uniqueness and non-uniqueness in percolation theory

This paper is an up-to-date introduction to the problem of uniqueness versus non-uniqueness of infinite clusters for percolation on ${\mathbb{Z}}^d$ and, more generally, on transitive graphs. For iid percolation on ${\mathbb{Z}}^d$, uniqueness of the infinite cluster is a classical result, while on certain other transitive graphs uniqueness may fail. Key properties of the graphs in this context turn out to be amenability and nonamenability. The same problem is considered for certain dependent percolation models -- most prominently the Fortuin--Kasteleyn random-cluster model -- and in situations where the standard connectivity notion is replaced by entanglement or rigidity. So-called simultaneous uniqueness in couplings of percolation processes is also considered. Some of the main results are proved in detail, while for others the proofs are merely sketched, and for yet others they are omitted. Several open problems are discussed.

Johan Jonasson

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Robust Neural Network Classification via Double Regularization

Rapid mixing in unimodal landscapes and efficient simulatedannealing for multimodal distributions

Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data

The existence phase transition for two Poisson random fractal models

Bayesian leave-one-out cross-validation for large data

Card-cyclic-to-random shuffling with relabeling

Stationary random graphs on $\mathbb{Z}$ with prescribed iid degrees and finite mean connections

The spectrum and convergence rates of exclusion and interchange processes on the complete graph

Mixing times for the interchange process

Uniqueness and non-uniqueness in percolation theory