Researcher profile

Johan Karlsson

Johan Karlsson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Orthogonalization of data via Gromov-Wasserstein type feedback for clustering and visualization

In this paper we propose an adaptive approach for clustering and visualization of data by an orthogonalization process. Starting with the data points being represented by a Markov process using the diffusion map framework, the method adaptively increase the orthogonality of the clusters by applying a feedback mechanism inspired by the Gromov-Wasserstein distance. This mechanism iteratively increases the spectral gap and refines the orthogonality of the data to achieve a clustering with high specificity. By using the diffusion map framework and representing the relation between data points using transition probabilities, the method is robust with respect to both the underlying distance, noise in the data and random initialization. We prove that the method converges globally to a unique fixpoint for certain parameter values. We also propose a related approach where the transition probabilities in the Markov process are required to be doubly stochastic, in which case the method generates a minimizer to a nonconvex optimization problem. We apply the method on cryo-electron microscopy image data from biopharmaceutical manufacturing where we can confirm biologically relevant insights related to therapeutic efficacy. We consider an example with morphological variations of gene packaging and confirm that the method produces biologically meaningful clustering results consistent with human expert classification.

preprint2022arXiv

VidHarm: A Clip Based Dataset for Harmful Content Detection

Automatically identifying harmful content in video is an important task with a wide range of applications. However, there is a lack of professionally labeled open datasets available. In this work VidHarm, an open dataset of 3589 video clips from film trailers annotated by professionals, is presented. An analysis of the dataset is performed, revealing among other things the relation between clip and trailer level annotations. Audiovisual models are trained on the dataset and an in-depth study of modeling choices conducted. The results show that performance is greatly improved by combining the visual and audio modality, pre-training on large-scale video recognition datasets, and class balanced sampling. Lastly, biases of the trained models are investigated using discrimination probing. VidHarm is openly available, and further details are available at: https://vidharm.github.io

preprint2020arXiv

Incremental inference of collective graphical models

We consider incremental inference problems from aggregate data for collective dynamics. In particular, we address the problem of estimating the aggregate marginals of a Markov chain from noisy aggregate observations in an incremental (online) fashion. We propose a sliding window Sinkhorn belief propagation (SW-SBP) algorithm that utilizes a sliding window filter of the most recent noisy aggregate observations along with encoded information from discarded observations. Our algorithm is built upon the recently proposed multi-marginal optimal transport based SBP algorithm that leverages standard belief propagation and Sinkhorn algorithm to solve inference problems from aggregate data. We demonstrate the performance of our algorithm on applications such as inferring population flow from aggregate observations.

preprint2020arXiv

M$^2$-Spectral Estimation: A Relative Entropy Approach

This paper deals with M$^2$-signals, namely multivariate (or vector-valued) signals defined over a multidimensional domain. In particular, we propose an optimization technique to solve the covariance extension problem for stationary random vector fields. The multidimensional Itakura-Saito distance is employed as an optimization criterion to select the solution among the spectra satisfying a finite number of moment constraints. In order to avoid technicalities that may happen on the boundary of the feasible set, we deal with the discrete version of the problem where the multidimensional integrals are approximated by Riemann sums. The spectrum solution is also discrete, which occurs naturally when the underlying random field is periodic. We show that a solution to the discrete problem exists, is unique and depends smoothly on the problem data. Therefore, we have a well-posed problem whose solution can be tuned in a smooth manner. Finally, we have applied our theory to the target parameter estimation problem in an integrated system of automotive modules. Simulation results show that our spectral estimator has promising performance.

preprint2020arXiv

Modeling collective behaviors: A moment-based approach

In this work we introduce an approach for modeling and analyzing collective behavior of a group of agents using moments. We represent the group of agents via their distribution and derive a method to estimate the dynamics of the moments. We use this to predict the evolution of the distribution of agents by first computing the moment trajectories and then use this to reconstruct the distribution of the agents. In the latter an inverse problem is solved in order to reconstruct a nominal distribution and to recover the macro-scale properties of the group of agents. The proposed method is applicable for several types of multi-agent systems, e.g., leader-follower systems. We derive error bounds for the moment trajectories and describe how to take these error bounds into account for computing the moment dynamics. The convergence of the moment dynamics is also analyzed for cases with monomial moments. To illustrate the theory, two numerical examples are given. In the first we consider a multi-agent system with interactions and compare the proposed methods for several types of moments. In the second example we apply the framework to a leader-follower problem for modeling pedestrian crowd dynamics.

preprint2020arXiv

Multi-marginal optimal transport and probabilistic graphical models

We study multi-marginal optimal transport problems from a probabilistic graphical model perspective. We point out an elegant connection between the two when the underlying cost for optimal transport allows a graph structure. In particular, an entropy regularized multi-marginal optimal transport is equivalent to a Bayesian marginal inference problem for probabilistic graphical models with the additional requirement that some of the marginal distributions are specified. This relation on the one hand extends the optimal transport as well as the probabilistic graphical model theories, and on the other hand leads to fast algorithms for multi-marginal optimal transport by leveraging the well-developed algorithms in Bayesian inference. Several numerical examples are provided to highlight the results.

preprint2020arXiv

Smart Resource Management for Data Streaming using an Online Bin-packing Strategy

Data stream processing frameworks provide reliable and efficient mechanisms for executing complex workflows over large datasets. A common challenge for the majority of currently available streaming frameworks is efficient utilization of resources. Most frameworks use static or semi-static settings for resource utilization that work well for established use cases but lead to marginal improvements for unseen scenarios. Another pressing issue is the efficient processing of large individual objects such as images and matrices typical for scientific datasets. HarmonicIO has proven to be a good solution for streams of relatively large individual objects, as demonstrated in a benchmark comparison with the Spark and Kafka streaming frameworks. We here present an extension of the HarmonicIO framework based on the online bin-packing algorithm, to allow for efficient utilization of resources. Based on a real world use case from large-scale microscopy pipelines, we compare results of the new system to Spark's auto-scaling mechanism.

preprint2018arXiv

Data-driven nonsmooth optimization

In this work, we consider methods for solving large-scale optimization problems with a possibly nonsmooth objective function. The key idea is to first specify a class of optimization algorithms using a generic iterative scheme involving only linear operations and applications of proximal operators. This scheme contains many modern primal-dual first-order solvers like the Douglas-Rachford and hybrid gradient methods as special cases. Moreover, we show convergence to an optimal point for a new method which also belongs to this class. Next, we interpret the generic scheme as a neural network and use unsupervised training to learn the best set of parameters for a specific class of objective functions while imposing a fixed number of iterations. In contrast to other approaches of "learning to optimize", we present an approach which learns parameters only in the set of convergent schemes. As use cases, we consider optimization problems arising in tomographic reconstruction and image deconvolution, and in particular a family of total variation regularization problems.