Researcher profile

Arto Klami

Arto Klami contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Lagrangian Manifold Monte Carlo on Monge Patches

The efficiency of Markov Chain Monte Carlo (MCMC) depends on how the underlying geometry of the problem is taken into account. For distributions with strongly varying curvature, Riemannian metrics help in efficient exploration of the target distribution. Unfortunately, they have significant computational overhead due to e.g. repeated inversion of the metric tensor, and current geometric MCMC methods using the Fisher information matrix to induce the manifold are in practice slow. We propose a new alternative Riemannian metric for MCMC, by embedding the target distribution into a higher-dimensional Euclidean space as a Monge patch and using the induced metric determined by direct geometric reasoning. Our metric only requires first-order gradient information and has fast inverse and determinants, and allows reducing the computational complexity of individual iterations from cubic to quadratic in the problem dimensionality. We demonstrate how Lagrangian Monte Carlo in this metric efficiently explores the target distributions.

preprint2022arXiv

Neural network for determining an asteroid mineral composition from reflectance spectra

Chemical and mineral compositions of asteroids reflect the formation and history of our Solar System. This knowledge is also important for planetary defence and in-space resource utilisation. We aim to develop a fast and robust neural-network-based method for deriving the mineral modal and chemical compositions of silicate materials from their visible and near-infrared spectra. The method should be able to process raw spectra without significant pre-processing. We designed a convolutional neural network with two hidden layers for the analysis of the spectra, and trained it using labelled reflectance spectra. For the training, we used a dataset that consisted of reflectance spectra of real silicate samples stored in the RELAB and C-Tape databases, namely olivine, orthopyroxene, clinopyroxene, their mixtures, and olivine-pyroxene-rich meteorites. We used the model on two datasets. First, we evaluated the model reliability on a test dataset where we compared the model classification with known compositional reference values. The individual classification results are mostly within 10 percentage-point intervals around the correct values. Second, we classified the reflectance spectra of S-complex (Q-type and V-type, also including A-type) asteroids with known Bus-DeMeo taxonomy classes. The predicted mineral chemical composition of S-type and Q-type asteroids agree with the chemical composition of ordinary chondrites. The modal abundances of V-type and A-type asteroids show a dominant contribution of orthopyroxene and olivine, respectively. Additionally, our predictions of the mineral modal composition of S-type and Q-type asteroids show an apparent depletion of olivine related to the attenuation of its diagnostic absorptions with space weathering. This trend is consistent with previous results of the slower pyroxene response to space weathering relative to olivine.

preprint2022arXiv

Traversing Time with Multi-Resolution Gaussian Process State-Space Models

Gaussian Process state-space models capture complex temporal dependencies in a principled manner by placing a Gaussian Process prior on the transition function. These models have a natural interpretation as discretized stochastic differential equations, but inference for long sequences with fast and slow transitions is difficult. Fast transitions need tight discretizations whereas slow transitions require backpropagating the gradients over long subtrajectories. We propose a novel Gaussian process state-space architecture composed of multiple components, each trained on a different resolution, to model effects on different timescales. The combined model allows traversing time on adaptive scales, providing efficient inference for arbitrarily long sequences with complex dynamics. We benchmark our novel method on semi-synthetic data and on an engine modeling task. In both experiments, our approach compares favorably against its state-of-the-art alternatives that operate on a single time-scale only.

preprint2021arXiv

Reliable Categorical Variational Inference with Mixture of Discrete Normalizing Flows

Variational approximations are increasingly based on gradient-based optimization of expectations estimated by sampling. Handling discrete latent variables is then challenging because the sampling process is not differentiable. Continuous relaxations, such as the Gumbel-Softmax for categorical distribution, enable gradient-based optimization, but do not define a valid probability mass for discrete observations. In practice, selecting the amount of relaxation is difficult and one needs to optimize an objective that does not align with the desired one, causing problems especially with models having strong meaningful priors. We provide an alternative differentiable reparameterization for categorical distribution by composing it as a mixture of discrete normalizing flows. It defines a proper discrete distribution, allows directly optimizing the evidence lower bound, and is less sensitive to the hyperparameter controlling relaxation.

preprint2020arXiv

Flexible Prior Elicitation via the Prior Predictive Distribution

The prior distribution for the unknown model parameters plays a crucial role in the process of statistical inference based on Bayesian methods. However, specifying suitable priors is often difficult even when detailed prior knowledge is available in principle. The challenge is to express quantitative information in the form of a probability distribution. Prior elicitation addresses this question by extracting subjective information from an expert and transforming it into a valid prior. Most existing methods, however, require information to be provided on the unobservable parameters, whose effect on the data generating process is often complicated and hard to understand. We propose an alternative approach that only requires knowledge about the observable outcomes - knowledge which is often much easier for experts to provide. Building upon a principled statistical framework, our approach utilizes the prior predictive distribution implied by the model to automatically transform experts judgements about plausible outcome values to suitable priors on the parameters. We also provide computational strategies to perform inference and guidelines to facilitate practical use.

preprint2020arXiv

Multi-scale Cloud Detection in Remote Sensing Images using a Dual Convolutional Neural Network

Semantic segmentation by convolutional neural networks (CNN) has advanced the state of the art in pixel-level classification of remote sensing images. However, processing large images typically requires analyzing the image in small patches, and hence features that have large spatial extent still cause challenges in tasks such as cloud masking. To support a wider scale of spatial features while simultaneously reducing computational requirements for large satellite images, we propose an architecture of two cascaded CNN model components successively processing undersampled and full resolution images. The first component distinguishes between patches in the inner cloud area from patches at the cloud's boundary region. For the cloud-ambiguous edge patches requiring further segmentation, the framework then delegates computation to a fine-grained model component. We apply the architecture to a cloud detection dataset of complete Sentinel-2 multispectral images, approximately annotated for minimal false negatives in a land use application. On this specific task and data, we achieve a 16\% relative improvement in pixel accuracy over a CNN baseline based on patching.