Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
15works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

15 published item(s)

preprint2024arXiv

Proven Distributed Memory Parallelization of Particle Methods

We provide a mathematically proven parallelization scheme for particle methods on distributed-memory computer systems. Particle methods are a versatile and widely used class of algorithms for computer simulations and numerical predictions in various applications, ranging from continuum fluid dynamics and granular flows, using methods such as Smoothed Particle Hydrodynamics (SPH) and Discrete Element Methods (DEM) to Molecular Dynamics (MD) simulations in molecular modeling. Particle methods naturally lend themselves to implementation on parallel-computing hardware. So far, however, a mathematical proof of correctness and equivalence to sequential implementations was only available for shared-memory parallelism. Here, we leverage a formal definition of the algorithmic class of particle methods to provide a proven parallelization scheme for distributed-memory computers. We prove that these parallelized particle methods on distributed memory computers are formally equivalent to their sequential counterpart for a well-defined class of particle methods. Notably, the here analyzed parallelization scheme is well-known and commonly used. Our analysis is, therefore, of immediate practical relevance to existing and new parallel software implementations of particle methods and places them on solid theoretical grounds.

preprint2022arXiv

Learning deterministic hydrodynamic equations from stochastic active particle dynamics

We present a principled data-driven strategy for learning deterministic hydrodynamic models directly from stochastic non-equilibrium active particle trajectories. We apply our method to learning a hydrodynamic model for the propagating density lanes observed in self-propelled particle systems and to learning a continuum description of cell dynamics in epithelial tissues. We also infer from stochastic particle trajectories the latent phoretic fields driving chemotaxis. This demonstrates that statistical learning theory combined with physical priors can enable discovery of multi-scale models of non-equilibrium stochastic processes characteristic of collective movement in living systems.

preprint2021arXiv

Parallel Discrete Convolutions on Adaptive Particle Representations of Images

We present data structures and algorithms for native implementations of discrete convolution operators over Adaptive Particle Representations (APR) of images on parallel computer architectures. The APR is a content-adaptive image representation that locally adapts the sampling resolution to the image signal. It has been developed as an alternative to pixel representations for large, sparse images as they typically occur in fluorescence microscopy. It has been shown to reduce the memory and runtime costs of storing, visualizing, and processing such images. This, however, requires that image processing natively operates on APRs, without intermediately reverting to pixels. Designing efficient and scalable APR-native image processing primitives, however, is complicated by the APR's irregular memory structure. Here, we provide the algorithmic building blocks required to efficiently and natively process APR images using a wide range of algorithms that can be formulated in terms of discrete convolutions. We show that APR convolution naturally leads to scale-adaptive algorithms that efficiently parallelize on multi-core CPU and GPU architectures. We quantify the speedups in comparison to pixel-based algorithms and convolutions on evenly sampled data. We achieve pixel-equivalent throughputs of up to 1 TB/s on a single Nvidia GeForce RTX 2080 gaming GPU, requiring up to two orders of magnitude less memory than a pixel-based implementation.

preprint2021arXiv

STENCIL-NET: Data-driven solution-adaptive discretization of partial differential equations

Numerical methods for approximately solving partial differential equations (PDE) are at the core of scientific computing. Often, this requires high-resolution or adaptive discretization grids to capture relevant spatio-temporal features in the PDE solution, e.g., in applications like turbulence, combustion, and shock propagation. Numerical approximation also requires knowing the PDE in order to construct problem-specific discretizations. Systematically deriving such solution-adaptive discrete operators, however, is a current challenge. Here we present STENCIL-NET, an artificial neural network architecture for data-driven learning of problem- and resolution-specific local discretizations of nonlinear PDEs. STENCIL-NET achieves numerically stable discretization of the operators in an unknown nonlinear PDE by spatially and temporally adaptive parametric pooling on regular Cartesian grids, and by incorporating knowledge about discrete time integration. Knowing the actual PDE is not necessary, as solution data is sufficient to train the network to learn the discrete operators. A once-trained STENCIL-NET model can be used to predict solutions of the PDE on larger spatial domains and for longer times than it was trained for, hence addressing the problem of PDE-constrained extrapolation from data. To support this claim, we present numerical experiments on long-term forecasting of chaotic PDE solutions on coarse spatio-temporal grids. We also quantify the speed-up achieved by substituting base-line numerical methods with equation-free STENCIL-NET predictions on coarser grids with little compromise on accuracy.

preprint2020arXiv

A robustness measure for singular point and index estimation in discretized orientation and vector fields

The identification of singular points or topological defects in discretized vector fields occurs in diverse areas ranging from the polarization of the cosmic microwave background to liquid crystals to fingerprint recognition and bio-medical imaging. Due to their discrete nature, defects and their topological charge cannot depend continuously on each single vector, but they discontinuously change as soon as a vector changes by more than a threshold. Considering this threshold of admissible change at the level of vectors, we develop a robustness measure for discrete defect estimators. Here, we compare different template paths for defect estimation in discretized vector or orientation fields. Sampling prototypical vector field patterns around defects shows that the robustness increases with the length of template path, but less so in the presence of noise on the vectors. We therefore find an optimal trade-off between resolution and robustness against noise for relatively small templates, except for the "single pixel" defect analysis, which cannot exclude zero robustness. The presented robustness measure paves the way for uncertainty quantification of defects in discretized vector fields.

preprint2020arXiv

Bionic Tracking: Using Eye Tracking to Track Biological Cells in Virtual Reality

We present Bionic Tracking, a novel method for solving biological cell tracking problems with eye tracking in virtual reality using commodity hardware. Using gaze data, and especially smooth pursuit eye movements, we are able to track cells in time series of 3D volumetric datasets. The problem of tracking cells is ubiquitous in developmental biology, where large volumetric microscopy datasets are acquired on a daily basis, often comprising hundreds or thousands of time points that span hours or days. The image data, however, is only a means to an end, and scientists are often interested in the reconstruction of cell trajectories and cell lineage trees. Reliably tracking cells in crowded three-dimensional space over many timepoints remains an open problem, and many current approaches rely on tedious manual annotation and curation. In our Bionic Tracking approach, we substitute the usual 2D point-and-click annotation to track cells with eye tracking in a virtual reality headset, where users simply have to follow a cell with their eyes in 3D space in order to track it. We detail the interaction design of our approach and explain the graph-based algorithm used to connect different time points, also taking occlusion and user distraction into account. We demonstrate our cell tracking method using the example of two different biological datasets. Finally, we report on a user study with seven cell tracking experts, demonstrating the benefits of our approach over manual point-and-click tracking.

preprint2020arXiv

Multivariate Newton Interpolation

For $m,n \in \mathbb{N}$, $m\geq 1$ and a given function $f : \mathbb{R}^m\longrightarrow \mathbb{R}$, the polynomial interpolation problem (PIP) is to determine a unisolvent node set $P_{m,n} \subseteq \mathbb{R}^m$ of $N(m,n):=|P_{m,n}|=\binom{m+n}{n}$ points and the uniquely defined polynomial $Q_{m,n,f}\in Π_{m,n}$ in $m$ variables of degree $\mathrm{deg}(Q_{m,n,f})\leq n \in \mathbb{N}$ that fits $f$ on $P_{m,n}$, i.e., $Q_{m,n,f}(p) = f(p)$, $\forall\, p \in P_{m,n}$. For $m=1$ the solution to the PIP is well known. In higher dimensions, however, no closed framework was available. We here present a generalization of the classic Newton interpolation from one-dimensional to arbitrary-dimensional spaces. Further we formulate an algorithm, termed PIP-SOLVER, based on a multivariate divided difference scheme that computes the solution $Q_{m,n,f}$ in $\mathcal{O}\big(N(m,n)^2\big)$ time using $\mathcal{O}\big(mN(m,n)\big)$ memory. Further, we introduce unisolvent Newton-Chebyshev nodes and show that these nodes avoid Runge's phenomenon in the sense that arbitrary periodic Sobolev functions $f \in H^k(Ω,\mathbb{R}) \subsetneq C^0(Ω,\mathbb{R})$, $Ω=[-1,1]^m$ of regularity $k >m/2$ can be uniformly approximated, i.e., $ \lim_{n\rightarrow \infty}||\,f -Q_{m,n,f} \,||_{C^0(Ω)}= 0$. Numerical experiments demonstrate the computational performance and approximation accuracy of the PIP-SOLVER in practice. We expect the presented results to be relevant for many applications, including numerical solvers, quadrature, non-linear optimization, polynomial regression, adaptive sampling, Bayesian inference, and spectral analysis.

preprint2020arXiv

scenery: Flexible Virtual Reality Visualization on the Java VM

Life science today involves computational analysis of a large amount and variety of data, such as volumetric data acquired by state-of-the-art microscopes, or mesh data from analysis of such data or simulations. Visualization is often the first step in making sense of data, and a crucial part of building and debugging analysis pipelines. It is therefore important that visualizations can be quickly prototyped, as well as developed or embedded into full applications. In order to better judge spatiotemporal relationships, immersive hardware, such as Virtual or Augmented Reality (VR/AR) headsets and associated controllers are becoming invaluable tools. In this work we introduce scenery, a flexible VR/AR visualization framework for the Java VM that can handle mesh and large volumetric data, containing multiple views, timepoints, and color channels. scenery is free and open-source software, works on all major platforms, and uses the Vulkan or OpenGL rendering APIs. We introduce scenery's main features and example applications, such as its use in VR for microscopy, in the biomedical image analysis software Fiji, or for visualizing agent-based simulations.

preprint2014arXiv

Particle methods enable fast and simple approximation of Sobolev gradients in image segmentation

Bio-image analysis is challenging due to inhomogeneous intensity distributions and high levels of noise in the images. Bayesian inference provides a principled way for regularizing the problem using prior knowledge. A fundamental choice is how one measures "distances" between shapes in an image. It has been shown that the straightforward geometric L2 distance is degenerate and leads to pathological situations. This is avoided when using Sobolev gradients, rendering the segmentation problem less ill-posed. The high computational cost and implementation overhead of Sobolev gradients, however, have hampered practical applications. We show how particle methods as applied to image segmentation allow for a simple and computationally efficient implementation of Sobolev gradients. We show that the evaluation of Sobolev gradients amounts to particle-particle interactions along the contour in an image. We extend an existing particle-based segmentation algorithm to using Sobolev gradients. Using synthetic and real-world images, we benchmark the results for both 2D and 3D images using piecewise smooth and piecewise constant region models. The present particle approximation of Sobolev gradients is 2.8 to 10 times faster than the previous reference implementation, but retains the known favorable properties of Sobolev gradients. This speedup is achieved by using local particle-particle interactions instead of solving a global Poisson equation at each iteration. The computational time per iteration is higher for Sobolev gradients than for L2 gradients. Since Sobolev gradients precondition the optimization problem, however, a smaller number of overall iterations may be necessary for the algorithm to converge, which can in some cases amortize the higher per-iteration cost.

preprint2014arXiv

Piecewise Constant Sequential Importance Sampling for Fast Particle Filtering

Particle filters are key algorithms for object tracking under non-linear, non-Gaussian dynamics. The high computational cost of particle filters, however, hampers their applicability in cases where the likelihood model is costly to evaluate, or where large numbers of particles are required to represent the posterior. We introduce the approximate sequential importance sampling/resampling (ASIR) algorithm, which aims at reducing the cost of traditional particle filters by approximating the likelihood with a mixture of uniform distributions over pre-defined cells or bins. The particles in each bin are represented by a dummy particle at the center of mass of the original particle distribution and with a state vector that is the average of the states of all particles in the same bin. The likelihood is only evaluated for the dummy particles, and the resulting weight is identically assigned to all particles in the bin. We derive upper bounds on the approximation error of the so-obtained piecewise constant function representation, and analyze how bin size affects tracking accuracy and runtime. Further, we show numerically that the ASIR approximation error converges to that of sequential importance sampling/resampling (SIR) as the bin size is decreased. We present a set of numerical experiments from the field of biological image processing and tracking that demonstrate ASIR's capabilities. Overall, we consider ASIR a promising candidate for simple, fast particle filtering in generic applications.

preprint2014arXiv

PPF - A Parallel Particle Filtering Library

We present the parallel particle filtering (PPF) software library, which enables hybrid shared-memory/distributed-memory parallelization of particle filtering (PF) algorithms combining the Message Passing Interface (MPI) with multithreading for multi-level parallelism. The library is implemented in Java and relies on OpenMPI's Java bindings for inter-process communication. It includes dynamic load balancing, multi-thread balancing, and several algorithmic improvements for PF, such as input-space domain decomposition. The PPF library hides the difficulties of efficient parallel programming of PF algorithms and provides application developers with the necessary tools for parallel implementation of PF methods. We demonstrate the capabilities of the PPF library using two distributed PF algorithms in two scenarios with different numbers of particles. The PPF library runs a 38 million particle problem, corresponding to more than 1.86 GB of particle data, on 192 cores with 67% parallel efficiency. To the best of our knowledge, the PPF library is the first open-source software that offers a parallel framework for PF applications.

preprint2013arXiv

Adaptive Distributed Resampling Algorithm with Non-Proportional Allocation

The distributed resampling algorithm with proportional allocation (RNA) is key to implementing particle filtering applications on parallel computer systems. We extend the original work by Bolic et al. by introducing an adaptive RNA (ARNA) algorithm, improving RNA by dynamically adjusting the particle-exchange ratio and randomizing the process ring topology. This improves the runtime performance of ARNA by about 9% over RNA with 10% particle exchange. ARNA also significantly improves the speed at which information is shared between processing elements, leading to about 20-fold faster convergence. The ARNA algorithm requires only a few modifications to the original RNA, and is hence easy to implement.

preprint2013arXiv

Balanced offline allocation of weighted balls into bins

We propose a sorting-based greedy algorithm called SortedGreedy[m] for approximately solving the offline version of the d-choice weighted balls-into-bins problem where the number of choices for each ball is equal to the number of bins. We assume the ball weights to be non-negative. We compare the performance of the sorting-based algorithm with a naive algorithm called Greedy[m]. We show that by sorting the input data according to the weights we are able to achieve an order of magnitude smaller gap (the weight difference between the heaviest and the lightest bin) for small problems (<= 4000 balls), and at least two orders of magnitude smaller gap for larger problems. In practice, SortedGreedy[m] runs almost as fast as Greedy[m]. This makes sorting-based algorithms favorable for solving offline weighted balls-into-bins problems.

preprint2013arXiv

Balancing indivisible real-valued loads in arbitrary networks

In parallel computing, a problem is divided into a set of smaller tasks that are distributed across multiple processing elements. Balancing the load of the processing elements is key to achieving good performance and scalability. If the computational costs of the individual tasks vary over time in an unpredictable way, dynamic load balancing aims at migrating them between processing elements so as to maintain load balance. During dynamic load balancing, the tasks amount to indivisible work packets with a real-valued cost. For this case of indivisible, real- valued loads, we analyze the balancing circuit model, a local dynamic load-balancing scheme that does not require global communication. We extend previous analyses to the present case and provide a probabilistic bound for the achievable load balance. Based on an analogy with the offline balls-into-bins problem, we further propose a novel algorithm for dynamic balancing of indivisible, real-valued loads. We benchmark the proposed algorithm in numerical experiments and compare it with the classical greedy algorithm, both in terms of solution quality and communication cost. We find that the increased communication cost of the proposed algorithm is compensated by a higher solution quality, leading on average to about an order of magnitude gain in overall performance.

preprint2011arXiv

Global parameter identification of stochastic reaction networks from single trajectories

We consider the problem of inferring the unknown parameters of a stochastic biochemical network model from a single measured time-course of the concentration of some of the involved species. Such measurements are available, e.g., from live-cell fluorescence microscopy in image-based systems biology. In addition, fluctuation time-courses from, e.g., fluorescence correlation spectroscopy provide additional information about the system dynamics that can be used to more robustly infer parameters than when considering only mean concentrations. Estimating model parameters from a single experimental trajectory enables single-cell measurements and quantification of cell--cell variability. We propose a novel combination of an adaptive Monte Carlo sampler, called Gaussian Adaptation, and efficient exact stochastic simulation algorithms that allows parameter identification from single stochastic trajectories. We benchmark the proposed method on a linear and a non-linear reaction network at steady state and during transient phases. In addition, we demonstrate that the present method also provides an ellipsoidal volume estimate of the viable part of parameter space and is able to estimate the physical volume of the compartment in which the observed reactions take place.