Source author record

Mengyang Gu

Mengyang Gu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications cond-mat.soft Biological Physics math.ST Methodology physics.chem-ph physics.data-an Statistics Theory

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A framework for modeling and inferring tracer diffusion in crowded environments

Tracer diffusion in crowded environments is central to many biological and soft matter systems, but quantitative frameworks for linking tracer motion to environmental structure remain limited. Here, we study the transport of rigid tracers in suspensions of soft particles and within living cells. Experiments reveal a transition from diffusive to confined motion as the matrix area fraction increases. We develop a minimal simulation that incorporates steric exclusion and hydrodynamic hindrance to reproduce the observed mean-squared displacements (MSDs). Using simulation outputs, we train a parallel partial Gaussian process (PPGP) model that rapidly predicts MSDs from matrix geometric variables, including area fraction, particle size, and polydispersity. The PPGP model accelerates predictions by several orders of magnitude relative to simulation and experiments. Analysis reveals that tracer transport is primarily governed by accessible pore sizes and that distinct global structures can produce indistinguishable MSDs. We find that the minimal model can also capture the MSDs of internalized tracer particles in cells. The framework enables rapid inference of structural properties in crowded environments, including transport in the intracellular environment.

preprint2024arXiv

Sequential Kalman filter for fast online changepoint detection in longitudinal health records

This article introduces the sequential Kalman filter, a computationally scalable approach for online changepoint detection with temporally correlated data. The temporal correlation was not considered in the Bayesian online changepoint detection approach due to the large computational cost. Motivated by detecting COVID-19 infections for dialysis patients from massive longitudinal health records with a large number of covariates, we develop a scalable approach to detect multiple changepoints from correlated data by sequentially stitching Kalman filters of subsequences to compute the joint distribution of the observations, which has linear computational complexity with respect to the number of observations between the last detected changepoint and the current observation at each time point, without approximating the likelihood function. Compared to other online changepoint detection methods, simulated experiments show that our approach is more precise in detecting single or multiple changes in mean, variance, or correlation for temporally correlated data. Furthermore, we propose a new way to integrate classification and changepoint detection approaches that improve the detection delay and accuracy for detecting COVID-19 infection compared to other alternatives.

preprint2022arXiv

Efficient force field and energy emulation through partition of permutationally equivalent atoms

Gaussian process (GP) emulator has been used as a surrogate model for predicting force field and molecular potential, to overcome the computational bottleneck of molecular dynamics simulation. Integrating both atomic force and energy in predictions was found to be more accurate than using energy alone, yet it requires $O((NM)^3)$ computational operations for computing the likelihood function and making predictions, where $N$ is the number of atoms and $M$ is the number of simulated configurations in the training sample, due to the inversion of a large covariance matrix. The large computational need limits its applications to emulating simulation of small molecules. The computational challenge of using both gradient information and function values in GPs was recently noticed in statistics and machine learning communities, where conventional approximation methods, such as the low rank decomposition or sparse approximation, may not work well. Here we introduce a new approach, the atomized force field (AFF) model, that integrates both force and energy in the emulator with many fewer computational operations. The drastic reduction on computation is achieved by utilizing the naturally sparse structure of the covariance satisfying the constraints of the energy conservation and permutation symmetry of atoms. The efficient machine learning algorithm extends the limits of its applications on larger molecules under the same computational budget, with nearly no loss of predictive accuracy. Furthermore, our approach contains uncertainty assessment of predictions of atomic forces and potentials, useful for developing a sequential design over the chemical input space, with almost no increase in computational cost.

preprint2022arXiv

Uncertainty quantification and estimation in differential dynamic microscopy

Differential dynamic microscopy (DDM) is a form of video image analysis that combines the sensitivity of scattering and the direct visualization benefits of microscopy. DDM is broadly useful in determining dynamical properties including the intermediate scattering function for many spatiotemporally correlated systems. Despite its straightforward analysis, DDM has not been fully adopted as a routine characterization tool, largely due to computational cost and lack of algorithmic robustness. We present statistical analysis that quantifies the noise, reduces the computational order and enhances the robustness of DDM analysis. We propagate the image noise through the Fourier analysis, which allows us to comprehensively study the bias in different estimators of model parameters, and we derive a different way to detect whether the bias is negligible. Furthermore, through use of Gaussian process regression (GPR), we find that predictive samples of the image structure function require only around 0.5%-5% of the Fourier transforms of the observed quantities. This vastly reduces computational cost, while preserving information of the quantities of interest, such as quantiles of the image scattering function, for subsequent analysis. The approach, which we call DDM with uncertainty quantification (DDM-UQ), is validated using both simulations and experiments with respect to accuracy and computational efficiency, as compared with conventional DDM and multiple particle tracking. Overall, we propose that DDM-UQ lays the foundation for important new applications of DDM, as well as to high-throughput characterization. We implement the fast computation tool in a new, publicly available MATLAB software package.

preprint2020arXiv

A theoretical framework of the scaled Gaussian stochastic process in prediction and calibration

Model calibration or data inversion is one of fundamental tasks in uncertainty quantification. In this work, we study the theoretical properties of the scaled Gaussian stochastic process (S-GaSP), to model the discrepancy between reality and imperfect mathematical models. We establish the explicit connection between Gaussian stochastic process (GaSP) and S-GaSP through the orthogonal series representation. The predictive mean estimator in the S-GaSP calibration model converges to the reality at the same rate as the GaSP with a suitable choice of the regularization and scaling parameters. We also show the calibrated mathematical model in the S-GaSP calibration converges to the one that minimizes the $L_2$ loss between the reality and mathematical model, whereas the GaSP model with other widely used covariance functions does not have this property. Numerical examples confirm the excellent finite sample performance of our approaches compared to a few recent approaches.