Source author record

Khai Nguyen

Khai Nguyen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning astro-ph.GA astro-ph.HE gr-qc Computation Computer Vision Graphics Methodology physics.class-ph quant-ph

Catalog footprint

What is connected

9works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bayesian Multiple Multivariate Density-Density Regression

We propose the first approach for multiple multivariate density-density regression (MDDR), making it possible to consider the regression of a multivariate density-valued response on multiple multivariate density-valued predictors. The core idea is to define a fitted distribution using a sliced Wasserstein barycenter (SWB) of push-forwards of the predictors and to quantify deviations from the observed response using the sliced Wasserstein (SW) distance. Regression functions, which map predictors' supports to the response support, and barycenter weights are inferred within a generalized Bayes framework, enabling principled uncertainty quantification without requiring a fully specified likelihood. The inference process can be seen as an instance of an inverse SWB problem. We establish theoretical guarantees, including the stability of the SWB under perturbations of marginals and barycenter weights, sample complexity of the generalized likelihood, and posterior consistency. For practical inference, we introduce a differentiable approximation of the SWB and a smooth reparameterization to handle the simplex constraint on barycenter weights, allowing efficient gradient-based MCMC sampling. We demonstrate MDDR in an application to inference for population-scale single-cell data. Posterior analysis under the MDDR model in this example includes inference on communication between multiple source/sender cell types and a target/receiver cell type. The proposed approach provides accurate fits, reliable predictions, and interpretable posterior estimates of barycenter weights, which can be used to construct sparse cell-cell communication networks.

preprint2023arXiv

Energy-Based Sliced Wasserstein Distance

The sliced Wasserstein (SW) distance has been widely recognized as a statistically effective and computationally efficient metric between two probability measures. A key component of the SW distance is the slicing distribution. There are two existing approaches for choosing this distribution. The first approach is using a fixed prior distribution. The second approach is optimizing for the best distribution which belongs to a parametric family of distributions and can maximize the expected distance. However, both approaches have their limitations. A fixed prior distribution is non-informative in terms of highlighting projecting directions that can discriminate two general probability measures. Doing optimization for the best distribution is often expensive and unstable. Moreover, designing the parametric family of the candidate distribution could be easily misspecified. To address the issues, we propose to design the slicing distribution as an energy-based distribution that is parameter-free and has the density proportional to an energy function of the projected one-dimensional Wasserstein distance. We then derive a novel sliced Wasserstein metric, energy-based sliced Waserstein (EBSW) distance, and investigate its topological, statistical, and computational properties via importance sampling, sampling importance resampling, and Markov Chain methods. Finally, we conduct experiments on point-cloud gradient flow, color transfer, and point-cloud reconstruction to show the favorable performance of the EBSW.

preprint2023arXiv

Markovian Sliced Wasserstein Distances: Beyond Independent Projections

Sliced Wasserstein (SW) distance suffers from redundant projections due to independent uniform random projecting directions. To partially overcome the issue, max K sliced Wasserstein (Max-K-SW) distance ($K\geq 1$), seeks the best discriminative orthogonal projecting directions. Despite being able to reduce the number of projections, the metricity of Max-K-SW cannot be guaranteed in practice due to the non-optimality of the optimization. Moreover, the orthogonality constraint is also computationally expensive and might not be effective. To address the problem, we introduce a new family of SW distances, named Markovian sliced Wasserstein (MSW) distance, which imposes a first-order Markov structure on projecting directions. We discuss various members of MSW by specifying the Markov structure including the prior distribution, the transition distribution, and the burning and thinning technique. Moreover, we investigate the theoretical properties of MSW including topological properties (metricity, weak convergence, and connection to other distances), statistical properties (sample complexity, and Monte Carlo estimation error), and computational properties (computational complexity and memory complexity). Finally, we compare MSW distances with previous SW variants in various applications such as gradient flows, color transfer, and deep generative modeling to demonstrate the favorable performance of MSW.

preprint2022arXiv

Improving Mini-batch Optimal Transport via Partial Transportation

Mini-batch optimal transport (m-OT) has been widely used recently to deal with the memory issue of OT in large-scale applications. Despite their practicality, m-OT suffers from misspecified mappings, namely, mappings that are optimal on the mini-batch level but are partially wrong in the comparison with the optimal transportation plan between the original measures. Motivated by the misspecified mappings issue, we propose a novel mini-batch method by using partial optimal transport (POT) between mini-batch empirical measures, which we refer to as mini-batch partial optimal transport (m-POT). Leveraging the insight from the partial transportation, we explain the source of misspecified mappings from the m-OT and motivate why limiting the amount of transported masses among mini-batches via POT can alleviate the incorrect mappings. Finally, we carry out extensive experiments on various applications such as deep domain adaptation, partial domain adaptation, deep generative model, color transfer, and gradient flow to demonstrate the favorable performance of m-POT compared to current mini-batch methods.

preprint2022arXiv

On Transportation of Mini-batches: A Hierarchical Approach

Mini-batch optimal transport (m-OT) has been successfully used in practical applications that involve probability measures with a very high number of supports. The m-OT solves several smaller optimal transport problems and then returns the average of their costs and transportation plans. Despite its scalability advantage, the m-OT does not consider the relationship between mini-batches which leads to undesirable estimation. Moreover, the m-OT does not approximate a proper metric between probability measures since the identity property is not satisfied. To address these problems, we propose a novel mini-batch scheme for optimal transport, named Batch of Mini-batches Optimal Transport (BoMb-OT), that finds the optimal coupling between mini-batches and it can be seen as an approximation to a well-defined distance on the space of probability measures. Furthermore, we show that the m-OT is a limit of the entropic regularized version of the BoMb-OT when the regularized parameter goes to infinity. Finally, we carry out experiments on various applications including deep generative models, deep domain adaptation, approximate Bayesian computation, color transfer, and gradient flow to show that the BoMb-OT can be widely applied and performs well in various applications.

preprint2022arXiv

Transformer with Fourier Integral Attentions

Multi-head attention empowers the recent success of transformers, the state-of-the-art models that have achieved remarkable success in sequence modeling and beyond. These attention mechanisms compute the pairwise dot products between the queries and keys, which results from the use of unnormalized Gaussian kernels with the assumption that the queries follow a mixture of Gaussian distribution. There is no guarantee that this assumption is valid in practice. In response, we first interpret attention in transformers as a nonparametric kernel regression. We then propose the FourierFormer, a new class of transformers in which the dot-product kernels are replaced by the novel generalized Fourier integral kernels. Different from the dot-product kernels, where we need to choose a good covariance matrix to capture the dependency of the features of data, the generalized Fourier integral kernels can automatically capture such dependency and remove the need to tune the covariance matrix. We theoretically prove that our proposed Fourier integral kernels can efficiently approximate any key and query distributions. Compared to the conventional transformers with dot-product attention, FourierFormers attain better accuracy and reduce the redundancy between attention heads. We empirically corroborate the advantages of FourierFormers over the baseline transformers in a variety of practical applications including language modeling and image classification.

preprint2020arXiv

Emission Signatures from Sub-parsec Binary Supermassive Black Holes III: Comparison of Models with Observations

We present a method for comparing the H$β$ emission-line profiles of observed supermassive black hole (SBHB) candidates and models of sub-parsec SBHBs in circumbinary disks. Using the approach based on principal component analysis we infer the values of the binary parameters for the spectroscopic SBHB candidates and evaluate the parameter degeneracies, representative of the uncertainties intrinsic to such measurements. We find that as a population, the SBHB candidates favor the average value of the semimajor axis corresponding to $\log(a/M) \approx 4.20\pm 0.42$ and comparable mass ratios, $q>0.5$. If the SBHB candidates considered are true binaries, this result would suggest that there is a physical process that allows initially unequal mass systems to evolve toward comparable mass ratios (e.g., accretion that occurs preferentially onto the smaller of the black holes) or point to some, yet unspecified, selection bias. Our method also indicates that the SBHB candidates equally favor configurations in which the mini-disks are coplanar or misaligned with the binary orbital plane. If confirmed for true SBHBs, this finding would indicate the presence of a physical mechanism that maintains misalignment of the mini-disks down to sub-parsec binary separations (e.g., precession driven by gravitational torques). The probability distributions of the SBHB parameters inferred for the observed SBHB candidates and our control group of AGNs are statistically indistinguishable, implying that this method can in principle be used to interpret the observed emission-line profiles once a sample of confirmed SBHBs is available but cannot be used as a conclusive test of binarity.

preprint2016arXiv

Emission Signatures from Sub-parsec Binary Supermassive Black Holes I: Diagnostic Power of Broad Emission Lines

Motivated by advances in observational searches for sub-parsec supermassive black hole binaries (SBHBs) made in the past few years we develop a semi-analytic model to describe spectral emission line signatures of these systems. The goal of this study is to aid the interpretation of spectroscopic searches for binaries and help test one of the leading models of binary accretion flows in the literature: SBHB in a circumbinary disk. In this work we present the methodology and a comparison of the preliminary model with the data. We model SBHB accretion flows as a set of three accretion disks: two mini-disks that are gravitationally bound to the individual black holes and a circumbinary disk. Given a physically motivated parameter space occupied by sub-parsec SBHBs, we calculate a synthetic database of nearly 15 million broad optical emission line profiles and explore the dependence of the profile shapes on characteristic properties of SBHBs. We find that the modeled profiles show distinct statistical properties as a function of the semi-major axis, mass ratio, eccentricity of the binary, and the degree of alignment of the triple disk system. This suggests that the broad emission line profiles from SBHB systems can in principle be used to infer the distribution of these parameters and as such merit further investigation. Calculated profiles are more morphologically heterogeneous than the broad emission lines in observed SBHB candidates and we discuss improved treatment of radiative transfer effects which will allow direct statistical comparison of the two groups.

preprint2011arXiv

Derivation of the Aharanov Bohm phase shift using classical forces

In 1959 Aharonov and Bohm suggested that an electron passing around a long solenoid would pick up a phase shift dependent on the magnetic field of the solenoid, even though the electrons themselves pass through a region of space which has a zero magnetic field. It has long been held that this result is purely quantum and is derived in many well known quantum mechanics text books using the Schrodinger equation and vector potential. Here the same phase shift is derived from a purely classical force, but relativistic transformations are taken into account. The force is in the direction of motion of the electron (or opposite) leading to a phase advance (or lag) and we obtain precisely the phase shift thought previously to be purely quantum. The only quantum result used here is the de Broglie wavelength of the particle, in order to get two slit like interference and the phase shift. We employ a stack of dipoles as the solenoid and note the same force on the electron in two different frames of reference. We shall consider the solenoid stationary and the electron moving, and then consider the electron rest frame and consider the solenoid moving in the opposite direction.

Khai Nguyen

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Bayesian Multiple Multivariate Density-Density Regression

Energy-Based Sliced Wasserstein Distance

Markovian Sliced Wasserstein Distances: Beyond Independent Projections

Improving Mini-batch Optimal Transport via Partial Transportation

On Transportation of Mini-batches: A Hierarchical Approach

Transformer with Fourier Integral Attentions

Emission Signatures from Sub-parsec Binary Supermassive Black Holes III: Comparison of Models with Observations

Emission Signatures from Sub-parsec Binary Supermassive Black Holes I: Diagnostic Power of Broad Emission Lines

Derivation of the Aharanov Bohm phase shift using classical forces