Researcher profile

Minh-Ngoc Tran

Minh-Ngoc Tran contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

A Statistical Recurrent Stochastic Volatility Model for Stock Markets

The Stochastic Volatility (SV) model and its variants are widely used in the financial sector while recurrent neural network (RNN) models are successfully used in many large-scale industrial applications of Deep Learning. Our article combines these two methods in a non-trivial way and proposes a model, which we call the Statistical Recurrent Stochastic Volatility (SR-SV) model, to capture the dynamics of stochastic volatility. The proposed model is able to capture complex volatility effects (e.g., non-linearity and long-memory auto-dependence) overlooked by the conventional SV models, is statistically interpretable and has an impressive out-of-sample forecast performance. These properties are carefully discussed and illustrated through extensive simulation studies and applications to five international stock index datasets: The German stock index DAX30, the Hong Kong stock index HSI50, the France market index CAC40, the US stock market index SP500 and the Canada market index TSX250. An user-friendly software package together with the examples reported in the paper are available at \url{https://github.com/vbayeslab}.

preprint2022arXiv

An Introduction to Quantum Computing for Statisticians and Data Scientists

Quantum computers promise to surpass the most powerful classical supercomputers when it comes to solving many critically important practical problems, such as pharmaceutical and fertilizer design, supply chain and traffic optimization, or optimization for machine learning tasks. Because quantum computers function fundamentally differently from classical computers, the emergence of quantum computing technology will lead to a new evolutionary branch of statistical and data analytics methodologies. This review provides an introduction to quantum computing designed to be accessible to statisticians and data scientists, aiming to equip them with an overarching framework of quantum computing, the basic language and building blocks of quantum algorithms, and an overview of existing quantum applications in statistics and data analysis. Our goal is to enable statisticians and data scientists to follow quantum computing literature relevant to their fields, to collaborate with quantum algorithm designers, and, ultimately, to bring forth the next generation of statistical and data analytics tools.

preprint2022arXiv

Quantum Speedup of Natural Gradient for Variational Bayes

Variational Bayes (VB) is a critical method in machine learning and statistics, underpinning the recent success of Bayesian deep learning. The natural gradient is an essential component of efficient VB estimation, but it is prohibitively computationally expensive in high dimensions. We propose a computationally efficient regression-based method for natural gradient estimation, with convergence guarantees under standard assumptions. The method enables the use of quantum matrix inversion to further speed up VB. We demonstrate that the problem setup fulfills the conditions required for quantum matrix inversion to deliver computational efficiency. The method works with a broad range of statistical models and does not require special-purpose or simplified variational distributions.

preprint2021arXiv

A practical tutorial on Variational Bayes

This tutorial gives a quick introduction to Variational Bayes (VB), also called Variational Inference or Variational Approximation, from a practical point of view. The paper covers a range of commonly used VB methods and an attempt is made to keep the materials accessible to the wide community of data analysis practitioners. The aim is that the reader can quickly derive and implement their first VB algorithm for Bayesian inference with their data analysis problem. An end-user software package in Matlab together with the documentation can be found at https://vbayeslab.github.io/VBLabDocs/

preprint2020arXiv

Identifying relationships between cognitive processes across tasks, contexts, and time

It is commonly assumed that a specific testing occasion (task, design, procedure, etc.) provides insights that generalise beyond that occasion. This assumption is infrequently carefully tested in data. We develop a statistically principled method to directly estimate the correlation between latent components of cognitive processing across tasks, contexts, and time. This method simultaneously estimates individual-participant parameters of a cognitive model at each testing occasion, group-level parameters representing across-participant parameter averages and variances, and across-task correlations. The approach provides a natural way to "borrow" strength across testing occasions, which can increase the precision of parameter estimates across all testing occasions. Two example applications demonstrate that the method is practical in standard designs. The examples, and a simulation study, also provide evidence about the reliability and validity of parameter estimates from the linear ballistic accumulator model. We conclude by highlighting the potential of the parameter-correlation method to provide an "assumption-light" tool for estimating the relatedness of cognitive processes across tasks, contexts, and time.

preprint2020arXiv

New Estimation Approaches for the Hierarchical Linear Ballistic Accumulator Model

The Linear Ballistic Accumulator (Brown & Heathcote, 2008) model is used as a measurement tool to answer questions about applied psychology. The analyses based on this model depend upon the model selected and its estimated parameters. Modern approaches use hierarchical Bayesian models and Markov chain Monte-Carlo (MCMC) methods to estimate the posterior distribution of the parameters. Although there are several approaches available for model selection, they are all based on the posterior samples produced via MCMC, which means that the model selection inference inherits the properties of the MCMC sampler. To improve on current approaches to LBA inference we propose two methods that are based on recent advances in particle MCMC methodology; they are qualitatively different from existing approaches as well as from each other. The first approach is particle Metropolis-within-Gibbs; the second approach is density tempered sequential Monte Carlo. Both new approaches provide very efficient sampling and can be applied to estimate the marginal likelihood, which provides Bayes factors for model selection. The first approach is usually faster. The second approach provides a direct estimate of the marginal likelihood, uses the first approach in its Markov move step and is very efficient to parallelize on high performance computers. The new methods are illustrated by applying them to simulated and real data, and through pseudo code. The code implementing the methods is freely available.

preprint2020arXiv

Spectral Subsampling MCMC for Stationary Time Series

Bayesian inference using Markov Chain Monte Carlo (MCMC) on large datasets has developed rapidly in recent years. However, the underlying methods are generally limited to relatively simple settings where the data have specific forms of independence. We propose a novel technique for speeding up MCMC for time series data by efficient data subsampling in the frequency domain. For several challenging time series models, we demonstrate a speedup of up to two orders of magnitude while incurring negligible bias compared to MCMC on the full dataset. We also propose alternative control variates for variance reduction based on data grouping and coreset constructions.

preprint2020arXiv

Subsampling Sequential Monte Carlo for Static Bayesian Models

We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets.

preprint2020arXiv

The block-Poisson estimator for optimally tuned exact subsampling MCMC

Speeding up Markov Chain Monte Carlo (MCMC) for datasets with many observations by data subsampling has recently received considerable attention. A pseudo-marginal MCMC method is proposed that estimates the likelihood by data subsampling using a block-Poisson estimator. The estimator is a product of Poisson estimators, allowing us to update a single block of subsample indicators in each MCMC iteration so that a desired correlation is achieved between the logs of successive likelihood estimates. This is important since pseudo-marginal MCMC with positively correlated likelihood estimates can use substantially smaller subsamples without adversely affecting the sampling efficiency. The block-Poisson estimator is unbiased but not necessarily positive, so the algorithm runs the MCMC on the absolute value of the likelihood estimator and uses an importance sampling correction to obtain consistent estimates of the posterior mean of any function of the parameters. Our article derives guidelines to select the optimal tuning parameters for our method and shows that it compares very favourably to regular MCMC without subsampling, and to two other recently proposed exact subsampling approaches in the literature.

preprint2010arXiv

Model Selection with the Loss Rank Principle

A key issue in statistics and machine learning is to automatically select the "right" model complexity, e.g., the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle - the Loss Rank Principle (LoRP) - for model selection in regression and classification. It is based on the loss rank, which counts how many other (fictitious) data would be fitted better. LoRP selects the model that has minimal loss rank. Unlike most penalized maximum likelihood variants (AIC, BIC, MDL), LoRP depends only on the regression functions and the loss function. It works without a stochastic noise model, and is directly applicable to any non-parametric regressor, like kNN.

preprint2010arXiv

The Predictive Lasso

We propose a shrinkage procedure for simultaneous variable selection and estimation in generalized linear models (GLMs) with an explicit predictive motivation. The procedure estimates the coefficients by minimizing the Kullback-Leibler divergence of a set of predictive distributions to the corresponding predictive distributions for the full model, subject to an $l_1$ constraint on the coefficient vector. This results in selection of a parsimonious model with similar predictive performance to the full model. Thanks to its similar form to the original lasso problem for GLMs, our procedure can benefit from available $l_1$-regularization path algorithms. Simulation studies and real-data examples confirm the efficiency of our method in terms of predictive performance on future observations.