Source author record

Jinglai Li

Jinglai Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation math.NA Methodology Machine Learning math.OC math.ST Statistics Theory Artificial Intelligence Computation and Language math.PR Multiagent Systems Numerical Analysis

Catalog footprint

What is connected

17works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Segmenting Human-LLM Co-authored Text via Change Point Detection

The rise of large language models (LLMs) has created an urgent need to distinguish between human-written and LLM-generated text to ensure authenticity and societal trust. Existing detectors typically provide a binary classification for an entire passage; however, this is insufficient for human--LLM co-authored text, where the objective is to localize specific segments authored by humans or LLMs. To bridge this gap, we propose algorithms to segment text into human- and LLM-authored pieces. Our key observation is that such a segmentation task is conceptually similar to classical change point detection in time-series analysis. Leveraging this analogy, we adapt change point detection to LLM-generated text detection, develop a weighted algorithm and a generalized algorithm to accommodate heterogeneous detection score variability, and establish the minimax optimality of our procedure. Empirically, we demonstrate the strong performance of our approach against a wide range of existing baselines.

preprint2026arXiv

The Practicality of Normalizing Flow Test-Time Training in Bayesian Inference for Agent-Based Models

Agent-Based Models (ABMs) are gaining great popularity in economics and social science because of their strong flexibility to describe the realistic and heterogeneous decisions and interaction rules between individual agents. In this work, we investigate for the first time the practicality of test-time training (TTT) of deep models such as normalizing flows, in the parameters posterior estimations of ABMs. We propose several practical TTT strategies for fine-tuning the normalizing flow against distribution shifts. Our numerical study demonstrates that TTT schemes are remarkably effective, enabling real-time adjustment of flow-based inference for ABM parameters.

preprint2022arXiv

Ensemble Kalman filter based Sequential Monte Carlo Sampler for sequential Bayesian inference

Many real-world problems require one to estimate parameters of interest, in a Bayesian framework, from data that are collected sequentially in time. Conventional methods for sampling from posterior distributions, such as {Markov Chain Monte Carlo} can not efficiently address such problems as they do not take advantage of the data's sequential structure. To this end, sequential methods which seek to update the posterior distribution whenever a new collection of data become available are often used to solve these types of problems. Two popular choices of sequential method are the Ensemble Kalman filter (EnKF) and the sequential Monte Carlo sampler (SMCS). While EnKF only computes a Gaussian approximation of the posterior distribution, SMCS can draw samples directly from the posterior. Its performance, however, depends critically upon the kernels that are used. In this work, we present a method that constructs the kernels of SMCS using an EnKF formulation, and we demonstrate the performance of the method with numerical examples.

preprint2022arXiv

On multilevel Monte Carlo methods for deterministic and uncertain hyperbolic systems

In this paper, we evaluate the performance of the multilevel Monte Carlo method (MLMC) for deterministic and uncertain hyperbolic systems, where randomness is introduced either in the modeling parameters or in the approximation algorithms. MLMC is a well known variance reduction method widely used to accelerate Monte Carlo (MC) sampling. However, we demonstrate in this paper that for hyperbolic systems, whether MLMC can achieve a real boost turns out to be delicate. The computational costs of MLMC and MC depend on the interplay among the accuracy (bias) and the computational cost of the numerical method for a single sample, as well as the variances of the sampled MLMC corrections or MC solutions. We characterize three regimes for the MLMC and MC performances using those parameters, and show that MLMC may not accelerate MC and can even have a higher cost when the variances of MC solutions and MLMC corrections are of the same order. Our studies are carried out by a few prototype hyperbolic systems: a linear scalar equation, the Euler and shallow water equations, and a linear relaxation model, the above statements are proved analytically in some cases, and demonstrated numerically for the cases of the stochastic hyperbolic equations driven by white noise parameters and Glimm's random choice method for deterministic hyperbolic equations.

preprint2021arXiv

Inverse Gaussian Process regression for likelihood-free inference

In this work we consider Bayesian inference problems with intractable likelihood functions. We present a method to compute an approximate of the posterior with a limited number of model simulations. The method features an inverse Gaussian Process regression (IGPR), i.e., one from the output of a simulation model to the input of it. Within the method, we provide an adaptive algorithm with a tempering procedure to construct the approximations of the marginal posterior distributions. With examples we demonstrate that IGPR has a competitive performance compared to some commonly used algorithms, especially in terms of statistical stability and computational efficiency, while the price to pay is that it can only compute a weighted Gaussian approximation of the marginal posteriors.

preprint2020arXiv

An approximate KLD based experimental design for models with intractable likelihoods

Data collection is a critical step in statistical inference and data science, and the goal of statistical experimental design (ED) is to find the data collection setup that can provide most information for the inference. In this work we consider a special type of ED problems where the likelihoods are not available in a closed form. In this case, the popular information-theoretic Kullback-Leibler divergence (KLD) based design criterion can not be used directly, as it requires to evaluate the likelihood function. To address the issue, we derive a new utility function, which is a lower bound of the original KLD utility. This lower bound is expressed in terms of the summation of two or more entropies in the data space, and thus can be evaluated efficiently via entropy estimation methods. We provide several numerical examples to demonstrate the performance of the proposed method.

preprint2020arXiv

Bayesian optimization with local search

Global optimization finds applications in a wide range of real world problems. The multi-start methods are a popular class of global optimization techniques, which are based on the ideas of conducting local searches at multiple starting points. In this work we propose a new multi-start algorithm where the starting points are determined in a Bayesian optimization framework. Specifically, the method can be understood as to construct a new function by conducting local searches of the original objective function, where the new function attains the same global optima as the original one. Bayesian optimization is then applied to find the global optima of the new local search defined function.

preprint2020arXiv

Maximum conditional entropy Hamiltonian Monte Carlo sampler

The performance of Hamiltonian Monte Carlo (HMC) sampler depends critically on some algorithm parameters such as the total integration time and the numerical integration stepsize. The parameter tuning is particularly challenging when the mass matrix of the HMC sampler is adapted. We propose in this work a Kolmogorov-Sinai entropy (KSE) based design criterion to optimize these algorithm parameters, which can avoid some potential issues in the often used jumping-distance based measures. For near-Gaussian distributions, we are able to derive the optimal algorithm parameters with respect to the KSE criterion analytically. As a byproduct the KSE criterion also provides a theoretical justification for the need to adapt the mass matrix in HMC sampler. Based on the results, we propose an adaptive HMC algorithm, and we then demonstrate the performance of the proposed algorithm with numerical examples.

preprint2016arXiv

A Derivative-Free Trust-Region Algorithm for Reliability-Based Optimization

In this note, we present a derivative-free trust-region (TR) algorithm for reliability based optimization (RBO) problems. The proposed algorithm consists of solving a set of subproblems, in which simple surrogate models of the reliability constraints are constructed and used in solving the subproblems. Taking advantage of the special structure of the RBO problems, we employ a sample reweighting method to evaluate the failure probabilities, which constructs the surrogate for the reliability constraints by performing only a single full reliability evaluation in each iteration. With numerical experiments, we illustrate that the proposed algorithm is competitive against existing methods.

preprint2016arXiv

A hybrid adaptive MCMC algorithm in function spaces

The preconditioned Crank-Nicolson (pCN) method is a Markov Chain Monte Carlo (MCMC) scheme, specifically designed to perform Bayesian inferences in function spaces. Unlike many standard MCMC algorithms, the pCN method can preserve the sampling efficiency under the mesh refinement, a property referred to as being dimension independent. In this work we consider an adaptive strategy to further improve the efficiency of pCN. In particular we develop a hybrid adaptive MCMC method: the algorithm performs an adaptive Metropolis scheme in a chosen finite dimensional subspace, and a standard pCN algorithm in the complement space of the chosen subspace. We show that the proposed algorithm satisfies certain important ergodicity conditions. Finally with numerical examples we demonstrate that the proposed method has competitive performance with existing adaptive algorithms.

preprint2016arXiv

A surrogate accelerated multicanonical Monte Carlo method for uncertainty quantification

In this work we consider a class of uncertainty quantification problems where the system performance or reliability is characterized by a scalar parameter $y$. The performance parameter $y$ is random due to the presence of various sources of uncertainty in the system, and our goal is to estimate the probability density function (PDF) of $y$. We propose to use the multicanonical Monte Carlo (MMC) method, a special type of adaptive importance sampling algorithm, to compute the PDF of interest. Moreover, we develop an adaptive algorithm to construct local Gaussian process surrogates to further accelerate the MMC iterations. With numerical examples we demonstrate that the proposed method can achieve several orders of magnitudes of speedup over the standard Monte Carlo method.

preprint2016arXiv

A TV-Gaussian prior for infinite-dimensional Bayesian inverse problems and its numerical implementations

Many scientific and engineering problems require to perform Bayesian inferences in function spaces, in which the unknowns are of infinite dimension. In such problems, choosing an appropriate prior distribution is an important task. In particular we consider problems where the function to infer is subject to sharp jumps which render the commonly used Gaussian measures unsuitable. On the other hand, the so-called total variation (TV) prior can only be defined in a finite dimensional setting, and does not lead to a well-defined posterior measure in function spaces. In this work we present a TV-Gaussian (TG) prior to address such problems, where the TV term is used to detect sharp jumps of the function, and the Gaussian distribution is used as a reference measure so that it results in a well-defined posterior measure in the function space. We also present an efficient Markov Chain Monte Carlo (MCMC) algorithm to draw samples from the posterior distribution of the TG prior. With numerical examples we demonstrate the performance of the TG prior and the efficiency of the proposed MCMC algorithm.

preprint2016arXiv

An adaptive independence sampler MCMC algorithm for infinite dimensional Bayesian inferences

Many scientific and engineering problems require to perform Bayesian inferences in function spaces, in which the unknowns are of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. In this work we develop an independence sampler based MCMC method for the infinite dimensional Bayesian inferences. We represent the proposal distribution as a mixture of a finite number of specially parametrized Gaussian measures. We show that under the chosen parametrization, the resulting MCMC algorithm is dimension independent. We also design an efficient adaptive algorithm to adjust the parameter values of the mixtures from the previous samples. Finally we provide numerical examples to demonstrate the efficiency and robustness of the proposed method, even for problems with multimodal posterior distributions.

preprint2016arXiv

On an adaptive preconditioned Crank-Nicolson MCMC algorithm for infinite dimensional Bayesian inferences

Many scientific and engineering problems require to perform Bayesian inferences for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we develop an adaptive version of the pCN algorithm, where the covariance operator of the proposal distribution is adjusted based on sampling history to improve the simulation efficiency. We show that the proposed algorithm satisfies an important ergodicity condition under some mild assumptions. Finally we provide numerical examples to demonstrate the performance of the proposed method.

preprint2015arXiv

Gaussian process surrogates for failure detection: a Bayesian experimental design approach

An important task of uncertainty quantification is to identify {the probability of} undesired events, in particular, system failures, caused by various sources of uncertainties. In this work we consider the construction of Gaussian {process} surrogates for failure detection and failure probability estimation. In particular, we consider the situation that the underlying computer models are extremely expensive, and in this setting, determining the sampling points in the state space is of essential importance. We formulate the problem as an optimal experimental design for Bayesian inferences of the limit state (i.e., the failure boundary) and propose an efficient numerical scheme to solve the resulting optimization problem. In particular, the proposed limit-state inference method is capable of determining multiple sampling points at a time, and thus it is well suited for problems where multiple computer simulations can be performed in parallel. The accuracy and performance of the proposed method is demonstrated by both academic and practical examples.

preprint2014arXiv

A note on the Karhunen-Loève expansions for infinite-dimensional Bayesian inverse problems

In this note, we consider the truncated Karhunen-Loève expansion for approximating solutions to infinite dimensional inverse problems. We show that, under certain conditions, the bound of the error between a solution and its finite-dimensional approximation can be estimated without the knowledge of the solution.

preprint2013arXiv

Adaptive construction of surrogates for the Bayesian solution of inverse problems

The Bayesian approach to inverse problems typically relies on posterior sampling approaches, such as Markov chain Monte Carlo, for which the generation of each sample requires one or more evaluations of the parameter-to-observable map or forward model. When these evaluations are computationally intensive, approximations of the forward model are essential to accelerating sample-based inference. Yet the construction of globally accurate approximations for nonlinear forward models can be computationally prohibitive and in fact unnecessary, as the posterior distribution typically concentrates on a small fraction of the support of the prior distribution. We present a new approach that uses stochastic optimization to construct polynomial approximations over a sequence of measures adaptively determined from the data, eventually concentrating on the posterior distribution. The approach yields substantial gains in efficiency and accuracy over prior-based surrogates, as demonstrated via application to inverse problems in partial differential equations.

Jinglai Li

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Segmenting Human-LLM Co-authored Text via Change Point Detection

The Practicality of Normalizing Flow Test-Time Training in Bayesian Inference for Agent-Based Models

Ensemble Kalman filter based Sequential Monte Carlo Sampler for sequential Bayesian inference

On multilevel Monte Carlo methods for deterministic and uncertain hyperbolic systems

Inverse Gaussian Process regression for likelihood-free inference

An approximate KLD based experimental design for models with intractable likelihoods

Bayesian optimization with local search

Maximum conditional entropy Hamiltonian Monte Carlo sampler

A Derivative-Free Trust-Region Algorithm for Reliability-Based Optimization

A hybrid adaptive MCMC algorithm in function spaces

A surrogate accelerated multicanonical Monte Carlo method for uncertainty quantification

A TV-Gaussian prior for infinite-dimensional Bayesian inverse problems and its numerical implementations

An adaptive independence sampler MCMC algorithm for infinite dimensional Bayesian inferences

On an adaptive preconditioned Crank-Nicolson MCMC algorithm for infinite dimensional Bayesian inferences

Gaussian process surrogates for failure detection: a Bayesian experimental design approach

A note on the Karhunen-Loève expansions for infinite-dimensional Bayesian inverse problems

Adaptive construction of surrogates for the Bayesian solution of inverse problems