Source author record

Henry Lam

Henry Lam appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning math.PR Methodology math.OC math.ST Statistics Theory Robotics Computation Data Structures and Algorithms Discrete Mathematics q-fin.PR Social and Information Networks

Catalog footprint

What is connected

28works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Subsampled Ensemble Can Improve Generalization Tail Exponentially

Ensemble learning is a popular technique to improve the accuracy of machine learning models. It traditionally hinges on the rationale that aggregating multiple weak models can lead to better models with lower variance and hence higher stability, especially for discontinuous base learners. In this paper, we provide a new perspective on ensembling. By selecting the most frequently generated model from the base learner when repeatedly applied to subsamples, we can attain exponentially decaying tails for the excess risk, even if the base learner suffers from slow (i.e., polynomial) decay rates. This tail enhancement power of ensembling applies to base learners that have reasonable predictive power to begin with and is stronger than variance reduction in the sense of exhibiting rate improvement. We demonstrate how our ensemble methods can substantially improve out-of-sample performances in a range of numerical examples involving heavy-tailed data or intrinsically slow rates.

preprint2023arXiv

A Distributionally Robust Optimization Framework for Extreme Event Estimation

Conventional methods for extreme event estimation rely on well-chosen parametric models asymptotically justified from extreme value theory (EVT). These methods, while powerful and theoretically grounded, could however encounter a difficult bias-variance tradeoff that exacerbates especially when data size is too small, deteriorating the reliability of the tail estimation. In this paper, we study a framework based on the recently surging literature of distributionally robust optimization. This approach can be viewed as a nonparametric alternative to conventional EVT, by imposing general shape belief on the tail instead of parametric assumption and using worst-case optimization as a resolution to handle the nonparametric uncertainty. We explain how this approach bypasses the bias-variance tradeoff in EVT. On the other hand, we face a conservativeness-variance tradeoff which we describe how to tackle. We also demonstrate computational tools for the involved optimization problems and compare our performance with conventional EVT across a range of numerical examples.

preprint2023arXiv

Propagation of Input Tail Uncertainty in Rare-Event Estimation: A Light versus Heavy Tail Dichotomy

We consider the estimation of small probabilities or other risk quantities associated with rare but catastrophic events. In the model-based literature, much of the focus has been devoted to efficient Monte Carlo computation or analytical approximation assuming the model is accurately specified. In this paper, we study a distinct direction on the propagation of model uncertainty and how it impacts the reliability of rare-event estimates. Specifically, we consider the basic setup of the exceedance of i.i.d. sum, and investigate how the lack of tail information of each input summand can affect the output probability. We argue that heavy-tailed problems are much more vulnerable to input uncertainty than light-tailed problems, reasoned through their large deviations behaviors and numerical evidence. We also investigate some approaches to quantify model errors in this problem using a combination of the bootstrap and extreme value theory, showing some positive outcomes but also uncovering some statistical challenges.

preprint2022arXiv

A Cheap Bootstrap Method for Fast Inference

The bootstrap is a versatile inference method that has proven powerful in many statistical problems. However, when applied to modern large-scale models, it could face substantial computation demand from repeated data resampling and model fitting. We present a bootstrap methodology that uses minimal computation, namely with a resample effort as low as one Monte Carlo replication, while maintaining desirable statistical guarantees. We present the theory of this method that uses a twisted perspective from the standard bootstrap principle. We also present generalizations of this method to nested sampling problems and to a range of subsampling variants, and illustrate how it can be used for fast inference across different estimation problems.

preprint2022arXiv

Evaluating Aleatoric Uncertainty via Conditional Generative Models

Aleatoric uncertainty quantification seeks for distributional knowledge of random responses, which is important for reliability analysis and robustness improvement in machine learning applications. Previous research on aleatoric uncertainty estimation mainly targets closed-formed conditional densities or variances, which requires strong restrictions on the data distribution or dimensionality. To overcome these restrictions, we study conditional generative models for aleatoric uncertainty estimation. We introduce two metrics to measure the discrepancy between two conditional distributions that suit these models. Both metrics can be easily and unbiasedly computed via Monte Carlo simulation of the conditional generative models, thus facilitating their evaluation and training. We demonstrate numerically how our metrics provide correct measurements of conditional distributional discrepancies and can be used to train conditional models competitive against existing benchmarks.

preprint2022arXiv

General Feasibility Bounds for Sample Average Approximation via Vapnik-Chervonenkis Dimension

We investigate the feasibility of sample average approximation (SAA) for general stochastic optimization problems, including two-stage stochastic programming without the relatively complete recourse assumption. Instead of analyzing problems with specific structures, we utilize results from the Vapnik-Chervonenkis (VC) dimension and Probably Approximately Correct learning to provide a general framework that offers explicit feasibility bounds for SAA solutions under minimal structural or distributional assumption. We show that, as long as the hypothesis class formed by the feasbible region has a finite VC dimension, the infeasibility of SAA solutions decreases exponentially with computable rates and explicitly identifiable accompanying constants. We demonstrate how our bounds apply more generally and competitively compared to existing results.

preprint2022arXiv

Prediction Intervals for Simulation Metamodeling

Simulation metamodeling refers to the construction of lower-fidelity models to represent input-output relations using few simulation runs. Stochastic kriging, which is based on Gaussian process, is a versatile and common technique for such a task. However, this approach relies on specific model assumptions and could encounter scalability challenges. In this paper, we study an alternative metamodeling approach using prediction intervals to capture the uncertainty of simulation outputs. We cast the metamodeling task as an empirical constrained optimization framework to train prediction intervals that attain accurate prediction coverage and narrow width. We specifically use neural network to represent these intervals and discuss procedures to approximately solve this optimization problem. We also present an adaptation of conformal prediction tools as another approach to construct prediction intervals for metamodeling. Lastly, we present a validation machinery and show how our method can enjoy a distribution-free finite-sample guarantee on the prediction performance. We demonstrate and compare our proposed approaches with existing methods including stochastic kriging through numerical examples.

preprint2022arXiv

Test Against High-Dimensional Uncertainties: Accelerated Evaluation of Autonomous Vehicles with Deep Importance Sampling

Evaluating the performance of autonomous vehicles (AV) and their complex subsystems to high precision under naturalistic circumstances remains a challenge, especially when failure or dangerous cases are rare. Rarity does not only require an enormous sample size for a naive method to achieve high confidence estimation, but it also causes dangerous underestimation of the true failure rate and it is extremely hard to detect. Meanwhile, the state-of-the-art approach that comes with a correctness guarantee can only compute an upper bound for the failure rate under certain conditions, which could limit its practical uses. In this work, we present Deep Importance Sampling (Deep IS) framework that utilizes a deep neural network to obtain an efficient IS that is on par with the state-of-the-art, capable of reducing the required sample size 43 times smaller than the naive sampling method to achieve 10% relative error and while producing an estimate that is much less conservative. Our high-dimensional experiment estimating the misclassification rate of one of the state-of-the-art traffic sign classifiers further reveals that this efficiency still holds true even when the target is very small, achieving over 600 times efficiency boost. This highlights the potential of Deep IS in providing a precise estimate even against high-dimensional uncertainties.

preprint2021arXiv

Adaptive Importance Sampling for Efficient Stochastic Root Finding and Quantile Estimation

In solving simulation-based stochastic root-finding or optimization problems that involve rare events, such as in extreme quantile estimation, running crude Monte Carlo can be prohibitively inefficient. To address this issue, importance sampling can be employed to drive down the sampling error to a desirable level. However, selecting a good importance sampler requires knowledge of the solution to the problem at hand, which is the goal to begin with and thus forms a circular challenge. We investigate the use of adaptive importance sampling to untie this circularity. Our procedure sequentially updates the importance sampler to reach the optimal sampler and the optimal solution simultaneously, and can be embedded in both sample average approximation and stochastic approximation-type algorithms. Our theoretical analysis establishes strong consistency and asymptotic normality of the resulting estimators. We also demonstrate, via a minimax perspective, the key role of using adaptivity in controlling asymptotic errors. Finally, we illustrate the effectiveness of our approach via numerical experiments.

preprint2021arXiv

Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical Systems

Evaluating the reliability of intelligent physical systems against rare safety-critical events poses a huge testing burden for real-world applications. Simulation provides a useful platform to evaluate the extremal risks of these systems before their deployments. Importance Sampling (IS), while proven to be powerful for rare-event simulation, faces challenges in handling these learning-based systems due to their black-box nature that fundamentally undermines its efficiency guarantee, which can lead to under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the safety-critical event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of an intelligent driving algorithm.

preprint2021arXiv

Learning Prediction Intervals for Regression: Generalization and Calibration

We study the generation of prediction intervals in regression for uncertainty quantification. This task can be formalized as an empirical constrained optimization problem that minimizes the average interval width while maintaining the coverage accuracy across data. We strengthen the existing literature by studying two aspects of this empirical optimization. First is a general learning theory to characterize the optimality-feasibility tradeoff that encompasses Lipschitz continuity and VC-subgraph classes, which are exemplified in regression trees and neural networks. Second is a calibration machinery and the corresponding statistical theory to optimally select the regularization parameter that manages this tradeoff, which bypasses the overfitting issues in previous approaches in coverage attainment. We empirically demonstrate the strengths of our interval generation and calibration algorithms in terms of testing performances compared to existing benchmarks.

preprint2020arXiv

A Distributionally Robust Optimization Approach to the NASA Langley Uncertainty Quantification Challenge

We study a methodology to tackle the NASA Langley Uncertainty Quantification Challenge problem, based on an integration of robust optimization, more specifically a recent line of research known as distributionally robust optimization, and importance sampling in Monte Carlo simulation. The main computation machinery in this integrated methodology boils down to solving sampled linear programs. We will illustrate both our numerical performances and theoretical statistical guarantees obtained via connections to nonparametric hypothesis testing.

preprint2020arXiv

Learning-based Robust Optimization: Procedures and Statistical Guarantees

Robust optimization (RO) is a common approach to tractably obtain safeguarding solutions for optimization problems with uncertain constraints. In this paper, we study a statistical framework to integrate data into RO, based on learning a prediction set using (combinations of) geometric shapes that are compatible with established RO tools, and a simple data-splitting validation step that achieves finite-sample nonparametric statistical guarantees on feasibility. We demonstrate how our required sample size to achieve feasibility at a given confidence level is independent of the dimensions of both the decision space and the probability space governing the stochasticity, and discuss some approaches to improve the objective performances while maintaining these dimension-free statistical feasibility guarantees.

preprint2020arXiv

Parametric Scenario Optimization under Limited Data: A Distributionally Robust Optimization View

We consider optimization problems with uncertain constraints that need to be satisfied probabilistically. When data are available, a common method to obtain feasible solutions for such problems is to impose sampled constraints, following the so-called scenario optimization approach. However, when the data size is small, the sampled constraints may not statistically support a feasibility guarantee on the obtained solution. This paper studies how to leverage parametric information and the power of Monte Carlo simulation to obtain feasible solutions for small-data situations. Our approach makes use of a distributionally robust optimization (DRO) formulation that translates the data size requirement into a Monte Carlo sample size requirement drawn from what we call a generating distribution. We show that, while the optimal choice of this generating distribution is the one eliciting the data or the baseline distribution in a nonparametric divergence-based DRO, it is not necessarily so in the parametric case. Correspondingly, we develop procedures to obtain generating distributions that improve upon these basic choices. We support our findings with several numerical examples.

preprint2020arXiv

Robust Importance Weighting for Covariate Shift

In many learning problems, the training and testing data follow different distributions and a particularly common situation is the \textit{covariate shift}. To correct for sampling biases, most approaches, including the popular kernel mean matching (KMM), focus on estimating the importance weights between the two distributions. Reweighting-based methods, however, are exposed to high variance when the distributional discrepancy is large and the weights are poorly estimated. On the other hand, the alternate approach of using nonparametric regression (NR) incurs high bias when the training size is limited. In this paper, we propose and analyze a new estimator that systematically integrates the residuals of NR with KMM reweighting, based on a control-variate perspective. The proposed estimator can be shown to either strictly outperform or match the best-known existing rates for both KMM and NR, and thus is a robust combination of both estimators. The experiments shows the estimator works well in practice.

preprint2016arXiv

Accelerated Evaluation of Automated Vehicles Safety in Lane Change Scenarios Based on Importance Sampling Techniques

Automated vehicles (AVs) must be evaluated thoroughly before their release and deployment. A widely-used evaluation approach is the Naturalistic-Field Operational Test (N-FOT), which tests prototype vehicles directly on the public roads. Due to the low exposure to safety-critical scenarios, N-FOTs are time-consuming and expensive to conduct. In this paper, we propose an accelerated evaluation approach for AVs. The results can be used to generate motions of the primary other vehicles to accelerate the verification of AVs in simulations and controlled experiments. Frontal collision due to unsafe cut-ins is the target crash type of this paper. Human-controlled vehicles making unsafe lane changes are modeled as the primary disturbance to AVs based on data collected by the University of Michigan Safety Pilot Model Deployment Program. The cut-in scenarios are generated based on skewed statistics of collected human driver behaviors, which generate risky testing scenarios while preserving the statistical information so that the safety benefits of AVs in non-accelerated cases can be accurately estimated. The Cross Entropy method is used to recursively search for the optimal skewing parameters. The frequencies of occurrence of conflicts, crashes and injuries are estimated for a modeled automated vehicle, and the achieved accelerated rate is around 2,000 to 20,000. In other words, in the accelerated simulations, driving for 1,000 miles will expose the AV with challenging scenarios that will take about 2 to 20 million miles of real-world driving to encounter. This technique thus has the potential to reduce greatly the development and validation time for AVs.

preprint2016arXiv

Recovering Best Statistical Guarantees via the Empirical Divergence-based Distributionally Robust Optimization

We investigate the use of distributionally robust optimization (DRO) as a tractable tool to recover the asymptotic statistical guarantees provided by the Central Limit Theorem, for maintaining the feasibility of an expected value constraint under ambiguous probability distributions. We show that using empirically defined Burg-entropy divergence balls to construct the DRO can attain such guarantees. These balls, however, are not reasoned from the standard data-driven DRO framework since by themselves they can have low or even zero probability of covering the true distribution. Rather, their superior statistical performances are endowed by linking the resulting DRO with empirical likelihood and empirical processes. We show that the sizes of these balls can be optimally calibrated using chi-square process excursion. We conduct numerical experiments to support our theoretical findings.

preprint2016arXiv

Sensitivity to Serial Dependency of Input Processes: A Robust Approach

Procedures in assessing the impact of serial dependency on performance analysis are usually built on parametrically specified models. In this paper, we propose a robust, nonparametric approach to carry out this assessment, by computing the worst-case deviation of the performance measure due to arbitrary dependence. The approach is based on optimizations, posited on the model space, that have constraints specifying the level of dependency measured by a nonparametric distance to some nominal i.i.d. input model. We study approximation methods for these optimizations via simulation and analysis-of-variance (ANOVA). Numerical experiments demonstrate how the proposed approach can discover the hidden impacts of dependency beyond those revealed by conventional parametric modeling and correlation studies.

preprint2016arXiv

The Empirical Likelihood Approach to Quantifying Uncertainty in Sample Average Approximation

We study the empirical likelihood approach to construct confidence intervals for the optimal value and the optimality gap of a given solution, henceforth quantify the statistical uncertainty of sample average approximation, for optimization problems with expected value objectives and constraints where the underlying probability distributions are observed via limited data. This approach relies on two distributionally robust optimization problems posited over the uncertain distribution, with a divergence-based uncertainty set that is suitably calibrated to provide asymptotic statistical guarantees.

preprint2015arXiv

A Bayesian Approach for Online Classifier Ensemble

We propose a Bayesian approach for recursively estimating the classifier weights in online learning of a classifier ensemble. In contrast with past methods, such as stochastic gradient descent or online boosting, our approach estimates the weights by recursively updating its posterior distribution. For a specified class of loss functions, we show that it is possible to formulate a suitably defined likelihood function and hence use the posterior distribution as an approximation to the global empirical loss minimizer. If the stream of training data is sampled from a stationary process, we can also show that our approach admits a superior rate of convergence to the expected loss minimizer than is possible with standard stochastic gradient descent. In experiments with real-world datasets, our formulation often performs better than state-of-the-art stochastic gradient descent and online boosting algorithms.

preprint2015arXiv

Robust Sensitivity Analysis for Stochastic Systems

We study a worst-case approach to measure the sensitivity to model misspecification in the performance analysis of stochastic systems. The situation of interest is when only minimal parametric information is available on the form of the true model. Under this setting, we post optimization programs that compute the worst-case performance measures, subject to constraints on the amount of model misspecification measured by Kullback-Leibler (KL) divergence. Our main contribution is the development of infinitesimal approximations for these programs, resulting in asymptotic expansions of their optimal values in terms of the divergence. The coefficients of these expansions can be computed via simulation, and are mathematically derived from the representation of the worst-case models as changes of measure that satisfy a well-defined class of functional fixed point equations.

preprint2014arXiv

From Black-Scholes to Online Learning: Dynamic Hedging under Adversarial Environments

We consider a non-stochastic online learning approach to price financial options by modeling the market dynamic as a repeated game between the nature (adversary) and the investor. We demonstrate that such framework yields analogous structure as the Black-Scholes model, the widely popular option pricing model in stochastic finance, for both European and American options with convex payoffs. In the case of non-convex options, we construct approximate pricing algorithms, and demonstrate that their efficiency can be analyzed through the introduction of an artificial probability measure, in parallel to the so-called risk-neutral measure in the finance literature, even though our framework is completely adversarial. Continuous-time convergence results and extensions to incorporate price jumps are also presented.

preprint2013arXiv

Learning about social learning in MOOCs: From statistical analysis to generative model

We study user behavior in the courses offered by a major Massive Online Open Course (MOOC) provider during the summer of 2013. Since social learning is a key element of scalable education in MOOCs and is done via online discussion forums, our main focus is in understanding forum activities. Two salient features of MOOC forum activities drive our research: 1. High decline rate: for all courses studied, the volume of discussions in the forum declines continuously throughout the duration of the course. 2. High-volume, noisy discussions: at least 30% of the courses produce new discussion threads at rates that are infeasible for students or teaching staff to read through. Furthermore, a substantial portion of the discussions are not directly course-related. We investigate factors that correlate with the decline of activity in the online discussion forums and find effective strategies to classify threads and rank their relevance. Specifically, we use linear regression models to analyze the time series of the count data for the forum activities and make a number of observations, e.g., the teaching staff's active participation in the discussion increases the discussion volume but does not slow down the decline rate. We then propose a unified generative model for the discussion threads, which allows us both to choose efficient thread classifiers and design an effective algorithm for ranking thread relevance. Our ranking algorithm is further compared against two baseline algorithms, using human evaluation from Amazon Mechanical Turk. The authors on this paper are listed in alphabetical order. For media and press coverage, please refer to us collectively, as "researchers from the EDGE Lab at Princeton University, together with collaborators at Boston University and Microsoft Corporation."

preprint2012arXiv

Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified

We prove the first Chernoff-Hoeffding bounds for general nonreversible finite-state Markov chains based on the standard L_1 (variation distance) mixing-time of the chain. Specifically, consider an ergodic Markov chain M and a weight function f: [n] -> [0,1] on the state space [n] of M with mean mu = E_{v <- pi}[f(v)], where pi is the stationary distribution of M. A t-step random walk (v_1,...,v_t) on M starting from the stationary distribution pi has expected total weight E[X] = mu t, where X = sum_{i=1}^t f(v_i). Let T be the L_1 mixing-time of M. We show that the probability of X deviating from its mean by a multiplicative factor of delta, i.e., Pr [ |X - mu t| >= delta mu t ], is at most exp(-Omega(delta^2 mu t / T)) for 0 <= delta <= 1, and exp(-Omega(delta mu t / T)) for delta > 1. In fact, the bounds hold even if the weight functions f_i's for i in [t] are distinct, provided that all of them have the same mean mu. We also obtain a simplified proof for the Chernoff-Hoeffding bounds based on the spectral expansion lambda of M, which is the square root of the second largest eigenvalue (in absolute value) of M tilde{M}, where tilde{M} is the time-reversal Markov chain of M. We show that the probability Pr [ |X - mu t| >= delta mu t ] is at most exp(-Omega(delta^2 (1-lambda) mu t)) for 0 <= delta <= 1, and exp(-Omega(delta (1-lambda) mu t)) for delta > 1. Both of our results extend to continuous-time Markov chains, and to the case where the walk starts from an arbitrary distribution x, at a price of a multiplicative factor depending on the distribution x in the concentration bounds.

preprint2012arXiv

Efficient Rare-event Simulation for Perpetuities

We consider perpetuities of the form D = B_1 exp(Y_1) + B_2 exp(Y_1+Y_2) + ... where the Y_j's and B_j's might be i.i.d. or jointly driven by a suitable Markov chain. We assume that the Y_j's satisfy the so-called Cramer condition with associated root theta_{ast} in (0,infty) and that the tails of the B_j's are appropriately behaved so that D is regularly varying with index theta_{ast}. We illustrate by means of an example that the natural state-independent importance sampling estimator obtained by exponentially tilting the Y_j's according to theta_{ast} fails to provide an efficient estimator (in the sense of appropriately controlling the relative mean squared error as the tail probability of interest gets smaller). Then, we construct estimators based on state-dependent importance sampling that are rigorously shown to be efficient.

preprint2012arXiv

Rare-Event Simulation for Many-Server Queues

We develop rare-event simulation methodology for the analysis of loss events in a many-server loss system under quality-driven regime, focusing on the steady-state loss probability (i.e. fraction of lost customers over arrivals) and the behavior of the whole system leading to loss events. The analysis of these events requires working with the full measure-valued process describing the system. This is the first algorithm that is shown to be asymptotically optimal, in the rare-event simulation context, under the setting of many-server queues involving a full measure-valued descriptor.

preprint2011arXiv

Corrections to the Central Limit Theorem for Heavy-Tailed Probability Densities

Classical Edgeworth expansions provide asymptotic correction terms to the Central Limit Theorem (CLT) up to an order that depends on the number of moments available. In this paper, we provide subsequent correction terms beyond those given by a standard Edgeworth expansion in the general case of regularly varying distributions with diverging moments (beyond the second). The subsequent terms can be expressed in a simple closed form in terms of certain special functions (Dawson's integral and parabolic cylinder functions), and there are qualitative differences depending on whether the number of moments available is even, odd or not an integer, and whether the distributions are symmetric or not. If the increments have an even number of moments, then additional logarithmic corrections must also be incorporated in the expansion parameter. An interesting feature of our correction terms for the CLT is that they become dominant outside the central region and blend naturally with known large-deviation asymptotics when these are applied formally to the spatial scales of the CLT.

preprint2011arXiv

Information Dissemination via Random Walks in d-Dimensional Space

We study a natural information dissemination problem for multiple mobile agents in a bounded Euclidean space. Agents are placed uniformly at random in the $d$-dimensional space $\{-n, ..., n\}^d$ at time zero, and one of the agents holds a piece of information to be disseminated. All the agents then perform independent random walks over the space, and the information is transmitted from one agent to another if the two agents are sufficiently close. We wish to bound the total time before all agents receive the information (with high probability). Our work extends Pettarin et al.'s work (Infectious random walks, arXiv:1007.1604v2, 2011), which solved the problem for $d \leq 2$. We present tight bounds up to polylogarithmic factors for the case $d = 3$. (While our results extend to higher dimensions, for space and readability considerations we provide only the case $d=3$ here.) Our results show the behavior when $d \geq 3$ is qualitatively different from the case $d \leq 2$. In particular, as the ratio between the volume of the space and the number of agents varies, we show an interesting phase transition for three dimensions that does not occur in one or two dimensions.

Henry Lam

What is connected

Connect this record

See the researcher in context

Building this map preview

28 published item(s)

Subsampled Ensemble Can Improve Generalization Tail Exponentially

A Distributionally Robust Optimization Framework for Extreme Event Estimation

Propagation of Input Tail Uncertainty in Rare-Event Estimation: A Light versus Heavy Tail Dichotomy

A Cheap Bootstrap Method for Fast Inference

Evaluating Aleatoric Uncertainty via Conditional Generative Models

General Feasibility Bounds for Sample Average Approximation via Vapnik-Chervonenkis Dimension

Prediction Intervals for Simulation Metamodeling

Test Against High-Dimensional Uncertainties: Accelerated Evaluation of Autonomous Vehicles with Deep Importance Sampling

Adaptive Importance Sampling for Efficient Stochastic Root Finding and Quantile Estimation

Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical Systems

Learning Prediction Intervals for Regression: Generalization and Calibration

A Distributionally Robust Optimization Approach to the NASA Langley Uncertainty Quantification Challenge

Learning-based Robust Optimization: Procedures and Statistical Guarantees

Parametric Scenario Optimization under Limited Data: A Distributionally Robust Optimization View

Robust Importance Weighting for Covariate Shift

Accelerated Evaluation of Automated Vehicles Safety in Lane Change Scenarios Based on Importance Sampling Techniques

Recovering Best Statistical Guarantees via the Empirical Divergence-based Distributionally Robust Optimization

Sensitivity to Serial Dependency of Input Processes: A Robust Approach

The Empirical Likelihood Approach to Quantifying Uncertainty in Sample Average Approximation

A Bayesian Approach for Online Classifier Ensemble

Robust Sensitivity Analysis for Stochastic Systems

From Black-Scholes to Online Learning: Dynamic Hedging under Adversarial Environments

Learning about social learning in MOOCs: From statistical analysis to generative model

Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified

Efficient Rare-event Simulation for Perpetuities

Rare-Event Simulation for Many-Server Queues

Corrections to the Central Limit Theorem for Heavy-Tailed Probability Densities

Information Dissemination via Random Walks in d-Dimensional Space