Source author record

Erick Delage

Erick Delage appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Computer Science and Game Theory math.CO q-fin.PR

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Actor-Critic Algorithm for Dynamic Expectile and CVaR

Optimizing dynamic risk with stochastic policies is challenging in both policy updates and value learning. The former typically requires transition perturbation, while the latter may rely on model-based approaches. To address these challenges, we propose a surrogate policy gradient without transition perturbation under softmax policy parameterization. We further develop model-free value learning methods for dynamic expectile and conditional value-at-risk by leveraging elicitability. Finally, inspired by Expected SARSA and Expected Policy Gradient, a model-free off-policy actor-critic algorithm is constructed. Empirical results in domains with verifiable risk-averse behavior show that our algorithm can learn risk-averse policy and consistently outperforms other existing methods.

preprint2026arXiv

Mitigating optimistic bias in entropic risk estimation and optimization

The entropic risk measure is widely used in high-stakes decision-making across economics, management science, finance, and safety-critical control systems because it captures tail risks associated with uncertain losses. However, when data are limited, the empirical entropic risk estimator, formed by replacing the expectation in the risk measure with a sample average, underestimates true risk. We show that this negative bias grows superlinearly with the standard deviation of the loss for distributions with unbounded right tails. We further demonstrate that several existing bias reduction techniques developed for empirical risk either continue to underestimate entropic risk or substantially overestimate it, potentially leading to overly risky or overly conservative decisions. To address this issue, we develop a parametric bootstrap procedure that is strongly asymptotically consistent and provides a controlled overestimation of entropic risk under mild assumptions. The method first fits a distribution to the data and then estimates the empirical estimator's bias via bootstrapping. We show that the fitted distribution must satisfy only weak regularity conditions, and Gaussian mixture models offer a convenient and flexible choice within this class. As an application, we introduce a distributionally robust optimization model for an insurance contract design problem that incorporates correlations in household losses. We show that selecting regularization parameters using standard cross-validation can lead to substantially higher out-of-sample risk for the insurer if the validation bias is not corrected. Our approach improves performance by recommending higher and more accurate premiums, thereby better reflecting the underlying tail risk.

preprint2022arXiv

A Double-oracle, Logic-based Benders decomposition approach to solve the K-adaptability problem

We propose a novel approach to solve K-adaptability problems with convex objective and constraints and integer first-stage decisions. A logic-based Benders decomposition is applied to handle the first-stage decisions in a master problem, thus the sub-problem becomes a min-max-min robust combinatorial optimization problem that is solved via a double-oracle algorithm that iteratively generates adverse scenarios and recourse decisions and assigns scenarios to K subsets of the decisions by solving p-center problems. Extensions of the proposed approach to handle parameter uncertainty in both the first-stage objective and the second-stage constraints are also provided. We show that the proposed algorithm converges to an optimal solution and terminates in finite number of iterations. Numerical results obtained from experiments on benchmark instances of the adaptive shortest path problem, the regular knapsack problem, and a generic K-adaptability problem demonstrate the performance advantage of the proposed approach when compared to state-of-the-art methods in the literature.

preprint2020arXiv

Equal Risk Pricing and Hedging of Financial Derivatives with Convex Risk Measures

In this paper, we consider the problem of equal risk pricing and hedging in which the fair price of an option is the price that exposes both sides of the contract to the same level of risk. Focusing for the first time on the context where risk is measured according to convex risk measures, we establish that the problem reduces to solving independently the writer and the buyer's hedging problem with zero initial capital. By further imposing that the risk measures decompose in a way that satisfies a Markovian property, we provide dynamic programming equations that can be used to solve the hedging problems for both the case of European and American options. All of our results are general enough to accommodate situations where the risk is measured according to a worst-case risk measure as is typically done in robust optimization. Our numerical study illustrates the advantages of equal risk pricing over schemes that only account for a single party, pricing based on quadratic hedging (i.e. $ε$-arbitrage pricing), or pricing based on a fixed equivalent martingale measure (i.e. Black-Scholes pricing). In particular, the numerical results confirm that when employing an equal risk price both the writer and the buyer end up being exposed to risks that are more similar and on average smaller than what they would experience with the other approaches.

preprint2020arXiv

The value of randomized strategies in distributionally robust risk averse network interdiction games

Conditional Value at Risk (CVaR) is widely used to account for the preferences of a risk-averse agent in the extreme loss scenarios. To study the effectiveness of randomization in interdiction games with an interdictor that is both risk and ambiguity averse, we introduce a distributionally robust network interdiction game where the interdictor randomizes over the feasible interdiction plans in order to minimize the worst-case CVaR of the flow with respect to both the unknown distribution of the capacity of the arcs and his mixed strategy over interdicted arcs. The flow player, on the contrary, maximizes the total flow in the network. By using the budgeted uncertainty set, we control the degree of conservatism in the model and reformulate the interdictor's non-linear problem as a bi-convex optimization problem. For solving this problem to any given optimality level, we devise a spatial branch and bound algorithm that uses the McCormick inequalities and reduced reformulation linearization technique (RRLT) to obtain convex relaxation of the problem. We also develop a column generation algorithm to identify the optimal support of the convex relaxation which is then used in the coordinate descent algorithm to determine the upper bounds. The efficiency and convergence of the spatial branch and bound algorithm is established in the numerical experiments. Further, our numerical experiments show that randomized strategies can have significantly better in-sample and out-of-sample performance than optimal deterministic ones.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint