Source author record

Tien Mai

Tien Mai appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning econ.EM econ.GN q-fin.EC Applications Artificial Intelligence Computer Science and Game Theory eess.SY Systems and Control

Catalog footprint

What is connected

11works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Joint Binary-Continuous Fractional Programming: Solution Methods and Applications

In this paper, we investigate a class of non-convex sum-of-ratios programs relevant to decision-making in key areas such as product assortment and pricing, and facility location and cost planning. These optimization problems, characterized by both continuous and binary decision variables, are highly non-convex and challenging to solve. To the best of our knowledge, no existing methods can efficiently solve these problems to near-optimality with arbitrary precision. To address this challenge, we propose an innovative approach based on logarithmic transformations and piecewise linear approximation (PWLA) to approximate the nonlinear fractional program as a mixed-integer convex program with arbitrary precision, which can be efficiently solved using cutting plane (CP) or Branch-and-Cut (B&C) procedures. Our method offers several advantages: it allows for a shared set of binary variables to approximate nonlinear terms and employs an optimal set of breakpoints to approximate other non-convex terms in the reformulation, resulting in an approximate model that is minimal in size. Furthermore, we provide a theoretical analysis of the approximation errors associated with the solutions derived from the approximated problem. We demonstrate the applicability of our approach to constrained competitive joint facility location and cost optimization, as well as constrained product assortment and pricing problems. Extensive experiments on instances of varying sizes, comparing our method with several alternatives, including general-purpose solvers and more direct PWLA-based approximations, show that our approach consistently achieves superior performance across all baselines, particularly in large-scale instances.

preprint2025arXiv

Constrained Assortment and Price Optimization under Generalized Nested Logit Models

We study assortment and price optimization under the generalized nested logit (GNL) model, one of the most general and flexible modeling frameworks in discrete choice modeling. Despite its modeling advantages, optimization under GNL is highly challenging: even the pure assortment problem is NP-hard, and existing approaches rely on approximation schemes or are limited to simple cardinality constraints. In this paper, we develop the first exact and near-exact algorithms for constrained assortment and joint assortment--pricing optimization (JAP) under GNL. Our approach reformulates the problem into bilinear and exponential-cone convex programs and exploits convexity, concavity, and submodularity properties to generate strong cutting planes within a Branch-and-Cut framework (B\&C). We further extend this framework to the mixed GNL (MGNL) model, capturing heterogeneous customer segments, and to JAP with discrete prices. For the continuous pricing case, we propose a near-exact algorithm based on piecewise-linear approximation (PWLA) that achieves arbitrarily high precision under general linear constraints. Extensive computational experiments demonstrate that our methods substantially outperform state-of-the-art approximation approaches in both solution quality and scalability. In particular, we are able to solve large-scale instances with up to 1000 products and 20 nests, and to obtain near-optimal solutions for continuous pricing problems with negligible optimality gaps. To the best of our knowledge, this work resolves several open problems in assortment and price optimization under GNL.

preprint2022arXiv

Estimation of Recursive Route Choice Models with Incomplete Trip Observations

This work concerns the estimation of recursive route choice models in the situation that the trip observations are incomplete, i.e., there are unconnected links (or nodes) in the observations. A direct approach to handle this issue would be intractable because enumerating all paths between unconnected links (or nodes) in a real network is typically not possible. We exploit an expectation-maximization (EM) method that allows to deal with the missing-data issue by alternatively performing two steps of sampling the missing segments in the observations and solving maximum likelihood estimation problems. Moreover, observing that the EM method would be expensive, we propose a new estimation method based on the idea that the choice probabilities of unconnected link observations can be exactly computed by solving systems of linear equations. We further design a new algorithm, called as decomposition-composition (DC), that helps reduce the number of systems of linear equations to be solved and speed up the estimation. We compare our proposed algorithms with some standard baselines using a dataset from a real network and show that the DC algorithm outperforms the other approaches in recovering missing information in the observations. Our methods work with most of the recursive route choice models proposed in the literature, including the recursive logit, nested recursive logit, or discounted recursive models.

preprint2022arXiv

Safe Delivery of Critical Services in Areas with Volatile Security Situation via a Stackelberg Game Approach

Vaccine delivery in under-resourced locations with security risks is not just challenging but also life threatening. The current COVID pandemic and the need to vaccinate have added even more urgency to this issue. Motivated by this problem, we propose a general framework to set-up limited temporary (vaccination) centers that balance physical security and desired (vaccine) service coverage with limited resources. We set-up the problem as a Stackelberg game between the centers operator (defender) and an adversary, where the set of centers is not fixed a priori but is part of the decision output. This results in a mixed combinatorial and continuous optimization problem. As part of our scalable approximation of this problem, we provide a fundamental contribution by identifying general duality conditions of switching max and min when both discrete and continuous variables are involved. We perform detailed experiments to show that the solution proposed is scalable in practice.

preprint2022arXiv

Scalable Distributional Robustness in a Class of Non Convex Optimization with Guarantees

Distributionally robust optimization (DRO) has shown lot of promise in providing robustness in learning as well as sample based optimization problems. We endeavor to provide DRO solutions for a class of sum of fractionals, non-convex optimization which is used for decision making in prominent areas such as facility location and security games. In contrast to previous work, we find it more tractable to optimize the equivalent variance regularized form of DRO rather than the minimax form. We transform the variance regularized form to a mixed-integer second order cone program (MISOCP), which, while guaranteeing near global optimality, does not scale enough to solve problems with real world data-sets. We further propose two abstraction approaches based on clustering and stratified sampling to increase scalability, which we then use for real world data-sets. Importantly, we provide near global optimality guarantees for our approach and show experimentally that our solution quality is better than the locally optimal ones achieved by state-of-the-art gradient-based methods. We experimentally compare our different approaches and baselines, and reveal nuanced properties of a DRO solution.

preprint2022arXiv

Weighted Maximum Entropy Inverse Reinforcement Learning

We study inverse reinforcement learning (IRL) and imitation learning (IM), the problems of recovering a reward or policy function from expert's demonstrated trajectories. We propose a new way to improve the learning process by adding a weight function to the maximum entropy framework, with the motivation of having the ability to learn and recover the stochasticity (or the bounded rationality) of the expert policy. Our framework and algorithms allow to learn both a reward (or policy) function and the structure of the entropy terms added to the Markov Decision Processes, thus enhancing the learning procedure. Our numerical experiments using human and simulated demonstrations and with discrete and continuous IRL/IM tasks show that our approach outperforms prior algorithms.

preprint2021arXiv

Robust Entropy-regularized Markov Decision Processes

Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with respect to the state transition probabilities, and the estimation of these probabilities may be inaccurate, we study a robust version of the ER-MDP model, where the stochastic optimal policies are required to be robust with respect to the ambiguity in the underlying transition probabilities. Our work is at the crossroads of two important schemes in reinforcement learning (RL), namely, robust MDP and entropy regularized MDP. We show that essential properties that hold for the non-robust ER-MDP and robust unregularized MDP models also hold in our settings, making the robust ER-MDP problem tractable. We show how our framework and results can be integrated into different algorithmic schemes including value or (modified) policy iteration, which would lead to new robust RL and inverse RL algorithms to handle uncertainties. Analyses on computational complexity and error propagation under conventional uncertainty settings are also provided.

preprint2021arXiv

Submodularity and Local Search Approaches for Maximum Capture Problems under Generalized Extreme Value Models

We study the maximum capture problem in facility location under random utility models, i.e., the problem of seeking to locate new facilities in a competitive market such that the captured user demand is maximized, assuming that each customer chooses among all available facilities according to a random utility maximization model. We employ the generalized extreme value (GEV) family of discrete choice models and show that the objective function in this context is monotonic and submodular. This finding implies that a simple greed heuristic can always guarantee an (1-1/e) approximation solution. We further develop a new algorithm combining a greedy heuristic, a gradient-based local search and an exchanging procedure to efficiently solve the problem. We conduct experiments using instances of difference sizes and under different discrete choice models, and we show that our approach significantly outperforms prior approaches in terms of both returned objective value and CPU time. Our algorithm and theoretical findings can be applied to the maximum capture problems under various random utility models in the literature, including the popular multinomial logit, nested logit, cross nested logit, and the mixed logit models.

preprint2020arXiv

A Relation Analysis of Markov Decision Process Frameworks

We study the relation between different Markov Decision Process (MDP) frameworks in the machine learning and econometrics literatures, including the standard MDP, the entropy and general regularized MDP, and stochastic MDP, where the latter is based on the assumption that the reward function is stochastic and follows a given distribution. We show that the entropy-regularized MDP is equivalent to a stochastic MDP model, and is strictly subsumed by the general regularized MDP. Moreover, we propose a distributional stochastic MDP framework by assuming that the distribution of the reward function is ambiguous. We further show that the distributional stochastic MDP is equivalent to the regularized MDP, in the sense that they always yield the same optimal policies. We also provide a connection between stochastic/regularized MDP and constrained MDP. Our work gives a unified view on several important MDP frameworks, which would lead new ways to interpret the (entropy/general) regularized MDP frameworks through the lens of stochastic rewards and vice-versa. Given the recent popularity of regularized MDP in (deep) reinforcement learning, our work brings new understandings of how such algorithmic schemes work and suggest ideas to develop new ones.

preprint2020arXiv

Generalized Maximum Causal Entropy for Inverse Reinforcement Learning

We consider the problem of learning from demonstrated trajectories with inverse reinforcement learning (IRL). Motivated by a limitation of the classical maximum entropy model in capturing the structure of the network of states, we propose an IRL model based on a generalized version of the causal entropy maximization problem, which allows us to generate a class of maximum entropy IRL models. Our generalized model has an advantage of being able to recover, in addition to a reward function, another expert's function that would (partially) capture the impact of the connecting structure of the states on experts' decisions. Empirical evaluation on a real-world dataset and a grid-world dataset shows that our generalized model outperforms the classical ones, in terms of recovering reward functions and demonstrated trajectories.

preprint2020arXiv

Modeling Route Choice with Real-Time Information: Comparing the Recursive and Non-Recursive Models

We study the routing policy choice problems in a stochastic time-dependent (STD) network. A routing policy is defined as a decision rule applied at the end of each link that maps the realized traffic condition to the decision on the link to take next. Two types of routing policy choice models are formulated with perfect online information (POI): recursive logit model and non-recursive logit model. In the non-recursive model, a choice set of routing policies between an origin-destination (OD) pair is generated, and a probabilistic choice is modeled at the origin, while the choice of the next link at each link is a deterministic execution of the chosen routing policy. In the recursive model, the probabilistic choice of the next link is modeled at each link, following the framework of dynamic discrete choice models. The two models are further compared in terms of computational efficiency in estimation and prediction, and flexibility in systematic utility specification and modeling correlation.

Tien Mai

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Joint Binary-Continuous Fractional Programming: Solution Methods and Applications

Constrained Assortment and Price Optimization under Generalized Nested Logit Models

Estimation of Recursive Route Choice Models with Incomplete Trip Observations

Safe Delivery of Critical Services in Areas with Volatile Security Situation via a Stackelberg Game Approach

Scalable Distributional Robustness in a Class of Non Convex Optimization with Guarantees

Weighted Maximum Entropy Inverse Reinforcement Learning

Robust Entropy-regularized Markov Decision Processes

Submodularity and Local Search Approaches for Maximum Capture Problems under Generalized Extreme Value Models

A Relation Analysis of Markov Decision Process Frameworks

Generalized Maximum Causal Entropy for Inverse Reinforcement Learning

Modeling Route Choice with Real-Time Information: Comparing the Recursive and Non-Recursive Models