Source author record

Eugene A. Feinberg

Eugene A. Feinberg appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC math.PR math.GN Artificial Intelligence Hardware Architecture math.CA math.FA

Catalog footprint

What is connected

22works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Semi-Uniform Feller Stochastic Kernels

This paper studies transition probabilities from a Borel subset of a Polish space to a product of two Borel subsets of Polish spaces. For such transition probabilities it introduces and studies the property of semi-uniform Feller continuity. This paper provides several equivalent definitions of semi-uniform Feller continuity and establishes its preservation under integration. The motivation for this study came from the theory of Markov decision processes with incomplete information, and this paper provides fundamental results useful for this theory.

preprint2022arXiv

Continuity of Discounted Values and the Structure of Optimal Policies for Periodic-Review Inventory Control with Setup Costs

This paper proves continuity of value functions in discounted periodic-review single-commodity total-cost inventory control problems with \revision{continuous inventory levels,} fixed ordering costs, possibly bounded inventory storage capacity, and possibly bounded order sizes for finite and infinite horizons. In each of these constrained models, the finite and infinite-horizon value functions are continuous, there exist deterministic Markov optimal finite-horizon policies, and there exist stationary deterministic Markov optimal infinite-horizon policies. For models with bounded inventory storage and unbounded order sizes, this paper also characterizes the conditions under which $(s_t, S_t)$ policies are optimal in the finite horizon and an $(s,S)$ policy is optimal in the infinite horizon.

preprint2022arXiv

Epi-Convergence of Expectation Functions under Varying Measures and Integrands

For expectation functions on metric spaces, we provide sufficient conditions for epi-convergence under varying probability measures and integrands, and examine applications in the area of sieve estimators, mollifier smoothing, PDE-constrained optimization, and stochastic optimization with expectation constraints. As a stepping stone to epi-convergence of independent interest, we develop parametric Fatou's lemmas under mild integrability assumptions. In the setting of Suslin metric spaces, the assumptions are expressed in terms of Pasch-Hausdorff envelopes. For general metric spaces, the assumptions shift to semicontinuity of integrands also on the sample space, which then is assumed to be a metric space.

preprint2022arXiv

Markov Decision Processes with Incomplete Information and Semi-Uniform Feller Transition Probabilities

This paper deals with control of partially observable discrete-time stochastic systems. It introduces and studies Markov Decision Processes with Incomplete Information and with semi-uniform Feller transition probabilities. The important feature of these models is that their classic reduction to Completely Observable Markov Decision Processes with belief states preserves semi-uniform Feller continuity of transition probabilities. Under mild assumptions on cost functions, optimal policies exist, optimality equations hold, and value iterations converge to optimal values for these models. In particular, for Partially Observable Markov Decision Processes the results of this paper imply new and generalize several known sufficient conditions on transition and observation probabilities for weak continuity of transition probabilities for Markov Decision Processes with belief states, the existence of optimal policies, validity of optimality equations defining optimal policies, and convergence of value iterations to optimal values.

preprint2020arXiv

Strong Polynomiality of the Value Iteration Algorithm for Computing Nearly Optimal Policies for Discounted Dynamic Programming

This note provides upper bounds on the number of operations required to compute by value iterations a nearly optimal policy for an infinite-horizon discounted Markov decision process with a finite number of states and actions. For a given discount factor, magnitude of the reward function, and desired closeness to optimality, these upper bounds are strongly polynomial in the number of state-action pairs, and one of the provided upper bounds has the property that it is a non-decreasing function of the value of the discount factor.

preprint2020arXiv

Sufficiency of Markov Policies for Continuous-Time Jump Markov Decision Processes

This paper extends to Continuous-Time Jump Markov Decision Processes (CTJMDP) the classic result for Markov Decision Processes stating that, for a given initial state distribution, for every policy there is a (randomized) Markov policy, which can be defined in a natural way, such that at each time instance the marginal distributions of state-action pairs for these two policies coincide. It is shown in this paper that this equality takes place for a CTJMDP if the corresponding Markov policy defines a nonexplosive jump Markov process. If this Markov process is explosive, then at each time instance the marginal probability, that a state-action pair belongs to a measurable set of state-action pairs, is not greater for the described Markov policy than the same probability for the original policy. These results are used in this paper to prove that for expected discounted total costs and for average costs per unit time, for a given initial state distribution, for each policy for a CTJMDP the described a Markov policy has the same or better performance.

preprint2016arXiv

Kolmogorov's Equations for Jump Markov Processes with Unbounded Jump Rates

As well-known, transition probabilities of jump Markov processes satisfy Kolmogorov's backward and forward equations. In the seminal 1940 paper, William Feller investigated solutions of Kolmogorov's equations for jump Markov processes. Recently the authors solved the problem studied by Feller and showed that the minimal solution of Kolmogorov's backward and forward equations is the transition probability of the corresponding jump Markov process if the transition rate at each state is bounded. This paper presents more general results. For Kolmogorov's backward equation, the sufficient condition for the described property of the minimal solution is that the transition rate at each state is locally integrable, and for Kolmogorov's forward equation the corresponding sufficient condition is that the transition rate at each state is locally bounded.

preprint2016arXiv

On the Optimality Equation for Average Cost Markov Decision Processes and its Validity for Inventory Control

As is well known, average-cost optimality inequalities imply the existence of stationary optimal policies for Markov Decision Processes with average costs per unit time, and these inequalities hold under broad natural conditions. This paper provides sufficient conditions for the validity of the average-cost optimality equation for an infinite state problem with weakly continuous transition probabilities and with possibly unbounded one-step costs and noncompact action sets. These conditions also imply the convergence of sequences of discounted relative value functions to average-cost relative value functions and the continuity of average-cost relative value functions. As shown in the paper, the classic periodic-review inventory control problem satisfies these conditions. Therefore, the optimality inequality holds in the form of an equality with a continuous average-cost relative value function for this problem. In addition, the $K$-convexity of discounted relative value functions and their convergence to average-cost relative value functions, when the discount factor increases to 1, imply the $K$-convexity of average-cost relative value functions. This implies that average-cost optimal $(s,S)$ policies for the inventory control problem can be derived from the average-cost optimality equation.

preprint2016arXiv

Optimality Conditions for Inventory Control

This tutorial describes recently developed general optimality conditions for Markov Decision Processes that have significant applications to inventory control. In particular, these conditions imply the validity of optimality equations and inequalities. They also imply the convergence of value iteration algorithms. For total discounted-cost problems only two mild conditions on the continuity of transition probabilities and lower semi-continuity of one-step costs are needed. For average-cost problems, a single additional assumption on the finiteness of relative values is required. The general results are applied to periodic-review inventory control problems with discounted and average-cost criteria without any assumptions on demand distributions. The case of partially observable states is also discussed.

preprint2015arXiv

Uniform Fatou's Lemma

Fatou's lemma is a classic fact in real analysis that states that the limit inferior of integrals of functions is greater than or equal to the integral of the inferior limit. This paper introduces a stronger inequality that holds uniformly for integrals on measurable subsets of a measurable space. The necessary and sufficient condition, under which this inequality holds for a sequence of finite measures converging in total variation, is provided. This statement is called the uniform Fatou's lemma, and it holds under the minor assumption that all the integrals are well-defined. The uniform Fatou's lemma improves the classic Fatou's lemma in the following directions: the uniform Fatou's lemma states a more precise inequality, it provides the necessary and sufficient condition, and it deals with variable measures. Various corollaries of the uniform Fatou's lemma are formulated. The examples in this paper demonstrate that: (a) the uniform Fatou's lemma may indeed provide a more accurate inequality than the classic Fatou's lemma; (b) the uniform Fatou's lemma does not hold if convergence of measures in total variation is relaxed to setwise convergence.

preprint2014arXiv

Continuity of Minima: Local Results

This paper compares and generalizes Berge's maximum theorem for noncompact image sets established in Feinberg, Kasyanov and Voorneveld (2014) and the local maximum theorem established in Bonnans and Shapiro (2000).

preprint2014arXiv

Convergence of Probability Measures and Markov Decision Models with Incomplete Information

This paper deals with three major types of convergence of probability measures on metric spaces: weak convergence, setwise converges, and convergence in the total variation. First, it describes and compares necessary and sufficient conditions for these types of convergence, some of which are well-known, in terms of convergence of probabilities of open and closed sets and, for the probabilities on the real line, in terms of convergence of distribution functions. Second, it provides % convenient criteria for weak and setwise convergence of probability measures and continuity of stochastic kernels in terms of convergence of probabilities defined on the base of the topology generated by the metric. Third, it provides applications to control of Partially Observable Markov Decision Processes and, in particular, to Markov Decision Models with incomplete information.

preprint2014arXiv

Examples Concerning Abelian and Cesaro Limits

This note provides examples of all possible equality and strict inequality relations between upper and lower Abelian and Cesaro limits of sequences bounded above or below.

preprint2014arXiv

Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

This paper describes sufficient conditions for the existence of optimal policies for Partially Observable Markov Decision Processes (POMDPs) with Borel state, observation, and action sets and with the expected total costs. Action sets may not be compact and one-step cost functions may be unbounded. The introduced conditions are also sufficient for the validity of optimality equations, semi-continuity of value functions, and convergence of value iterations to optimal values. Since POMDPs can be reduced to Completely Observable Markov Decision Processes (COMDPs), whose states are posterior state distributions, this paper focuses on the validity of the above mentioned optimality properties for COMDPs. The central question is whether transition probabilities for a COMDP are weakly continuous. We introduce sufficient conditions for this and show that the transition probabilities for a COMDP are weakly continuous, if transition probabilities of the underlying Markov Decision Process are weakly continuous and observation probabilities for the POMDP are continuous in the total variation. Moreover, the continuity in the total variation of the observation probabilities cannot be weakened to setwise continuity. The results are illustrated with counterexamples and examples.

preprint2013arXiv

Berge's Maximum Theorem for Noncompact Image Sets

This note generalizes Berge's maximum theorem to noncompact image sets. It is also clarifies the results from E.A. Feinberg, P.O. Kasyanov, N.V. Zadoianchuk, "Berge's theorem for noncompact image sets," J. Math. Anal. Appl. 397(1)(2013), pp. 255-259 on the extension to noncompact image sets of another Berge's theorem, that states semi-continuity of value functions. Here we explain that the notion of a $\K$-inf-compact function introduced there is applicable to metrizable topological spaces and to more general compactly generated topological spaces. For Hausdorff topological spaces we introduce the notion of a $\K\N$-inf-compact function ($\N$ stands for "nets" in $\K$-inf-compactness), which coincides with $\K$-inf-compactness for compactly generated and, in particular, for metrizable topological spaces.

preprint2013arXiv

Fatou's Lemma for Weakly Converging Probabilities

Fatou's lemma states under appropriate conditions that the integral of the lower limit of a sequence of functions is not greater than the lower limit of the integrals. This note describes similar inequalities when, instead of a single measure, the functions are integrated with respect to different measures that form a weakly convergent sequence.

preprint2013arXiv

On solutions of Kolmogorov's equations for jump Markov processes

This paper studies three ways to construct a nonhomogeneous jump Markov process: (i) via a compensator of the random measure of a multivariate point process, (ii) as a minimal solution of the backward Kolmogorov equation, and (iii) as a minimal solution of the forward Kolmogorov equation. The main conclusion of this paper is that, for a given measurable transition intensity, commonly called a Q-function, all these constructions define the same transition function. If this transition function is regular, that is, the probability of accumulation of jumps is zero, then this transition function is the unique solution of the backward and forward Kolmogorov equations. For continuous Q-functions, Kolmogorov equations were studied in Feller's seminal paper. In particular, this paper extends Feller's results for continuous Q-functions to measurable Q-functions and provides additional results.

preprint2013arXiv

The Value Iteration Algorithm is Not Strongly Polynomial for Discounted Dynamic Programming

This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iterations, the value iteration algorithm is not strongly polynomial for discounted dynamic programming.

preprint2012arXiv

Average-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

This paper presents sufficient conditions for the existence of stationary optimal policies for average-cost Markov Decision Processes with Borel state and action sets and with weakly continuous transition probabilities. The one-step cost functions may be unbounded, and action sets may be noncompact. The main contributions of this paper are: (i) general sufficient conditions for the existence of stationary discount-optimal and average-cost optimal policies and descriptions of properties of value functions and sets of optimal actions, (ii) a sufficient condition for the average-cost optimality of a stationary policy in the form of optimality inequalities, and (iii) approximations of average-cost optimal actions by discount-optimal actions.

preprint2012arXiv

Berge's Theorem for Noncompact Image Sets

For an upper semi-continuous set-valued mapping from one topological space to another and for a lower semi-continuous function defined on the product of these spaces, Berge's theorem states lower semi-continuity of the minimum of this function taken over the image sets. It assumes that the image sets are compact. For Hausdorff topological spaces, this paper extends Berge's theorem to set-valued mappings with possible noncompact image sets and studies relevant properties of minima.

preprint2012arXiv

The Multi-Armed Bandit, with Constraints

The early sections of this paper present an analysis of a Markov decision model that is known as the multi-armed bandit under the assumption that the utility function of the decision maker is either linear or exponential. The analysis includes efficient procedures for computing the expected utility associated with the use of a priority policy and for identifying a priority policy that is optimal. The methodology in these sections is novel, building on the use of elementary row operations. In the later sections of this paper, the analysis is adapted to accommodate constraints that link the bandits.

preprint2007arXiv

Buffer Insertion for Bridges and Optimal Buffer Sizing for Communication Sub-System of Systems-on-Chip

We have presented an optimal buffer sizing and buffer insertion methodology which uses stochastic models of the architecture and Continuous Time Markov Decision Processes CTMDPs. Such a methodology is useful in managing the scarce buffer resources available on chip as compared to network based data communication which can have large buffer space. The modeling of this problem in terms of a CT-MDP framework lead to a nonlinear formulation due to usage of bridges in the bus architecture. We present a methodology to split the problem into several smaller though linear systems and we then solve these subsystems.

Eugene A. Feinberg

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Semi-Uniform Feller Stochastic Kernels

Continuity of Discounted Values and the Structure of Optimal Policies for Periodic-Review Inventory Control with Setup Costs

Epi-Convergence of Expectation Functions under Varying Measures and Integrands

Markov Decision Processes with Incomplete Information and Semi-Uniform Feller Transition Probabilities

Strong Polynomiality of the Value Iteration Algorithm for Computing Nearly Optimal Policies for Discounted Dynamic Programming

Sufficiency of Markov Policies for Continuous-Time Jump Markov Decision Processes

Kolmogorov's Equations for Jump Markov Processes with Unbounded Jump Rates

On the Optimality Equation for Average Cost Markov Decision Processes and its Validity for Inventory Control

Optimality Conditions for Inventory Control

Uniform Fatou's Lemma

Continuity of Minima: Local Results

Convergence of Probability Measures and Markov Decision Models with Incomplete Information

Examples Concerning Abelian and Cesaro Limits

Partially Observable Total-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Berge's Maximum Theorem for Noncompact Image Sets

Fatou's Lemma for Weakly Converging Probabilities

On solutions of Kolmogorov's equations for jump Markov processes

The Value Iteration Algorithm is Not Strongly Polynomial for Discounted Dynamic Programming

Average-Cost Markov Decision Processes with Weakly Continuous Transition Probabilities

Berge's Theorem for Noncompact Image Sets

The Multi-Armed Bandit, with Constraints

Buffer Insertion for Bridges and Optimal Buffer Sizing for Communication Sub-System of Systems-on-Chip