Decentralized Control of Two Agents with Nested Accessible Information
In this paper, we investigate a decentralized stochastic control problem with two agents, where a part of the memory of the second agent is also available to the first agent at each instance of time. We derive a structural form for optimal control strategies which allows us to restrict their domain to a set which does not grow in size with time. We also present a dynamic programming (DP) decomposition which can utilize our results to derive optimal strategies for arbitrarily long time horizons. Since obtaining optimal control strategies by solving this DP decomposition is computationally intensive, we present potential resolutions in the form of simplified strategies by imposing additional conditions on our model, and an approximation technique which can be used to implement our results with a bounded loss of optimality.