Researcher profile

Srinivasa M Salapaka

Srinivasa M Salapaka contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework

We present a framework to address a class of sequential decision making problems. Our framework features learning the optimal control policy with robustness to noisy data, determining the unknown state and action parameters, and performing sensitivity analysis with respect to problem parameters. We consider two broad categories of sequential decision making problems modelled as infinite horizon Markov Decision Processes (MDPs) with (and without) an absorbing state. The central idea underlying our framework is to quantify exploration in terms of the Shannon Entropy of the trajectories under the MDP and determine the stochastic policy that maximizes it while guaranteeing a low value of the expected cost along a trajectory. This resulting policy enhances the quality of exploration early on in the learning process, and consequently allows faster convergence rates and robust solutions even in the presence of noisy data as demonstrated in our comparisons to popular algorithms such as Q-learning, Double Q-learning and entropy regularized Soft Q-learning. The framework extends to the class of parameterized MDP and RL problems, where states and actions are parameter dependent, and the objective is to determine the optimal parameters along with the corresponding optimal policy. Here, the associated cost function can possibly be non-convex with multiple poor local minima. Simulation results applied to a 5G small cell network problem demonstrate successful determination of communication routes and the small cell locations. We also obtain sensitivity measures to problem parameters and robustness to noisy environment data.

preprint2020arXiv

Inequality Constraints in Facility Location and Other Similar Optimization Problems: An Entropy Based Approach

In this paper we propose an annealing based framework to incorporate inequality constraints in optimization problems such as facility location, simultaneous facility location with path optimization, and the last mile delivery problem. These inequality constraints are used to model several application specific size and capacity limitations on the corresponding facilities, transportation paths and the service vehicles. We design our algorithms in such a way that it allows to (possibly) violate the constraints during the initial stages of the algorithm, so as to facilitate a thorough exploration of the solution space; as the algorithm proceeds, this violation (controlled through the annealing parameter) is gradually lowered till the solution converges in the feasible region of the optimization problem. We present simulations on various datasets that demonstrate the efficacy of our algorithm.