Researcher profile

Arpan Chattopadhyay

Arpan Chattopadhyay contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2023arXiv

Inverse Reinforcement Learning With Constraint Recovery

In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems. In standard IRL problems, the inverse learner or agent seeks to recover the reward function of the MDP, given a set of trajectory demonstrations for the optimal policy. In this work, we seek to infer not only the reward functions of the CMDP, but also the constraints. Using the principle of maximum entropy, we show that the IRL with constraint recovery (IRL-CR) problem can be cast as a constrained non-convex optimization problem. We reduce it to an alternating constrained optimization problem whose sub-problems are convex. We use exponentiated gradient descent algorithm to solve it. Finally, we demonstrate the efficacy of our algorithm for the grid world environment.

preprint2022arXiv

Online Reinforcement Learning for Periodic MDP

We study learning in periodic Markov Decision Process(MDP), a special type of non-stationary MDP where both the state transition probabilities and reward functions vary periodically, under the average reward maximization setting. We formulate the problem as a stationary MDP by augmenting the state space with the period index, and propose a periodic upper confidence bound reinforcement learning-2 (PUCRL2) algorithm. We show that the regret of PUCRL2 varies linearly with the period and as sub-linear with the horizon length. Numerical results demonstrate the efficacy of PUCRL2.

preprint2022arXiv

OptM3Sec: Optimizing Multicast IRS-Aided Multiantenna DFRC Secrecy Channel with Multiple Eavesdroppers

With the use of common signaling methods for dual-function radar-communications (DFRC) systems, the susceptibility of eavesdropping on messages aimed at legitimate users has worsened. For DFRC systems, the radar target may act as an eavesdropper (ED) that receives a high-energy signal thereby leading to additional challenges. Unlike prior works, we consider a multicast multi-antenna DFRC system with multiple EDs. We then propose a physical layer design approach to maximize the secrecy rate by installing intelligent reflecting surfaces in the radar channels. Our optimization of multiple ED multicast multi-antenna DFRC secrecy rate (OptM3Sec) approach solves this highly nonconvex problem with respect to the precoding matrices. Our numerical experiments demonstrate the feasibility of our algorithm in maximizing the secrecy rate in this DFRC setup.

preprint2022arXiv

Quickest Bayesian and non-Bayesian detection of false data injection attack in remote state estimation

In this paper, quickest detection of false data injection attack on remote state estimation is considered. A set of $N$ sensors make noisy linear observations of a discrete-time linear process with Gaussian noise, and report the observations to a remote estimator. The challenge is the presence of a few potentially malicious sensors which can start strategically manipulating their observations at a random time in order to skew the estimates. The quickest attack detection problem for a known {\em linear} attack scheme in the Bayesian setting with a Geometric prior on the attack initiation instant is posed as a constrained Markov decision process (MDP), in order to minimize the expected detection delay subject to a false alarm constraint, with the state involving the probability belief at the estimator that the system is under attack. State transition probabilities are derived in terms of system parameters, and the structure of the optimal policy is derived analytically. It turns out that the optimal policy amounts to checking whether the probability belief exceeds a threshold. Next, generalized CUSUM based attack detection algorithm is proposed for the non-Bayesian setting where the attacker chooses the attack initiation instant in a particularly adversarial manner. It turns out that computing the statistic for the generalised CUSUM test in this setting relies on the same techniques developed to compute the state transition probabilities of the MDP. Numerical results demonstrate significant performance gain under the proposed algorithms against competing algorithms.

preprint2021arXiv

Design of false data injection attack on distributed process estimation

Herein, design of false data injection attack on a distributed cyber-physical system is considered. A stochastic process with linear dynamics and Gaussian noise is measured by multiple agent nodes, each equipped with multiple sensors. The agent nodes form a multi-hop network among themselves. Each agent node computes an estimate of the process by using its sensor observation and messages obtained from neighboring nodes, via Kalman-consensus filtering. An external attacker, capable of arbitrarily manipulating the sensor observations of some or all agent nodes, injects errors into those sensor observations. The goal of the attacker is to steer the estimates at the agent nodes as close as possible to a pre-specified value, while respecting a constraint on the attack detection probability. To this end, a constrained optimization problem is formulated to find the optimal parameter values of a certain class of linear attacks. The parameters of linear attack are learnt on-line via a combination of stochastic approximation based update of a Lagrange multiplier, and an optimization technique involving either the Karush-Kuhn-Tucker (KKT) conditions or online stochastic gradient descent. The problem turns out to be convex for some special cases. Desired convergence of the proposed algorithms are proved by exploiting the convexity and properties of stochastic approximation algorithms. Finally, numerical results demonstrate the efficacy of the attack.

preprint2020arXiv

Distributed Relay Selection in Presence of Dynamic Obstacles in Millimeter Wave D2D Communication

Millimeter wave (mmWave) device to device (D2D) communication is highly susceptible to obstacles due to severe penetration losses and requires almost a line of sight (LOS) communication path. D2D channel condition is local to devices/user equipments (UEs) and hence is \textit{not} directly visible to the base station (BS). Thus quality of the D2D channel needs to be propagated to BS by UEs which may incur some delay. Hence the solution provided by BS to UEs using this gathered channel information might become less useful to establish communication due to moving obstacles. These types of obstacles might not be known in advance and hence may cause unpredictable fluctuations to the D2D channel quality. Hence we seek to learn the D2D channels using the finite horizon partially observable Markov decision process (POMDP) framework to model the uncertainty in such kind of network environments with dynamic obstacles. The objective is to minimize delay when channel quality deteriorates, by making UEs choose locally the best possible decision between i) to continue on the current relay link on which communication is taking place or ii) to switch to another good relay by exploring other possible UEs in its locality. We derive an optimal threshold policy which tells the UE to take appropriate decision locally. Later, we give a simplified and easy to implement stationary threshold policy which counts the number of successive acknowledgement failures, based on which UE make appropriate decision locally. Through extensive simulation, we demonstrate that our approach outperforms recent algorithms.

preprint2020arXiv

Efficient detection of adversarial images

In this paper, detection of deception attack on deep neural network (DNN) based image classification in autonomous and cyber-physical systems is considered. Several studies have shown the vulnerability of DNN to malicious deception attacks. In such attacks, some or all pixel values of an image are modified by an external attacker, so that the change is almost invisible to the human eye but significant enough for a DNN-based classifier to misclassify it. This paper first proposes a novel pre-processing technique that facilitates the detection of such modified images under any DNN-based image classifier as well as the attacker model. The proposed pre-processing algorithm involves a certain combination of principal component analysis (PCA)-based decomposition of the image, and random perturbation based detection to reduce computational complexity. Next, an adaptive version of this algorithm is proposed where a random number of perturbations are chosen adaptively using a doubly-threshold policy, and the threshold values are learnt via stochastic approximation in order to minimize the expected number of perturbations subject to constraints on the false alarm and missed detection probabilities. Numerical experiments show that the proposed detection scheme outperforms a competing algorithm while achieving reasonably low computational complexity.

preprint2020arXiv

Optimal deception attack on networked vehicular cyber physical systems

Herein, design of false data injection attack on a distributed cyber-physical system is considered. A stochastic process with linear dynamics and Gaussian noise is measured by multiple agent nodes, each equipped with multiple sensors. The agent nodes form a multi-hop network among themselves. Each agent node computes an estimate of the process by using its sensor observation and messages obtained from neighboring nodes,via Kalman-consensus filtering. An external attacker, capable of arbitrarily manipulating the sensor observations of some or all agent nodes, injects errors into those sensor observations. The goal of the attacker is to steer the estimates at the agent nodes as close as possible to a pre-specified value, while respecting a constraint on the attack detection probability. To this end,a constrained optimization problem is formulated to find the optimal parameter values of a certain class of linear attacks. The parameters of linear attack are learnt on-line via a combination of stochastic approximation and online stochastic gradient descent.Numerical results demonstrate the efficacy of the attack.

preprint2018arXiv

Location Aware Opportunistic Bandwidth Sharing between Static and Mobile Users with Stochastic Learning in Cellular Networks

We consider location-dependent opportunistic bandwidth sharing between static and mobile downlink users in a cellular network. Each cell has some fixed number of static users. Mobile users enter the cell, move inside the cell for some time and then leave the cell. In order to provide higher data rate to mobile users, we propose to provide higher bandwidth to the mobile users at favourable times and locations, and provide higher bandwidth to the static users in other times. We formulate the problem as a long run average reward Markov decision process (MDP) where the per-step reward is a linear combination of instantaneous data volumes received by static and mobile users, and find the optimal policy. The transition structure of this MDP is not known in general. To alleviate this issue, we propose a learning algorithm based on single timescale stochastic approximation. Also, noting that the unconstrained MDP can be used to solve a constrained problem, we provide a learning algorithm based on multi-timescale stochastic approximation. The results are extended to address the issue of fair bandwidth sharing between the two classes of users. Numerical results demonstrate performance improvement by our scheme, and also the trade-off between performance gain and fairness.

preprint2016arXiv

Cell planning for mobility management in heterogeneous cellular networks

In small cell networks, high mobility of users results in frequent handoff and thus severely restricts the data rate for mobile users. To alleviate this problem, we propose to use heterogeneous, two-tier network structure where static users are served by both macro and micro base stations, whereas the mobile (i.e., moving) users are served only by macro base stations having larger cells; the idea is to prevent frequent data outage for mobile users due to handoff. We use the classical two-tier Poisson network model with different transmit powers (cf [1]), assume independent Poisson process of static users and doubly stochastic Poisson process of mobile users moving at a constant speed along infinite straight lines generated by a Poisson line process. Using stochastic geometry, we calculate the average downlink data rate of the typical static and mobile (i.e., moving) users, the latter accounted for handoff outage periods. We consider also the average throughput of these two types of users defined as their average data rates divided by the mean total number of users co-served by the same base station. We find that if the density of a homogeneous network and/or the speed of mobile users is high, it is advantageous to let the mobile users connect only to some optimal fraction of BSs to reduce the frequency of handoffs during which the connection is not assured. If a heterogeneous structure of the network is allowed, one can further jointly optimize the mean throughput of mobile and static users by appropriately tuning the powers of micro and macro base stations subject to some aggregate power constraint ensuring unchanged mean data rates of static users via the network equivalence property (see [2]).