Source author record

Manjesh K. Hanawal

Manjesh K. Hanawal appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence eess.SP Networking and Internet Architecture Information Theory math.IT Cryptography and Security Applications Computational Geometry Computer Science and Game Theory math.OC

Catalog footprint

What is connected

16works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Recurrent Transformer-Based Near- and Far-Field THz Wideband Channel Estimation for UM-MIMO

The integration of terahertz communications and ultra-massive multiple-input multiple-output (UM-MIMO) systems in 6G networks is motivated by their ability to enable unprecedented data rates, mitigate spectrum congestion, and enhance overall network performance. However, the enlarged antenna apertures and higher carrier frequencies in these systems increase the Rayleigh distance, causing users to span both the near-field and conventional far-field regions. Accurate spatial precoding thus requires exact channel estimation at the base station - a task made more challenging by the hybrid coexistence of near- and far-field effects and the limited number of digital chains available in hybrid beamforming architectures. In this paper, we propose a block recurrent transformer model to address this challenge. We demonstrate that a single transformer block equipped with state memory can be trained once and then iteratively applied for hybrid-field channel estimation. Furthermore, we train the model such that it generalizes to wireless channels with varying scatterer distances, different numbers of propagation paths, and wideband operation. Simulation results show that the proposed method achieves performance gains of approximately 5 dB and 7.5 dB in normalized mean squared error (NMSE) over state-of-the-art solutions in narrowband and wideband scenarios, respectively.

preprint2022arXiv

Challenges in Adapting ECH in TLS for Privacy Enhancement over the Internet

Security and Privacy are crucial in modern Internet services. Transport Layer Security (TLS) has largely addressed the issue of security. However, information about the type of service being accessed goes in plain-text in the initial handshakes of vanilla TLS, thus potentially revealing the activity of users and compromising privacy. The ``Encrypted ClientHello'' or ECH overcomes this issue by extending TLS 1.3 where all of the information that can potentially reveal the service type is masked, thus addressing the privacy issues in TLS 1.3. However, we notice that Internet services tend to use different versions of TLS for application data (primary connection/channel) and supporting data (side channels) such as scheduling information \textit{etc.}. %, during the active session. Although many internet services have migrated to TLS 1.3, we notice that it is only true for the primary connections which do benefit from TLS 1.3, while the side-channels continue to use lower version of TLS (e.g., 1.2) %which do not support ECH and continue to leak type of service accessed. We demonstrate that privacy information leaked from the side-channels can be used to affect the performance on the primary channels, like blocking or throttling specific service on the internet. Our work demonstrates that adapting ECH on primary channels alone is not sufficient to prevent the privacy leaks and attacks on primary channels. Further, we demonstrate that it is necessary for all of the associated side-channels also to migrate to TLS 1.3 and adapt ECH extension in order to offer complete privacy preservatio

preprint2022arXiv

Exploiting Side Information for Improved Online Learning Algorithms in Wireless Networks

In wireless networks, the rate achieved depends on factors like level of interference, hardware impairments, and channel gain. Often, instantaneous values of some of these factors can be measured, and they provide useful information about the instantaneous rate achieved. For example, higher interference implies a lower rate. In this work, we treat any such measurable quality that has a non-zero correlation with the rate achieved as side-information and study how it can be exploited to quickly learn the channel that offers higher throughput (reward). When the mean value of the side-information is known, using control variate theory we develop algorithms that require fewer samples to learn the parameters and can improve the learning rate compared to cases where side-information is ignored. Specifically, we incorporate side-information in the classical Upper Confidence Bound (UCB) algorithm and quantify the gain achieved in the regret performance. We show that the gain is proportional to the amount of the correlation between the reward and associated side-information. We discuss in detail various side-information that can be exploited in cognitive radio and air-to-ground communication in $L-$band. We demonstrate that correlation between the reward and side-information is often strong in practice and exploiting it improves the throughput significantly.

preprint2022arXiv

Learning Optimal Phase-Shifts of Holographic Metasurface Transceivers

Holographic metasurface transceivers (HMT) is an emerging technology for enhancing the coverage and rate of wireless communication systems. However, acquiring accurate channel state information in HMT-assisted wireless communication systems is critical for achieving these goals. In this paper, we propose an algorithm for learning the optimal phase-shifts at a HMT for the far-field channel model. Our proposed algorithm exploits the structure of the channel gains in the far-field regions and learns the optimal phase-shifts in presence of noise in the received signals. We prove that the probability that the optimal phase-shifts estimated by our proposed algorithm deviate from the true values decays exponentially in the number of pilot signals. Extensive numerical simulations validate the theoretical guarantees and also demonstrate significant gains as compared to the state-of-the-art policies.

preprint2022arXiv

Stochastic Multi-Armed Bandits with Control Variates

This paper studies a new variant of the stochastic multi-armed bandits problem where auxiliary information about the arm rewards is available in the form of control variates. In many applications like queuing and wireless networks, the arm rewards are functions of some exogenous variables. The mean values of these variables are known a priori from historical data and can be used as control variates. Leveraging the theory of control variates, we obtain mean estimates with smaller variance and tighter confidence bounds. We develop an upper confidence bound based algorithm named UCB-CV and characterize the regret bounds in terms of the correlation between rewards and control variates when they follow a multivariate normal distribution. We also extend UCB-CV to other distributions using resampling methods like Jackknifing and Splitting. Experiments on synthetic problem instances validate performance guarantees of the proposed algorithms.

preprint2022arXiv

UB3: Best Beam Identification in Millimeter Wave Systems via Pure Exploration Unimodal Bandits

Millimeter wave (mmWave) communications have a broad spectrum and can support data rates in the order of gigabits per second, as envisioned in 5G systems. However, they cannot be used for long distances due to their sensitivity to attenuation loss. To enable their use in the 5G network, it requires that the transmission energy be focused in sharp pencil beams. As any misalignment between the transmitter and receiver beam pair can reduce the data rate significantly, it is important that they are aligned as much as possible. To find the best transmit-receive beam pair, recent beam alignment (BA) techniques examine the entire beam space, which might result in a large amount of BA latency. Recent works propose to adaptively select the beams such that the cumulative reward measured in terms of received signal strength or throughput is maximized. In this paper, we develop an algorithm that exploits the unimodal structure of the received signal strengths of the beams to identify the best beam in a finite time using pure exploration strategies. Strategies that identify the best beam in a fixed time slot are more suitable for wireless network protocol design than cumulative reward maximization strategies that continuously perform exploration and exploitation. Our algorithm is named Unimodal Bandit for Best Beam (UB3) and identifies the best beam with a high probability in a few rounds. We prove that the error exponent in the probability does not depend on the number of beams and show that this is indeed the case by establishing a lower bound for the unimodal bandits. We demonstrate that UB3 outperforms the state-of-the-art algorithms through extensive simulations. Moreover, our algorithm is simple to implement and has lower computational complexity.

preprint2021arXiv

Masking Host Identity on Internet: Encrypted TLS/SSL Handshake

Network middle-boxes often classify the traffic flows on the Internet to perform traffic management or discriminate one traffic against the other. As the widespread adoption of HTTPS protocol has made it difficult to classify the traffic looking into the content field, one of the fields the middle-boxes look for is Server Name Indicator (SNI), which goes in plain text. SNI field contains information about the host and can, in turn, reveal the type of traffic. This paper presents a method to mask the server host identity by encrypting the SNI. We develop a simple method that completes the SSL/TLS connection establishment over two handshakes - the first handshake establishes a secure channel without sharing SNI information, and the second handshake shares the encrypted SNI. Our method makes it mandatory for fronting servers to always accept the handshake request without the SNI and respond with a valid SSL certificate. As there is no modification in already proven SSL/TLS encryption mechanism and processing of handshake messages, the new method enjoys all security benefits of existing secure channel establishment and needs no modification in existing routers/middle-boxes. Using customized client-server over the live Internet, we demonstrate the feasibility of our method. Moreover, the impact analysis shows that the method adheres to almost all SSL/TLS related Internet standards requirements.

preprint2020arXiv

Distributed Learning in Ad-Hoc Networks: A Multi-player Multi-armed Bandit Framework

Next-generation networks are expected to be ultra-dense with a very high peak rate but relatively lower expected traffic per user. For such scenario, existing central controller based resource allocation may incur substantial signaling (control communications) leading to a negative effect on the quality of service (e.g. drop calls), energy and spectrum efficiency. To overcome this problem, cognitive ad-hoc networks (CAHN) that share spectrum with other networks are being envisioned. They allow some users to identify and communicate in `free slots' thereby reducing signaling load and allowing the higher number of users per base stations (dense networks). Such networks open up many interesting challenges such as resource identification, coordination, dynamic and context-aware adaptation for which Machine Learning and Artificial Intelligence framework offers novel solutions. In this paper, we discuss state-of-the-art multi-armed multi-player bandit based distributed learning algorithms that allow users to adapt to the environment and coordinate with other players/users. We also discuss various open research problems for feasible realization of CAHN and interesting applications in other domains such as energy harvesting, Internet of Things, and Smart grids.

preprint2020arXiv

Learning and Fairness in Energy Harvesting: A Maximin Multi-Armed Bandits Approach

Recent advances in wireless radio frequency (RF) energy harvesting allows sensor nodes to increase their lifespan by remotely charging their batteries. The amount of energy harvested by the nodes varies depending on their ambient environment, and proximity to the source. The lifespan of the sensor network depends on the minimum amount of energy a node can harvest in the network. It is thus important to learn the least amount of energy harvested by nodes so that the source can transmit on a frequency band that maximizes this amount. We model this learning problem as a novel stochastic Maximin Multi-Armed Bandits (Maximin MAB) problem and propose an Upper Confidence Bound (UCB) based algorithm named Maximin UCB. Maximin MAB is a generalization of standard MAB and enjoys the same performance guarantee as that of the UCB1 algorithm. Experimental results validate the performance guarantees of our algorithm.

preprint2020arXiv

Learning to Optimize Energy Efficiency in Energy Harvesting Wireless Sensor Networks

We study wireless power transmission by an energy source to multiple energy harvesting nodes with the aim to maximize the energy efficiency. The source transmits energy to the nodes using one of the available power levels in each time slot and the nodes transmit information back to the energy source using the harvested energy. The source does not have any channel state information and it only knows whether a received codeword from a given node was successfully decoded or not. With this limited information, the source has to learn the optimal power level that maximizes the energy efficiency of the network. We model the problem as a stochastic Multi-Armed Bandits problem and develop an Upper Confidence Bound based algorithm, which learns the optimal transmit power of the energy source that maximizes the energy efficiency. Numerical results validate the performance guarantees of the proposed algorithm and show significant gains compared to the benchmark schemes.

preprint2020arXiv

Stochastic Network Utility Maximization with Unknown Utilities: Multi-Armed Bandits Approach

In this paper, we study a novel Stochastic Network Utility Maximization (NUM) problem where the utilities of agents are unknown. The utility of each agent depends on the amount of resource it receives from a network operator/controller. The operator desires to do a resource allocation that maximizes the expected total utility of the network. We consider threshold type utility functions where each agent gets non-zero utility if the amount of resource it receives is higher than a certain threshold. Otherwise, its utility is zero (hard real-time). We pose this NUM setup with unknown utilities as a regret minimization problem. Our goal is to identify a policy that performs as `good' as an oracle policy that knows the utilities of agents. We model this problem setting as a bandit setting where feedback obtained in each round depends on the resource allocated to the agents. We propose algorithms for this novel setting using ideas from Multiple-Play Multi-Armed Bandits and Combinatorial Semi-Bandits. We show that the proposed algorithm is optimal when all agents have the same utility. We validate the performance guarantees of our proposed algorithms through numerical experiments.

preprint2020arXiv

Thompson Sampling for Unsupervised Sequential Selection

Thompson Sampling has generated significant interest due to its better empirical performance than upper confidence bound based algorithms. In this paper, we study Thompson Sampling based algorithm for Unsupervised Sequential Selection (USS) problem. The USS problem is a variant of the stochastic multi-armed bandits problem, where the loss of an arm can not be inferred from the observed feedback. In the USS setup, arms are associated with fixed costs and are ordered, forming a cascade. In each round, the learner selects an arm and observes the feedback from arms up to the selected arm. The learner's goal is to find the arm that minimizes the expected total loss. The total loss is the sum of the cost incurred for selecting the arm and the stochastic loss associated with the selected arm. The problem is challenging because, without knowing the mean loss, one cannot compute the total loss for the selected arm. Clearly, learning is feasible only if the optimal arm can be inferred from the problem structure. As shown in the prior work, learning is possible when the problem instance satisfies the so-called `Weak Dominance' (WD) property. Under WD, we show that our Thompson Sampling based algorithm for the USS problem achieves near optimal regret and has better numerical performance than existing algorithms.

preprint2019arXiv

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

In this paper, we study censored Semi-Bandits, a novel variant of the semi-bandits problem. The learner is assumed to have a fixed amount of resources, which it allocates to the arms at each time step. The loss observed from an arm is random and depends on the amount of resources allocated to it. More specifically, the loss equals zero if the allocation for the arm exceeds a constant (but unknown)threshold that can be dependent on the arm. Our goal is to learn a feasible allocation that minimizes the expected loss. The problem is challenging because the loss distribution and threshold value of each arm are unknown. We study this novel setting by establishing its `equivalence' to Multiple-Play Multi-Armed Bandits(MP-MAB) and Combinatorial Semi-Bandits. Exploiting these equivalences, we derive optimal algorithms for our setting using existing algorithms for MP-MABand Combinatorial Semi-Bandits. Experiments on synthetically generated data validate performance guarantees of the proposed algorithms.

preprint2019arXiv

Unsupervised Online Feature Selection for Cost-Sensitive Medical Diagnosis

In medical diagnosis, physicians predict the state of a patient by checking measurements (features) obtained from a sequence of tests, e.g., blood test, urine test, followed by invasive tests. As tests are often costly, one would like to obtain only those features (tests) that can establish the presence or absence of the state conclusively. Another aspect of medical diagnosis is that we are often faced with unsupervised prediction tasks as the true state of the patients may not be known. Motivated by such medical diagnosis problems, we consider a {\it Cost-Sensitive Medical Diagnosis} (CSMD) problem, where the true state of patients is unknown. We formulate the CSMD problem as a feature selection problem where each test gives a feature that can be used in a prediction model. Our objective is to learn strategies for selecting the features that give the best trade-off between accuracy and costs. We exploit the `Weak Dominance' property of problem to develop online algorithms that identify a set of features which provides an `optimal' trade-off between cost and accuracy of prediction without requiring to know the true state of the medical condition. Our empirical results validate the performance of our algorithms on problem instances generated from real-world datasets.

preprint2015arXiv

Algorithms for Linear Bandits on Polyhedral Sets

We study stochastic linear optimization problem with bandit feedback. The set of arms take values in an $N$-dimensional space and belong to a bounded polyhedron described by finitely many linear inequalities. We provide a lower bound for the expected regret that scales as $Ω(N\log T)$. We then provide a nearly optimal algorithm and show that its expected regret scales as $O(N\log^{1+ε}(T))$ for an arbitrary small $ε>0$. The algorithm alternates between exploration and exploitation intervals sequentially where deterministic set of arms are played in the exploration intervals and greedily selected arm is played in the exploitation intervals. We also develop an algorithm that achieves the optimal regret when sub-Gaussianity parameter of the noise term is known. Our key insight is that for a polyhedron the optimal arm is robust to small perturbations in the reward function. Consequently, a greedily selected arm is guaranteed to be optimal when the estimation error falls below some suitable threshold. Our solution resolves a question posed by Rusmevichientong and Tsitsiklis (2011) that left open the possibility of efficient algorithms with asymptotic logarithmic regret bounds. We also show that the regret upper bounds hold with probability $1$. Our numerical investigations show that while theoretical results are asymptotic the performance of our algorithms compares favorably to state-of-the-art algorithms in finite time as well.

preprint2013arXiv

The Coalitional Switch off Game of Service Providers

This paper studies a significant problem in green networking called switching off base stations in case of cooperating service providers by means of stochastic geometric and coalitional game tools. The coalitional game herein considered is played by service providers who cooperate in switching off base stations. When they cooperate, any mobile is associated to the nearest BS of any service provider. Given a Poisson point process deployment model of nodes over an area and switching off base stations with some probability, it is proved that the distribution of signal to interference plus noise ratio remains unchanged while the transmission power is increased up to preserving the quality of service. The coalitional game behavior of a typical player is called to be \emph{hedonic} if the gain of any player depends solely on the members of the coalition to which the player belongs, thus, the coalitions form as a result of the preferences of the players over their possible coalitions' set. We also introduce a novel concept which is called the Nash-stable core containing those gain allocation methods that result in Nash-stable partitions. By this way, we always guarantee Nash stability. We study the non-emptiness of the Nash-stable core. Assuming the choice of a coalition is performed only by one player in a point of time, we prove that the Nash-stable core is non-empty when a player chooses its coalition in its turn, the player gains zero utility if the chosen coalition is visited before by itself.

Manjesh K. Hanawal

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Recurrent Transformer-Based Near- and Far-Field THz Wideband Channel Estimation for UM-MIMO

Challenges in Adapting ECH in TLS for Privacy Enhancement over the Internet

Exploiting Side Information for Improved Online Learning Algorithms in Wireless Networks

Learning Optimal Phase-Shifts of Holographic Metasurface Transceivers

Stochastic Multi-Armed Bandits with Control Variates

UB3: Best Beam Identification in Millimeter Wave Systems via Pure Exploration Unimodal Bandits

Masking Host Identity on Internet: Encrypted TLS/SSL Handshake

Distributed Learning in Ad-Hoc Networks: A Multi-player Multi-armed Bandit Framework

Learning and Fairness in Energy Harvesting: A Maximin Multi-Armed Bandits Approach

Learning to Optimize Energy Efficiency in Energy Harvesting Wireless Sensor Networks

Stochastic Network Utility Maximization with Unknown Utilities: Multi-Armed Bandits Approach

Thompson Sampling for Unsupervised Sequential Selection

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

Unsupervised Online Feature Selection for Cost-Sensitive Medical Diagnosis

Algorithms for Linear Bandits on Polyhedral Sets

The Coalitional Switch off Game of Service Providers