Researcher profile

R. Michael Buehrer

R. Michael Buehrer contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2023arXiv

Design of Rim-Located Reconfigurable Reflectarrays for Interference Mitigation in Reflector Antennas

Radio telescopes are susceptible to interference arriving through its sidelobes. If a reflector antenna could be retrofitted with an adaptive null steering system, it could potentially mitigate this interference. The design of a reflectarray which can be used to reconfigure a radio telescopes radiation pattern by driving a null to the angle of incoming interference is presented. The reflectarray occupies only a portion of the rim of the original reflector and lays conformal to the paraboloid within this region. The conformal reflectarray contains unit cells with 1-bit reconfigurability stemming from two symmetrically placed PIN diodes. It is found that the dielectric and switch losses introduced by the reflectarray do not significantly affect the radio telescopes efficiency since the reflectarray is placed only along the outer rim of the reflector which is weakly illuminated. Simulation results of an L-band reconfigurable reflectarray for an 18m prime focus fed parabola are presented.

preprint2022arXiv

Decentralized Bandits with Feedback for Cognitive Radar Networks

Completely decentralized Multi-Player Bandit models have demonstrated high localization accuracy at the cost of long convergence times in cognitive radar networks. Rather than model each radar node as an independent learner, entirely unable to swap information with other nodes in a network, in this work we construct a "central coordinator" to facilitate the exchange of information between radar nodes. We show that in interference-limited spectrum, where the signal to interference plus noise (SINR) ratio for the available bands may vary by location, a cognitive radar network (CRN) is able to use information from a central coordinator to reduce the number of time steps required to attain a given localization error. Importantly, each node is still able to learn separately. We provide a description of a network which has hybrid cognition in both a central coordinator and in each of the cognitive radar nodes, and examine the online machine learning algorithms which can be implemented in this structure.

preprint2022arXiv

Distributed Online Learning for Coexistence in Cognitive Radar Networks

This work addresses the coexistence problem for radar networks. Specifically, we model a network of cooperative, independent, and non-communicating radar nodes which must share resources within the network as well as with non-cooperative nearby emitters. We approach this problem using online Machine Learning (ML) techniques. Online learning approaches are specifically preferred due to the fact that each radar node has no prior knowledge of the environment nor of the positions of the other radar nodes, and due to the sequential nature of the problem. For this task we specifically select the multi-player multi-armed bandit (MMAB) model, which poses the problem as a sequential game, where each radar node in a network makes independent selections of center frequency and waveform with the same goal of improving tracking performance for the network as a whole. For accurate tracking, each radar node communicates observations to a fusion center on set intervals. The fusion center has knowledge of the radar node placement, but cannot communicate to the individual nodes fast enough for waveform control. Every radar node in the network must learn the behavior of the environment, which includes rewards, interferer behavior, and target behavior. Our contributions include a mathematical description of the MMAB framework adapted to the radar network scenario. We conclude with a simulation study of several different network configurations. Experimental results show that iterative, online learning using MMAB outperforms the more traditional sense-and-avoid (SAA) and fixed-allocation approaches.

preprint2022arXiv

Linear Jamming Bandits: Sample-Efficient Learning for Non-Coherent Digital Jamming

It has been shown (Amuru et al. 2015) that online learning algorithms can be effectively used to select optimal physical layer parameters for jamming against digital modulation schemes without a priori knowledge of the victim's transmission strategy. However, this learning problem involves solving a multi-armed bandit problem with a mixed action space that can grow very large. As a result, convergence to the optimal jamming strategy can be slow, especially when the victim and jammer's symbols are not perfectly synchronized. In this work, we remedy the sample efficiency issues by introducing a linear bandit algorithm that accounts for inherent similarities between actions. Further, we propose context features which are well-suited for the statistical features of the non-coherent jamming problem and demonstrate significantly improved convergence behavior compared to the prior art. Additionally, we show how prior knowledge about the victim's transmissions can be seamlessly integrated into the learning framework. We finally discuss limitations in the asymptotic regime.

preprint2022arXiv

Model Order Estimation in the Presence of multipath Interference using Residual Convolutional Neural Networks

Model order estimation (MOE) is often a pre-requisite for Direction of Arrival (DoA) estimation. Due to limits imposed by array geometry, it is typically not possible to estimate spatial parameters for an arbitrary number of sources; an estimate of the signal model is usually required. MOE is the process of selecting the most likely signal model from several candidates. While classic methods fail at MOE in the presence of coherent multipath interference, data-driven supervised learning models can solve this problem. Instead of the classic MLP (Multiple Layer Perceptions) or CNN (Convolutional Neural Networks) architectures, we propose the application of Residual Convolutional Neural Networks (RCNN), with grouped symmetric kernel filters to deliver state-of-art estimation accuracy of up to 95.2\% in the presence of coherent multipath, and a weighted loss function to eliminate underestimation error of the model order. We show the benefit of the approach by demonstrating its impact on an overall signal processing flow that determines the number of total signals received by the array, the number of independent sources, and the association of each of the paths with those sources . Moreover, we show that the proposed estimator provides accurate performance over a variety of array types, can identify the overloaded scenario, and ultimately provides strong DoA estimation and signal association performance.

preprint2022arXiv

Multi-Band Wi-Fi Sensing with Matched Feature Granularity

Complementary to the fine-grained channel state information (CSI) from the physical layer and coarse-grained received signal strength indicator (RSSI) measurements, the mid-grained spatial beam attributes (e.g., beam SNR) that are available at millimeter-wave (mmWave) bands during the mandatory beam training phase can be repurposed for Wi-Fi sensing applications. In this paper, we propose a multi-band Wi-Fi fusion method for Wi-Fi sensing that hierarchically fuses the features from both the fine-grained CSI at sub-6 GHz and the mid-grained beam SNR at 60 GHz in a granularity matching framework. The granularity matching is realized by pairing two feature maps from the CSI and beam SNR at different granularity levels and linearly combining all paired feature maps into a fused feature map with learnable weights. To further address the issue of limited labeled training data, we propose an autoencoder-based multi-band Wi-Fi fusion network that can be pre-trained in an unsupervised fashion. Once the autoencoder-based fusion network is pre-trained, we detach the decoders and append multi-task sensing heads to the fused feature map by fine-tuning the fusion block and re-training the multi-task heads from the scratch. The multi-band Wi-Fi fusion framework is thoroughly validated by in-house experimental Wi-Fi sensing datasets spanning three tasks: 1) pose recognition; 2) occupancy sensing; and 3) indoor localization. Comparison to four baseline methods (i.e., CSI-only, beam SNR-only, input fusion, and feature fusion) demonstrates the granularity matching improves the multi-task sensing performance. Quantitative performance is evaluated as a function of the number of labeled training data, latent space dimension, and fine-tuning learning rates.

preprint2022arXiv

Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér-Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Further, due to computational concerns, many traditional approaches are limited to near-term, or myopic, optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process (MDP), allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene which can be modeled as a $U^{\text{th}}$ order Markov process for a finite, but unknown, integer $U$. Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a context-tree, which is used as a probabalistic model for the scene's behavior. We show that an algorithm based on a multi-alphabet version of the Context-Tree Weighting (CTW) method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment's behavior.

preprint2022arXiv

Weight Selection for Pattern Control of Paraboloidal Reflector Antennas with Reconfigurable Rim Scattering

It has been recently demonstrated that modifying the rim scattering of a paraboloidal reflector antenna through the use of reconfigurable elements along the rim facilitates sidelobe modification including cancelling sidelobes. In this work we investigate techniques for determining the unit-magnitude weights (i.e., weights which modify the phase of the scattered signals) to accomplish sidelobe cancellation at arbitrary angles from the reflector axis. Specifically, it is shown that despite the large search space and the non-convexity of the cost function, weights can be found with reasonable complexity which provide significant cancellation capability. First, the optimal weights without any magnitude constraints are found. Afterwards, algorithms are developed for determining the unit-modulus weights with both quantized and unquantized phases. Further, it is shown that weights can be obtained that both cancel sidelobes while providing a constant main lobe gain. A primary finding is that sufficiently deep nulls are possible with essentially no change in the main lobe with practical (binary or quaternary) phase-only weights.

preprint2021arXiv

Constrained Online Learning to Mitigate Distortion Effects in Pulse-Agile Cognitive Radar

Pulse-agile radar systems have demonstrated favorable performance in dynamic electromagnetic scenarios. However, the use of non-identical waveforms within a radar's coherent processing interval may lead to harmful distortion effects when pulse-Doppler processing is used. This paper presents an online learning framework to optimize detection performance while mitigating harmful sidelobe levels. The radar waveform selection process is formulated as a linear contextual bandit problem, within which waveform adaptations which exceed a tolerable level of expected distortion are eliminated. The constrained online learning approach is effective and computationally feasible, evidenced by simulations in a radar-communication coexistence scenario and in the presence of intentional adaptive jamming. This approach is applied to both stochastic and adversarial contextual bandit learning models and the detection performance in dynamic scenarios is evaluated.

preprint2021arXiv

Multi-player Bandits for Distributed Cognitive Radar

With new applications for radar networks such as automotive control or indoor localization, the need for spectrum sharing and general interoperability is expected to rise. This paper describes the application of multi-player bandit algorithms for waveform selection to a distributed cognitive radar network that must coexist with a communications system. Specifically, we make the assumption that radar nodes in the network have no dedicated communication channel. As we will discuss later, nodes can communicate indirectly by taking actions which intentionally interfere with other nodes and observing the resulting collisions. The radar nodes attempt to optimize their own spectrum utilization while avoiding collisions, not only with each other, but with the communications system. The communications system is assumed to statically occupy some subset of the bands available to the radar network. First, we examine models that assume each node experiences equivalent channel conditions, and later examine a model that relaxes this assumption.

preprint2020arXiv

Centimeter-Level Indoor Localization using Channel State Information with Recurrent Neural Networks

Modern techniques in the Internet of Things or autonomous driving require more accuracy positioning ever. Classic location techniques mainly adapt to outdoor scenarios, while they do not meet the requirement of indoor cases with multiple paths. Meanwhile as a feature robust to noise and time variations, Channel State Information (CSI) has shown its advantages over Received Signal Strength Indicator (RSSI) at more accurate positioning. To this end, this paper proposes the neural network method to estimate the centimeter-level indoor positioning with real CSI data collected from linear antennas. It utilizes an amplitude of channel response or a correlation matrix as the input, which can highly reduce the data size and suppress the noise. Also, it makes use of the consistency in the user motion trajectory via Recurrent Neural Network (RNN) and signal-noise ratio (SNR) information, which can further improve the estimation accuracy, especially in small datasize learning. These contributions all benefit the efficiency of the neural network, based on the results with other classic supervised learning methods.

preprint2020arXiv

Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments

In this paper, dynamic non-cooperative coexistence between a cognitive pulsed radar and a nearby communications system is addressed by applying nonlinear value function approximation via deep reinforcement learning (Deep RL) to develop a policy for optimal radar performance. The radar learns to vary the bandwidth and center frequency of its linear frequency modulated (LFM) waveforms to mitigate mutual interference with other systems and improve target detection performance while also maintaining sufficient utilization of the available frequency bands required for a fine range resolution. We demonstrate that our approach, based on the Deep Q-Learning (DQL) algorithm, enhances important radar metrics, including SINR and bandwidth utilization, more effectively than policy iteration or sense-and-avoid (SAA) approaches in a variety of realistic coexistence environments. We also extend the DQL-based approach to incorporate Double Q-learning and a recurrent neural network to form a Double Deep Recurrent Q-Network (DDRQN). We demonstrate the DDRQN results in favorable performance and stability compared to DQL and policy iteration. Finally, we demonstrate the practicality of our proposed approach through a discussion of experiments performed on a software defined radar (SDRadar) prototype system. Our experimental results indicate that the proposed Deep RL approach significantly improves radar detection performance in congested spectral environments when compared to policy iteration and SAA.

preprint2020arXiv

Direction of Arrival Estimation for a Vector Sensor Using Deep Neural Networks

A vector sensor, a type of sensor array with six collocated antennas to measure all electromagnetic field components of incident waves, has been shown to be advantageous in estimating the angle of arrival and polarization of the incident sources. While angle estimation with machine learning for linear arrays has been well studied, there has not been a similar solution for the vector sensor. In this paper, we propose neural networks to determine the number of the sources and estimate the angle of arrival of each source, based on the covariance matrix extracted from received data. Also, we provide a solution for matching output angles to corresponding sources and examine the error distributions with this method. The results show that neural networks can achieve reasonably accurate estimation with up to 5 sources, especially if the field-of-view is limited.

preprint2020arXiv

Efficient Online Learning for Cognitive Radar-Cellular Coexistence via Contextual Thompson Sampling

This paper describes a sequential, or online, learning scheme for adaptive radar transmissions that facilitate spectrum sharing with a non-cooperative cellular network. First, the interference channel between the radar and a spatially distant cellular network is modeled. Then, a linear Contextual Bandit (CB) learning framework is applied to drive the radar's behavior. The fundamental trade-off between exploration and exploitation is balanced by a proposed Thompson Sampling (TS) algorithm, a pseudo-Bayesian approach which selects waveform parameters based on the posterior probability that a specific waveform is optimal, given discounted channel information as context. It is shown that the contextual TS approach converges more rapidly to behavior that minimizes mutual interference and maximizes spectrum utilization than comparable contextual bandit algorithms. Additionally, we show that the TS learning scheme results in a favorable SINR distribution compared to other online learning algorithms. Finally, the proposed TS algorithm is compared to a deep reinforcement learning model. We show that the TS algorithm maintains competitive performance with a more complex Deep Q-Network (DQN).

preprint2020arXiv

Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar

In this work, we first describe a framework for the application of Reinforcement Learning (RL) control to a radar system that operates in a congested spectral setting. We then compare the utility of several RL algorithms through a discussion of experiments performed on Commercial off-the-shelf (COTS) hardware. Each RL technique is evaluated in terms of convergence, radar detection performance achieved in a congested spectral environment, and the ability to share 100MHz spectrum with an uncooperative communications system. We examine policy iteration, which solves an environment posed as a Markov Decision Process (MDP) by directly solving for a stochastic mapping between environmental states and radar waveforms, as well as Deep RL techniques, which utilize a form of Q-Learning to approximate a parameterized function that is used by the radar to select optimal actions. We show that RL techniques are beneficial over a Sense-and-Avoid (SAA) scheme and discuss the conditions under which each approach is most effective.

preprint2020arXiv

Interference Classification Using Deep Neural Networks

The recent success in implementing supervised learning to classify modulation types suggests that other problems akin to modulation classification would eventually benefit from that implementation. One of these problems is classifying the interference type added to a signal-of-interest, also known as interference classification. In this paper, we propose an interference classification method using a deep neural network. We generate five distinct types of interfering signals then use both the power-spectral density (PSD) and the cyclic spectrum of the received signal as input features to the network. The computer experiments reveal that using the received signal PSD outperforms using its cyclic spectrum in terms of accuracy. In addition, the same experiments show that the feed-forward networks yield better accuracy than classic methods. The proposed classifier aids the subsequent stage in the receiver chain with choosing the appropriate mitigation algorithm and also can coexist with modulation-classification methods to further improve the classifier accuracy.

preprint2020arXiv

Predicting Bit Error Rate from Meta Information using Random Forests

With the increasing power of machine learning-based reasoning, the use of meta-information (e.g., digital signal modulation parameters, channel conditions, etc.) to predict the performance of various signal processing techniques has become feasible. One such problem of practical interest is choosing a proper interference mitigation method based on the meta information of the received signal. Since heuristic table-based methods suffer from limited prediction capability for unseen cases, we propose a recommendation system based on the use of Random Forests (RF). Specifically, RF used to predict the Bit-Error-Rate (BER) of all mitigation approaches so as to determine the approach with the best performance. We found RF can predict BER with high accuracy, and its importance factor demonstrates which input attributes matter most. These BER prediction results can also benefit other functions such as adaptive modulation, channel sensing, beaming selection, etc.