Source author record

Rong Zheng

Rong Zheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Networking and Internet Architecture Distributed, Parallel, and Cluster Computing physics.soc-ph Social and Information Networks Artificial Intelligence Computer Vision Databases Discrete Mathematics eess.AS eess.SP Machine Learning Robotics Sound

Catalog footprint

What is connected

20works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Efficient Action Recognition Using Confidence Distillation

Modern neural networks are powerful predictive models. However, when it comes to recognizing that they may be wrong about their predictions, they perform poorly. For example, for one of the most common activation functions, the ReLU and its variants, even a well-calibrated model can produce incorrect but high confidence predictions. In the related task of action recognition, most current classification methods are based on clip-level classifiers that densely sample a given video for non-overlapping, same-sized clips and aggregate the results using an aggregation function - typically averaging - to achieve video level predictions. While this approach has shown to be effective, it is sub-optimal in recognition accuracy and has a high computational overhead. To mitigate both these issues, we propose the confidence distillation framework to teach a representation of uncertainty of the teacher to the student sampler and divide the task of full video prediction between the student and the teacher models. We conduct extensive experiments on three action recognition datasets and demonstrate that our framework achieves significant improvements in action recognition accuracy (up to 20%) and computational efficiency (more than 40%).

preprint2022arXiv

Individualizing Head-Related Transfer Functions for Binaural Acoustic Applications

A Head Related Transfer Function (HRTF) characterizes how a human ear receives sounds from a point in space, and depends on the shapes of one's head, pinna, and torso. Accurate estimations of HRTFs for human subjects are crucial in enabling binaural acoustic applications such as sound localization and 3D sound spatialization. Unfortunately, conventional approaches for HRTF estimation rely on specialized devices or lengthy measurement processes. This work proposes a novel lightweight method for HRTF individualization that can be implemented using commercial-off-the-shelf components and performed by average users in home settings. The proposed method has two key components: a generative neural network model that can be individualized to predict HRTFs of new subjects from sparse measurements, and a lightweight measurement procedure that collects HRTF data from spatial locations. Extensive experiments using a public dataset and in house measurement data from 10 subjects of different ages and genders, show that the individualized models significantly outperform a baseline model in the accuracy of predicted HRTFs. To further demonstrate the advantages of individualized HRTFs, we implement two prototype applications for binaural localization and acoustic spatialization. We find that the performance of a localization model is improved by 15 degree after trained with individualized HRTFs. Furthermore, in hearing tests, the success rate of correctly identifying the azimuth direction of incoming sounds increases by 183% after individualization.

preprint2020arXiv

CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge

The success of deep neural networks (DNN) in machine perception applications such as image classification and speech recognition comes at the cost of high computation and storage complexity. Inference of uncompressed large scale DNN models can only run in the cloud with extra communication latency back and forth between cloud and end devices, while compressed DNN models achieve real-time inference on end devices at the price of lower predictive accuracy. In order to have the best of both worlds (latency and accuracy), we propose CacheNet, a model caching framework. CacheNet caches low-complexity models on end devices and high-complexity (or full) models on edge or cloud servers. By exploiting temporal locality in streaming data, high cache hit and consequently shorter latency can be achieved with no or only marginal decrease in prediction accuracy. Experiments on CIFAR-10 and FVG have shown CacheNet is 58-217% faster than baseline approaches that run inference tasks on end devices or edge servers alone.

preprint2020arXiv

Informative Path Planning for Mobile Sensing with Reinforcement Learning

Large-scale spatial data such as air quality, thermal conditions and location signatures play a vital role in a variety of applications. Collecting such data manually can be tedious and labour intensive. With the advancement of robotic technologies, it is feasible to automate such tasks using mobile robots with sensing and navigation capabilities. However, due to limited battery lifetime and scarcity of charging stations, it is important to plan paths for the robots that maximize the utility of data collection, also known as the informative path planning (IPP) problem. In this paper, we propose a novel IPP algorithm using reinforcement learning (RL). A constrained exploration and exploitation strategy is designed to address the unique challenges of IPP, and is shown to have fast convergence and better optimality than a classical reinforcement learning approach. Extensive experiments using real-world measurement data demonstrate that the proposed algorithm outperforms state-of-the-art algorithms in most test cases. Interestingly, unlike existing solutions that have to be re-executed when any input parameter changes, our RL-based solution allows a degree of transferability across different problem instances.

preprint2020arXiv

RECOME: a New Density-Based Clustering Algorithm Using Relative KNN Kernel Density

Discovering clusters from a dataset with different shapes, densities, and scales is a known challenging problem in data clustering. In this paper, we propose the RElative COre MErge (RECOME) clustering algorithm. The core of RECOME is a novel density measure, i.e., Relative $K$ nearest Neighbor Kernel Density (RNKD). RECOME identifies core objects with unit RNKD, and {partitions} non-core objects into atom clusters by successively following higher-density neighbor relations toward core objects. Core objects and their corresponding atom clusters are then merged through $α$-reachable paths on a KNN graph. We discover that the number of clusters computed by RECOME is a step function of the $α$ parameter with jump discontinuity on a small collection of values. A fast jump discontinuity discovery (FJDD) method is proposed based on graph theory. RECOME is evaluated on both synthetic datasets and real datasets. Experimental results indicate that RECOME is able to discover clusters with different shapes, densities, and scales. It outperforms six baseline methods on both synthetic datasets and real datasets. Moreover, FJDD is shown to be effective to extract the jump discontinuity set of parameter $α$ for all tested datasets, which can ease the task of data exploration and parameter tuning.

preprint2015arXiv

Device Fingerprinting in Wireless Networks: Challenges and Opportunities

Node forgery or impersonation, in which legitimate cryptographic credentials are captured by an adversary, constitutes one major security threat facing wireless networks. The fact that mobile devices are prone to be compromised and reverse engineered significantly increases the risk of such attacks in which adversaries can obtain secret keys on trusted nodes and impersonate the legitimate node. One promising approach toward thwarting these attacks is through the extraction of unique fingerprints that can provide a reliable and robust means for device identification. These fingerprints can be extracted from transmitted signal by analyzing information across the protocol stack. In this paper, the first unified and comprehensive tutorial in the area of wireless device fingerprinting for security applications is presented. In particular, we aim to provide a detailed treatment on developing novel wireless security solutions using device fingerprinting techniques. The objectives are three-fold: (i) to introduce a comprehensive taxonomy of wireless features that can be used in fingerprinting, (ii) to provide a systematic review on fingerprint algorithms including both white-list based and unsupervised learning approaches, and (iii) to identify key open research problems in the area of device fingerprinting and feature extraction, as applied to wireless security.

preprint2014arXiv

Update-Efficient Error-Correcting Product-Matrix Codes

Regenerating codes provide an efficient way to recover data at failed nodes in distributed storage systems. It has been shown that regenerating codes can be designed to minimize the per-node storage (called MSR) or minimize the communication overhead for regeneration (called MBR). In this work, we propose new encoding schemes for $[n,d]$ error-correcting MSR and MBR codes that generalize our earlier work on error-correcting regenerating codes. We show that by choosing a suitable diagonal matrix, any generator matrix of the $[n,α]$ Reed-Solomon (RS) code can be integrated into the encoding matrix. Hence, MSR codes with the least update complexity can be found. By using the coefficients of generator polynomials of $[n,k]$ and $[n,d]$ RS codes, we present a least-update-complexity encoding scheme for MBR codes. A decoding scheme is proposed that utilizes the $[n,α]$ RS code to perform data reconstruction for MSR codes. The proposed decoding scheme has better error correction capability and incurs the least number of node accesses when errors are present. A new decoding scheme is also proposed for MBR codes that can correct more error-patterns.

preprint2013arXiv

A Data-driven Study of Influences in Twitter Communities

This paper presents a quantitative study of Twitter, one of the most popular micro-blogging services, from the perspective of user influence. We crawl several datasets from the most active communities on Twitter and obtain 20.5 million user profiles, along with 420.2 million directed relations and 105 million tweets among the users. User influence scores are obtained from influence measurement services, Klout and PeerIndex. Our analysis reveals interesting findings, including non-power-law influence distribution, strong reciprocity among users in a community, the existence of homophily and hierarchical relationships in social influences. Most importantly, we observe that whether a user retweets a message is strongly influenced by the first of his followees who posted that message. To capture such an effect, we propose the first influencer (FI) information diffusion model and show through extensive evaluation that compared to the widely adopted independent cascade model, the FI model is more stable and more accurate in predicting influence spreads in Twitter communities.

preprint2013arXiv

On Budgeted Influence Maximization in Social Networks

Given a budget and arbitrary cost for selecting each node, the budgeted influence maximization (BIM) problem concerns selecting a set of seed nodes to disseminate some information that maximizes the total number of nodes influenced (termed as influence spread) in social networks at a total cost no more than the budget. Our proposed seed selection algorithm for the BIM problem guarantees an approximation ratio of (1 - 1/sqrt(e)). The seed selection algorithm needs to calculate the influence spread of candidate seed sets, which is known to be #P-complex. Identifying the linkage between the computation of marginal probabilities in Bayesian networks and the influence spread, we devise efficient heuristic algorithms for the latter problem. Experiments using both large-scale social networks and synthetically generated networks demonstrate superior performance of the proposed algorithm with moderate computation costs. Moreover, synthetic datasets allow us to vary the network parameters and gain important insights on the impact of graph structures on the performance of different algorithms.

preprint2013arXiv

On Quality of Monitoring for Multi-channel Wireless Infrastructure Networks

Passive monitoring utilizing distributed wireless sniffers is an effective technique to monitor activities in wireless infrastructure networks for fault diagnosis, resource management and critical path analysis. In this paper, we introduce a quality of monitoring (QoM) metric defined by the expected number of active users monitored, and investigate the problem of maximizing QoM by judiciously assigning sniffers to channels based on the knowledge of user activities in a multi-channel wireless network. Two types of capture models are considered. The user-centric model assumes frame-level capturing capability of sniffers such that the activities of different users can be distinguished while the sniffer-centric model only utilizes the binary channel information (active or not) at a sniffer. For the user-centric model, we show that the implied optimization problem is NP-hard, but a constant approximation ratio can be attained via polynomial complexity algorithms. For the sniffer-centric model, we devise stochastic inference schemes to transform the problem into the user-centric domain, where we are able to apply our polynomial approximation algorithms. The effectiveness of our proposed schemes and algorithms is further evaluated using both synthetic data as well as real-world traces from an operational WLAN.

preprint2013arXiv

Update-Efficient Regenerating Codes with Minimum Per-Node Storage

Regenerating codes provide an efficient way to recover data at failed nodes in distributed storage systems. It has been shown that regenerating codes can be designed to minimize the per-node storage (called MSR) or minimize the communication overhead for regeneration (called MBR). In this work, we propose a new encoding scheme for [n,d] error- correcting MSR codes that generalizes our earlier work on error-correcting regenerating codes. We show that by choosing a suitable diagonal matrix, any generator matrix of the [n,α] Reed-Solomon (RS) code can be integrated into the encoding matrix. Hence, MSR codes with the least update complexity can be found. An efficient decoding scheme is also proposed that utilizes the [n,α] RS code to perform data reconstruction. The proposed decoding scheme has better error correction capability and incurs the least number of node accesses when errors are present.

preprint2012arXiv

A Robust Relay Placement Framework for 60GHz mmWave Wireless Personal Area Networks

Multimedia streaming applications with stringent QoS requirements in 60GHz mmWave wireless personal area networks (WPANs) demand high rate and low latency data transfer as well as low service disruption. In this paper, we consider the problem of robust relay placement in 60GHz WPANs. Relays forward traffic from transmitter devices to receiver devices facilitating i) the primary communication path for non-line-of-sight (NLOS) transceiver pairs, and ii) secondary (backup) communication path for line-of-sight (LOS) transceiver pairs. We formulate the robust minimum relay placement problem and the robust maximum utility relay placement problem with the objective to minimize the number of relays deployed and maximize the network utility, respectively. Efficient algorithms are developed to solve both problems and have been shown to incur less service disruption in presence of moving subjects that may block the LOS paths in the environment.

preprint2012arXiv

Binary is Good: A Binary Inference Framework for Primary User Separation in Cognitive Radio Networks

Primary users (PU) separation concerns with the issues of distinguishing and characterizing primary users in cognitive radio (CR) networks. We argue the need for PU separation in the context of collaborative spectrum sensing and monitor selection. In this paper, we model the observations of monitors as boolean OR mixtures of underlying binary latency sources for PUs, and devise a novel binary inference algorithm for PU separation. Simulation results show that without prior knowledge regarding PUs' activities, the algorithm achieves high inference accuracy. An interesting implication of the proposed algorithm is the ability to effectively represent n independent binary sources via (correlated) binary vectors of logarithmic length.

preprint2012arXiv

Coalitional Games in Partition Form for Joint Spectrum Sensing and Access in Cognitive Radio Networks

Unlicensed secondary users (SUs) in cognitive radio networks are subject to an inherent tradeoff between spectrum sensing and spectrum access. Although each SU has an incentive to sense the primary user (PU) channels for locating spectrum holes, this exploration of the spectrum can come at the expense of a shorter transmission time, and, hence, a possibly smaller capacity for data transmission. This paper investigates the impact of this tradeoff on the cooperative strategies of a network of SUs that seek to cooperate in order to improve their view of the spectrum (sensing), reduce the possibility of interference among each other, and improve their transmission capacity (access). The problem is modeled as a coalitional game in partition form and an algorithm for coalition formation is proposed. Using the proposed algorithm, the SUs can make individual distributed decisions to join or leave a coalition while maximizing their utilities which capture the average time spent for sensing as well as the capacity achieved while accessing the spectrum. It is shown that, by using the proposed algorithm, the SUs can self-organize into a network partition composed of disjoint coalitions, with the members of each coalition cooperating to jointly optimize their sensing and access performance. Simulation results show the performance improvement that the proposed algorithm yields with respect to the non-cooperative case. The results also show how the algorithm allows the SUs to self-adapt to changes in the environment such as the change in the traffic of the PUs, or slow mobility.

preprint2011arXiv

Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

Due to the use of commodity software and hardware, crash-stop and Byzantine failures are likely to be more prevalent in today's large-scale distributed storage systems. Regenerating codes have been shown to be a more efficient way to disperse information across multiple nodes and recover crash-stop failures in the literature. In this paper, we present the design of regeneration codes in conjunction with integrity check that allows exact regeneration of failed nodes and data reconstruction in presence of Byzantine failures. A progressive decoding mechanism is incorporated in both procedures to leverage computation performed thus far. The fault-tolerance and security properties of the schemes are also analyzed.

preprint2011arXiv

Maximum Lifetime for Data Regeneration in Wireless Sensor Networks

Robust distributed storage systems dedicated to wireless sensor networks utilize several nodes to redundantly store sensed data so that when some storage nodes fail, the sensed data can still be reconstructed. For the same level of redundancy, erasure coding based approaches are known to require less data storage space than replication methods. To maintain the same level of redundancy when one storage node fails, erasure coded data can be restored onto some other storage node by having this node download respective pieces from other live storage nodes. Previous works showed that the benefits in using erasure coding for robust storage over replication are made unappealing by the complication in regenerating lost data. More recent work has, however, shown that the bandwidth for erasure coded data can be further reduced by proposing Regenerating Coding, making erasure codes again desirable for robust data storage. But none of these works on regenerating coding consider how these codes will perform for data regeneration in wireless sensor networks. We therefore propose an analytical model to quantify the network lifetime gains of regenerating coding over classical schemes. We also propose a distributed algorithm, TROY, that determines which nodes and routes to use for data regeneration. Our analytical studies show that for certain topologies, TROY achieves maximum network lifetime. Our evaluation studies in real sensor network traces show that TROY achieves near optimal lifetime and performs better than baseline algorithms.

preprint2010arXiv

Binary Independent Component Analysis with OR Mixtures

Independent component analysis (ICA) is a computational method for separating a multivariate signal into subcomponents assuming the mutual statistical independence of the non-Gaussian source signals. The classical Independent Components Analysis (ICA) framework usually assumes linear combinations of independent sources over the field of realvalued numbers R. In this paper, we investigate binary ICA for OR mixtures (bICA), which can find applications in many domains including medical diagnosis, multi-cluster assignment, Internet tomography and network resource management. We prove that bICA is uniquely identifiable under the disjunctive generation model, and propose a deterministic iterative algorithm to determine the distribution of the latent random variables and the mixing matrix. The inverse problem concerning inferring the values of latent variables are also considered along with noisy measurements. We conduct an extensive simulation study to verify the effectiveness of the propose algorithm and present examples of real-world applications where bICA can be applied.

preprint2010arXiv

Binary Inference for Primary User Separation in Cognitive Radio Networks

Spectrum sensing receives much attention recently in the cognitive radio (CR) network research, i.e., secondary users (SUs) constantly monitor channel condition to detect the presence of the primary users (PUs). In this paper, we go beyond spectrum sensing and introduce the PU separation problem, which concerns with the issues of distinguishing and characterizing PUs in the context of collaborative spectrum sensing and monitor selection. The observations of monitors are modeled as boolean OR mixtures of underlying binary sources for PUs. We first justify the use of the binary OR mixture model as opposed to the traditional linear mixture model through simulation studies. Then we devise a novel binary inference algorithm for PU separation. Not only PU-SU relationship are revealed, but PUs' transmission statistics and activities at each time slot can also be inferred. Simulation results show that without any prior knowledge regarding PUs' activities, the algorithm achieves high inference accuracy even in the presence of noisy measurements.

preprint2010arXiv

Progressive Decoding for Data Availability and Reliability in Distributed Networked Storage

To harness the ever growing capacity and decreasing cost of storage, providing an abstraction of dependable storage in the presence of crash-stop and Byzantine failures is compulsory. We propose a decentralized Reed Solomon coding mechanism with minimum communication overhead. Using a progressive data retrieval scheme, a data collector contacts only the necessary number of storage nodes needed to guarantee data integrity. The scheme gracefully adapts the cost of successful data retrieval to the number of storage node failures. Moreover, by leveraging the Welch-Berlekamp algorithm, it avoids unnecessary computations. Compared to the state-of-the-art decoding scheme, the implementation and evaluation results show that our progressive data retrieval scheme has up to 35 times better computation performance for low Byzantine node rates. Additionally, the communication cost in data retrieval is derived analytically and corroborated by Monte-Carlo simulation results. Our implementation is flexible in that the level of redundancy it provides is independent of the number of data generating nodes, a requirement for distributed storage systems

preprint2007arXiv

Order-Optimal Data Aggregation in Wireless Sensor Networks - Part I: Regular Networks

The predominate traffic patterns in a wireless sensor network are many-to-one and one-to-many communication. Hence, the performance of wireless sensor networks is characterized by the rate at which data can be disseminated from or aggregated to a data sink. In this paper, we consider the data aggregation problem. We demonstrate that a data aggregation rate of O(log(n)/n) is optimal and that this rate can be achieved in wireless sensor networks using a generalization of cooperative beamforming called cooperative time-reversal communication.

Rong Zheng

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Efficient Action Recognition Using Confidence Distillation

Individualizing Head-Related Transfer Functions for Binaural Acoustic Applications

CacheNet: A Model Caching Framework for Deep Learning Inference on the Edge

Informative Path Planning for Mobile Sensing with Reinforcement Learning

RECOME: a New Density-Based Clustering Algorithm Using Relative KNN Kernel Density

Device Fingerprinting in Wireless Networks: Challenges and Opportunities

Update-Efficient Error-Correcting Product-Matrix Codes

A Data-driven Study of Influences in Twitter Communities

On Budgeted Influence Maximization in Social Networks

On Quality of Monitoring for Multi-channel Wireless Infrastructure Networks

Update-Efficient Regenerating Codes with Minimum Per-Node Storage

A Robust Relay Placement Framework for 60GHz mmWave Wireless Personal Area Networks

Binary is Good: A Binary Inference Framework for Primary User Separation in Cognitive Radio Networks

Coalitional Games in Partition Form for Joint Spectrum Sensing and Access in Cognitive Radio Networks

Exact Regenerating Codes for Byzantine Fault Tolerance in Distributed Storage

Maximum Lifetime for Data Regeneration in Wireless Sensor Networks

Binary Independent Component Analysis with OR Mixtures

Binary Inference for Primary User Separation in Cognitive Radio Networks

Progressive Decoding for Data Availability and Reliability in Distributed Networked Storage

Order-Optimal Data Aggregation in Wireless Sensor Networks - Part I: Regular Networks