Source author record

Xun Gong

Xun Gong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Computer Vision Information Theory math.IT eess.AS Sound eess.SP eess.SY Networking and Internet Architecture quant-ph Systems and Control

Catalog footprint

What is connected

13works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Online Data-Driven Method for Microgrid Secondary Voltage and Frequency Control with Ensemble Koopman Modeling

Low inertia, nonlinearity and a high level of uncertainty (varying topologies and operating conditions) pose challenges to microgrid (MG) systemwide operation. This paper proposes an online adaptive Koopman operator optimal control (AKOOC) method for MG secondary voltage and frequency control. Unlike typical data-driven methods that are data-hungry and lack guaranteed stability, the proposed AKOOC requires no warm-up training yet with guaranteed bounded-input-bounded-output (BIBO) stability and even asymptotical stability under some mild conditions. The proposed AKOOC is developed based on an ensemble Koopman state space modeling with full basis functions that combines both linear and nonlinear bases without the need of event detection or switching. An iterative learning method is also developed to exploit model parameters, ensuring the effectiveness and the adaptiveness of the designed control. Simulation studies in the 4-bus (with detailed inner-loop control) MG system and the 34-bus MG system showed improved modeling accuracy and control, verifying the effectiveness of the proposed method subject to various changes of operating conditions even with time delay, measurement noise, and missing measurements.

preprint2022arXiv

Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition

Modern non-autoregressive~(NAR) speech recognition systems aim to accelerate the inference speed; however, they suffer from performance degradation compared with autoregressive~(AR) models as well as the huge model size issue. We propose a novel knowledge transfer and distillation architecture that leverages knowledge from AR models to improve the NAR performance while reducing the model's size. Frame- and sequence-level objectives are well-designed for transfer learning. To further boost the performance of NAR, a beam search method on Mask-CTC is developed to enlarge the search space during the inference stage. Experiments show that the proposed NAR beam search relatively reduces CER by over 5% on AISHELL-1 benchmark with a tolerable real-time-factor~(RTF) increment. By knowledge transfer, the NAR student who has the same size as the AR teacher obtains relative CER reductions of 8/16% on AISHELL-1 dev/test sets, and over 25% relative WER reductions on LibriSpeech test-clean/other sets. Moreover, the ~9x smaller NAR models achieve ~25% relative CER/WER reductions on both AISHELL-1 and LibriSpeech benchmarks with the proposed knowledge transfer and distillation.

preprint2022arXiv

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

Accent variability has posed a huge challenge to automatic speech recognition~(ASR) modeling. Although one-hot accent vector based adaptation systems are commonly used, they require prior knowledge about the target accent and cannot handle unseen accents. Furthermore, simply concatenating accent embeddings does not make good use of accent knowledge, which has limited improvements. In this work, we aim to tackle these problems with a novel layer-wise adaptation structure injected into the E2E ASR model encoder. The adapter layer encodes an arbitrary accent in the accent space and assists the ASR model in recognizing accented speech. Given an utterance, the adaptation structure extracts the corresponding accent information and transforms the input acoustic feature into an accent-related feature through the linear combination of all accent bases. We further explore the injection position of the adaptation layer, the number of accent bases, and different types of accent bases to achieve better accent adaptation. Experimental results show that the proposed adaptation structure brings 12\% and 10\% relative word error rate~(WER) reduction on the AESRC2020 accent dataset and the Librispeech dataset, respectively, compared to the baseline.

preprint2022arXiv

ReplaceBlock: An improved regularization method based on background information

Attention mechanism, being frequently used to train networks for better feature representations, can effectively disentangle the target object from irrelevant objects in the background. Given an arbitrary image, we find that the background's irrelevant objects are most likely to occlude/block the target object. We propose, based on this finding, a ReplaceBlock to simulate the situations when the target object is partially occluded by the objects that are deemed as background. Specifically, ReplaceBlock erases the target object in the image, and then generates a feature map with only irrelevant objects and background by the model. Finally, some regions in the background feature map are used to replace some regions of the target object in the original image feature map. In this way, ReplaceBlock can effectively simulate the feature map of the occluded image. The experimental results show that ReplaceBlock works better than DropBlock in regularizing convolutional networks.

preprint2022arXiv

Self-Supervised Implicit Attention: Guided Attention by The Model Itself

We propose Self-Supervised Implicit Attention (SSIA), a new approach that adaptively guides deep neural network models to gain attention by exploiting the properties of the models themselves. SSIA is a novel attention mechanism that does not require any extra parameters, computation, or memory access costs during inference, which is in contrast to existing attention mechanism. In short, by considering attention weights as higher-level semantic information, we reconsidered the implementation of existing attention mechanisms and further propose generating supervisory signals from higher network layers to guide lower network layers for parameter updates. We achieved this by building a self-supervised learning task using the hierarchical features of the network itself, which only works at the training stage. To verify the effectiveness of SSIA, we performed a particular implementation (called an SSIA block) in convolutional neural network models and validated it on several image classification datasets. The experimental results show that an SSIA block can significantly improve the model performance, even outperforms many popular attention methods that require additional parameters and computation costs, such as Squeeze-and-Excitation and Convolutional Block Attention Module. Our implementation will be available on GitHub.

preprint2022arXiv

The Fixed Sub-Center: A Better Way to Capture Data Complexity

Treating class with a single center may hardly capture data distribution complexities. Using multiple sub-centers is an alternative way to address this problem. However, highly correlated sub-classes, the classifier's parameters grow linearly with the number of classes, and lack of intra-class compactness are three typical issues that need to be addressed in existing multi-subclass methods. To this end, we propose to use Fixed Sub-Center (F-SC), which allows the model to create more discrepant sub-centers while saving memory and cutting computational costs considerably. The F-SC specifically, first samples a class center Ui for each class from a uniform distribution, and then generates a normal distribution for each class, where the mean is equal to Ui. Finally, the sub-centers are sampled based on the normal distribution corresponding to each class, and the sub-centers are fixed during the training process avoiding the overhead of gradient calculation. Moreover, F-SC penalizes the Euclidean distance between the samples and their corresponding sub-centers, it helps remain intra-compactness. The experimental results show that F-SC significantly improves the accuracy of both image classification and fine-grained recognition tasks.

preprint2021arXiv

Cyclic three-level-pulse-area theorem for enantioselective state transfer of chiral molecules

We derive a pulse-area theorem for a cyclic three-level system, an archetypal model for exploring enantioselective state transfer (ESST) in chiral molecules driven by three linearly polarized microwave pulses. By dividing the closed-loop excitation into two separate stages, we obtain both amplitude and phase conditions of three control fields to generate high fidelity of ESST. As a proof of principle, we apply this pulse-area theorem to the cyclohexylmethanol molecules ($\text{C}_{7}\text{H}_{14}\text{O}$), for which three rotational states are connected by the $a$-type, $b$-type, and $c$-type components of the transition dipole moments in both center-frequency resonant and detuned conditions. Our results show that two enantiomers with opposite handedness can be transferred to different target states by designing three microwave pulses that satisfy the amplitude and phase conditions at the transition frequencies. The corresponding control schemes are robust against the time delays between the two stages. We suggest that the two control fields used in the second stage should be applied simultaneously for practical applications. This work contributes an alternative pulse-area theorem to the field of quantum control, which has the potential to determine the chirality of enantiomers in a mixture.

preprint2020arXiv

Achievable Rates of Opportunistic Cognitive Radio Systems Using Reconfigurable Antennas with Imperfect Sensing and Channel Estimation

We consider an opportunistic cognitive radio (CR) system in which secondary transmitter (SUtx) is equipped with a reconfigurable antenna (RA). Utilizing the beam steering capability of the RA, we regard a design framework for integrated sector-based spectrum sensing and data communication. In this framework, SUtx senses the spectrum and detects the beam corresponding to active primary user's (PU) location. SUtx also sends training symbols (prior to data symbols), to enable channel estimation at secondary receiver (SUrx) and selection of the strongest beam between SUtx-SUrx for data transmission. We establish a lower bound on the achievable rates of SUtx-SUrx link, in the presence of spectrum sensing and channel estimation errors, and errors due to incorrect detection of the beam corresponding to PU's location and incorrect selection of the strongest beam for data transmission. We formulate a novel constrained optimization problem, aiming at maximizing the derived achievable rate lower bound subject to average transmit and interference power constraints. We optimize the durations of spatial spectrum sensing and channel training as well as data symbol transmission power. Our numerical results demonstrate that between optimizing spectrum sensing and channel training durations, the latter is more important for providing higher achievable rates.

preprint2014arXiv

Quantifying the Information Leakage in Timing Side Channels in Deterministic Work-Conserving Schedulers

When multiple job processes are served by a single scheduler, the queueing delays of one process are often affected by the others, resulting in a timing side channel that leaks the arrival pattern of one process to the others. In this work, we study such a timing side channel between a regular user and a malicious attacker. Utilizing Shannon's mutual information as a measure of information leakage between the user and attacker, we analyze privacy-preserving behaviors of common work-conserving schedulers. We find that the attacker can always learn perfectly the user's arrival process in a longest-queue-first (LQF) scheduler. When the user's job arrival rate is very low (near zero), first-come-first-serve (FCFS) and round robin schedulers both completely reveal the user's arrival pattern. The near-complete information leakage in the low-rate traffic region is proven to be reduced by half in a work-conserving version of TDMA (WC-TDMA) scheduler, which turns out to be privacy-optimal in the class of deterministic-working-conserving (det-WC) schedulers, according to a universal lower bound on information leakage we derive for all det-WC schedulers.

preprint2013arXiv

An Information Theoretic Study of Timing Side Channels in Two-user Schedulers

Timing side channels in two-user schedulers are studied. When two users share a scheduler, one user may learn the other user's behavior from patterns of service timings. We measure the information leakage of the resulting timing side channel in schedulers serving a legitimate user and a malicious attacker, using a privacy metric defined as the Shannon equivocation of the user's job density. We show that the commonly used first-come-first-serve (FCFS) scheduler provides no privacy as the attacker is able to to learn the user's job pattern completely. Furthermore, we introduce an scheduling policy, accumulate-and-serve scheduler, which services jobs from the user and attacker in batches after buffering them. The information leakage in this scheduler is mitigated at the price of service delays, and the maximum privacy is achievable when large delays are added.

preprint2013arXiv

Invisible Flow Watermarks for Channels with Dependent Substitution, Deletion, and Bursty Insertion Errors

Flow watermarks efficiently link packet flows in a network in order to thwart various attacks such as stepping stones. We study the problem of designing good flow watermarks. Earlier flow watermarking schemes mostly considered substitution errors, neglecting the effects of packet insertions and deletions that commonly happen within a network. More recent schemes consider packet deletions but often at the expense of the watermark visibility. We present an invisible flow watermarking scheme capable of enduring a large number of packet losses and insertions. To maintain invisibility, our scheme uses quantization index modulation (QIM) to embed the watermark into inter-packet delays, as opposed to time intervals including many packets. As the watermark is injected within individual packets, packet losses and insertions may lead to watermark desynchronization and substitution errors. To address this issue, we add a layer of error-correction coding to our scheme. Experimental results on both synthetic and real network traces demonstrate that our scheme is robust to network jitter, packet drops and splits, while remaining invisible to an attacker.

preprint2012arXiv

CensorSpoofer: Asymmetric Communication with IP Spoofing for Censorship-Resistant Web Browsing

A key challenge in censorship-resistant web browsing is being able to direct legitimate users to redirection proxies while preventing censors, posing as insiders, from discovering their addresses and blocking them. We propose a new framework for censorship-resistant web browsing called {\it CensorSpoofer} that addresses this challenge by exploiting the asymmetric nature of web browsing traffic and making use of IP spoofing. CensorSpoofer de-couples the upstream and downstream channels, using a low-bandwidth indirect channel for delivering outbound requests (URLs) and a high-bandwidth direct channel for downloading web content. The upstream channel hides the request contents using steganographic encoding within email or instant messages, whereas the downstream channel uses IP address spoofing so that the real address of the proxies is not revealed either to legitimate users or censors. We built a proof-of-concept prototype that uses encrypted VoIP for this downstream channel and demonstrated the feasibility of using the CensorSpoofer framework in a realistic environment.

preprint2011arXiv

Website Detection Using Remote Traffic Analysis

Recent work in traffic analysis has shown that traffic patterns leaked through side channels can be used to recover important semantic information. For instance, attackers can find out which website, or which page on a website, a user is accessing simply by monitoring the packet size distribution. We show that traffic analysis is even a greater threat to privacy than previously thought by introducing a new attack that can be carried out remotely. In particular, we show that, to perform traffic analysis, adversaries do not need to directly observe the traffic patterns. Instead, they can gain sufficient information by sending probes from a far-off vantage point that exploits a queuing side channel in routers. To demonstrate the threat of such remote traffic analysis, we study a remote website detection attack that works against home broadband users. Because the remotely observed traffic patterns are more noisy than those obtained using previous schemes based on direct local traffic monitoring, we take a dynamic time warping (DTW) based approach to detecting fingerprints from the same website. As a new twist on website fingerprinting, we consider a website detection attack, where the attacker aims to find out whether a user browses a particular web site, and its privacy implications. We show experimentally that, although the success of the attack is highly variable, depending on the target site, for some sites very low error rates. We also show how such website detection can be used to deanonymize message board users.

Xun Gong

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

An Online Data-Driven Method for Microgrid Secondary Voltage and Frequency Control with Ensemble Koopman Modeling

Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition

Layer-wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition

ReplaceBlock: An improved regularization method based on background information

Self-Supervised Implicit Attention: Guided Attention by The Model Itself

The Fixed Sub-Center: A Better Way to Capture Data Complexity

Cyclic three-level-pulse-area theorem for enantioselective state transfer of chiral molecules

Achievable Rates of Opportunistic Cognitive Radio Systems Using Reconfigurable Antennas with Imperfect Sensing and Channel Estimation

Quantifying the Information Leakage in Timing Side Channels in Deterministic Work-Conserving Schedulers

An Information Theoretic Study of Timing Side Channels in Two-user Schedulers

Invisible Flow Watermarks for Channels with Dependent Substitution, Deletion, and Bursty Insertion Errors

CensorSpoofer: Asymmetric Communication with IP Spoofing for Censorship-Resistant Web Browsing

Website Detection Using Remote Traffic Analysis