Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
81works
0followers
40topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

81 published item(s)

preprint2026arXiv

A Bayesian Approach for Selecting Relevant External Data (BASE): Application to a study of Long-Term Outcomes in a Hemophilia Gene Therapy Trial

Gene therapies aim to address the root causes of diseases, particularly those stemming from rare genetic defects that can be life-threatening or severely debilitating. Although an increasing number of gene therapies have received regulatory approvals in recent years, understanding their long-term efficacy in trials with limited follow-up time remains challenging. To address this critical question, we propose a novel Bayesian framework designed to selectively integrate relevant external data with internal trial data to improve the inference of the durability of long-term efficacy. We proved that the proposed method has desired theoretical properties, such as identifying and favoring external subsets deemed relevant, where the relevance is defined as the similarity, induced by the marginal likelihood, between the generating mechanisms of the internal data and the selected external data. We also conducted comprehensive simulations to evaluate its performance under various scenarios. Furthermore, we apply this method to predict and infer the endogenous factor IX (FIX) levels of patients who receive Etranacogene dezaparvovec over the long-term. Our estimated long-term FIX levels, validated by recent trial data, indicate that Etranacogene dezaparvovec induces sustained FIX production. Together, the theoretical findings, simulation results, and successful application of this framework underscore its potential to address similar long-term effectiveness estimation and inference questions in real world applications.

preprint2026arXiv

Beyond Physical Labels: Redefining Domains for Robust WiFi-based Gesture Recognition

In this paper, we propose GesFi, a novel WiFi-based gesture recognition system that introduces WiFi latent domain mining to redefine domains directly from the data itself. GesFi first processes raw sensing data collected from WiFi receivers using CSI-ratio denoising, Short-Time Fast Fourier Transform, and visualization techniques to generate standardized input representations. It then employs class-wise adversarial learning to suppress gesture semantic and leverages unsupervised clustering to automatically uncover latent domain factors responsible for distributional shifts. These latent domains are then aligned through adversarial learning to support robust cross-domain generalization. Finally, the system is applied to the target environment for robust gesture inference. We deployed GesFi under both single-pair and multi-pair settings using commodity WiFi transceivers, and evaluated it across multiple public datasets and real-world environments. Compared to state-of-the-art baselines, GesFi achieves up to 78% and 50% performance improvements over existing adversarial methods, and consistently outperforms prior generalization approaches across most cross-domain tasks.

preprint2026arXiv

Breaking Coordinate Overfitting: Geometry-Aware WiFi Sensing for Cross-Layout 3D Pose Estimation

WiFi-based 3D human pose estimation offers a low-cost and privacy-preserving alternative to vision-based systems for smart interaction. However, existing approaches rely on visual 3D poses as supervision and directly regress CSI to a camera-based coordinate system. We find that this practice leads to coordinate overfitting: models memorize deployment-specific WiFi transceiver layouts rather than only learning activity-relevant representations, resulting in severe generalization failures. To address this challenge, we present PerceptAlign, the first geometry-conditioned framework for WiFi-based cross-layout pose estimation. PerceptAlign introduces a lightweight coordinate unification procedure that aligns WiFi and vision measurements in a shared 3D space using only two checkerboards and a few photos. Within this unified space, it encodes calibrated transceiver positions into high-dimensional embeddings and fuses them with CSI features, making the model explicitly aware of device geometry as a conditional variable. This design forces the network to disentangle human motion from deployment layouts, enabling robust and, for the first time, layout-invariant WiFi pose estimation. To support systematic evaluation, we construct the largest cross-domain 3D WiFi pose estimation dataset to date, comprising 21 subjects, 5 scenes, 18 actions, and 7 device layouts. Experiments show that PerceptAlign reduces in-domain error by 12.3% and cross-domain error by more than 60% compared to state-of-the-art baselines. These results establish geometry-conditioned learning as a viable path toward scalable and practical WiFi sensing.

preprint2026arXiv

Cross-Subject Generalization for EEG Decoding: A Survey of Deep Learning Methods

Deep learning for cross-subject EEG decoding is hindered by high inter-subject variability, which introduces a severe domain shift between training and unseen test subjects. This survey presents a comprehensive review of deep learning methodologies specifically engineered to address this cross-subject generalization challenge. To ground this analysis, we formalize the cross-subject setting as a multi-source domain problem and delineate the rigorous, subject-independent evaluation protocols required for valid assessment. Central to this survey is a systematic taxonomy of the current literature into discrete methodological families, including feature alignment, adversarial learning, feature disentanglement, and contrastive learning. We conclude by examining three critical elements for advancing robust, real-world decoding: the theoretical limitations of current methodologies, the structural value of subject identity, and the emergence of EEG foundation models.

preprint2026arXiv

DexH2R: Task-oriented Dexterous Manipulation from Human to Robots

Dexterous manipulation is a critical aspect of human capability, enabling interaction with a wide variety of objects. Recent advancements in learning from human demonstrations and teleoperation have enabled progress for robots in such ability. However, these approaches either require complex data collection such as costly human effort for eye-robot contact, or suffer from poor generalization when faced with novel scenarios. To solve both challenges, we propose a framework, DexH2R, that combines human hand motion retargeting with a task-oriented residual action policy, improving task performance by bridging the embodiment gap between human and robotic dexterous hands. Specifically, DexH2R learns the residual policy directly from retargeted primitive actions and task-oriented rewards, eliminating the need for labor-intensive teleoperation systems. Moreover, we incorporate test-time guidance for novel scenarios by taking in desired trajectories of human hands and objects, allowing the dexterous hand to acquire new skills with high generalizability. Extensive experiments in both simulation and real-world environments demonstrate the effectiveness of our work, outperforming prior state-of-the-arts by 40% across various settings.

preprint2026arXiv

DIMoE-Adapters: Dynamic Expert Evolution for Continual Learning in Vision-Language Models

Continual learning enables vision-language models to accumulate knowledge and adapt to evolving tasks without retraining from scratch. However, in multi-domain task-incremental learning, large domain shifts intensify the stability-plasticity dilemma. Most existing methods rely on fixed architectures with statically allocated parameters, which limits adaptation to new domains and aggravates catastrophic forgetting. To address these challenges, we propose DIMoE-Adapters, a Dynamic Incremental Mixture-of-Experts Adapters framework that introduces a dynamic expert evolution paradigm to balance stability and plasticity. This paradigm is implemented through two collaborative components: Self-Calibrated Expert Evolution (SCEE) and Prototype-Guided Expert Selection (PGES). SCEE constructs and evolves a sparse expert pool through expert optimization dynamics, improving plasticity while reducing redundant capacity. PGES controls expert utilization based on the pool shaped by SCEE, improving stability across both previously encountered and unseen tasks. Extensive experiments show that DIMoE-Adapters outperforms previous state-of-the-art methods across various settings.

preprint2026arXiv

FORESTLLM: Large Language Models Make Random Forest Great on Few-shot Tabular Learning

Tabular data high-stakes critical decision-making in domains such as finance, healthcare, and scientific discovery. Yet, learning effectively from tabular data in few-shot settings, where labeled examples are scarce, remains a fundamental challenge. Traditional tree-based methods often falter in these regimes due to their reliance on statistical purity metrics, which become unstable and prone to overfitting with limited supervision. At the same time, direct applications of large language models (LLMs) often overlook its inherent structure, leading to suboptimal performance. To overcome these limitations, we propose FORESTLLM, a novel framework that unifies the structural inductive biases of decision forests with the semantic reasoning capabilities of LLMs. Crucially, FORESTLLM leverages the LLM only during training, treating it as an offline model designer that encodes rich, contextual knowledge into a lightweight, interpretable forest model, eliminating the need for LLM inference at test time. Our method is two-fold. First, we introduce a semantic splitting criterion in which the LLM evaluates candidate partitions based on their coherence over both labeled and unlabeled data, enabling the induction of more robust and generalizable tree structures under few-shot supervision. Second, we propose a one-time in-context inference mechanism for leaf node stabilization, where the LLM distills the decision path and its supporting examples into a concise, deterministic prediction, replacing noisy empirical estimates with semantically informed outputs. Across a diverse suite of few-shot classification and regression benchmarks, FORESTLLM achieves state-of-the-art performance.

preprint2026arXiv

Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views

Soft boundaries, like thin hairs, are commonly observed in natural and computer-generated imagery, but they remain challenging for 3D vision due to the ambiguous mixing of foreground and background cues. This paper introduces Guardians of the Hair (HairGuard), a framework designed to recover fine-grained soft boundary details in 3D vision tasks. Specifically, we first propose a novel data curation pipeline that leverages image matting datasets for training and design a depth fixer network to automatically identify soft boundary regions. With a gated residual module, the depth fixer refines depth precisely around soft boundaries while maintaining global depth quality, allowing plug-and-play integration with state-of-the-art depth models. For view synthesis, we perform depth-based forward warping to retain high-fidelity textures, followed by a generative scene painter that fills disoccluded regions and eliminates redundant background artifacts within soft boundaries. Finally, a color fuser adaptively combines warped and inpainted results to produce novel views with consistent geometry and fine-grained details. Extensive experiments demonstrate that HairGuard achieves state-of-the-art performance across monocular depth estimation, stereo image/video conversion, and novel view synthesis, with significant improvements in soft boundary regions.

preprint2026arXiv

Hierarchical Dual-Subspace Decoupling for Continual Learning in Vision-Language Models

Class-incremental learning aims to continuously acquire new knowledge while preserving previously learned information, thereby mitigating catastrophic forgetting. Existing methods primarily restrict parameter updates but often overlook their structural properties in high-dimensional spaces. From a subspace perspective, updates induced by different tasks tend to lie in multiple overlapping low-rank subspaces, leading to cross-task subspace interference and severe forgetting. To address this issue, we propose HDSD, a Hierarchical Dual-Subspace Decoupling framework for continual learning in vision-language models. Specifically, we introduce a lightweight Feature Modulation Module (FMM) that explicitly decomposes the parameter space into general and task-specific subspaces. Building on this design, we develop two complementary components. First, a General Fusion Module (GFM) evaluates relative parameter changes across tasks and uses an adaptive threshold to capture stable and transferable knowledge. Second, a Hierarchical Learning Module (HLM) performs structured parameter decomposition via Singular Value Decomposition (SVD) and uses a scaling mechanism to constrain updates within distinct subspace scales. Together, these designs reduce subspace interference and parameter drift. Extensive experiments on conventional benchmarks show that HDSD achieves state-of-the-art results.

preprint2026arXiv

Hierarchical Secure Aggregation with Heterogeneous Security Constraints and Arbitrary User Collusion

In hierarchical secure aggregation (HSA), a server communicates with clustered users through an intermediate layer of relays to compute the sum of users' inputs under two security requirements -- server security and relay security. Server security requires that the server learns nothing beyond the desired sum even when colluding with a subset of users, while relay security requires that each relay remains oblivious to the users' inputs under collusion. Existing work on HSA enforces homogeneous security where \tit{all} inputs must be protected against \tit{any} subset of potential colluding users with sizes up to a predefined threshold. Such a \homo formulation cannot capture scenarios with \tit{\het} \secty \reqs where \diff users may demand various levels of protection. In this paper, we study hierarchical secure aggregation (HSA) with heterogeneous security requirements and arbitrary user collusion. Specifically, we consider scenarios where the inputs of certain groups of users must remain information-theoretically secure against inference by the server or any relay, even if the server or any relay colludes with an arbitrary subset of other users. Under server security, the server learns nothing about these protected inputs beyond the prescribed aggregate sum, despite any such collusion. Under relay security, each relay similarly obtains no information about the protected inputs under the same collusion model. We characterize the optimal communication rates achievable across all layers for all parameter regimes. Furthermore, we study the minimum source keys required at the users to ensure security. For this source key requirement, we provide tight characterizations in two broad regimes determined by the security and collusion constraints, and establish a general information-theoretic lower bound together with a bounded-gap achievable scheme for the remaining regime.

preprint2026arXiv

Optimal Communication and Key Rate Region for Hierarchical Secure Aggregation with User Collusion

Secure aggregation is concerned with the task of securely uploading the inputs of multiple users to an aggregation server without letting the server know the inputs beyond their summation. It finds broad applications in distributed machine learning paradigms such as federated learning (FL) where multiple clients, each having access to a proprietary dataset, periodically upload their locally trained models (abstracted as inputs) to a parameter server which then generates an aggregate (e.g., averaged) model that is sent back to the clients as an initializing point for a new round of local training. To enhance the data privacy of the clients, secure aggregation protocols are developed using techniques from cryptography to ensure that the server infers no more information of the users' inputs beyond the desired aggregated input, even if the server can collude with some users. Although laying the ground for understanding the fundamental utility-security trade-off in secure aggregation, the simple star client-server architecture cannot capture more complex network architectures used in practical systems. Motivated by hierarchical federated learning, we investigate the secure aggregation problem in a $3$-layer hierarchical network consisting of clustered users connecting to an aggregation server through an intermediate layer of relays. Besides the conventional server security which requires that the server learns nothing beyond the desired sum of inputs, relay security is also imposed so that the relays infer nothing about the users' inputs and remain oblivious. For such a hierarchical secure aggregation (HSA) problem, we characterize the optimal multifaceted trade-off between communication (in terms of user-to-relay and relay-to-server communication rates) and secret key generation efficiency (in terms of individual key and source key rates).

preprint2026arXiv

Optimal Rate Region for Multi-server Secure Aggregation with User Collusion

Secure aggregation is a fundamental primitive in privacy-preserving distributed learning systems, where an aggregator aims to compute the sum of users' inputs without revealing individual data. In this paper, we study a multi-server secure aggregation problem in a two-hop network consisting of multiple aggregation servers and multiple users per server, under the presence of user collusion. Each user communicates only with its associated server, while the servers exchange messages to jointly recover the global sum. We adopt an information-theoretic security framework, allowing up to $T$ users to collude with any server. We characterize the complete optimal rate region in terms of user-to-server communication rate, server-to-server communication rate, individual key rate, and source key rate. Our main result shows that the minimum communication and individual key rates are all one symbol per input symbol, while the optimal source key rate is given by $\min\{U+V+T-2,\, UV-1\}$, where $U$ denotes the number of servers and $V$ the number of users per server. The achievability is established via a linear key construction that ensures correctness and security against colluding users, while the converse proof relies on tight entropy bounds derived from correctness and security constraints. The results reveal a fundamental tradeoff between security and key efficiency and demonstrate that the multi-server architecture can significantly reduce the required key randomness compared to single-server secure aggregation. Our findings provide a complete information-theoretic characterization of secure aggregation in multi-server systems with user collusion.

preprint2026arXiv

Peak-Detector: Explainable Peak Detection via Instruction-Tuned Large Language Models in Physiological Sign

Accurate peak detection across diverse cardiac physiological signals, including the Electrocardiogram (ECG), Photoplethysmogram (PPG), Ballistocardiogram (BCG), and Bodyseismography (BSG), is fundamental for cardiovascular monitoring but is often hindered by artifacts and signal variability. Conventional algorithms are typically engineered with expert knowledge for a single signal modality, limiting their generalizability. Conversely, deep learning-based methods often lack interpretability, limiting transparency for expert verification and hindering expert-computer interaction. To address these limitations, we introduce Peak-Detector, a novel framework that leverages instruction-tuned Large Language Models (LLMs) for robust, cross-modal, and explainable peak detection. A core innovation of our framework is a "peak-representation" technique that transforms time-series data into a condensed format, preserving critical event information while significantly reducing signal length. This representation provides a crucial inductive bias, guiding the LLM to reason over physiologically meaningful events rather than raw, noisy data. The model is optimized through a two-stage process: supervised fine-tuning (SFT) followed by reinforcement learning (RL) with a multi-objective reward function. The model's self-explanation capabilities are cultivated by fine-tuning on a custom-built Peak-Explanation dataset. Across four modalities-ECG, PPG, BCG, and BSG-spanning seven datasets (six public benchmarks plus one real-world cohort), Peak-Detector demonstrates strong cross-modal performance, achieving best or tied-best detection under clinically relevant temporal tolerance. Beyond accuracy, the generated rationales surface failure modes and support verification and error analysis.

preprint2026arXiv

Rotation Control Unlearning: Quantifying and Controlling Continuous Unlearning for LLM with The Cognitive Rotation Space

As Large Language Models (LLMs) become increasingly prevalent, their security vulnerabilities have already drawn attention. Machine unlearning is introduced to seek to mitigate these risks by removing the influence of undesirable data. However, existing methods not only rely on the retained dataset to preserve model utility, but also suffer from cumulative catastrophic utility loss under continuous unlearning requests. To solve this dilemma, we propose a novel method, called Rotation Control Unlearning (RCU), which leverages the rotational salience weight of RCU to quantify and control the unlearning degree in the continuous unlearning process. The skew symmetric loss is designed to construct the existence of the cognitive rotation space, where the changes of rotational angle can simulate the continuous unlearning process. Furthermore, we design an orthogonal rotation axes regularization to enforce mutually perpendicular rotation directions for continuous unlearning requests, effectively minimizing interference and addressing cumulative catastrophic utility loss. Experiments on multiple datasets confirm that our method without retained dataset achieves SOTA performance.

preprint2026arXiv

TFEC: Multivariate Time-Series Clustering via Temporal-Frequency Enhanced Contrastive Learning

Multivariate Time-Series (MTS) clustering is crucial for signal processing and data analysis. Although deep learning approaches, particularly those leveraging Contrastive Learning (CL), are prominent for MTS representation, existing CL-based models face two key limitations: 1) neglecting clustering information during positive/negative sample pair construction, and 2) introducing unreasonable inductive biases, e.g., destroying time dependence and periodicity through augmentation strategies, compromising representation quality. This paper, therefore, proposes a Temporal-Frequency Enhanced Contrastive (TFEC) learning framework. To preserve temporal structure while generating low-distortion representations, a temporal-frequency Co-EnHancement (CoEH) mechanism is introduced. Accordingly, a synergistic dual-path representation and cluster distribution learning framework is designed to jointly optimize cluster structure and representation fidelity. Experiments on six real-world benchmark datasets demonstrate TFEC's superiority, achieving 4.48% average NMI gains over SOTA methods, with ablation studies validating the design. The code of the paper is available at: https://github.com/yueliangy/TFEC.

preprint2026arXiv

UniFixer: A Universal Reference-Guided Fixer for Diffusion-Based View Synthesis

With the recent surge of generative models, diffusion-based approaches have become mainstream for view synthesis tasks, either in an explicit depth-warp-inpaint or in an implicit end-to-end manner. Despite their success, both paradigms often suffer from noticeable quality degradation, e.g., blurred details and distorted structures, caused by pixel-to-latent compression and diffusion hallucination. In this paper, we investigate diffusion degradation from three key dimensions (i.e., spatial, temporal, and backbone-related) and propose UniFixer, a universal reference-guided framework that fixes diverse degradation artifacts via a coarse-to-fine strategy. Specifically, a reference pre-alignment module is first designed to perform coarse alignment between the reference view and the degraded novel view. A global structure anchoring mechanism then rectifies geometric distortions to ensure structural fidelity, followed by a local detail injection module that recovers fine-grained texture details for high-quality view synthesis. Our UniFixer serves as a plug-and-play refiner that achieves zero-shot fixing across different types of diffusion degradation, and extensive experiments verify our state-of-the-art performance on novel view synthesis and stereo conversion.

preprint2026arXiv

VulTriage: Triple-Path Context Augmentation for LLM-Based Vulnerability Detection

Automated vulnerability detection is a fundamental task in software security, yet existing learning-based methods still struggle to capture the structural dependencies, domain-specific vulnerability knowledge, and complex program semantics required for accurate detection. Recent Large Language Models (LLMs) have shown strong code understanding ability, but directly prompting them with raw source code often leads to missed vulnerabilities or false alarms, especially when vulnerable and benign functions differ only in subtle semantic details. To address this, we propose VulTriage, a triple-path context augmentation framework for LLM-based vulnerability detection. VulTriage enhances the LLM input through three complementary paths: a Control Path that extracts and verbalizes AST, CFG, and DFG information to expose control and data dependencies; a Knowledge Path that retrieves relevant CWE-derived vulnerability patterns and examples through hybrid dense--sparse retrieval; and a Semantic Path that summarizes the functional behavior of the code before the final judgment. These contexts are integrated into a unified instruction to guide the LLM toward more reliable vulnerability reasoning. Experiments on the PrimeVul pair test set show that VulTriage achieves state-of-the-art performance, outperforming existing deep learning and LLM-based baselines on key pair-wise and classification metrics. Further ablation studies verify the effectiveness of each path, and additional experiments on the Kotlin dataset demonstrate the generalization ability of VulTriage under low-resource and class-imbalanced settings. Our code is available at https://github.com/vinsontang1/VulTriage

preprint2026arXiv

When to Think, When to Speak: Learning Disclosure Policies for LLM Reasoning

In single-stream autoregressive interfaces, the same tokens both update the model state and constitute an irreversible public commitment. This coupling creates a silence tax: additional deliberation postpones the first task-relevant content, while naive early streaming risks premature commitments that bias subsequent generations. We introduce Side-by-Side (SxS) Interleaved Reasoning, which makes disclosure timing a controllable decision within standard autoregressive generation. SxS interleaves partial disclosures with continued private reasoning in the same context, but releases content only when it is supported by the reasoning so far. To learn such pacing without incentivizing filler, we construct entailment-aligned interleaved trajectories by matching answer prefixes to supporting reasoning prefixes, then train with SFT to acquire the dual-action semantics and RL to recover reasoning performance under the new format. Across two Qwen3 architectures/scales (MoE Qwen3-30B-A3B, dense Qwen3-4B) and both in-domain (AIME25) and out-of-domain (GPQA-Diamond) benchmarks, SxS improves accuracy--content-latency Pareto trade-offs under token-level proxies such as inter-update waiting.

preprint2023arXiv

A portable sub Hertz ultra-stable laser over 1700km highway transportation

We present a subHz linewidth portable ultrastable laser with the mass and volume of are 40kg and 400mm*280mm*450mm, respectively, that meets the requirements of automatic frequency locking and road transportation. A dynamic analytical model of the physical parts of ultrastable laser is established, and the first order resonance frequency is determined by FEA and well agrees with the experimentally measured result. To verify the transport performance of the portable ultrastable laser, it is tested for 100 km actual road transportation and 60 min continuous vibration, corresponding to 1700 km road transportation. The success of the test demonstrated that the portable ultrastable laser was very robust. Meanwhile, the portable ultrastable lasers shows that the median of the linewidth distribution is approximately 0.78 Hz, and the fractional frequency instability is less than 3E-15 at 1 to 10 s averaging time. This value approaches the total noise of 2.0E-15 including thermal noise and residual amplitude modulation. The robust suggested that the portable ultrastable laser might be a good candidate such as optical frequency transfer and metrological systems.

preprint2022arXiv

An EEG-Based Multi-Modal Emotion Database with Both Posed and Authentic Facial Actions for Emotion Analysis

Emotion is an experience associated with a particular pattern of physiological activity along with different physiological, behavioral and cognitive changes. One behavioral change is facial expression, which has been studied extensively over the past few decades. Facial behavior varies with a person's emotion according to differences in terms of culture, personality, age, context, and environment. In recent years, physiological activities have been used to study emotional responses. A typical signal is the electroencephalogram (EEG), which measures brain activity. Most of existing EEG-based emotion analysis has overlooked the role of facial expression changes. There exits little research on the relationship between facial behavior and brain signals due to the lack of dataset measuring both EEG and facial action signals simultaneously. To address this problem, we propose to develop a new database by collecting facial expressions, action units, and EEGs simultaneously. We recorded the EEGs and face videos of both posed facial actions and spontaneous expressions from 29 participants with different ages, genders, ethnic backgrounds. Differing from existing approaches, we designed a protocol to capture the EEG signals by evoking participants' individual action units explicitly. We also investigated the relation between the EEG signals and facial action units. As a baseline, the database has been evaluated through the experiments on both posed and spontaneous emotion recognition with images alone, EEG alone, and EEG fused with images, respectively. The database will be released to the research community to advance the state of the art for automatic emotion recognition.

preprint2022arXiv

AutoGCL: Automated Graph Contrastive Learning via Learnable View Generators

Contrastive learning has been widely applied to graph representation learning, where the view generators play a vital role in generating effective contrastive samples. Most of the existing contrastive learning methods employ pre-defined view generation methods, e.g., node drop or edge perturbation, which usually cannot adapt to input data or preserve the original semantic structures well. To address this issue, we propose a novel framework named Automated Graph Contrastive Learning (AutoGCL) in this paper. Specifically, AutoGCL employs a set of learnable graph view generators orchestrated by an auto augmentation strategy, where every graph view generator learns a probability distribution of graphs conditioned by the input. While the graph view generators in AutoGCL preserve the most representative structures of the original graph in generation of every contrastive sample, the auto augmentation learns policies to introduce adequate augmentation variances in the whole contrastive learning procedure. Furthermore, AutoGCL adopts a joint training strategy to train the learnable view generators, the graph encoder, and the classifier in an end-to-end manner, resulting in topological heterogeneity yet semantic similarity in the generation of contrastive samples. Extensive experiments on semi-supervised learning, unsupervised learning, and transfer learning demonstrate the superiority of our AutoGCL framework over the state-of-the-arts in graph contrastive learning. In addition, the visualization results further confirm that the learnable view generators can deliver more compact and semantically meaningful contrastive samples compared against the existing view generation methods.

preprint2022arXiv

BM-NAS: Bilevel Multimodal Neural Architecture Search

Deep neural networks (DNNs) have shown superior performances on various multimodal learning problems. However, it often requires huge efforts to adapt DNNs to individual multimodal tasks by manually engineering unimodal features and designing multimodal feature fusion strategies. This paper proposes Bilevel Multimodal Neural Architecture Search (BM-NAS) framework, which makes the architecture of multimodal fusion models fully searchable via a bilevel searching scheme. At the upper level, BM-NAS selects the inter/intra-modal feature pairs from the pretrained unimodal backbones. At the lower level, BM-NAS learns the fusion strategy for each feature pair, which is a combination of predefined primitive operations. The primitive operations are elaborately designed and they can be flexibly combined to accommodate various effective feature fusion modules such as multi-head attention (Transformer) and Attention on Attention (AoA). Experimental results on three multimodal tasks demonstrate the effectiveness and efficiency of the proposed BM-NAS framework. BM-NAS achieves competitive performances with much less search time and fewer model parameters in comparison with the existing generalized multimodal NAS methods.

preprint2022arXiv

Experimental Test of Contextuality based on State Discrimination with a Single Qubit

Exploring quantum phenomena beyond predictions of any classical model has fundamental importance to understand the boundary of classical and quantum descriptions of nature. As a typical property that a quantum system behaves distinctively from a classical counterpart, contextuality has been studied extensively and verified experimentally in systems composed of at least three levels (qutrit). Here we extend the scope of experimental test of contextuality to a minimal quantum system of only two states (qubit) by implementing the minimum error state discrimination on a single $^{171}$Yb$^+$ ion. We observe a substantial violation of a no-go inequality derived by assuming non-contextuality, and firmly conclude that the measured results of state discrimination cannot be reconciled with any non-contextual description. We also quantify the contextual advantage of state discrimination and the tolerance against quantum noises.

preprint2022arXiv

Exploring Edge Disentanglement for Node Classification

Edges in real-world graphs are typically formed by a variety of factors and carry diverse relation semantics. For example, connections in a social network could indicate friendship, being colleagues, or living in the same neighborhood. However, these latent factors are usually concealed behind mere edge existence due to the data collection and graph formation processes. Despite rapid developments in graph learning over these years, most models take a holistic approach and treat all edges as equal. One major difficulty in disentangling edges is the lack of explicit supervisions. In this work, with close examination of edge patterns, we propose three heuristics and design three corresponding pretext tasks to guide the automatic edge disentanglement. Concretely, these self-supervision tasks are enforced on a designed edge disentanglement module to be trained jointly with the downstream node classification task to encourage automatic edge disentanglement. Channels of the disentanglement module are expected to capture distinguishable relations and neighborhood interactions, and outputs from them are aggregated as node representations. The proposed DisGNN is easy to be incorporated with various neural architectures, and we conduct experiments on $6$ real-world datasets. Empirical results show that it can achieve significant performance gains.

preprint2022arXiv

GaLactic and Extragalactic All-sky Murchison Widefield Array survey eXtended (GLEAM-X) I: Survey Description and Initial Data Release

We describe a new low-frequency wideband radio survey of the southern sky. Observations covering 72 - 231 MHz and Declinations south of $+30^\circ$ have been performed with the Murchison Widefield Array &#34;extended&#34; Phase II configuration over 2018 - 2020 and will be processed to form data products including continuum and polarisation images and mosaics, multi-frequency catalogues, transient search data, and ionospheric measurements. From a pilot field described in this work, we publish an initial data release covering 1,447 sq. deg over 4h < RA < 13h, -32.7deg < Dec < -20.7deg. We process twenty frequency bands sampling 72 - 231 MHz, with a resolution of $2&#39;$ - $45&#34;$, and produce a wideband source-finding image across 170 - 231MHz with a root-mean-square noise of $1.27\pm0.15$ mJy/beam. Source-finding yields 78,967 components, of which 71,320 are fitted spectrally. The catalogue has a completeness of 98% at $\sim50$mJy, and a reliability of 98.2% at $5σ$ rising to 99.7% at $7σ$. A catalogue is available from Vizier; images are made available on AAO Data Central, SkyView, and the PASA Datastore. This is the first in a series of data releases from the GLEAM-X survey.

preprint2022arXiv

Graph-Guided Network for Irregularly Sampled Multivariate Time Series

In many domains, including healthcare, biology, and climate science, time series are irregularly sampled with varying time intervals between successive readouts and different subsets of variables (sensors) observed at different time points. Here, we introduce RAINDROP, a graph neural network that embeds irregularly sampled and multivariate time series while also learning the dynamics of sensors purely from observational data. RAINDROP represents every sample as a separate sensor graph and models time-varying dependencies between sensors with a novel message passing operator. It estimates the latent sensor graph structure and leverages the structure together with nearby observations to predict misaligned readouts. This model can be interpreted as a graph neural network that sends messages over graphs that are optimized for capturing time-varying dependencies among sensors. We use RAINDROP to classify time series and interpret temporal dynamics on three healthcare and human activity datasets. RAINDROP outperforms state-of-the-art methods by up to 11.4% (absolute F1-score points), including techniques that deal with irregular sampling using fixed discretization and set functions. RAINDROP shows superiority in diverse setups, including challenging leave-sensor-out settings.

preprint2022arXiv

High throughput data-driven design of laser crystallized 2D MoS2 chemical sensors

High throughput characterization and processing techniques are becoming increasingly necessary to navigate multivariable, data-driven design challenges for sensors and electronic devices. For two-dimensional materials, device performance is highly dependent upon a vast array of material properties including number of layers, lattice strain, carrier concentration, defect density, and grain structure. In this work, laser-crystallization was used to locally pattern and transform hundreds of regions of amorphous MoS2 thin films into 2D 2H-MoS2. A high throughput Raman spectroscopy approach was subsequently used to assess the process-dependent structural and compositional variations for each illuminated region, yielding over 5500 distinct non-resonant, resonant, and polarized Raman spectra. The rapid generation of a comprehensive library of structural and compositional data elucidated important trends between structure-property-processing relationships involving laser-crystallized MoS2, including the relationships between grain size, grain orientation, and intrinsic strain. Moreover, extensive analysis of structure/property relationships allowed for intelligent design, and evaluation of major contributions to, device performance in MoS2 chemical sensors. In particular, it is found that sensor performance is strongly dependent on the orientation of the MoS2 grains relative to the crystal plane.

preprint2022arXiv

Improved Sensitivity for Space Domain Awareness Observations with the Murchison Widefield Array

Our previously reported survey of the Low Earth Orbit (LEO) environment using the Murchison Widefield Array (MWA) detected over 70 unique Resident Space Objects (RSOs) over multiple passes, from 20 hours of observations in passive radar mode. In this paper, we extend this work by demonstrating two methods that improve the detection sensitivity of the system. The first method, called shift-stacking, increases the statistical significance of faint RSO signals through the spatially coherent integration of the reflected signal along the RSO&#39;s trajectory across the sky. This method was tested on the observations used during our previous blind survey, and we obtained a $75\%$ increase in the total number of detections. The second method re-focuses the MWA to the near-field RSO&#39;s position (post-observation), by applying a complex phase correction to each visibility to account for the curved wave-front. The method was tested successfully on an MWA extended array observation of an ISS pass. However, the method is currently limited by signal de-coherence on the long-baselines (due to the hardware constraints of the current correlator). We discuss the sensitivity improvement for RSO detections we expect from the MWA Phase 3 correlator upgrade. We conclude the paper by briefly commenting on future dedicated Space Domain Awareness (SDA) systems that will incorporate MWA technologies.

preprint2022arXiv

Knowledge-Spreader: Learning Facial Action Unit Dynamics with Extremely Limited Labels

Recent studies on the automatic detection of facial action unit (AU) have extensively relied on large-sized annotations. However, manually AU labeling is difficult, time-consuming, and costly. Most existing semi-supervised works ignore the informative cues from the temporal domain, and are highly dependent on densely annotated videos, making the learning process less efficient. To alleviate these problems, we propose a deep semi-supervised framework Knowledge-Spreader (KS), which differs from conventional methods in two aspects. First, rather than only encoding human knowledge as constraints, KS also learns the Spatial-Temporal AU correlation knowledge in order to strengthen its out-of-distribution generalization ability. Second, we approach KS by applying consistency regularization and pseudo-labeling in multiple student networks alternately and dynamically. It spreads the spatial knowledge from labeled frames to unlabeled data, and completes the temporal information of partially labeled video clips. Thus, the design allows KS to learn AU dynamics from video clips with only one label allocated, which significantly reduce the requirements of using annotations. Extensive experiments demonstrate that the proposed KS achieves competitive performance as compared to the state of the arts under the circumstances of using only 2% labels on BP4D and 5% labels on DISFA. In addition, we test it on our newly developed large-scale comprehensive emotion database, which contains considerable samples across well-synchronized and aligned sensor modalities for easing the scarcity issue of annotations and identities in human affective computing. The new database will be released to the research community.

preprint2022arXiv

Link Prediction on Heterophilic Graphs via Disentangled Representation Learning

Link prediction is an important task that has wide applications in various domains. However, the majority of existing link prediction approaches assume the given graph follows homophily assumption, and designs similarity-based heuristics or representation learning approaches to predict links. However, many real-world graphs are heterophilic graphs, where the homophily assumption does not hold, which challenges existing link prediction methods. Generally, in heterophilic graphs, there are many latent factors causing the link formation, and two linked nodes tend to be similar in one or two factors but might be dissimilar in other factors, leading to low overall similarity. Thus, one way is to learn disentangled representation for each node with each vector capturing the latent representation of a node on one factor, which paves a way to model the link formation in heterophilic graphs, resulting in better node representation learning and link prediction performance. However, the work on this is rather limited. Therefore, in this paper, we study a novel problem of exploring disentangled representation learning for link prediction on heterophilic graphs. We propose a novel framework DisenLink which can learn disentangled representations by modeling the link formation and perform factor-aware message-passing to facilitate link prediction. Extensive experiments on 13 real-world datasets demonstrate the effectiveness of DisenLink for link prediction on both heterophilic and hemophiliac graphs. Our codes are available at https://github.com/sjz5202/DisenLink

preprint2022arXiv

Multiple-Photon Resonance Enabled Quantum Interference in Emission Spectroscopy of N_2^+

Quantum interference occurs frequently in the interaction of laser radiation with materials, leading to a series of fascinating effects such as lasing without inversion, electromagnetically induced transparency, Fano resonance, etc. Such quantum interference effects are mostly enabled by single-photon resonance with transitions in the matter, regardless of how many optical frequencies are involved. Here, we demonstrate quantum interference driven by multiple photons in the emission spectroscopy of nitrogen ions that are resonantly pumped by ultrafast infrared laser pulses. In the spectral domain, Fano resonance is observed in the emission spectrum, where a laser-assisted dynamic Stark effect creates the continuum. In the time domain, the fast-evolving emission is measured, revealing the nature of free-induction decay (FID) arising from quantum radiation and molecular cooperativity. These findings clarify the mechanism of coherent emission of nitrogen ions pumped with MIR pump laser and are likely to be universal. The present work opens a route to explore the important role of quantum interference during the interaction of intense laser pulses with materials near multiple photon resonance.

preprint2022arXiv

New criterions on nonexistence of periodic orbits of planar dynamical systems and their applications

Characterizing existence or not of periodic orbit is a classical problem and it has both theoretical importance and many real applications. Here, several new criterions on nonexistence of periodic orbits of the planar dynamical system $\dot x=y,~\dot y=-g(x)-f(x,y)y$ are obtained in this paper, and by examples showing that these criterions are applicable, but the known ones are invalid to them. Based on these criterions, we further characterize the local topological structures of its equilibrium, which also show that one of the classical results by A.F. Andreev [Amer. Math. Soc. Transl. 8 (1958), 183--207] on local topological classification of the degenerate equilibrium is incomplete. Finally, as another application of these results, we classify the global phase portraits of a planar differential system, which comes from the third question in the list of the 33 questions posed by A. Gasull and also from a mechanical oscillator under suitable restriction to its parameters.

preprint2022arXiv

Offline-Online Learning of Deformation Model for Cable Manipulation with Graph Neural Networks

Manipulating deformable linear objects by robots has a wide range of applications, e.g., manufacturing and medical surgery. To complete such tasks, an accurate dynamics model for predicting the deformation is critical for robust control. In this work, we deal with this challenge by proposing a hybrid offline-online method to learn the dynamics of cables in a robust and data-efficient manner. In the offline phase, we adopt Graph Neural Network (GNN) to learn the deformation dynamics purely from the simulation data. Then a linear residual model is learned in real-time to bridge the sim-to-real gap. The learned model is then utilized as the dynamics constraint of a trust region based Model Predictive Controller (MPC) to calculate the optimal robot movements. The online learning and MPC run in a closed-loop manner to robustly accomplish the task. Finally, comparative results with existing methods are provided to quantitatively show the effectiveness and robustness.

preprint2022arXiv

Omni-directional Pathloss Measurement Based on Virtual Antenna Array with Directional Antennas

Omni-directional pathloss, which refers to the pathloss when omni-directional antennas are used at the link ends, are essential for system design and evaluation. In the millimeter-wave (mm-Wave) and beyond bands, high gain directional antennas are widely used for channel measurements due to the significant signal attenuation. Conventional methods for omni-directional pathloss estimation are based on directional scanning sounding (DSS) system, i.e., a single directional antenna placed at the center of a rotator capturing signals from different rotation angles. The omni-directional pathloss is obtained by either summing up all the powers above the noise level or just summing up the powers of detected propagation paths. However, both methods are problematic with relatively wide main beams and high side-lobes provided by the directional antennas. In this letter, directional antenna based virtual antenna array (VAA) system is implemented for omni-directional pathloss estimation. The VAA scheme uses the same measurement system as the DSS, yet it offers high angular resolution (i.e. narrow main beam) and low side-lobes, which is essential for achieving accurate multipath detection in the power angular delay profiles (PADPs) and thereby obtaining accurate omni-directional pathloss. A measurement campaign was designed and conducted in an indoor corridor at 28-30 GHz to verify the effectiveness of the proposed method.

preprint2022arXiv

On the Equity of Nuclear Norm Maximization in Unsupervised Domain Adaptation

Nuclear norm maximization has shown the power to enhance the transferability of unsupervised domain adaptation model (UDA) in an empirical scheme. In this paper, we identify a new property termed equity, which indicates the balance degree of predicted classes, to demystify the efficacy of nuclear norm maximization for UDA theoretically. With this in mind, we offer a new discriminability-and-equity maximization paradigm built on squares loss, such that predictions are equalized explicitly. To verify its feasibility and flexibility, two new losses termed Class Weighted Squares Maximization (CWSM) and Normalized Squares Maximization (NSM), are proposed to maximize both predictive discriminability and equity, from the class level and the sample level, respectively. Importantly, we theoretically relate these two novel losses (i.e., CWSM and NSM) to the equity maximization under mild conditions, and empirically suggest the importance of the predictive equity in UDA. Moreover, it is very efficient to realize the equity constraints in both losses. Experiments of cross-domain image classification on three popular benchmark datasets show that both CWSM and NSM contribute to outperforming the corresponding counterparts.

preprint2022arXiv

Online Graph Learning in Dynamic Environments

Inferring the underlying graph topology that characterizes structured data is pivotal to many graph-based models when pre-defined graphs are not available. This paper focuses on learning graphs in the case of sequential data in dynamic environments. For sequential data, we develop an online version of classic batch graph learning method. To better track graphs in dynamic environments, we assume graphs evolve in certain patterns such that dynamic priors might be embedded in the online graph learning framework. When the information of these hidden patterns is not available, we use history data to predict the evolution of graphs. Furthermore, dynamic regret analysis of the proposed method is performed and illustrates that our online graph learning algorithms can reach sublinear dynamic regret. Experimental results support the fact that our method is superior to the state-of-art methods.

preprint2022arXiv

Properties and device performance of BN thin films grown on GaN by pulsed laser deposition

Wide and ultrawide-bandgap semiconductors lie at the heart of next-generation high-power, high-frequency electronics. Here, we report the growth of ultrawide-bandgap boron nitride (BN) thin films on wide-bandgap gallium nitride (GaN) by pulsed laser deposition. Comprehensive spectroscopic (core level and valence band XPS, FTIR, Raman) and microscopic (AFM and STEM) characterizations confirm the growth of BN thin films on GaN. Optically, we observed that BN/GaN heterostructure is second-harmonic generation active. Moreover, we fabricated the BN/GaN heterostructure-based Schottky diode that demonstrates rectifying characteristics, lower turn-on voltage, and an improved breakdown capability (234 V) as compared to GaN (168 V), owing to the higher breakdown electrical field of BN. Our approach is an early step towards bridging the gap between wide and ultrawide-bandgap materials for potential optoelectronics as well as next-generation high-power electronics.

preprint2022arXiv

Stability of Oxygenated Groups on Pristine and Defective Diamond Surfaces

The surface functionalization of diamond has been extensively studied through a variety of techniques, such as oxidation. Several oxygen groups have been correspondingly detected on the oxidized diamond, such as COC (ester), CO (ketonic), and COH (hydroxyl). However, the composition and relative concentration of these groups on diamond surfaces can be affected by the type of oxygenation treatment and the diamond surface quality. To investigate the stability of the oxygenated groups at specific diamond surfaces, we evaluated through fully atomistic reactive molecular mechanics (FARMM) simulations, using the ReaxFF force field, the formation energies of CO, COC, and COH groups on pristine and defective diamond surfaces (110), (111), and (311). According to our findings, the COH group has the lowest formation energy on a perfect (110) surface, while the COC is favored on a defective surface. As for the (111) surface, the COC group is the most stable for both pristine and defective surfaces. Similarly, COC group is also the most stable one on the defective/perfect (311) surface. In this way, our results suggest that if in a diamond film the (110) surface is the major exposed facet, the most adsorbed oxygen group could be either COH or COC, in which the COC would depend on the level of surface defects.

preprint2022arXiv

Text Spotting Transformers

In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heuristics-driven post-processing procedures; TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations. We show our canonical representation of control points suitable for text instances in both Bezier curve and polygon annotations. In addition, we design a bounding-box guided polygon detection (box-to-polygon) process. Experiments on curved and arbitrarily shaped datasets demonstrate state-of-the-art performances of the proposed TESTR algorithm.

preprint2022arXiv

Time-varying Graph Learning Under Structured Temporal Priors

This paper endeavors to learn time-varying graphs by using structured temporal priors that assume underlying relations between arbitrary two graphs in the graph sequence. Different from many existing chain structure based methods in which the priors like temporal homogeneity can only describe the variations of two consecutive graphs, we propose a structure named \emph{temporal graph} to characterize the underlying real temporal relations. Under this framework, the chain structure is actually a special case of our temporal graph. We further proposed Alternating Direction Method of Multipliers (ADMM), a distributed algorithm, to solve the induced optimization problem. Numerical experiments demonstrate the superiorities of our method.

preprint2022arXiv

Unifying Motion Deblurring and Frame Interpolation with Events

Slow shutter speed and long exposure time of frame-based cameras often cause visual blur and loss of inter-frame information, degenerating the overall quality of captured videos. To this end, we present a unified framework of event-based motion deblurring and frame interpolation for blurry video enhancement, where the extremely low latency of events is leveraged to alleviate motion blur and facilitate intermediate frame prediction. Specifically, the mapping relation between blurry frames and sharp latent images is first predicted by a learnable double integral network, and a fusion network is then proposed to refine the coarse results via utilizing the information from consecutive blurry inputs and the concurrent events. By exploring the mutual constraints among blurry frames, latent images, and event streams, we further propose a self-supervised learning framework to enable network training with real-world blurry videos and events. Extensive experiments demonstrate that our method compares favorably against the state-of-the-art approaches and achieves remarkable performance on both synthetic and real-world datasets.

preprint2021arXiv

A New Design of Cache-aided Multiuser Private Information Retrieval with Uncoded Prefetching

In the problem of cache-aided multiuser private information retrieval (MuPIR), a set of $K_{\rm u}$ cache-equipped users wish to privately download a set of messages from $N$ distributed databases each holding a library of $K$ messages. The system works in two phases: {\it cache placement (prefetching) phase} in which the users fill up their cache memory, and {\it private delivery phase} in which the users&#39; demands are revealed and they download an answer from each database so that the their desired messages can be recovered while each individual database learns nothing about the identities of the requested messages. The goal is to design the placement and the private delivery phases such that the \emph{load}, which is defined as the total number of downloaded bits normalized by the message size, is minimized given any user memory size. This paper considers the MuPIR problem with two messages, arbitrary number of users and databases where uncoded prefetching is assumed, i.e., the users directly copy some bits from the library as their cached contents. We propose a novel MuPIR scheme inspired by the Maddah-Ali and Niesen (MAN) coded caching scheme. The proposed scheme achieves lower load than any existing schemes, especially the product design (PD), and is shown to be optimal within a factor of $8$ in general and exactly optimal at very high or low memory regime.

preprint2021arXiv

A Reactive Molecular Dynamics Study of Hydrogenation on Diamond Surfaces

Hydrogenated diamond has been regarded as a promising material in electronic device applications, especially in field-effect transistors (FETs). However, the quality of diamond hydrogenation has not yet been established, nor has the specific orientation that would provide the optimum hydrogen coverage. In addition, most theoretical work in the literature use models with 100% hydrogenated diamond surfaces to study electronic properties, which is far from the experimentally observed hydrogen coverage. In this work, we have carried out a detailed study using fully atomistic reactive molecular dynamics (MD) simulations on low indices diamond surfaces i.e. (001), (013), (110), (113) and (111) to evaluate the quality and hydrogenation thresholds on different diamond surfaces and their possible effects on electronic properties. Our simulation results indicate that the 100% surface hydrogenation in these surfaces is hard to achieve because of the steric repulsion between the terminated hydrogen atoms. Among all the considered surfaces, the (001), (110), and (113) surfaces incorporate a larger number of hydrogen atoms and passivate the surface dangling bonds. Our results on hydrogen stability also suggest that these surfaces with optimum hydrogen coverage are robust under extreme conditions and could provide homogeneous p-type surface conductivity in the diamond surfaces, a key requirement for high-field, high-frequency device applications.

preprint2021arXiv

A Two-stream Neural Network for Pose-based Hand Gesture Recognition

Pose based hand gesture recognition has been widely studied in the recent years. Compared with full body action recognition, hand gesture involves joints that are more spatially closely distributed with stronger collaboration. This nature requires a different approach from action recognition to capturing the complex spatial features. Many gesture categories, such as &#34;Grab&#34; and &#34;Pinch&#34;, have very similar motion or temporal patterns posing a challenge on temporal processing. To address these challenges, this paper proposes a two-stream neural network with one stream being a self-attention based graph convolutional network (SAGCN) extracting the short-term temporal information and hierarchical spatial information, and the other being a residual-connection enhanced bidirectional Independently Recurrent Neural Network (RBi-IndRNN) for extracting long-term temporal information. The self-attention based graph convolutional network has a dynamic self-attention mechanism to adaptively exploit the relationships of all hand joints in addition to the fixed topology and local feature extraction in the GCN. On the other hand, the residual-connection enhanced Bi-IndRNN extends an IndRNN with the capability of bidirectional processing for temporal modelling. The two streams are fused together for recognition. The Dynamic Hand Gesture dataset and First-Person Hand Action dataset are used to validate its effectiveness, and our method achieves state-of-the-art performance.

preprint2021arXiv

Feedback-based Digital Higher-order Terminal Sliding Mode for 6-DOF Industrial Manipulators

The precise motion control of a multi-degree of freedom~(DOF) robot manipulator is always challenging due to its nonlinear dynamics, disturbances, and uncertainties. Because most manipulators are controlled by digital signals, a novel higher-order sliding mode controller in the discrete-time form with time delay estimation is proposed in this paper. The dynamic model of the manipulator used in the design allows proper handling of nonlinearities, uncertainties and disturbances involved in the problem. Specifically, parametric uncertainties and disturbances are handled by the time delay estimation and the nonlinearity of the manipulator is addressed by the feedback structure of the controller. The combination of terminal sliding mode surface and higher-order control scheme in the controller guarantees a fast response with a small chattering amplitude. Moreover, the controller is designed with a modified sliding mode surface and variable-gain structure, so that the performance of the controller is further enhanced. We also analyze the condition to guarantee the stability of the closed-loop system in this paper. Finally, the simulation and experimental results prove that the proposed control scheme has a precise performance in a robot manipulator system.

preprint2021arXiv

Information retrieval and eigenstates coalescence in a non-Hermitian quantum system with anti-$\mathcal{PT}$ symmetry

Non-Hermitian systems with parity-time reversal ($\mathcal{PT}$) or anti-$\mathcal{PT}$ symmetry have attracted a wide range of interest owing to their unique characteristics and counterintuitive phenomena. One of the most extraordinary features is the presence of an exception point (EP), across which a phase transition with spontaneously broken $\mathcal{PT}$ symmetry takes place. We implement a Floquet Hamiltonian of a single qubit with anti-$\mathcal{PT}$ symmetry by periodically driving a dissipative quantum system of a single trapped ion. With stroboscopic emission and quantum state tomography, we obtain the time evolution of density matrix for an arbitrary initial state, and directly demonstrate information retrieval, eigenstates coalescence, and topological energy spectra as unique features of non-Hermitian systems.

preprint2021arXiv

Learning Variable Impedance Control via Inverse Reinforcement Learning for Force-Related Tasks

Many manipulation tasks require robots to interact with unknown environments. In such applications, the ability to adapt the impedance according to different task phases and environment constraints is crucial for safety and performance. Although many approaches based on deep reinforcement learning (RL) and learning from demonstration (LfD) have been proposed to obtain variable impedance skills on contact-rich manipulation tasks, these skills are typically task-specific and could be sensitive to changes in task settings. This paper proposes an inverse reinforcement learning (IRL) based approach to recover both the variable impedance policy and reward function from expert demonstrations. We explore different action space of the reward functions to achieve a more general representation of expert variable impedance skills. Experiments on two variable impedance tasks (Peg-in-Hole and Cup-on-Plate) were conducted in both simulations and on a real FANUC LR Mate 200iD/7L industrial robot. The comparison results with behavior cloning and force-based IRL proved that the learned reward function in the gain action space has better transferability than in the force space. Experiment videos are available at https://msc.berkeley.edu/research/impedance-irl.html.

preprint2021arXiv

Uncoordinated Spectrum Sharing in Millimeter Wave Networks Using Carrier Sensing

We propose using Carrier Sensing (CS) for distributed interference management in millimeter-wave (mmWave) cellular networks where spectrum is shared by multiple operators that do not coordinate among themselves. In addition, even the base station sites can be shared by the operators. We describe important challenges in using traditional CS in this setting and propose enhanced CS protocols to address these challenges. Using stochastic geometry, we develop a general framework for downlink coverage probability analysis of our shared mmWave network in the presence of CS and derive the downlink coverage probability expressions for several CS protocols. To the best of our knowledge, our work is the first to investigate and analyze (using stochastic geometry) CS for mmWave networks with spectrum and BS sites shared among non-coordinating operators. We evaluate the downlink coverage probability of our shared mmWave network using simulations as well as numerical examples based on our analysis. Our evaluations show that our proposed enhancements lead to an improvement in downlink coverage probability, compared to the downlink coverage probability with no CS, for higher values of signal-to-interference and noise ratio (SINR). Interestingly, our evaluations also reveal that for lower values of SINR, not using any CS is the best strategy in terms of the downlink coverage probability.

preprint2020arXiv

A Multi-view CNN-based Acoustic Classification System for Automatic Animal Species Identification

Automatic identification of animal species by their vocalization is an important and challenging task. Although many kinds of audio monitoring system have been proposed in the literature, they suffer from several disadvantages such as non-trivial feature selection, accuracy degradation because of environmental noise or intensive local computation. In this paper, we propose a deep learning based acoustic classification framework for Wireless Acoustic Sensor Network (WASN). The proposed framework is based on cloud architecture which relaxes the computational burden on the wireless sensor node. To improve the recognition accuracy, we design a multi-view Convolution Neural Network (CNN) to extract the short-, middle-, and long-term dependencies in parallel. The evaluation on two real datasets shows that the proposed architecture can achieve high accuracy and outperforms traditional classification systems significantly when the environmental noise dominate the audio signal (low SNR). Moreover, we implement and deploy the proposed system on a testbed and analyse the system performance in real-world environments. Both simulation and real-world evaluation demonstrate the accuracy and robustness of the proposed acoustic classification system in distinguishing species of animals.

preprint2020arXiv

A New Design Framework on Device-to-Device Coded Caching with Optimal Rate and Significantly Less Subpacketizations

In this paper, we propose a new design framework on Device-to-Device (D2D) coded caching networks with optimal rate but significantly less file subpacketizations compared to that of the well-known D2D coded caching scheme proposed by Ji, Caire and Molisch (JCM). The proposed design framework is referred to as the {\em Packet Type-based (PTB) design}, where D2D users are first partitioned into multiple groups, which leads to a so-called {\em raw packet saving gain}. Then the corresponding multicasting group types and packet types are specified based on the prescribed node partition. By a careful selection of transmitters within each multicasting group, a so-called {\em further splitting ratio gain} can also be achieved. By the joint effect of the {\em raw packet saving gain} and the {\em further splitting ratio gain}, an order-wise subpacketization reduction can be achieved compared to the JCM scheme while preserving the optimal rate for large system parameter regimes. In addition, as the first time presented in the literature according to our knowledge, we find that unequal subpacketizaton is a key to achieve a subpacketization gain when the number of users is odd. As a by-product, instead of directly translating shared link caching schemes to D2D caching schemes, at least for the sake of subpackeitzations, a new design framework is indeed needed.

preprint2020arXiv

Adversarial Imitation Attack

Deep learning models are known to be vulnerable to adversarial examples. A practical adversarial attack should require as little as possible knowledge of attacked models. Current substitute attacks need pre-trained models to generate adversarial examples and their attack success rates heavily rely on the transferability of adversarial examples. Current score-based and decision-based attacks require lots of queries for the attacked models. In this study, we propose a novel adversarial imitation attack. First, it produces a replica of the attacked model by a two-player game like the generative adversarial networks (GANs). The objective of the generative model is to generate examples that lead the imitation model returning different outputs with the attacked model. The objective of the imitation model is to output the same labels with the attacked model under the same inputs. Then, the adversarial examples generated by the imitation model are utilized to fool the attacked model. Compared with the current substitute attacks, imitation attacks can use less training data to produce a replica of the attacked model and improve the transferability of adversarial examples. Experiments demonstrate that our imitation attack requires less training data than the black-box substitute attacks, but achieves an attack success rate close to the white-box attack on unseen data with no query.

preprint2020arXiv

Adversarial Representation Learning for Robust Patient-Independent Epileptic Seizure Detection

Objective: Epilepsy is a chronic neurological disorder characterized by the occurrence of spontaneous seizures, which affects about one percent of the world&#39;s population. Most of the current seizure detection approaches strongly rely on patient history records and thus fail in the patient-independent situation of detecting the new patients. To overcome such limitation, we propose a robust and explainable epileptic seizure detection model that effectively learns from seizure states while eliminates the inter-patient noises. Methods: A complex deep neural network model is proposed to learn the pure seizure-specific representation from the raw non-invasive electroencephalography (EEG) signals through adversarial training. Furthermore, to enhance the explainability, we develop an attention mechanism to automatically learn the importance of each EEG channels in the seizure diagnosis procedure. Results: The proposed approach is evaluated over the Temple University Hospital EEG (TUH EEG) database. The experimental results illustrate that our model outperforms the competitive state-of-the-art baselines with low latency. Moreover, the designed attention mechanism is demonstrated ables to provide fine-grained information for pathological analysis. Conclusion and significance: We propose an effective and efficient patient-independent diagnosis approach of epileptic seizure based on raw EEG signals without manually feature engineering, which is a step toward the development of large-scale deployment for real-life use.

preprint2020arXiv

Automatic Image Labelling at Pixel Level

The performance of deep networks for semantic image segmentation largely depends on the availability of large-scale training images which are labelled at the pixel level. Typically, such pixel-level image labellings are obtained manually by a labour-intensive process. To alleviate the burden of manual image labelling, we propose an interesting learning approach to generate pixel-level image labellings automatically. A Guided Filter Network (GFN) is first developed to learn the segmentation knowledge from a source domain, and such GFN then transfers such segmentation knowledge to generate coarse object masks in the target domain. Such coarse object masks are treated as pseudo labels and they are further integrated to optimize/refine the GFN iteratively in the target domain. Our experiments on six image sets have demonstrated that our proposed approach can generate fine-grained object masks (i.e., pixel-level object labellings), whose quality is very comparable to the manually-labelled ones. Our proposed approach can also achieve better performance on semantic image segmentation than most existing weakly-supervised approaches.

preprint2020arXiv

Berry curvature memory through electrically driven stacking transitions

In two-dimensional layered quantum materials, the stacking order of the layers determines both the crystalline symmetry and electronic properties such as the Berry curvature, topology and electron correlation. Electrical stimuli can influence quasiparticle interactions and the free-energy landscape, making it possible to dynamically modify the stacking order and reveal hidden structures that host different quantum properties. Here we demonstrate electrically driven stacking transitions that can be applied to design nonvolatile memory based on Berry curvature in few-layer WTe$_2$. The interplay of out-of-plane electric fields and electrostatic doping controls in-plane interlayer sliding and creates multiple polar and centrosymmetric stacking orders. In situ nonlinear Hall transport reveals such stacking rearrangements result in a layer-parity-selective Berry curvature memory in momentum space, where the sign reversal of the Berry curvature and its dipole only occurs in odd-layer crystals. Our findings open an avenue towards exploring coupling between topology, electron correlations, and ferroelectricity in hidden stacking orders and demonstrate a new low-energy-cost, electrically controlled topological memory in the atomically thin limit.

preprint2020arXiv

BeSense: Leveraging WiFi Channel Data and Computational Intelligence for Behavior Analysis

The ever evolving informatics technology has gradually bounded human and computer in a compact way. Understanding user behavior becomes a key enabler in many fields such as sedentary-related healthcare, human-computer interaction (HCI) and affective computing. Traditional sensor-based and vision-based user behavior analysis approaches are obtrusive in general, hindering their usage in realworld. Therefore, in this article, we first introduce WiFi signal as a new source instead of sensor and vision for unobtrusive user behaviors analysis. Then we design BeSense, a contactless behavior analysis system leveraging signal processing and computational intelligence over WiFi channel state information (CSI). We prototype BeSense on commodity low-cost WiFi devices and evaluate its performance in realworld environments. Experimental results have verified its effectiveness in recognizing user behaviors.

preprint2020arXiv

Brain2Object: Printing Your Mind from Brain Signals with Spatial Correlation Embedding

Electroencephalography (EEG) signals are known to manifest differential patterns when individuals visually concentrate on different objects. In this work, we present an end-to-end digital fabrication system, Brain2Object, to print the 3D object that an individual is observing by decoding visually-evoked brain signals. We propose a unified training framework that combines multi-class Common Spatial Pattern and Convolutional Neural Networks to support the backend computation. We learn the dynamical graph representations of brain signals to accurately capture the structural information among EEG channels. A user-friendly interface is developed as the system front end. Brain2Object presents a streamlined end-to-end workflow that can serve as a template for deeper integration of BCI technologies to assist with our routine activities. The proposed system is evaluated extensively using offline experiments and through an online demonstrator. The experimental results show that our approach can achieve the recognition accuracy of 92.58% on a benchmark dataset and 75.23% on a locally collected dataset. Moreover, our method consistently outperforms a wide range of baseline and state-of-the-art approaches. The proof-of-concept corroborates the practicality of our approach and illustrates the ease with which such a system could be deployed.

preprint2020arXiv

Cache-aided Interference Management using Hypercube Combinatorial Cache Design with Reduced Subpacketizations and Order Optimal Sum-Degrees of Freedom

We consider a cache-aided interference network which consists of a library of $N$ files, $K_T$ transmitters and $K_R$ receivers (users), each equipped with a local cache of size $M_T$ and $M_R$ files respectively, and connected via a discrete-time additive white Gaussian noise (AWGN) channel. Each receiver requests an arbitrary file from the library. The objective is to design a cache placement without knowing the receivers&#39; requests and a communication scheme such that the sum Degrees of Freedom (sum-DoF) of the delivery is maximized. This network model with one-shot transmission was firstly investigated by Naderializadeh {\em et al.}, who proposed a scheme that achieves a one-shot sum-DoF of $\min\{\frac{M_TK_T+K_RM_R}{N}, K_R\}$, which is optimal within a constant of $2$. One of the biggest limitations of this scheme is the requirement of high subpacketization level. This paper attempts to design new algorithms to reduce the file subpacketization in such a network without hurting the sum-DoF. In particular, we propose a new approach for both prefetching and linearly coded delivery based on a combinatorial design called {\em hypercube}. The proposed approach reduces the subpacketization exponentially in terms of $K_R M/N$ and achieves the identical one-shot sum DoF when $\frac{M_TK_T+K_RM_R}{N} \leq K_R$.

preprint2020arXiv

Cache-aided Interference Management Using Hypercube Combinatorial Cache Designs

We consider a cache-aided interference network which consists of a library of $N$ files, $K_T$ transmitters and $K_R$ receivers (users), each equipped with a local cache of size $M_T$ and $M_R$ files respectively, and connected via a discrete-time additive white Gaussian noise channel. Each receiver requests an arbitrary file from the library. The objective is to design a cache placement without knowing the receivers&#39; requests and a communication scheme such that the sum Degrees of Freedom (sum-DoF) of the delivery is maximized. This network model has been investigated by Naderializadeh {\em et al.}, who proposed a prefetching and a delivery schemes that achieves a sum-DoF of $\min\{\frac{M_TK_T+K_RM_R}{N}, K_R\}$. One of biggest limitations of this scheme is the requirement of high subpacketization level. This paper is the first attempt in the literature (according to our knowledge) to reduce the file subpacketization in such a network. In particular, we propose a new approach for both prefetching and linear delivery schemes based on a combinatorial design called {\em hypercube}. We show that required number of packets per file can be exponentially reduced compared to the state of the art scheme proposed by Naderializadeh {\em et al.}, or the NMA scheme. When $M_TK_T+K_RM_R \geq K_R$, the achievable one-shot sum-DoF using this approach is $\frac{M_TK_T+K_RM_R}{N}$ , which shows that 1) the one-shot sum-DoF scales linearly with the aggregate cache size in the network and 2) it is within a factor of $2$ to the information-theoretic optimum. Surprisingly, the identical and near optimal sum-DoF performance can be achieved using the hypercube approach with a much less file subpacketization.

preprint2020arXiv

Deep Neural Network Hyperparameter Optimization with Orthogonal Array Tuning

Deep learning algorithms have achieved excellent performance lately in a wide range of fields (e.g., computer version). However, a severe challenge faced by deep learning is the high dependency on hyper-parameters. The algorithm results may fluctuate dramatically under the different configuration of hyper-parameters. Addressing the above issue, this paper presents an efficient Orthogonal Array Tuning Method (OATM) for deep learning hyper-parameter tuning. We describe the OATM approach in five detailed steps and elaborate on it using two widely used deep neural network structures (Recurrent Neural Networks and Convolutional Neural Networks). The proposed method is compared to the state-of-the-art hyper-parameter tuning methods including manually (e.g., grid search and random search) and automatically (e.g., Bayesian Optimization) ones. The experiment results state that OATM can significantly save the tuning time compared to the state-of-the-art methods while preserving the satisfying performance. The codes are open in GitHub (https://github.com/xiangzhang1015/OATM)

preprint2020arXiv

Entity Profiling in Knowledge Graphs

Knowledge Graphs (KGs) are graph-structured knowledge bases storing factual information about real-world entities. Understanding the uniqueness of each entity is crucial to the analyzing, sharing, and reusing of KGs. Traditional profiling technologies encompass a vast array of methods to find distinctive features in various applications, which can help to differentiate entities in the process of human understanding of KGs. In this work, we present a novel profiling approach to identify distinctive entity features. The distinctiveness of features is carefully measured by a HAS model, which is a scalable representation learning model to produce a multi-pattern entity embedding. We fully evaluate the quality of entity profiles generated from real KGs. The results show that our approach facilitates human understanding of entities in KGs.

preprint2020arXiv

Graph Computing based Distributed State Estimation with PMUs

Power system state estimation plays a fundamental and critical role in the energy management system (EMS). To achieve a high performance and accurate system states estimation, a graph computing based distributed state estimation approach is proposed in this paper. Firstly, a power system network is divided into multiple areas. Reference buses are selected with PMUs being installed at these buses for each area. Then, the system network is converted into multiple independent areas. In this way, the power system state estimation could be conducted in parallel for each area and the estimated system states are obtained without compromise of accuracy. IEEE 118-bus system and MP 10790-bus system are employed to verify the results accuracy and present the promising computation performance.

preprint2020arXiv

High-performance frequency stabilization of ultraviolet diode lasers by using dichroic atomic vapor spectroscopy and transfer cavity

Ultraviolet (UV) diode lasers are widely used in many photonics applications. But their frequency stabilization schemes are not as mature as frequency-doubling lasers, mainly due to some limitations in the UV spectral region. Here we developed a high-performance UV frequency stabilization technique implemented directly on UV diode lasers by combining the dichroic atomic vapor laser lock and the resonant transfer cavity lock. As an example, we demonstrate a stable locking with frequency standard deviations of approximately 200 KHz and 300 KHz for 399nm and 370nm diode lasers in 20 minutes. We achieve a long-term frequency drift of no more than 1 MHz for the target 370nm laser within an hour, which was further verified with fluorescence counts rates of a single trapped $^{171}$Yb$^+$ ion. We also find strong linear correlations between lock points and environmental factors such as temperature and atmospheric pressure.

preprint2020arXiv

Improving Spiking Sparse Recovery via Non-Convex Penalties

Compared with digital methods, sparse recovery based on spiking neural networks has great advantages like high computational efficiency and low power-consumption. However, current spiking algorithms cannot guarantee more accurate estimates since they are usually designed to solve the classical optimization with convex penalties, especially the $\ell_{1}$-norm. In fact, convex penalties are observed to underestimate the true solution in practice, while non-convex ones can avoid the underestimation. Inspired by this, we propose an adaptive version of spiking sparse recovery algorithm to solve the non-convex regularized optimization, and provide an analysis on its global asymptotic convergence. Through experiments, the accuracy is greatly improved under different adaptive ways.

preprint2020arXiv

Improving Unsupervised Domain Adaptation by Reducing Bi-level Feature Redundancy

Reducing feature redundancy has shown beneficial effects for improving the accuracy of deep learning models, thus it is also indispensable for the models of unsupervised domain adaptation (UDA). Nevertheless, most recent efforts in the field of UDA ignores this point. Moreover, main schemes realizing this in general independent of UDA purely involve a single domain, thus might not be effective for cross-domain tasks. In this paper, we emphasize the significance of reducing feature redundancy for improving UDA in a bi-level way. For the first level, we try to ensure compact domain-specific features with a transferable decorrelated normalization module, which preserves specific domain information whilst easing the side effect of feature redundancy on the sequel domain-invariance. In the second level, domain-invariant feature redundancy caused by domain-shared representation is further mitigated via an alternative brand orthogonality for better generalization. These two novel aspects can be easily plugged into any BN-based backbone neural networks. Specifically, simply applying them to ResNet50 has achieved competitive performance to the state-of-the-arts on five popular benchmarks. Our code will be available at https://github.com/dreamkily/gUDA.

preprint2020arXiv

Linear Model based Geometry Coding for Lidar Acquired Point Clouds

In this paper, we propose a new geometry coding method for point cloud compression (PCC), where the points can be fitted and represented by straight lines. The encoding of the linear model can be expressed by two parts, including the principle component along the line direction and the offsets from the line. Compact representation and high-efficiency coding methods are presented by encoding the parameters of linear model with appropriate quantization step-sizes (QS). To maximize the coding performance, encoder optimization techniques are employed to find the optimal trade-off between coding bits and errors, involving the Lagrangian multiplier method, where the rate-distortion behavior in terms of QS and multiplier is analyzed. We implement our method on top of the MPEG G-PCC reference software, and the results have shown that the proposed method is effective in coding point clouds with explicit line structures, such as the Lidar acquired data for autonomous driving. About 20\% coding gains can be achieved on lossy geometry coding.

preprint2020arXiv

Literature Triage on Genomic Variation Publications by Knowledge-enhanced Multi-channel CNN

Background: To investigate the correlation between genomic variation and certain diseases or phenotypes, the fundamental task is to screen out the concerning publications from massive literature, which is called literature triage. Some knowledge bases, including UniProtKB/Swiss-Prot and NHGRI-EBI GWAS Catalog are created for collecting concerning publications. These publications are manually curated by experts, which is time-consuming. Moreover, the manual curation of information from literature is not scalable due to the rapidly increasing amount of publications. In order to cut down the cost of literature triage, machine-learning models were adopted to automatically identify biomedical publications. Methods: Comparing to previous studies utilizing machine-learning models for literature triage, we adopt a multi-channel convolutional network to utilize rich textual information and meanwhile bridge the semantic gaps from different corpora. In addition, knowledge embeddings learned from UMLS is also used to provide extra medical knowledge beyond textual features in the process of triage. Results: We demonstrate that our model outperforms the state-of-the-art models over 5 datasets with the help of knowledge embedding and multiple channels. Our model improves the accuracy of biomedical literature triage results. Conclusions: Multiple channels and knowledge embeddings enhance the performance of the CNN model in the task of biomedical literature triage. Keywords: Literature Triage; Knowledge Embedding; Multi-channel Convolutional Network

preprint2020arXiv

Magnitude and Spatial Distribution Control of the Supercurrent in Bi2O2Se-Based Josephson Junction

Many proposals in exploring topological quantum computation are based on superconducting quantum devices constructed on materials with strong spin-orbit coupling (SOC). For these devices, a full control on both the magnitude and the spatial distribution of the supercurrent would be highly demanded, but has been elusive up to now. We constructed proximity-type Josephson junction on nanoplates of Bi2O2Se, a new emerging semiconductor with strong SOC. Through electrical gating, we show that the supercurrent can be fully turned ON and OFF, and its real-space pathways can be configured either through the bulk or along the edges. Our work demonstrates Bi2O2Se as a promising platform for constructing multifunctional hybrid superconducting devices as well as for searching for topological superconductivity.

preprint2020arXiv

Multi-task Generative Adversarial Learning on Geometrical Shape Reconstruction from EEG Brain Signals

Synthesizing geometrical shapes from human brain activities is an interesting and meaningful but very challenging topic. Recently, the advancements of deep generative models like Generative Adversarial Networks (GANs) have supported the object generation from neurological signals. However, the Electroencephalograph (EEG)-based shape generation still suffer from the low realism problem. In particular, the generated geometrical shapes lack clear edges and fail to contain necessary details. In light of this, we propose a novel multi-task generative adversarial network to convert the individual&#39;s EEG signals evoked by geometrical shapes to the original geometry. First, we adopt a Convolutional Neural Network (CNN) to learn highly informative latent representation for the raw EEG signals, which is vital for the subsequent shape reconstruction. Next, we build the discriminator based on multi-task learning to distinguish and classify fake samples simultaneously, where the mutual promotion between different tasks improves the quality of the recovered shapes. Then, we propose a semantic alignment constraint in order to force the synthesized samples to approach the real ones in pixel-level, thus producing more compelling shapes. The proposed approach is evaluated over a local dataset and the results show that our model outperforms the competitive state-of-the-art baselines.

preprint2020arXiv

Precision measurements with cold atoms and trapped ions

Recent progresses on quantum control of cold atoms and trapped ions in both the scientific and technological aspects greatly advance the applications in precision measurement. Thanks to the exceptional controllability and versatility of these massive quantum systems, unprecedented sensitivity has been achieved in clocks, magnetometers and interferometers based on cold atoms and ions. Besides, these systems also feature many characteristics that can be employed to facilitate the applications in different scenarios. In this review, we briefly introduce the principles of optical clocks, cold atom magnetometers and atom interferometers used for precision measurement of time, magnetic field, and inertial forces. The main content is then devoted to summarize some recent experimental and theoretical progresses in these three applications, with special attention being paid to the new designs and possibilities towards better performance. The purpose of this review is by no means to give a complete overview of all important works in this fast developing field, but to draw a rough sketch about the frontiers and show the fascinating future lying ahead.

preprint2020arXiv

ProbaNet: Proposal-balanced Network for Object Detection

Candidate object proposals generated by object detectors based on convolutional neural network (CNN) encounter easy-hard samples imbalance problem, which can affect overall performance. In this study, we propose a Proposal-balanced Network (ProbaNet) for alleviating the imbalance problem. Firstly, ProbaNet increases the probability of choosing hard samples for training by discarding easy samples through threshold truncation. Secondly, ProbaNet emphasizes foreground proposals by increasing their weights. To evaluate the effectiveness of ProbaNet, we train models based on different benchmarks. Mean Average Precision (mAP) of the model using ProbaNet achieves 1.2$\%$ higher than the baseline on PASCAL VOC 2007. Furthermore, it is compatible with existing two-stage detectors and offers a very small amount of additional computational cost.

preprint2020arXiv

The Development of Non-coherent Passive Radar Techniques for Space Situational Awareness with the Murchison Widefield Array

The number of active and non active satellites in Earth orbit has dramatically increased in recent decades, requiring the development of novel surveillance techniques to monitor and track them. In this paper, we build upon previous non-coherent passive radar space surveillance demonstrations undertaken using the Murchison Widefield Array (MWA). We develop the concept of the Dynamic Signal to Noise Ratio Spectrum (DSNRS) in order to isolate signals of interest (reflections of FM transmissions of objects in orbit) and efficiently differentiate them from direct path reception events. We detect and track Alouette-2, ALOS, UKube-1, the International Space Station, and Duchifat-1 in this manner. We also identified out-of-band transmissions from Duchifat-1 and UKube-1 using these techniques, demonstrating the MWA&#39;s capability to look for spurious transmissions from satellites. We identify an offset from the locations predicted by the cataloged orbital parameters for some of the satellites, demonstrating the potential of using MWA for satellite catalog maintenance. These results demonstrate the capability of the MWA for Space Situational Awareness and we describe future work in this area.

preprint2020arXiv

Transparent Metamaterial Absorber with Broadband RCS Reduction for Solar Arrays

Solar arrays are the primary energy source of the satellite. In this paper, a metamaterial absorber for solar arrays with simultaneous high optical transparency and broadband microwave absorption is presented. By tailoring the reflection response of meta-atoms, 85% absorption performance from 6.8GHz to 18GHz is obtained. In the meantime, by employing transparent substrates, including indium tin oxide (ITO) film and anti-reflection glass, a maximum of 87% light transmittance is achieved. The absorptivity of the proposed metamaterial absorber is simulated and measured experimentally. Light transmittance and the effect of transparent metamaterial absorber on the conversion efficiency of the solar array have also been measured. These results fully demonstrate the reliability of our design for solar arrays, which also meet the requirements of structural strength, atomic oxygen erosion resistance, weight limitation, etc.

preprint2020arXiv

Unidirectional Pumping of Phonons by Magnetization Dynamics

We propose a method to control surface phonon transport by weak magnetic fields based on the pumping of surface acoustic waves (SAWs) by magnetostriction. We predict that the magnetization dynamics of a nanowire on top of a dielectric films injects SAWs with opposite angular momenta into opposite directions. Two parallel nanowires form a phononic cavity that at magnetic resonances pump a unidirectional SAW current into half of the substrate.

preprint2020arXiv

Unlocking the Power of Deep PICO Extraction: Step-wise Medical NER Identification

The PICO framework (Population, Intervention, Comparison, and Outcome) is usually used to formulate evidence in the medical domain. The major task of PICO extraction is to extract sentences from medical literature and classify them into each class. However, in most circumstances, there will be more than one evidences in an extracted sentence even it has been categorized to a certain class. In order to address this problem, we propose a step-wise disease Named Entity Recognition (DNER) extraction and PICO identification method. With our method, sentences in paper title and abstract are first classified into different classes of PICO, and medical entities are then identified and classified into P and O. Different kinds of deep learning frameworks are used and experimental results show that our method will achieve high performance and fine-grained extraction results comparing with conventional PICO extraction works.

preprint2019arXiv

Chiral coupling of magnons in waveguides

We theoretically investigate the collective excitation of multiple (sub)millimeter-sized ferromagnets mediated by waveguide photons. By the position of the magnets in the waveguide, the magnon-photon coupling can be tuned to be chiral, i.e., magnons only couple with photons propagating in one direction, leading to asymmetric transfer of angular momentum and energy between the magnets. A large imbalance in the magnon number distribution over the magnets can be achieved with a long chain of magnets, which concentrate at one edge. The chain also supports standing waves with low radiation efficiency that is inert to the chirality.

preprint2019arXiv

Microscopic mechanism of level attraction

The emerging level attraction from dissipative light-matter coupling converges the typical Rabi-splitting feature from coherent coupling and exhibits potentials in topological information processing. However, the underlying microscopic quantum mechanism of dissipative coupling still remains unclear, which brings difficulties in quantifying and manipulating coherence-dissipation competition and thereby the flexible control of level attraction. Here, by coupling magnon to a cavity supporting both standing and travelling waves, we identify the travelling-wave state to be responsible for magnon-photon dissipative coupling. By characterizing radiative broadening of magnon linewidth, we quantify the coherent and dissipative coupling strengths and their competition. The effective magnon-photon coupling strength, as a net result of competition, is analytically presented in quantum theory to show good agreement with measurements. In this manner, we extend the control dimension of level attraction by tuning field torque on magnetization or global cavity geometry. Our finding opens new routines to engineer coupled harmonic oscillator system.

preprint2019arXiv

Modular Quantum Computation in a Trapped Ion System

Modern computation relies crucially on modular architectures, breaking a complex algorithm into self-contained subroutines. A client can then call upon a remote server to implement parts of the computation independently via an application programming interface (API). Present APIs relay only classical information. Here we implement a quantum API that enables a client to estimate the absolute value of the trace of a server-provided unitary $U$. We demonstrate that the algorithm functions correctly irrespective of what unitary $U$ the server implements or how the server specifically realizes $U$. Our experiment involves pioneering techniques to coherently swap qubits encoded within the motional states of a trapped \Yb ion, controlled on its hyperfine state. This constitutes the first demonstration of modular computation in the quantum regime, providing a step towards scalable, parallelization of quantum computation.

preprint2019arXiv

Observation of Rydberg exciton polaritons and their condensate in a perovskite cavity

The condensation of half-light half-matter exciton polaritons in semiconductor optical cavities is a striking example of macroscopic quantum coherence in a solid state platform. Quantum coherence is possible only when there are strong interactions between the exciton polaritons provided by their excitonic constituents. Rydberg excitons with high principle value exhibit strong dipole-dipole interactions in cold atoms. However, polaritons with the excitonic constituent that is an excited state, namely Rydberg exciton polaritons (REPs), have not yet been experimentally observed. Here, for the first time, we observe the formation of REPs in a single crystal CsPbBr3 perovskite cavity without any external fields. These polaritons exhibit strong nonlinear behavior that leads to a coherent polariton condensate with a prominent blue shift. Furthermore, the REPs in CsPbBr3 are highly anisotropic and have a large extinction ratio, arising from the perovskite&#39;s orthorhombic crystal structure. Our observation not only sheds light on the importance of many-body physics in coherent polariton systems involving higher-order excited states, but also paves the way for exploring these coherent interactions for solid state quantum optical information processing.

preprint2018arXiv

Observation of acoustic spin

Unlike optical waves, acoustic waves in fluids are described by scalar pressure fields, and therefore are considered spinless. Here, we demonstrate experimentally the existence of spin in acoustics. In the interference of two acoustic waves propagating perpendicularly to each other, we observed the spin angular momentum in free space as a result of the rotation of local particle velocity. We successfully measured the acoustic spin, and spin induced torque acting on a lossy acoustic meta-atom that results from absorption of the spin angular momentum. The acoustic spin is also observed in the evanescent field of a guided mode traveling along a metamaterial waveguide. We found spin-momentum locking in acoustic waves whose propagation direction is determined by the sign of spin. The observed acoustic spin could open a new door in acoustics and their applications for the control of wave propagation and particle rotation.

preprint2016arXiv

Ultrafast fluorescent decay induced by metal-mediated dipole-dipole interaction in two-dimensional molecular aggregates

Two-dimensional molecular aggregate (2DMA), a thin sheet of strongly interacting dipole molecules self-assembled at close distance on an ordered lattice, is a fascinating fluorescent material. It is distinctively different from the single or colloidal dye molecules or quantum dots in most previous research. In this paper, we verify for the first time that when a 2DMA is placed at a nanometric distance from a metallic substrate, the strong and coherent interaction between the dipoles inside the 2DMA dominates its fluorescent decay at picosecond timescale. Our streak-camera lifetime measurement and interacting lattice-dipole calculation reveal that the metal-mediated dipole-dipole interaction shortens the fluorescent lifetime to about one half and increases the energy dissipation rate by ten times than expected from the noninteracting single-dipole picture. Our finding can enrich our understanding of nanoscale energy transfer in molecular excitonic systems and may designate a new direction for developing fast and efficient optoelectronic devices.