Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
39works
0followers
33topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

39 published item(s)

preprint2026arXiv

Hybrid Quantum-MambaVision: A Quantum-enhanced State Space Model for Calibrated Mixed-type Wafer Defect Detection

Extracting actionable knowledge from industrial visual data is fundamentally bottlenecked by extreme class imbalance and the prohibitive computational complexity of modern foundation models. In semi-conductor manufacturing, identifying multi-label wafer defects is a complex spatial data mining task where overlapping patterns obscure critical root-cause signals. While Vision Transformers (ViTs) excel at global dependency extraction, their quadratic scaling renders them inefficient for high-throughput, real-time anomaly detection. To overcome these computational barriers, this paper introduces Hybrid Quantum-MambaVision, a highly efficient architecture tailored for spatial knowledge discovery. We integrate a linear-complexity State-Space Model (SSM) backbone with a Parameterized Quantum Context Adapter (QCA) and Low-Rank Adaptation (LoRA). The Mamba backbone efficiently captures long-range spatial dependencies, while the quantum adapter maps compressed latent features into a high-dimensional Hilbert space to disentangle complex, overlapping signatures. On the highly imbalanced MixedWM38 dataset, Hybrid Quantum-MambaVision achieves exceptional multi-label classification performance, significantly reducing the error rate on complex multi-defect topologies compared to classical baselines. The quantum regularizer acts as a profound uncertainty calibrator, substantially reducing Maximum Calibration Error (MCE) and minimizing expected false-positive costs. This work establishes a scalable Quantum-Classical hybrid paradigm for efficient representation learning in industrial data mining.

preprint2026arXiv

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

As large language model (LLM)-powered agents are increasingly deployed to perform complex, real-world tasks, they face a growing class of attacks that exploit extended user-agent-environment interactions to pursue malicious objectives improbable in single-turn settings. Such long-horizon threats pose significant risks to the safe deployment of LLM agents in critical domains. In this paper, we present MAGE (Memory As Guardrail Enforcement), a novel defensive framework designed to counter a wide range of long-horizon threats. Inspired by the "shadow stack" abstraction in systems security, MAGE maintains a dedicated, safety-focused agentic memory that distills and retains safety-critical context across the agent's full execution trajectory, leveraging this shadow memory to proactively assess the risk of pending actions prior to their execution. Extensive evaluation demonstrates that MAGE substantially outperforms existing defenses across diverse long-horizon threats in detection accuracy, achieves early-stage detection for the majority of attacks, and introduces only negligible overhead to agent utility. To our best knowledge, MAGE represents the first framework to detect and mitigate long-horizon threats using an agentic memory approach, establishing a new paradigm for this critical challenge and opening promising directions for future research.

preprint2025arXiv

Mitigating Traffic Oscillations in Mixed Traffic Flow with Scalable Deep Koopman Predictive Control

Mitigating traffic oscillations in mixed flows of connected automated vehicles (CAVs) and human-driven vehicles (HDVs) is critical for enhancing traffic stability. A key challenge lies in modeling the nonlinear, heterogeneous behaviors of HDVs within computationally tractable predictive control frameworks. This study proposes an adaptive deep Koopman predictive control framework (AdapKoopPC) to address this issue. The framework features a novel deep Koopman network, AdapKoopnet, which represents complex HDV car-following dynamics as a linear system in a high-dimensional space by adaptively learning from naturalistic data. This learned linear representation is then embedded into a Model Predictive Control (MPC) scheme, enabling real-time, scalable, and optimal control of CAVs. We validate our framework using the HighD dataset and extensive numerical simulations. Results demonstrate that AdapKoopnet achieves superior trajectory prediction accuracy over baseline models. Furthermore, the complete AdapKoopPC controller significantly dampens traffic oscillations with lower computational cost, exhibiting strong performance even at low CAV penetration rates. The proposed framework offers a scalable and data-driven solution for enhancing stability in realistic mixed traffic environments. The code is made publicly available.

preprint2025arXiv

New Exam Security Questions in the AI Era: Comparing AI-Generated Item Similarity Between Naive and Detail-Guided Prompting Approaches

Large language models (LLMs) have emerged as powerful tools for generating domain-specific multiple-choice questions (MCQs), offering efficiency gains for certification boards but raising new concerns about examination security. This study investigated whether LLM-generated items created with proprietary guidance differ meaningfully from those generated using only publicly available resources. Four representative clinical activities from the American Board of Family Medicine (ABFM) blueprint were mapped to corresponding Entrustable Professional Activities (EPAs), and three LLMs (GPT-4o, Claude 4 Sonnet, Gemini 2.5 Flash) produced items under a naive strategy using only public EPA descriptors, while GPT-4o additionally produced items under a guided strategy that incorporated proprietary blueprints, item-writing guidelines, and exemplar items, yielding 160 total items. Question stems and options were encoded using PubMedBERT and BioBERT, and intra- and inter-strategy cosine similarity coefficients were calculated. Results showed high internal consistency within each prompting strategy, while cross-strategy similarity was lower overall. However, several domain model pairs, particularly in narrowly defined areas such as viral pneumonia and hypertension, exceeded the 0.65 threshold, indicating convergence between naive and guided pipelines. These findings suggest that while proprietary resources impart distinctiveness, LLMs prompted only with public information can still generate items closely resembling guided outputs in constrained clinical domains, thereby heightening risks of item exposure. Safeguarding the integrity of high stakes examinations will require human-first, AI-assisted item development, strict separation of formative and summative item pools, and systematic similarity surveillance to balance innovation with security.

preprint2024arXiv

AdvSQLi: Generating Adversarial SQL Injections against Real-world WAF-as-a-service

As the first defensive layer that attacks would hit, the web application firewall (WAF) plays an indispensable role in defending against malicious web attacks like SQL injection (SQLi). With the development of cloud computing, WAF-as-a-service, as one kind of Security-as-a-service, has been proposed to facilitate the deployment, configuration, and update of WAFs in the cloud. Despite its tremendous popularity, the security vulnerabilities of WAF-as-a-service are still largely unknown, which is highly concerning given its massive usage. In this paper, we propose a general and extendable attack framework, namely AdvSQLi, in which a minimal series of transformations are performed on the hierarchical tree representation of the original SQLi payload, such that the generated SQLi payloads can not only bypass WAF-as-a-service under black-box settings but also keep the same functionality and maliciousness as the original payload. With AdvSQLi, we make it feasible to inspect and understand the security vulnerabilities of WAFs automatically, helping vendors make products more secure. To evaluate the attack effectiveness and efficiency of AdvSQLi, we first employ two public datasets to generate adversarial SQLi payloads, leading to a maximum attack success rate of 100% against state-of-the-art ML-based SQLi detectors. Furthermore, to demonstrate the immediate security threats caused by AdvSQLi, we evaluate the attack effectiveness against 7 WAF-as-a-service solutions from mainstream vendors and find all of them are vulnerable to AdvSQLi. For instance, AdvSQLi achieves an attack success rate of over 79% against the F5 WAF. Through in-depth analysis of the evaluation results, we further condense out several general yet severe flaws of these vendors that cannot be easily patched.

preprint2023arXiv

Graph based Environment Representation for Vision-and-Language Navigation in Continuous Environments

Vision-and-Language Navigation in Continuous Environments (VLN-CE) is a navigation task that requires an agent to follow a language instruction in a realistic environment. The understanding of environments is a crucial part of the VLN-CE task, but existing methods are relatively simple and direct in understanding the environment, without delving into the relationship between language instructions and visual environments. Therefore, we propose a new environment representation in order to solve the above problems. First, we propose an Environment Representation Graph (ERG) through object detection to express the environment in semantic level. This operation enhances the relationship between language and environment. Then, the relational representations of object-object, object-agent in ERG are learned through GCN, so as to obtain a continuous expression about ERG. Sequentially, we combine the ERG expression with object label embeddings to obtain the environment representation. Finally, a new cross-modal attention navigation framework is proposed, incorporating our environment representation and a special loss function dedicated to training ERG. Experimental result shows that our method achieves satisfactory performance in terms of success rate on VLN-CE tasks. Further analysis explains that our method attains better cross-modal matching and strong generalization ability.

preprint2022arXiv

A Comprehensive Survey of Few-shot Learning: Evolution, Applications, Challenges, and Opportunities

Few-shot learning (FSL) has emerged as an effective learning method and shows great potential. Despite the recent creative works in tackling FSL tasks, learning valid information rapidly from just a few or even zero samples still remains a serious challenge. In this context, we extensively investigated 200+ latest papers on FSL published in the past three years, aiming to present a timely and comprehensive overview of the most recent advances in FSL along with impartial comparisons of the strengths and weaknesses of the existing works. For the sake of avoiding conceptual confusion, we first elaborate and compare a set of similar concepts including few-shot learning, transfer learning, and meta-learning. Furthermore, we propose a novel taxonomy to classify the existing work according to the level of abstraction of knowledge in accordance with the challenges of FSL. To enrich this survey, in each subsection we provide in-depth analysis and insightful discussion about recent advances on these topics. Moreover, taking computer vision as an example, we highlight the important application of FSL, covering various research hotspots. Finally, we conclude the survey with unique insights into the technology evolution trends together with potential future research opportunities in the hope of providing guidance to follow-up research.

preprint2022arXiv

Confidence Matters: Inspecting Backdoors in Deep Neural Networks via Distribution Transfer

Backdoor attacks have been shown to be a serious security threat against deep learning models, and detecting whether a given model has been backdoored becomes a crucial task. Existing defenses are mainly built upon the observation that the backdoor trigger is usually of small size or affects the activation of only a few neurons. However, the above observations are violated in many cases especially for advanced backdoor attacks, hindering the performance and applicability of the existing defenses. In this paper, we propose a backdoor defense DTInspector built upon a new observation. That is, an effective backdoor attack usually requires high prediction confidence on the poisoned training samples, so as to ensure that the trained model exhibits the targeted behavior with a high probability. Based on this observation, DTInspector first learns a patch that could change the predictions of most high-confidence data, and then decides the existence of backdoor by checking the ratio of prediction changes after applying the learned patch on the low-confidence data. Extensive evaluations on five backdoor attacks, four datasets, and three advanced attacking types demonstrate the effectiveness of the proposed defense.

preprint2022arXiv

Construction of Strong-Uniform Fuzzy Partitions of Arbitrary Dimensions

Strong-uniform fuzzy partition is necessary for the accuracy of fuzzy partition-based histograms. Most previous research focused on constructing one-dimensional strong-uniform fuzzy partitions. While to the best of our knowledge, few have been reported for high-dimensional cases. In order to fill this theoretical vacancy, this paper proves the existence of high-dimensional strong-uniform fuzzy partitions via proposing an analytic formula to construct strong-uniform fuzzy partitions of arbitrary dimensions.

preprint2022arXiv

Convergence Rate of Inertial Forward-Backward Algorithms Based on the Local Error Bound Condition

The "Inertial Forward-Backward algorithm" (IFB) is a powerful tool for convex nonsmooth minimization problems, it gives the well known "fast iterative shrinkage-thresholding algorithm " (FISTA), which enjoys $O\left( {\frac{1}{k^2}} \right)$ global convergence rate of function values, however, no convergence of iterates has been proved; by do a small modification, an accelerated IFB called "FISTA\_CD" improves the convergence rate of function values to $o\left( {\frac{1}{k^2}} \right)$ and shows the weak convergence of iterates. The local error bound condition is extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems, and in practical application, a large number of problems with special structure often satisfy the error bound condition. Naturally, using local error bound condition to derive or improve the convergence rate of IFB is a common means. In this paper, based on the local error bound condition, we exploit an new assumption condition for the important parameter $t_k$ in IFB, and establish the improved convergence rate of function values and strong convergence of the iterates generated by the IFB algorithms with six $t_k$ satisfying the above assumption condition in Hilbert space. It is remarkable that, under the local error bound condition, we establish the strong convergence of the iterates generated by the original FISTA, and prove that the convergence rates of function values for FISTA\_CD is actually related to the value of parameter $a,$ and show that the IFB algorithms with some $t_k$ mentioned above can achieve sublinear convergence rate $o\left( {\frac{1}{k^p}} \right)$ for any positive integer $p>1$. Some numerical experiments are conducted to illustrate our results.

preprint2022arXiv

Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation

Due to the prosperity of Artificial Intelligence (AI) techniques, more and more backdoors are designed by adversaries to attack Deep Neural Networks (DNNs).Although the state-of-the-art method Neural Attention Distillation (NAD) can effectively erase backdoor triggers from DNNs, it still suffers from non-negligible Attack Success Rate (ASR) together with lowered classification ACCuracy (ACC), since NAD focuses on backdoor defense using attention features (i.e., attention maps) of the same order. In this paper, we introduce a novel backdoor defense framework named Attention Relation Graph Distillation (ARGD), which fully explores the correlation among attention features with different orders using our proposed Attention Relation Graphs (ARGs). Based on the alignment of ARGs between both teacher and student models during knowledge distillation, ARGD can eradicate more backdoor triggers than NAD. Comprehensive experimental results show that, against six latest backdoor attacks, ARGD outperforms NAD by up to 94.85% reduction in ASR, while ACC can be improved by up to 3.23%.

preprint2022arXiv

FedEntropy: Efficient Device Grouping for Federated Learning Using Maximum Entropy Judgment

Along with the popularity of Artificial Intelligence (AI) and Internet-of-Things (IoT), Federated Learning (FL) has attracted steadily increasing attentions as a promising distributed machine learning paradigm, which enables the training of a central model on for numerous decentralized devices without exposing their privacy. However, due to the biased data distributions on involved devices, FL inherently suffers from low classification accuracy in non-IID scenarios. Although various device grouping method have been proposed to address this problem, most of them neglect both i) distinct data distribution characteristics of heterogeneous devices, and ii) contributions and hazards of local models, which are extremely important in determining the quality of global model aggregation. In this paper, we present an effective FL method named FedEntropy with a novel dynamic device grouping scheme, which makes full use of the above two factors based on our proposed maximum entropy judgement heuristic.Unlike existing FL methods that directly aggregate local models returned from all the selected devices, in one FL round FedEntropy firstly makes a judgement based on the pre-collected soft labels of selected devices and then only aggregates the local models that can maximize the overall entropy of these soft labels. Without collecting local models that are harmful for aggregation, FedEntropy can effectively improve global model accuracy while reducing the overall communication overhead. Comprehensive experimental results on well-known benchmarks show that, FedEntropy not only outperforms state-of-the-art FL methods in terms of model accuracy and communication overhead, but also can be integrated into them to enhance their classification performance.

preprint2022arXiv

Gate-Tunable Spin-Orbit Coupling in a Germanium Hole Double Quantum Dot

Hole spins confined in semiconductor quantum dot systems have gained considerable interest for their strong spin-orbit interactions (SOIs) and relatively weak hyperfine interactions. Here we experimentally demonstrate a tunable SOI in a double quantum dot in a Germanium (Ge) hut wire (HW), which could help enable fast all-electric spin manipulations while suppressing unwanted decoherence. Specifically, we measure the transport spectra in the Pauli spin blockade regime in the double quantum dot device.By adjusting the interdot tunnel coupling, we obtain an electric field tuned spin-orbit length lso = 2.0 - 48.9 nm. This tunability of the SOI could pave the way toward the realization of high-fidelity qubits in Ge HW systems.

preprint2022arXiv

Hybrid integration of deterministic quantum dots-based single-photon sources with CMOS-compatible silicon carbide photonics

Thin film 4H-silicon carbide (4H-SiC) is emerging as a contender for realizing large-scale optical quantum circuits due to its high CMOS technology compatibility and large optical nonlinearities. Though, challenges remain in producing wafer-scale 4H-SiC thin film on insulator (4H-SiCOI) for dense integration of photonic circuits, and in efficient coupling of deterministic quantum emitters that are essential for scalable quantum photonics. Here we demonstrate hybrid integration of self-assembled InGaAs quantum dots (QDs) based single-photon sources (SPSs) with wafer-scale 4H-SiC photonic chips prepared by ion slicing technique. By designing a bilayer vertical coupler, we realize generation and highly efficient routing of single-photon emission in the hybrid quantum photonic chip. Furthermore, we realize a chip-integrated beamsplitter operation for triggered single photons through fabricating a 1x2 multi-mode interferometer (MMI) with a symmetric power splitting ratio of 50:50. The successful demonstration of heterogeneously integrating QDs-based SPSs on 4H-SiC photonic chip prepared by ion slicing technique constitutes an important step toward CMOS-compatible, fast reconfigurable quantum photonic circuits with deterministic SPSs.

preprint2022arXiv

Machine Learning Empowered Intelligent Data Center Networking: A Survey

To support the needs of ever-growing cloud-based services, the number of servers and network devices in data centers is increasing exponentially, which in turn results in high complexities and difficulties in network optimization. To address these challenges, both academia and industry turn to artificial intelligence technology to realize network intelligence. To this end, a considerable number of novel and creative machine learning-based (ML-based) research works have been put forward in recent few years. Nevertheless, there are still enormous challenges faced by the intelligent optimization of data center networks (DCNs), especially in the scenario of online real-time dynamic processing of massive heterogeneous services and traffic data. To best of our knowledge, there is a lack of systematic and original comprehensively investigations with in-depth analysis on intelligent DCN. To this end, in this paper, we comprehensively investigate the application of machine learning to data center networking, and provide a general overview and in-depth analysis of the recent works, covering flow prediction, flow classification, load balancing, resource management, routing optimization, and congestion control. In order to provide a multi-dimensional and multi-perspective comparison of various solutions, we design a quality assessment criteria called REBEL-3S to impartially measure the strengths and weaknesses of these research works. Moreover, we also present unique insights into the technology evolution of the fusion of data center network and machine learning, together with some challenges and potential future research opportunities.

preprint2022arXiv

Manu: A Cloud Native Vector Database Management System

With the development of learning-based embedding models, embedding vectors are widely used for analyzing and searching unstructured data. As vector collections exceed billion-scale, fully managed and horizontally scalable vector databases are necessary. In the past three years, through interaction with our 1200+ industry users, we have sketched a vision for the features that next-generation vector databases should have, which include long-term evolvability, tunable consistency, good elasticity, and high performance. We present Manu, a cloud native vector database that implements these features. It is difficult to integrate all these features if we follow traditional DBMS design rules. As most vector data applications do not require complex data models and strong data consistency, our design philosophy is to relax the data model and consistency constraints in exchange for the aforementioned features. Specifically, Manu firstly exposes the write-ahead log (WAL) and binlog as backbone services. Secondly, write components are designed as log publishers while all read-only analytic and search components are designed as independent subscribers to the log services. Finally, we utilize multi-version concurrency control (MVCC) and a delta consistency model to simplify the communication and cooperation among the system components. These designs achieve a low coupling among the system components, which is essential for elasticity and evolution. We also extensively optimize Manu for performance and usability with hardware-aware implementations and support for complex search semantics.

preprint2022arXiv

Model-Contrastive Learning for Backdoor Defense

Due to the popularity of Artificial Intelligence (AI) techniques, we are witnessing an increasing number of backdoor injection attacks that are designed to maliciously threaten Deep Neural Networks (DNNs) causing misclassification. Although there exist various defense methods that can effectively erase backdoors from DNNs, they greatly suffer from both high Attack Success Rate (ASR) and a non-negligible loss in Benign Accuracy (BA). Inspired by the observation that a backdoored DNN tends to form a new cluster in its feature spaces for poisoned data, in this paper we propose a novel two-stage backdoor defense method, named MCLDef, based on Model-Contrastive Learning (MCL). In the first stage, our approach performs trigger inversion based on trigger synthesis, where the resultant trigger can be used to generate poisoned data. In the second stage, under the guidance of MCL and our defined positive and negative pairs, MCLDef can purify the backdoored model by pulling the feature representations of poisoned data towards those of their clean data counterparts. Due to the shrunken cluster of poisoned data, the backdoor formed by end-to-end supervised learning is eliminated. Comprehensive experimental results show that, with only 5% of clean data, MCLDef significantly outperforms state-of-the-art defense methods by up to 95.79% reduction in ASR, while in most cases the BA degradation can be controlled within less than 2%. Our code is available at https://github.com/WeCanShow/MCL.

preprint2022arXiv

Monolithic Integration of Embedded III-V Lasers on SOI

Silicon photonic integration has gained great success in many application fields owing to the excellent optical device properties and complementary metal-oxide semiconductor (CMOS) compatibility. Realizing monolithic integration of III-V lasers and silicon photonic components on single silicon wafer is recognized as a long-standing obstacle for ultra-dense photonic integration, which can provide considerable economical, energy efficient and foundry-scalable on-chip light sources, that has not been reported yet. Here, we demonstrate embedded InAs/GaAs quantum dot (QD) lasers directly grown on trenched silicon-on-insulator (SOI) substrate, enabling monolithic integration with butt-coupled silicon waveguides. By utilizing the patterned grating structures inside pre-defined SOI trenches and unique epitaxial method via molecular beam epitaxy (MBE), high-performance embedded InAs QD lasers with out-coupled silicon waveguide are achieved on such template. By resolving the epitaxy and fabrication challenges in such monolithic integrated architecture, embedded III-V lasers on SOI with continuous-wave lasing up to 85 oC are obtained. The maximum output power of 6.8 mW can be measured from the end tip of the butt-coupled silicon waveguides, with estimated coupling efficiency of approximately -7.35 dB. The results presented here provide a scalable and low-cost epitaxial method for realization of on-chip light sources directly coupling to the silicon photonic components for future high-density photonic integration.

preprint2022arXiv

Multi-agent Reinforcement Learning for Dynamic Resource Management in 6G in-X Subnetworks

The 6G network enables a subnetwork-wide evolution, resulting in a "network of subnetworks". However, due to the dynamic mobility of wireless subnetworks, the data transmission of intra-subnetwork and inter-subnetwork will inevitably interfere with each other, which poses a great challenge to radio resource management. Moreover, most of the existing approaches require the instantaneous channel gain between subnetworks, which are usually difficult to be collected. To tackle these issues, in this paper we propose a novel effective intelligent radio resource management method using multi-agent deep reinforcement learning (MARL), which only needs the sum of received power, named received signal strength indicator (RSSI), on each channel instead of channel gains. However, to directly separate individual interference from RSSI is an almost impossible thing. To this end, we further propose a novel MARL architecture, named GA-Net, which integrates a hard attention layer to model the importance distribution of inter-subnetwork relationships based on RSSI and exclude the impact of unrelated subnetworks, and employs a graph attention network with a multi-head attention layer to exact the features and calculate their weights that will impact individual throughput. Experimental results prove that our proposed framework significantly outperforms both traditional and MARL-based methods in various aspects.

preprint2022arXiv

Multi-Document Scientific Summarization from a Knowledge Graph-Centric View

Multi-Document Scientific Summarization (MDSS) aims to produce coherent and concise summaries for clusters of topic-relevant scientific papers. This task requires precise understanding of paper content and accurate modeling of cross-paper relationships. Knowledge graphs convey compact and interpretable structured information for documents, which makes them ideal for content modeling and relationship modeling. In this paper, we present KGSum, an MDSS model centred on knowledge graphs during both the encoding and decoding process. Specifically, in the encoding process, two graph-based modules are proposed to incorporate knowledge graph information into paper encoding, while in the decoding process, we propose a two-stage decoder by first generating knowledge graph information of summary in the form of descriptive sentences, followed by generating the final summary. Empirical results show that the proposed architecture brings substantial improvements over baselines on the Multi-Xscience dataset.

preprint2022arXiv

Over-the-Air Federated Learning via Second-Order Optimization

Federated learning (FL) is a promising learning paradigm that can tackle the increasingly prominent isolated data islands problem while keeping users' data locally with privacy and security guarantees. However, FL could result in task-oriented data traffic flows over wireless networks with limited radio resources. To design communication-efficient FL, most of the existing studies employ the first-order federated optimization approach that has a slow convergence rate. This however results in excessive communication rounds for local model updates between the edge devices and edge server. To address this issue, in this paper, we instead propose a novel over-the-air second-order federated optimization algorithm to simultaneously reduce the communication rounds and enable low-latency global model aggregation. This is achieved by exploiting the waveform superposition property of a multi-access channel to implement the distributed second-order optimization algorithm over wireless networks. The convergence behavior of the proposed algorithm is further characterized, which reveals a linear-quadratic convergence rate with an accumulative error term in each iteration. We thus propose a system optimization approach to minimize the accumulated error gap by joint device selection and beamforming design. Numerical results demonstrate the system and communication efficiency compared with the state-of-the-art approaches.

preprint2022arXiv

The Variable Volatility Elasticity Model from Commodity Markets

In this paper, we propose and study a novel continuous-time model, based on the well-known constant elasticity of variance (CEV) model, to describe the asset price process. The basic idea is that the volatility elasticity of the CEV model can not be treated as a constant from the perspective of stochastic analysis. To address this issue, we deduce the price process of assets from the perspective of volatility elasticity, propose the constant volatility elasticity (CVE) model, and further derive a more general variable volatility elasticity (VVE) model. Moreover, our model can describe the positive correlation between volatility and asset prices existing in the commodity markets, while CEV model can only describe the negative correlation. Through the empirical research on the financial market, many assets, especially commodities, often show this positive correlation phenomenon in some time periods, which shows that our model has strong practical application value. Finally, we provide the explicit pricing formula of European options based on our model. This formula has an elegant form convenient to calculate, which is similarly to the renowned Black-Scholes formula and of great significance to the research of derivatives market.

preprint2022arXiv

To Simulate the Spread of Infectious Diseases by the Random Matrix

The main aim to build models capable of simulating the spreading of infectious diseases is to control them. And along this way, the key to find the optimal strategy for disease control is to obtain a large number of simulations of disease transitions under different scenarios. Therefore, the models that can simulate the spreading of diseases under scenarios closer to the reality and are with high efficiency are preferred. In the realistic social networks, the random contact, including contacts between people in the public places and the public transits, becomes the important access for the spreading of infectious diseases. In this paper, a model can efficiently simulate the spreading of infectious diseases under random contacts is proposed. In this approach, the random contact between people is characterized by the random matrix with elements randomly generated and the spread of the diseases is simulated by the Markov process. We report an interesting property of the proposed model: the main indicators of the spreading of the diseases such as the death rate are invariant of the size of the population. Therefore, representative simulations can be conducted on models consist of small number of populations. The main advantage of this model is that it can easily simulate the spreading of diseases under more realistic scenarios and thus is able to give a large number of simulations needed for the searching of the optimal control strategy. Based on this work, the reinforcement learning will be introduced to give the optimal control strategy in the following work.

preprint2022arXiv

Towards Fast and Accurate Federated Learning with non-IID Data for Cloud-Based IoT Applications

As a promising method of central model training on decentralized device data while securing user privacy, Federated Learning (FL)is becoming popular in Internet of Things (IoT) design. However, when the data collected by IoT devices are highly skewed in a non-independent and identically distributed (non-IID) manner, the accuracy of vanilla FL method cannot be guaranteed. Although there exist various solutions that try to address the bottleneck of FL with non-IID data, most of them suffer from extra intolerable communication overhead and low model accuracy. To enable fast and accurate FL, this paper proposes a novel data-based device grouping approach that can effectively reduce the disadvantages of weight divergence during the training of non-IID data. However, since our grouping method is based on the similarity of extracted feature maps from IoT devices, it may incur additional risks of privacy exposure. To solve this problem, we propose an improved version by exploiting similarity information using the Locality-Sensitive Hashing (LSH) algorithm without exposing extracted feature maps. Comprehensive experimental results on well-known benchmarks show that our approach can not only accelerate the convergence rate, but also improve the prediction accuracy for FL with non-IID data.

preprint2022arXiv

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

One intriguing property of adversarial attacks is their "transferability" -- an adversarial example crafted with respect to one deep neural network (DNN) model is often found effective against other DNNs as well. Intensive research has been conducted on this phenomenon under simplistic controlled conditions. Yet, thus far, there is still a lack of comprehensive understanding about transferability-based attacks ("transfer attacks") in real-world environments. To bridge this critical gap, we conduct the first large-scale systematic empirical study of transfer attacks against major cloud-based MLaaS platforms, taking the components of a real transfer attack into account. The study leads to a number of interesting findings which are inconsistent to the existing ones, including: (1) Simple surrogates do not necessarily improve real transfer attacks. (2) No dominant surrogate architecture is found in real transfer attacks. (3) It is the gap between posterior (output of the softmax layer) rather than the gap between logit (so-called $κ$ value) that increases transferability. Moreover, by comparing with prior works, we demonstrate that transfer attacks possess many previously unknown properties in real-world environments, such as (1) Model similarity is not a well-defined concept. (2) $L_2$ norm of perturbation can generate high transferability without usage of gradient and is a more powerful source than $L_\infty$ norm. We believe this work sheds light on the vulnerabilities of popular MLaaS platforms and points to a few promising research directions.

preprint2022arXiv

Ultrafast coherent control of a hole spin qubit in a germanium quantum dot

Operation speed and coherence time are two core measures for the viability of a qubit. Strong spin-orbit interaction (SOI) and relatively weak hyperfine interaction make holes in germanium (Ge) intriguing candidates for spin qubits with rapid, all-electrical coherent control. Here we report ultrafast single-spin manipulation in a hole-based double quantum dot in a germanium hut wire (GHW). Mediated by the strong SOI, a Rabi frequency exceeding 540 MHz is observed at a magnetic field of 100 mT, setting a record for ultrafast spin qubit control in semiconductor systems. We demonstrate that the strong SOI of heavy holes (HHs) in our GHW, characterized by a very short spin-orbit length of 1.5 nm, enables the rapid gate operations we accomplish. Our results demonstrate the potential of ultrafast coherent control of hole spin qubits to meet the requirement of DiVincenzo's criteria for a scalable quantum information processor.

preprint2021arXiv

i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

Providing explanations for deep neural networks (DNNs) is essential for their use in domains wherein the interpretability of decisions is a critical prerequisite. Despite the plethora of work on interpreting DNNs, most existing solutions offer interpretability in an ad hoc, one-shot, and static manner, without accounting for the perception, understanding, or response of end-users, resulting in their poor usability in practice. In this paper, we argue that DNN interpretability should be implemented as the interactions between users and models. We present i-Algebra, a first-of-its-kind interactive framework for interpreting DNNs. At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives. Leveraging a declarative query language, users are enabled to build various analysis tools (e.g., "drill-down", "comparative", "what-if" analysis) via flexibly composing such operators. We prototype i-Algebra and conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.

preprint2021arXiv

Position-dependent chiral coupling between single quantum dots and cross waveguides

Chiral light-matter interaction between photonic nanostructures with quantum emitters shows great potential to implement spin-photon interfaces for quantum information processing. Position-dependent spin momentum locking of the quantum emitter is important for these chiral coupled nanostructures. Here, we report the position-dependent chiral coupling between quantum dots (QDs) and cross waveguides both numerically and experimentally. Four quantum dots distributed at different positions in the cross section are selected to characterize the chiral properties of the device. Directional emission is achieved in a single waveguide as well as in both two waveguides simultaneously. In addition, the QD position can be determined with the chiral contrasts from four outputs. Therefore, the cross waveguide can function as a one-way unidirectional waveguide and a circularly polarized beam splitter by placing the QD in a rational position, which has potential applications in spin-to-path encoding for complex quantum optical networks at the single-photon level.

preprint2021arXiv

UAV-Assisted Over-the-Air Computation

Over-the-air computation (AirComp) provides a promising way to support ultrafast aggregation of distributed data. However, its performance cannot be guaranteed in long-distance transmission due to the distortion induced by the channel fading and noise. To unleash the full potential of AirComp, this paper proposes to use a low-cost unmanned aerial vehicle (UAV) acting as a mobile base station to assist AirComp systems. Specifically, due to its controllable high-mobility and high-altitude, the UAV can move sufficiently close to the sensors to enable line-of-sight transmission and adaptively adjust all the links' distances, thereby enhancing the signal magnitude alignment and noise suppression. Our goal is to minimize the time-averaging mean-square error for AirComp by jointly optimizing the UAV trajectory, the scaling factor at the UAV, and the transmit power at the sensors, under constraints on the UAV's predetermined locations and flying speed, sensors' average and peak power limits. However, due to the highly coupled optimization variables and time-dependent constraints, the resulting problem is non-convex and challenging. We thus propose an efficient iterative algorithm by applying the block coordinate descent and successive convex optimization techniques. Simulation results verify the convergence of the proposed algorithm and demonstrate the performance gains and robustness of the proposed design compared with benchmarks.

preprint2021arXiv

Visual Perception Generalization for Vision-and-Language Navigation via Meta-Learning

Vision-and-language navigation (VLN) is a challenging task that requires an agent to navigate in real-world environments by understanding natural language instructions and visual information received in real-time. Prior works have implemented VLN tasks on continuous environments or physical robots, all of which use a fixed camera configuration due to the limitations of datasets, such as 1.5 meters height, 90 degrees horizontal field of view (HFOV), etc. However, real-life robots with different purposes have multiple camera configurations, and the huge gap in visual information makes it difficult to directly transfer the learned navigation model between various robots. In this paper, we propose a visual perception generalization strategy based on meta-learning, which enables the agent to fast adapt to a new camera configuration with a few shots. In the training phase, we first locate the generalization problem to the visual perception module, and then compare two meta-learning algorithms for better generalization in seen and unseen environments. One of them uses the Model-Agnostic Meta-Learning (MAML) algorithm that requires a few shot adaptation, and the other refers to a metric-based meta-learning method with a feature-wise affine transformation layer. The experiment results show that our strategy successfully adapts the learned navigation model to a new camera configuration, and the two algorithms show their advantages in seen and unseen environments respectively.

preprint2020arXiv

AdvMind: Inferring Adversary Intent of Black-Box Attacks

Deep neural networks (DNNs) are inherently susceptible to adversarial attacks even under black-box settings, in which the adversary only has query access to the target models. In practice, while it may be possible to effectively detect such attacks (e.g., observing massive similar but non-identical queries), it is often challenging to exactly infer the adversary intent (e.g., the target class of the adversarial example the adversary attempts to craft) especially during early stages of the attacks, which is crucial for performing effective deterrence and remediation of the threats in many scenarios. In this paper, we present AdvMind, a new class of estimation models that infer the adversary intent of black-box adversarial attacks in a robust and prompt manner. Specifically, to achieve robust detection, AdvMind accounts for the adversary adaptiveness such that her attempt to conceal the target will significantly increase the attack cost (e.g., in terms of the number of queries); to achieve prompt detection, AdvMind proactively synthesizes plausible query results to solicit subsequent queries from the adversary that maximally expose her intent. Through extensive empirical evaluation on benchmark datasets and state-of-the-art black-box attacks, we demonstrate that on average AdvMind detects the adversary intent with over 75% accuracy after observing less than 3 query batches and meanwhile increases the cost of adaptive attacks by over 60%. We further discuss the possible synergy between AdvMind and other defense methods against black-box adversarial attacks, pointing to several promising research directions.

preprint2020arXiv

Dipole coupling of a tunable hole double quantum dot in germanium hut wire to a microwave resonator

The germanium (Ge) hut wire system has strong spin-orbit coupling, a long coherence time due to a very large heavy-light hole splitting, and the advantage of site-controlled large-scale hut wire positioning. These properties make the Ge hut wire a promising candidate for the realization of strong coupling of spin to superconducting resonators and scalability for multiple qubit coupling. We have coupled a reflection line resonator to a hole double quantum dot (DQD) formed in Ge hut wire. The amplitude and phase responses of the microwave resonator revealed that the charge stability diagrams of the DQD are in good agreement with those obtained from transport measurements. The DQD interdot tunneling rate is shown to be tunable from 6.2 GHz to 8.5 GHz, which demonstrates the ability to adjust the frequency detuning between the qubit and the resonator. Furthermore, we achieved a hole-resonator coupling strength of up to 15 MHz, with a charge qubit decoherence rate of 0.28 GHz. Meanwhile the hole spin-resonator coupling rate was estimated to be 3 MHz. These results suggest that holes of a DQD in a Ge hut wire are dipole coupled to microwave photons, potentially enabling tunable hole spin-photon interactions in Ge with an inherent spin-orbit coupling.

preprint2020arXiv

Hole spin in tunable Ge hut wire double quantum dot

Holes in germanium (Ge) exhibit strong spin-orbit interaction, which can be exploited for fast and all-electrical manipulation of spin states. Here, we report transport experiments in a tunable Ge hut wire hole double quantum dot. We observe the signatures of Pauli spin blockade (PSB) with a large singlet-triplet energy splitting of ~1.1 meV and extract the g factor. By analyzing the the PSB leakage current, we obtain a spin-orbit length l_so of ~ 40-100 nm. Furthermore, we demonstrate the electric dipole spin resonance. These results lay a solid foundation for implementing high quality tunable hole spin-orbit qubits.

preprint2020arXiv

PoisHygiene: Detecting and Mitigating Poisoning Attacks in Neural Networks

The black-box nature of deep neural networks (DNNs) facilitates attackers to manipulate the behavior of DNN through data poisoning. Being able to detect and mitigate poisoning attacks, typically categorized into backdoor and adversarial poisoning (AP), is critical in enabling safe adoption of DNNs in many application domains. Although recent works demonstrate encouraging results on detection of certain backdoor attacks, they exhibit inherent limitations which may significantly constrain the applicability. Indeed, no technique exists for detecting AP attacks, which represents a harder challenge given that such attacks exhibit no common and explicit rules while backdoor attacks do (i.e., embedding backdoor triggers into poisoned data). We believe the key to detect and mitigate AP attacks is the capability of observing and leveraging essential poisoning-induced properties within an infected DNN model. In this paper, we present PoisHygiene, the first effective and robust detection and mitigation framework against AP attacks. PoisHygiene is fundamentally motivated by Dr. Ernest Rutherford's story (i.e., the 1908 Nobel Prize winner), on observing the structure of atom through random electron sampling.

preprint2020arXiv

Portably parallel construction of a CI wave function from a matrix-product state using the Charm++ framework

The constructions of configuration interaction (CI) expansions from a matrix-product state (MPS) involves numerous matrix operations and the skillful sampling of important configurations when in a huge Hilbert space. In this work, we present an efficient procedure for constructing CI expansions from MPS using the Charm++ parallel programming framework, upon which automatic load balancing and object migration facilities can be employed. This procedure was employed in the MPS-to-CI utility (Moritz et al., J. Chem. Phys. 2007, 126, 224109), sampling-reconstructed complete active space algorithm (SR-CAS, Boguslawski et al., J. Chem. Phys. 2011, 134, 224101) and entanglement-driven genetic algorithm (EDGA, Luo et al., J. Chem. Theory Comput. 2017, 13, 4699-4710). It enhances productivity and allows the sampling programs evolve to their population-expansion versions (e.g., EDGA with population expansion [PE-EDGA]). Examples of 1,2-dioxetanone and firefly dioxetanone anion (FDO-) molecules demonstrated that 1) the procedure could be flexibly employed among various multi-core architectures; 2) the parallel efficiencies could be persistently improved simply by increasing the proportion of asynchronous executions; 3) PE-EDGA could construct a CAS-type CI wave function from a huge Hilbert space, with 0.9952 CI completeness and 96.7% correlation energy via ~1.66x10^6 configurations (only 0.0000028% of the total configurations) of a bi-radical state of FDO- molecule using the full valence active space within a few hours.

preprint2020arXiv

Score-based Tests for Explaining Upper-Level Heterogeneity in Linear Mixed Models

Cross-level interactions among fixed effects in linear mixed models (also known as multilevel models) are often complicated by the variances stemming from random effects and residuals. When these variances change across clusters, tests of fixed effects (including cross-level interaction terms) are subject to inflated Type I or Type II error. While the impact of variance change/heterogeneity has been noticed in the literature, few methods have been proposed to detect this heterogeneity in a simple, systematic way. In addition, when heterogeneity among clusters is detected, researchers often wish to know which clusters' variances differed from the others. In this study, we utilize a recently-proposed family of score-based tests to distinguish between cross-level interactions and heterogeneity in variance components, also providing information about specific clusters that exhibit heterogeneity. These score-based tests only require estimation of the null model (when variance homogeneity is assumed to hold), and they have been previously applied to psychometric models. We extend the tests to linear mixed models here, detailing their implementation and performance when the data generating model is known. We also include an empirical example illustrating the tests' use in practice.

preprint2020arXiv

Self-controlled growth of highly uniform Ge/Si hut wires for scalable qubit devices

Semiconductor nanowires have been playing a crucial role in the development of nanoscale devices for the realization of spin qubits, Majorana fermions, single photon emitters, nanoprocessors, etc. The monolithic growth of site-controlled nanowires is a prerequisite towards the next generation of devices that will require addressability and scalability. Here, combining top-down nanofabrication and bottom-up self-assembly, we report on the growth of Ge wires on pre-patterned Si (001) substrates with controllable position, distance, length and structure. This is achieved by a novel growth process which uses a SiGe strain-relaxation template and can be generalized to other material combinations. Transport measurements show an electrically tunable spin-orbit coupling, with a spin-orbit length similar to that of III-V materials. Also, capacitive coupling between closely spaced wires is observed, which underlines their potential as a host for implementing two qubit gates. The reported results open a path towards scalable qubit devices with Si compatibility.

preprint2020arXiv

Stochastic Gradient Descent for Semilinear Elliptic Equations with Uncertainties

Randomness is ubiquitous in modern engineering. The uncertainty is often modeled as random coefficients in the differential equations that describe the underlying physics. In this work, we describe a two-step framework for numerically solving semilinear elliptic partial differential equations with random coefficients: 1) reformulate the problem as a functional minimization problem based on the direct method of calculus of variation; 2) solve the minimization problem using the stochastic gradient descent method. We provide the convergence criterion for the resulted stochastic gradient descent algorithm and discuss some useful technique to overcome the issues of ill-conditioning and large variance. The accuracy and efficiency of the algorithm are demonstrated by numerical experiments.

preprint2019arXiv

Accelerated scale bridging with sparsely approximated Gaussian learning

Multiscale modeling is a systematic approach to describe the behavior of complex systems by coupling models from different scales. The approach has been demonstrated to be very effective in areas of science as diverse as materials science, climate modeling and chemistry. However, routine use of multiscale simulations is often hindered by the very high cost of individual at-scale models. Approaches aiming to alleviate that cost by means of Gaussian process regression based surrogate models have been proposed. Yet, many of these surrogate models are expensive to construct, especially when the number of data needed is large. In this article, we employ a hierarchical sparse Cholesky decomposition to develop a sparse Gaussian process regression method and apply the method to approximate the equation of state of an energetic material in a multiscale model of dynamic deformation. We demonstrate that the method provides a substantial reduction both in computational cost and solution error as compared with previous methods.