Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
46works
0followers
25topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

46 published item(s)

preprint2026arXiv

Bgolearn: a Unified Bayesian Optimization Framework for Accelerating Materials Discovery

Efficient exploration of vast compositional and processing spaces is essential for accelerated materials discovery. Bayesian optimization (BO) provides a principled strategy for identifying optimal materials with minimal experiments, yet its adoption in materials science is hindered by implementation complexity and limited domain-specific tools. Here, we present Bgolearn, a comprehensive Python framework that makes BO accessible and practical for materials research through an intuitive interface, robust algorithms, and materials-oriented workflows. Bgolearn supports both single-objective and multi-objective Bayesian optimization with multiple acquisition functions (e.g., expected improvement, upper confidence bound, probability of improvement, and expected hypervolume improvement etc.), diverse surrogate models (including Gaussian processes, random forests, and gradient boosting etc.), and bootstrap-based uncertainty quantification. Benchmark studies show that Bgolearn reduces the number of required experiments by 40-60% compared with random search, grid search, and genetic algorithms, while maintaining comparable or superior solution quality. Its effectiveness is demonstrated not only through the studies presented in this paper, such as the identification of maximum-elastic-modulus triply periodic minimal surface structures, ultra-high-hardness high-entropy alloys, and high-strength, high-ductility medium-Mn steels, but also by numerous publications that have proven its impact in material discovery. With a modular architecture that integrates seamlessly into existing materials workflows and a graphical user interface (BgoFace) that removes programming barriers, Bgolearn establishes a practical and reliable platform for Bayesian optimization in materials science, and is openly available at https://github.com/Bin-Cao/Bgolearn.

preprint2026arXiv

Kagome goldene with flat bands and Dirac nodal line fermions via line-graph epitaxy

The kagome lattice has emerged as a promising platform for investigating exotic quantum phases. However, achieving a single-atomic-layer kagome lattice in elemental materials remains a significant challenge. Here, we introduce line-graph epitaxy, a novel approach that enables the atomic-scale synthesis of goldene, a monolayer of elemental gold atoms arranged in a kagome lattice. Through scanning tunneling microscopy/spectroscopy (STM/STS), and density functional theory (DFT) calculations, we demonstrate the formation of kagome goldene, featuring a flat band with a van Hove singularity approximately 1.1 eV below the Fermi level, signaling strong electron correlation effects. Notably, the flat band is disrupted at the zigzag edges of goldene nanoflakes, revealing substantial edge effects. Furthermore, our calculations show that weak interlayer interactions between goldene and the underlying Au2Ge substrate generate dual Dirac nodal lines through a proximity effect. These findings offer not only a novel strategy for constructing elemental kagome lattices, but also a generalizable framework for fabricating and controlling line-graph materials. This research advances the exploration of quantum phases driven by strong correlations and the design of materials for next-generation quantum technologies.

preprint2026arXiv

Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning

A central challenge in continual learning for large language models (LLMs) is catastrophic forgetting, where adapting to new tasks can substantially degrade performance on previously learned ones. Existing projection-based methods mitigate such interference by restricting parameter updates to subspaces that are orthogonal to directions associated with past tasks. However, these methods are typically formulated under Euclidean parameter geometry, with update magnitudes and projections governed by the Frobenius norm. The recent empirical success of the Muon optimizer, which applies orthogonalized matrix updates and admits a spectral-norm interpretation, suggests that Frobenius geometry may not be the most effective choice for matrix-valued LLM parameters. Motivated by this observation, we propose Muon-OGD, a spectral-norm-aware continual learning framework that integrates Muon-style operator-norm geometry with orthogonal projection constraints. Our method formulates each update as a spectral-norm-constrained optimization problem with linear non-interference constraints, and solves it efficiently through dual iterations and Newton--Schulz matrix-sign approximations. By applying orthogonalized momentum updates that avoid protected directions associated with prior tasks, Muon-OGD aims to improve the stability--plasticity trade-off in sequential LLM adaptation. We evaluate the proposed method on standard continual learning benchmarks, TRACE, and domain-specific Coding--Math--Medical curricula using both encoder--decoder and decoder-only architectures. Empirically, Muon-OGD consistently improves over sequential fine-tuning and competitive orthogonal-gradient baselines, while remaining computationally scalable. These results suggest that spectral-norm-aware update geometry provides a practical and effective alternative to Frobenius-norm projection for continual learning in LLMs.

preprint2026arXiv

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

LLM-powered agents can silently delete documents, leak credentials, or transfer funds on a routine user request, not because the agent was attacked, but because the skill it invoked broke its own declared safety rules. We call these specification violations: benign inputs cause a skill to breach the natural-language guardrails in its own specification, typically because the guardrail's semantics are undefined for autonomous execution, or because the implementation silently ignores the documented constraint. These violations are invisible to static analyzers, traditional fuzzers, and prompt-injection defenses alike, yet they undermine the very contract a user trusts when installing a skill. We present Sefz, a goal-directed semantic fuzzing framework that automatically discovers specification violations in agent skills. Sefz translates each guardrail into a reachability goal over an annotated execution trace, reducing violation checking to a deterministic graph query. An LLM-based mutator generates benign inputs whose traces progressively approach the violation patterns, guided by a multi-armed bandit that uses goal-proximity as its reward signal. On 402 real-world skills from the largest public agent-skill marketplace, Sefz finds specification violations in 120 (29.9%), including 26 previously unknown exploitable guardrail violations in deployed skills. Six recurring specification pitfalls explain the bulk of the failures, suggesting concrete principles for safer skill design.

preprint2026arXiv

Options, Not Clicks: Lattice Refinement for Consent-Driven MCP Authorization

As Model Context Protocol adoption grows, securing tool invocations via meaningful user consent has become a critical challenge, as existing methods, broad always allow toggles or opaque LLM-based decisions, fail to account for dangerous call arguments and often lead to consent fatigue. In this work, we present Conleash, a client-side middleware that enforces boundary-scoped authorization by utilizing a risk lattice to auto-permit safe calls within known boundaries while escalating risks, a policy engine for user-defined invariants, and a refinement loop that converts user decisions into reusable rules. Evaluated on 984 real-world traces, Conleash achieved 98.2% accuracy, caught 99.4% of escalations, and added only 8.2 ms of overhead for policy verification; furthermore, in a user study where N=16, participants significantly preferred Conleash scoped permissions over traditional methods, citing higher trust and reduced prompting.

preprint2026arXiv

SDEval: Safety Dynamic Evaluation for Multimodal Large Language Models

In the rapidly evolving landscape of Multimodal Large Language Models (MLLMs), the safety concerns of their outputs have earned significant attention. Although numerous datasets have been proposed, they may become outdated with MLLM advancements and are susceptible to data contamination issues. To address these problems, we propose \textbf{SDEval}, the \textit{first} safety dynamic evaluation framework to controllably adjust the distribution and complexity of safety benchmarks. Specifically, SDEval mainly adopts three dynamic strategies: text, image, and text-image dynamics to generate new samples from original benchmarks. We first explore the individual effects of text and image dynamics on model safety. Then, we find that injecting text dynamics into images can further impact safety, and conversely, injecting image dynamics into text also leads to safety risks. SDEval is general enough to be applied to various existing safety and even capability benchmarks. Experiments across safety benchmarks, MLLMGuard and VLSBench, and capability benchmarks, MMBench and MMVet, show that SDEval significantly influences safety evaluation, mitigates data contamination, and exposes safety limitations of MLLMs. Code is available at https://github.com/hq-King/SDEval

preprint2026arXiv

Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis

An agent skill is a configuration package that equips an LLM-driven agent with a concrete capability, such as reading email, executing shell commands, or signing blockchain transactions. Each skill is a hybrid artifact-a structured half declares executable interfaces, while a prose half dictates when and how those interfaces fire-and the prose is reinterpreted probabilistically on every invocation. Conventional static analyzers parse the structured half but ignore the prose; LLM-based tools read the prose but cannot reproducibly prove that a tainted input reaches a high-impact sink. We present Semia, a static auditor for agent skills. Semia lifts each skill into the Skill Description Language (SDL), a Datalog fact base that captures LLM-triggered actions, prose-defined conditions, and human-in-the-loop checkpoints. Synthesizing a fact base that is both structurally sound and semantically faithful to the original prose is the central challenge; we address it with Constraint-Guided Representation Synthesis (CGRS), a propose-verify-evaluate loop that refines LLM candidates until convergence. Security properties (e.g., indirect injection, secret leakage, confused deputies, unguarded sinks, etc.) over an agent skill can then be reduced to Datalog reachability queries. We evaluate Semia on 13,728 real-world skills from public marketplaces. Semia renders all of them auditable and finds that more than half carry at least one critical semantic risk. On a stratified sample of 541 expert-labeled skills, Semia achieves 97.7% recall and an F1 of 90.6%, substantially outperforming signature-based scanners and LLM baselines.

preprint2026arXiv

Skill Drift Is Contract Violation: Proactive Maintenance for LLM Agent Skill Libraries

LLM agents increasingly rely on reusable skill libraries, but these skills silently decay as the external services, packages, APIs, and configurations they reference evolve. Existing monitors detect such changes at the wrong granularity: they observe values, not the role those values play in a skill. A version string in a comment is noise; the same string in a pinned dependency is an operational obligation. We formulate skill drift as contract violation and introduce \sgname{}, which extracts executable environment contracts from skill documents and validates only those role-bearing assumptions against known or live conditions. This distinction turns noisy monitoring into a precision-first maintenance signal. Contract-free CI probes produce 40\% false positives, while \sgname{} raises zero false alarms over 599 no-drift and hard-negative cases (Wilson 95\% CI $[0,0.6]\%$). In known-drift verification, \sgname{} achieves 100\% precision and 76\% recall with the strongest backbone; in a pre-registered study over 49 real skills, it discovers live drift with 86\% conservative precision. Violated contracts also make repair actionable, improving one-round success from 10\% without localization to 78\%. We release \dbname{}, an 880-pair benchmark for skill degradation.

preprint2026arXiv

The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck

Tool-using LLM agents must act on untrusted webpages, emails, files, and API outputs while issuing privileged tool calls. Existing defenses often mediate trust at the granularity of an entire tool invocation, forcing a brittle choice in mixed-trust workflows: allow external content to influence a call and risk hijacked destinations or commands, or quarantine the call and block benign retrieval-then-act behavior. The key observation behind this paper is that indirect prompt injection becomes dangerous not when untrusted content appears in context, but when it determines an authority-bearing argument. We present \textsc{PACT} (\emph{Provenance-Aware Capability Contracts}), a runtime monitor that assigns semantic roles to tool arguments, tracks value provenance across replanning steps, and checks whether each argument's origin satisfies its role-specific trust contract. Under oracle provenance, \textsc{PACT} achieves 100\% utility and 100\% security on mixed-trust diagnostic suites, while flat invocation-level monitors incur false positives or false negatives. In full AgentDojo deployments across five models, \textsc{PACT} reaches 100\% security on the three strongest models while recovering 38.1--46.4\% utility, 8--16 percentage points above CaMeL at the same security level. Ablations show that both semantic roles and cross-step provenance are necessary. \textsc{PACT} reframes agent security as authority binding, and isolates the remaining deployment bottleneck to provenance inference and contract synthesis.

preprint2025arXiv

Flowing from Reasoning to Motion: Learning 3D Hand Trajectory Prediction from Egocentric Human Interaction Videos

Prior works on 3D hand trajectory prediction are constrained by datasets that decouple motion from semantic supervision and by models that weakly link reasoning and action. To address these, we first present the EgoMAN dataset, a large-scale egocentric dataset for interaction stage-aware 3D hand trajectory prediction with 219K 6DoF trajectories and 3M structured QA pairs for semantic, spatial, and motion reasoning. We then introduce the EgoMAN model, a reasoning-to-motion framework that links vision-language reasoning and motion generation via a trajectory-token interface. Trained progressively to align reasoning with motion dynamics, our approach yields accurate and stage-aware trajectories with generalization across real-world scenes.

preprint2024arXiv

Interactive Text-to-SQL Generation via Editable Step-by-Step Explanations

Relational databases play an important role in business, science, and more. However, many users cannot fully unleash the analytical power of relational databases, because they are not familiar with database languages such as SQL. Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors. Our experiments on multiple datasets, as well as a user study with 24 participants, demonstrate that our approach can achieve better performance than multiple SOTA approaches. Our code and datasets are available at https://github.com/magic-YuanTian/STEPS.

preprint2023arXiv

MVAM: Multi-variant Attacks on Memory for IoT Trust Computing

With the significant development of the Internet of Things and low-cost cloud services, the sensory and data processing requirements of IoT systems are continually going up. TrustZone is a hardware-protected Trusted Execution Environment (TEE) for ARM processors specifically designed for IoT handheld systems. It provides memory isolation techniques to protect trusted application data from being exploited by malicious entities. In this work, we focus on identifying different vulnerabilities of the TrustZone extension of ARM Cortex-M processors. Then design and implement a threat model to execute those attacks. We have found that TrustZone is vulnerable to buffer overflow-based attacks. We have used this to create an attack called MOFlow and successfully leaked the data of another trusted app. This is done by intentionally overflowing the memory of one app to access the encrypted memory of other apps inside the secure world. We have also found that, by not validating the input parameters in the entry function, TrustZone has exposed a security weakness. We call this Achilles heel and present an attack model showing how to exploit this weakness too. Our proposed novel attacks are implemented and successfully tested on two recent ARM Cortex-M processors available on the market (M23 and M33).

preprint2022arXiv

A First Look at Duplicate and Near-duplicate Self-admitted Technical Debt Comments

Self-admitted technical debt (SATD) refers to technical debt that is intentionally introduced by developers and explicitly documented in code comments or other software artifacts (e.g., issue reports) to annotate sub-optimal decisions made by developers in the software development process. In this work, we take the first look at the existence and characteristics of duplicate and near-duplicate SATD comments in five popular Apache OSS projects, i.e., JSPWiki, Helix, Jackrabbit, Archiva, and SystemML. We design a method to automatically identify groups of duplicate and near-duplicate SATD comments and track their evolution in the software system by mining the commit history of a software project. Leveraging the proposed method, we identified 3,520 duplicate and near-duplicate SATD comments from the target projects, which belong to 1,141 groups. We manually analyze the content and context of a sample of 1,505 SATD comments (by sampling 100 groups for each project) and identify if they annotate the same root cause. We also investigate whether duplicate SATD comments exist in code clones, whether they co-exist in the same file, and whether they are introduced and removed simultaneously. Our preliminary study reveals several surprising findings that would shed light on future studies aiming to improve the management of duplicate SATD comments. For instance, only 48.5% duplicate SATD comment groups with the same root cause exist in regular code clones, and only 33.9% of the duplicate SATD comment pairs are introduced in the same commit.

preprint2022arXiv

A Prescriptive Dirichlet Power Allocation Policy with Deep Reinforcement Learning

Prescribing optimal operation based on the condition of the system and, thereby, potentially prolonging the remaining useful lifetime has a large potential for actively managing the availability, maintenance and costs of complex systems. Reinforcement learning (RL) algorithms are particularly suitable for this type of problems given their learning capabilities. A special case of a prescriptive operation is the power allocation task, which can be considered as a sequential allocation problem, where the action space is bounded by a simplex constraint. A general continuous action-space solution of such sequential allocation problems has still remained an open research question for RL algorithms. In continuous action-space, the standard Gaussian policy applied in reinforcement learning does not support simplex constraints, while the Gaussian-softmax policy introduces a bias during training. In this work, we propose the Dirichlet policy for continuous allocation tasks and analyze the bias and variance of its policy gradients. We demonstrate that the Dirichlet policy is bias-free and provides significantly faster convergence, better performance and better hyperparameters robustness over the Gaussian-softmax policy. Moreover, we demonstrate the applicability of the proposed algorithm on a prescriptive operation case, where we propose the Dirichlet power allocation policy and evaluate the performance on a case study of a set of multiple lithium-ion (Li-I) battery systems. The experimental results show the potential to prescribe optimal operation, improve the efficiency and sustainability of multi-power source systems.

preprint2022arXiv

CNN-Augmented Visual-Inertial SLAM with Planar Constraints

We present a robust visual-inertial SLAM system that combines the benefits of Convolutional Neural Networks (CNNs) and planar constraints. Our system leverages a CNN to predict the depth map and the corresponding uncertainty map for each image. The CNN depth effectively bootstraps the back-end optimization of SLAM and meanwhile the CNN uncertainty adaptively weighs the contribution of each feature point to the back-end optimization. Given the gravity direction from the inertial sensor, we further present a fast plane detection method that detects horizontal planes via one-point RANSAC and vertical planes via two-point RANSAC. Those stably detected planes are in turn used to regularize the back-end optimization of SLAM. We evaluate our system on a public dataset, \ie, EuRoC, and demonstrate improved results over a state-of-the-art SLAM system, \ie, ORB-SLAM3.

preprint2022arXiv

Direct observation of moiré flat-band breakdown at the edge of magic-angle twisted bilayer graphene

Low-energy moiré flat bands in magic-angle twisted bilayer graphene (tBG) have demonstrated incredible potentials to exhibit rich exotic quantum phenomena. Theoretically, the moiré flat bands of tBG are based on the extended structures, i.e., the moiré patterns with periodic boundary conditions. However, a fundamental question of whether the flat bands can exist in the graphene moiré patterns with a reduced structure symmetry, such as sample edges, remains unanswered. Here, via scanning tunneling microscopy and spectroscopy, we study the local electronic properties of a magic-angle tBG near the sample terminated edge and report a direct observation of breakdown of the moiré flat bands. We show that the moiré electronic structures, including the low-energy flat bands, can sufficiently exist in a complete moiré spot, i.e., a moiré supercell, right at the edge even the translational symmetry of the moiré patterns is broken in one direction. However, the flat-band characteristic is obviously absent in the incomplete moiré spots that are partly terminated by the edge. Our results indicate that a whole moiré spot is sufficient and indispensable for the generation of the effective moiré flat bands in tBG.

preprint2022arXiv

EAN: Event Adaptive Network for Enhanced Action Recognition

Efficiently modeling spatial-temporal information in videos is crucial for action recognition. To achieve this goal, state-of-the-art methods typically employ the convolution operator and the dense interaction modules such as non-local blocks. However, these methods cannot accurately fit the diverse events in videos. On the one hand, the adopted convolutions are with fixed scales, thus struggling with events of various scales. On the other hand, the dense interaction modeling paradigm only achieves sub-optimal performance as action-irrelevant parts bring additional noises for the final prediction. In this paper, we propose a unified action recognition framework to investigate the dynamic nature of video content by introducing the following designs. First, when extracting local cues, we generate the spatial-temporal kernels of dynamic-scale to adaptively fit the diverse events. Second, to accurately aggregate these cues into a global video representation, we propose to mine the interactions only among a few selected foreground objects by a Transformer, which yields a sparse paradigm. We call the proposed framework as Event Adaptive Network (EAN) because both key designs are adaptive to the input video content. To exploit the short-term motions within local segments, we propose a novel and efficient Latent Motion Code (LMC) module, further improving the performance of the framework. Extensive experiments on several large-scale video datasets, e.g., Something-to-Something V1&V2, Kinetics, and Diving48, verify that our models achieve state-of-the-art or competitive performances at low FLOPs. Codes are available at: https://github.com/tianyuan168326/EAN-Pytorch.

preprint2022arXiv

FisheyeDistill: Self-Supervised Monocular Depth Estimation with Ordinal Distillation for Fisheye Cameras

In this paper, we deal with the problem of monocular depth estimation for fisheye cameras in a self-supervised manner. A known issue of self-supervised depth estimation is that it suffers in low-light/over-exposure conditions and in large homogeneous regions. To tackle this issue, we propose a novel ordinal distillation loss that distills the ordinal information from a large teacher model. Such a teacher model, since having been trained on a large amount of diverse data, can capture the depth ordering information well, but lacks in preserving accurate scene geometry. Combined with self-supervised losses, we show that our model can not only generate reasonable depth maps in challenging environments but also better recover the scene geometry. We further leverage the fisheye cameras of an AR-Glasses device to collect an indoor dataset to facilitate evaluation.

preprint2022arXiv

Jointly Modeling Hierarchical and Horizontal Features for Relational Triple Extraction

Recent works on relational triple extraction have shown the superiority of jointly extracting entities and relations over the pipelined extraction manner. However, most existing joint models fail to balance the modeling of entity features and the joint decoding strategy, and thus the interactions between the entity level and triple level are not fully investigated. In this work, we first introduce the hierarchical dependency and horizontal commonality between the two levels, and then propose an entity-enhanced dual tagging framework that enables the triple extraction (TE) task to utilize such interactions with self-learned entity features through an auxiliary entity extraction (EE) task, without breaking the joint decoding of relational triples. Specifically, we align the EE and TE tasks in a position-wise manner by formulating them as two sequence labeling problems with identical encoder-decoder structure. Moreover, the two tasks are organized in a carefully designed parameter sharing setting so that the learned entity features could be naturally shared via multi-task learning. Empirical experiments on the NYT benchmark demonstrate the effectiveness of the proposed framework compared to the state-of-the-art methods.

preprint2022arXiv

MPInspector: A Systematic and Automatic Approach for Evaluating the Security of IoT Messaging Protocols

Facilitated by messaging protocols (MP), many home devices are connected to the Internet, bringing convenience and accessibility to customers. However, most deployed MPs on IoT platforms are fragmented and are not implemented carefully to support secure communication. To the best of our knowledge, there is no systematic solution to perform automatic security checks on MP implementations yet. To bridge the gap, we present MPInspector, the first automatic and systematic solution for vetting the security of MP implementations. MPInspector combines model learning with formal analysis and operates in three stages: (a) using parameter semantics extraction and interaction logic extraction to automatically infer the state machine of an MP implementation, (b) generating security properties based on meta properties and the state machine, and (c) applying automatic property based formal verification to identify property violations. We evaluate MPInspector on three popular MPs, including MQTT, CoAP and AMQP, implemented on nine leading IoT platforms. It identifies 252 property violations, leveraging which we further identify eleven types of attacks under two realistic attack scenarios. In addition, we demonstrate that MPInspector is lightweight (the average overhead of end-to-end analysis is ~4.5 hours) and effective with a precision of 100% in identifying property violations.

preprint2022arXiv

Multi-agent Actor-Critic with Time Dynamical Opponent Model

In multi-agent reinforcement learning, multiple agents learn simultaneously while interacting with a common environment and each other. Since the agents adapt their policies during learning, not only the behavior of a single agent becomes non-stationary, but also the environment as perceived by the agent. This renders it particularly challenging to perform policy improvement. In this paper, we propose to exploit the fact that the agents seek to improve their expected cumulative reward and introduce a novel \textit{Time Dynamical Opponent Model} (TDOM) to encode the knowledge that the opponent policies tend to improve over time. We motivate TDOM theoretically by deriving a lower bound of the log objective of an individual agent and further propose \textit{Multi-Agent Actor-Critic with Time Dynamical Opponent Model} (TDOM-AC). We evaluate the proposed TDOM-AC on a differential game and the Multi-agent Particle Environment. We show empirically that TDOM achieves superior opponent behavior prediction during test time. The proposed TDOM-AC methodology outperforms state-of-the-art Actor-Critic methods on the performed experiments in cooperative and \textbf{especially} in mixed cooperative-competitive environments. TDOM-AC results in a more stable training and a faster convergence.

preprint2022arXiv

One Bad Apple Spoils the Barrel: Understanding the Security Risks Introduced by Third-Party Components in IoT Firmware

Currently, the development of IoT firmware heavily depends on third-party components (TPCs) to improve development efficiency. Nevertheless, TPCs are not secure, and the vulnerabilities in TPCs will influence the security of IoT firmware. Existing works pay less attention to the vulnerabilities caused by TPCs, and we still lack a comprehensive understanding of the security impact of TPC vulnerability against firmware. To fill in the knowledge gap, we design and implement FirmSec, which leverages syntactical features and control-flow graph features to detect the TPCs in firmware, and then recognizes the corresponding vulnerabilities. Based on FirmSec, we present the first large-scale analysis of the security risks raised by TPCs on $34,136$ firmware images. We successfully detect 584 TPCs and identify 128,757 vulnerabilities caused by 429 CVEs. Our in-depth analysis reveals the diversity of security risks in firmware and discovers some well-known vulnerabilities are still rooted in firmware. Besides, we explore the geographical distribution of vulnerable devices and confirm that the security situation of devices in different regions varies. Our analysis also indicates that vulnerabilities caused by TPCs in firmware keep growing with the boom of the IoT ecosystem. Further analysis shows 2,478 commercial firmware images have potentially violated GPL/AGPL licensing terms.

preprint2022arXiv

Origami-controlled strain engineering of tunable flat bands and correlated states in folded graphene

Flat electronic bands with tunable structures offer opportunities for the exploitation and manipulation of exotic interacting quantum states. Here, we present a controllable route to construct easily tunable flat bands in folded graphene, by nano origami-controlled strain engineering, and discover correlated states in this system. Via tearing and folding graphene monolayer at arbitrary step edges with scanning tunneling microscope manipulation, we create strain-induced pseudo-magnetic fields as well as resulting flat electronic bands in the curved edges of folded graphene. We show that the intensity of the pseudo-magnetic field can be readily tuned by changing the width of the folding edge due to the edge-width-dependent lattice deformation, leading to the well adjustability of the geometry of flat bands in folded graphene. Furthermore, by creating expected dispersionless flat bands using this technique, the correlation-induced splits of flat bands are successfully observed in the density of states when these bands are partially filled. Our experiment provides a feasible and effective pathway to engineer the system with tunable flat band structures, and establishes a new platform that can be used to realize devisable strain and interaction induced quantum phases.

preprint2022arXiv

Perceptron Synthesis Network: Rethinking the Action Scale Variances in Videos

Video action recognition has been partially addressed by the CNNs stacking of fixed-size 3D kernels. However, these methods may under-perform for only capturing rigid spatial-temporal patterns in single-scale spaces, while neglecting the scale variances across different action primitives. To overcome this limitation, we propose to learn the optimal-scale kernels from the data. More specifically, an \textit{action perceptron synthesizer} is proposed to generate the kernels from a bag of fixed-size kernels that are interacted by dense routing paths. To guarantee the interaction richness and the information capacity of the paths, we design the novel \textit{optimized feature fusion layer}. This layer establishes a principled universal paradigm that suffices to cover most of the current feature fusion techniques (e.g., channel shuffling, and channel dropout) for the first time. By inserting the \textit{synthesizer}, our method can easily adapt the traditional 2D CNNs to the video understanding tasks such as action recognition with marginal additional computation cost. The proposed method is thoroughly evaluated over several challenging datasets (i.e., Somehting-to-Somthing, Kinetics and Diving48) that highly require temporal reasoning or appearance discriminating, achieving new state-of-the-art results. Particularly, our low-resolution model outperforms the recent strong baseline methods, i.e., TSM and GST, with less than 30\% of their computation cost.

preprint2022arXiv

Towards Return Parity in Markov Decision Processes

Algorithmic decisions made by machine learning models in high-stakes domains may have lasting impacts over time. However, naive applications of standard fairness criterion in static settings over temporal domains may lead to delayed and adverse effects. To understand the dynamics of performance disparity, we study a fairness problem in Markov decision processes (MDPs). Specifically, we propose return parity, a fairness notion that requires MDPs from different demographic groups that share the same state and action spaces to achieve approximately the same expected time-discounted rewards. We first provide a decomposition theorem for return disparity, which decomposes the return disparity of any two MDPs sharing the same state and action spaces into the distance between group-wise reward functions, the discrepancy of group policies, and the discrepancy between state visitation distributions induced by the group policies. Motivated by our decomposition theorem, we propose algorithms to mitigate return disparity via learning a shared group policy with state visitation distributional alignment using integral probability metrics. We conduct experiments to corroborate our results, showing that the proposed algorithm can successfully close the disparity gap while maintaining the performance of policies on two real-world recommender system benchmark datasets.

preprint2022arXiv

Weakly Supervised Online Action Detection for Infant General Movements

To make the earlier medical intervention of infants' cerebral palsy (CP), early diagnosis of brain damage is critical. Although general movements assessment(GMA) has shown promising results in early CP detection, it is laborious. Most existing works take videos as input to make fidgety movements(FMs) classification for the GMA automation. Those methods require a complete observation of videos and can not localize video frames containing normal FMs. Therefore we propose a novel approach named WO-GMA to perform FMs localization in the weakly supervised online setting. Infant body keypoints are first extracted as the inputs to WO-GMA. Then WO-GMA performs local spatio-temporal extraction followed by two network branches to generate pseudo clip labels and model online actions. With the clip-level pseudo labels, the action modeling branch learns to detect FMs in an online fashion. Experimental results on a dataset with 757 videos of different infants show that WO-GMA can get state-of-the-art video-level classification and cliplevel detection results. Moreover, only the first 20% duration of the video is needed to get classification results as good as fully observed, implying a significantly shortened FMs diagnosis time. Code is available at: https://github.com/scofiedluo/WO-GMA.

preprint2021arXiv

Curse or Redemption? How Data Heterogeneity Affects the Robustness of Federated Learning

Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks. This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks. The initial impression driven by our experimental results suggests that data heterogeneity is the dominant factor in the effectiveness of attacks and it may be a redemption for defending against backdooring as it makes the attack less efficient, more challenging to design effective attack strategies, and the attack result also becomes less predictable. However, with further investigations, we found data heterogeneity is more of a curse than a redemption as the attack effectiveness can be significantly boosted by simply adjusting the client-side backdooring timing. More importantly,data heterogeneity may result in overfitting at the local training of benign clients, which can be utilized by attackers to disguise themselves and fool skewed-feature based defenses. In addition, effective attack strategies can be made by adjusting attack data distribution. Finally, we discuss the potential directions of defending the curses brought by data heterogeneity. The results and lessons learned from our extensive experiments and analysis offer new insights for designing robust federated learning methods and systems

preprint2021arXiv

Meta Federated Learning

Due to its distributed methodology alongside its privacy-preserving features, Federated Learning (FL) is vulnerable to training time adversarial attacks. In this study, our focus is on backdoor attacks in which the adversary's goal is to cause targeted misclassifications for inputs embedded with an adversarial trigger while maintaining an acceptable performance on the main learning task at hand. Contemporary defenses against backdoor attacks in federated learning require direct access to each individual client's update which is not feasible in recent FL settings where Secure Aggregation is deployed. In this study, we seek to answer the following question, Is it possible to defend against backdoor attacks when secure aggregation is in place?, a question that has not been addressed by prior arts. To this end, we propose Meta Federated Learning (Meta-FL), a novel variant of federated learning which not only is compatible with secure aggregation protocol but also facilitates defense against backdoor attacks. We perform a systematic evaluation of Meta-FL on two classification datasets: SVHN and GTSRB. The results show that Meta-FL not only achieves better utility than classic FL, but also enhances the performance of contemporary defenses in terms of robustness against adversarial attacks.

preprint2021arXiv

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

We study the problem of incorporating prior knowledge into a deep Transformer-based model,i.e.,Bidirectional Encoder Representations from Transformers (BERT), to enhance its performance on semantic textual matching tasks. By probing and analyzing what BERT has already known when solving this task, we obtain better understanding of what task-specific knowledge BERT needs the most and where it is most needed. The analysis further motivates us to take a different approach than most existing works. Instead of using prior knowledge to create a new training task for fine-tuning BERT, we directly inject knowledge into BERT's multi-head attention mechanism. This leads us to a simple yet effective approach that enjoys fast training stage as it saves the model from training on additional data or tasks other than the main task. Extensive experiments demonstrate that the proposed knowledge-enhanced BERT is able to consistently improve semantic textual matching performance over the original BERT model, and the performance benefit is most salient when training data is scarce.

preprint2020arXiv

$H_\infty$ Model-free Reinforcement Learning with Robust Stability Guarantee

Reinforcement learning is showing great potentials in robotics applications, including autonomous driving, robot manipulation and locomotion. However, with complex uncertainties in the real-world environment, it is difficult to guarantee the successful generalization and sim-to-real transfer of learned policies theoretically. In this paper, we introduce and extend the idea of robust stability and $H_\infty$ control to design policies with both stability and robustness guarantee. Specifically, a sample-based approach for analyzing the Lyapunov stability and performance robustness of a learning-based control system is proposed. Based on the theoretical results, a maximum entropy algorithm is developed for searching Lyapunov function and designing a policy with provable robust stability guarantee. Without any specific domain knowledge, our method can find a policy that is robust to various uncertainties and generalizes well to different test environments. In our experiments, we show that our method achieves better robustness to both large impulsive disturbances and parametric variations in the environment than the state-of-art results in both robust and generic RL, as well as classic control. Anonymous code is available to reproduce the experimental results at https://github.com/RobustStabilityGuaranteeRL/RobustStabilityGuaranteeRL.

preprint2020arXiv

A Bayesian semi-parametric hybrid model for spatial extremes with unknown dependence structure

The max-stable process is an asymptotically justified model for spatial extremes. In particular, we focus on the hierarchical extreme-value process (HEVP), which is a particular max-stable process that is conducive to Bayesian computing. The HEVP and all max-stable process models are parametric and impose strong assumptions including that all marginal distributions belong to the generalized extreme value family and that nearby sites are asymptotically dependent. We generalize the HEVP by relaxing these assumptions to provide a wider class of marginal distributions via a Dirichlet process prior for the spatial random effects distribution. In addition, we present a hybrid max-mixture model that combines the strengths of the parametric and semi-parametric models. We show that this versatile max-mixture model accommodates both asymptotic independence and dependence and can be fit using standard Markov chain Monte Carlo algorithms. The utility of our model is evaluated in Monte Carlo simulation studies and application to Netherlands wind gust data.

preprint2020arXiv

A First Look at the Deprecation of RESTful APIs: An Empirical Study

REpresentational State Transfer (REST) is considered as one standard software architectural style to build web APIs that can integrate software systems over the internet. However, while connecting systems, RESTful APIs might also break the dependent applications that rely on their services when they introduce breaking changes, e.g., an older version of the API is no longer supported. To warn developers promptly and thus prevent critical impact on downstream applications, a deprecated-removed model should be followed, and deprecation-related information such as alternative approaches should also be listed. While API deprecation analysis as a theme is not new, most existing work focuses on non-web APIs, such as the ones provided by Java and Android. To investigate RESTful API deprecation, we propose a framework called RADA (RESTful API Deprecation Analyzer). RADA is capable of automatically identifying deprecated API elements and analyzing impacted operations from an OpenAPI specification, a machine-readable profile for describing RESTful web service. We apply RADA on 2,224 OpenAPI specifications of 1,368 RESTful APIs collected from APIs.guru, the largest directory of OpenAPI specifications. Based on the data mined by RADA, we perform an empirical study to investigate how the deprecated-removed protocol is followed in RESTful APIs and characterize practices in RESTful API deprecation. The results of our study reveal several severe deprecation-related problems in existing RESTful APIs. Our implementation of RADA and detailed empirical results are publicly available for future intelligent tools that could automatically identify and migrate usage of deprecated RESTful API operations in client code.

preprint2020arXiv

A Novel Cascade Binary Tagging Framework for Relational Triple Extraction

Extracting relational triples from unstructured text is crucial for large-scale knowledge graph construction. However, few existing works excel in solving the overlapping triple problem where multiple relational triples in the same sentence share the same entities. In this work, we introduce a fresh perspective to revisit the relational triple extraction task and propose a novel cascade binary tagging framework (CasRel) derived from a principled problem formulation. Instead of treating relations as discrete labels as in previous works, our new framework models relations as functions that map subjects to objects in a sentence, which naturally handles the overlapping problem. Experiments show that the CasRel framework already outperforms state-of-the-art methods even when its encoder module uses a randomly initialized BERT encoder, showing the power of the new tagging framework. It enjoys further performance boost when employing a pre-trained BERT encoder, outperforming the strongest baseline by 17.5 and 30.2 absolute gain in F1-score on two public datasets NYT and WebNLG, respectively. In-depth analysis on different scenarios of overlapping triples shows that the method delivers consistent performance gain across all these scenarios. The source code and data are released online.

preprint2020arXiv

AstroSeis -- A 3D Boundary element modeling code for seismic wavefields in irregular asteroids and bodies

We developed a 3-D elastic Boundary Element Method (BEM) computer code, called AstroSeis, to model seismic wavefields in a body with an arbitrary shape, such as an asteroid. Besides the AstroSeis can handle arbitrary surface topography, it can deal with a liquid core in an asteroid model. Both the solid and liquid domains are homogenous in our current code. For seismic sources, we can use single forces or moment tensors. The AstroSeis is implemented in the frequency domain and the frequency-dependent Q can be readily incorporated. The code is in MATLAB and it is straightforward to set up the model to run our code. The frequency-domain calculation is advantageous to study the long-term elastic response of a celestial body due to a cyclic force such as the tidal force with no numerical dispersion issue suffered by many other methods requiring volume meshing. Our AstroSeis has been benchmarked with other methods such as normal modes summation and the direct solution method (DSM). This open-source AstroSeis will be a useful tool to study the interior and surface processes of asteroids.

preprint2020arXiv

Dual-path CNN with Max Gated block for Text-Based Person Re-identification

Text-based person re-identification(Re-id) is an important task in video surveillance, which consists of retrieving the corresponding person's image given a textual description from a large gallery of images. It is difficult to directly match visual contents with the textual descriptions due to the modality heterogeneity. On the one hand, the textual embeddings are not discriminative enough, which originates from the high abstraction of the textual descriptions. One the other hand,Global average pooling (GAP) is commonly utilized to extract more general or smoothed features implicitly but ignores salient local features, which are more important for the cross-modal matching problem. With that in mind, a novel Dual-path CNN with Max Gated block (DCMG) is proposed to extract discriminative word embeddings and make visual-textual association concern more on remarkable features of both modalities. The proposed framework is based on two deep residual CNNs jointly optimized with cross-modal projection matching (CMPM) loss and cross-modal projection classification (CMPC) loss to embed the two modalities into a joint feature space. First, the pre-trained language model, BERT, is combined with the convolutional neural network (CNN) to learn better word embeddings in the text-to-image matching domain. Second, the global Max pooling (GMP) layer is applied to make the visual-textual features focus more on the salient part. To further alleviate the noise of the maxed-pooled features, the gated block (GB) is proposed to produce an attention map that focuses on meaningful features of both modalities. Finally, extensive experiments are conducted on the benchmark dataset, CUHK-PEDES, in which our approach achieves the rank-1 score of 55.81% and outperforms the state-of-the-art method by 1.3%.

preprint2020arXiv

Efficient Estimation of Material Property Curves and Surfaces via Active Learning

The relationship between material properties and independent variables such as temperature, external field or time, is usually represented by a curve or surface in a multi-dimensional space. Determining such a curve or surface requires a series of experiments or calculations which are often time and cost consuming. A general strategy uses an appropriate utility function to sample the space to recommend the next optimal experiment or calculation within an active learning loop. However, knowing what the optimal sampling strategy to use to minimize the number of experiments is an outstanding problem. We compare a number of strategies based on directed exploration on several materials problems of varying complexity using a Kriging based model. These include one dimensional curves such as the fatigue life curve for 304L stainless steel and the Liquidus line of the Fe-C phase diagram, surfaces such as the Hartmann 3 function in 3D space and the fitted intermolecular potential for Ar-SH, and a four dimensional data set of experimental measurements for BaTiO3 based ceramics. We also consider the effects of experimental noise on the Hartmann 3 function. We find that directed exploration guided by maximum variance provides better performance overall, converging faster across several data sets. However, for certain problems, the trade-off methods incorporating exploitation can perform at least as well, if not better than maximum variance. Thus, we discuss how the choice of the utility function depends on the distribution of the data, the model performance and uncertainties, additive noise as well as the budget.

preprint2020arXiv

Exploring an Application of Virtual Reality for Early Detection of Dementia

Facing the severe global dementia problem, an exploration was conducted adopting the technology of virtual reality (VR). This report lays a technical foundation for further research project "Early Detection of Dementia Using Testing Tools in VR Environment", which illustrates the process of developing a VR application using Unity 3D software on Oculus Go. This preliminary exploration is composed of three steps, including 3D virtual scene construction, VR interaction design and monitoring. The exploration was recorded to provide basic technical guidance and detailed method for subsequent research.

preprint2020arXiv

Object Detection in the Context of Mobile Augmented Reality

In the past few years, numerous Deep Neural Network (DNN) models and frameworks have been developed to tackle the problem of real-time object detection from RGB images. Ordinary object detection approaches process information from the images only, and they are oblivious to the camera pose with regard to the environment and the scale of the environment. On the other hand, mobile Augmented Reality (AR) frameworks can continuously track a camera's pose within the scene and can estimate the correct scale of the environment by using Visual-Inertial Odometry (VIO). In this paper, we propose a novel approach that combines the geometric information from VIO with semantic information from object detectors to improve the performance of object detection on mobile devices. Our approach includes three components: (1) an image orientation correction method, (2) a scale-based filtering approach, and (3) an online semantic map. Each component takes advantage of the different characteristics of the VIO-based AR framework. We implemented the AR-enhanced features using ARCore and the SSD Mobilenet model on Android phones. To validate our approach, we manually labeled objects in image sequences taken from 12 room-scale AR sessions. The results show that our approach can improve on the accuracy of generic object detectors by 12% on our dataset.

preprint2020arXiv

Off-Policy Reinforcement Learning for Efficient and Effective GAN Architecture Search

In this paper, we introduce a new reinforcement learning (RL) based neural architecture search (NAS) methodology for effective and efficient generative adversarial network (GAN) architecture search. The key idea is to formulate the GAN architecture search problem as a Markov decision process (MDP) for smoother architecture sampling, which enables a more effective RL-based search algorithm by targeting the potential global optimal architecture. To improve efficiency, we exploit an off-policy GAN architecture search algorithm that makes efficient use of the samples generated by previous policies. Evaluation on two standard benchmark datasets (i.e., CIFAR-10 and STL-10) demonstrates that the proposed method is able to discover highly competitive architectures for generally better image generation results with a considerably reduced computational burden: 7 GPU hours. Our code is available at https://github.com/Yuantian013/E2GAN.

preprint2020arXiv

Real-Time Model Calibration with Deep Reinforcement Learning

The dynamic, real-time, and accurate inference of model parameters from empirical data is of great importance in many scientific and engineering disciplines that use computational models (such as a digital twin) for the analysis and prediction of complex physical processes. However, fast and accurate inference for processes with large and high dimensional datasets cannot easily be achieved with state-of-the-art methods under noisy real-world conditions. The primary reason is that the inference of model parameters with traditional techniques based on optimisation or sampling often suffers from computational and statistical challenges, resulting in a trade-off between accuracy and deployment time. In this paper, we propose a novel framework for inference of model parameters based on reinforcement learning. The contribution of the paper is twofold: 1) We reformulate the inference problem as a tracking problem with the objective of learning a policy that forces the response of the physics-based model to follow the observations; 2) We propose the constrained Lyapunov-based actor-critic (CLAC) algorithm to enable the robust and accurate inference of physics-based model parameters in real time under noisy real-world conditions. The proposed methodology is demonstrated and evaluated on two model-based diagnostics test cases utilizing two different physics-based models of turbofan engines. The performance of the methodology is compared to that of two alternative approaches: a state update method (unscented Kalman filter) and a supervised end-to-end mapping with deep neural networks. The experimental results demonstrate that the proposed methodology outperforms all other tested methods in terms of speed and robustness, with high inference accuracy.

preprint2020arXiv

Relativistic Mean-Field Approach in Nuclear Systems

A new scheme to study the properties of finite nuclei is proposed based on the Dirac-Brueckner-Hartree-Fock (DBHF) approach starting from a bare nucleon-nucleon interaction. The relativistic structure of the nucleon self-energies in nuclear matter depending on density, momentum and isospin asymmetry are determined through a subtracted T-matrix technique and parameterized, which makes them easily accessible for general use. The scalar and vector potentials of a single particle in nuclei are generated via a local density approximation (LDA). The surface effect of finite nuclei can be taken into account by an improved LDA (ILDA), which has successfully been applied in microscopic derivations of the optical model potential for nucleon-nucleus scattering. The bulk properties of nuclei can be determined in a self-consistent scheme for nuclei all over the nuclear mass table. Calculated binding energies agree very well with the empirical data, while the predicted values for radii and spin-orbit splitting of single-particle energies are about 10 \% smaller than the experimental data. Basic features of more sophisticated DBHF calculations for finite nuclei are reproduced.

preprint2020arXiv

Site testing campaign for the Large Optical/infrared Telescope of China: Overview

The Large Optical/infrared Telescope (LOT) is a ground-based 12m diameter optical/infrared telescope which is proposed to be built in the western part of China in the next decade. Based on satellite remote sensing data, along with geographical, logistical and political considerations, three candidate sites were chosen for ground-based astronomical performance monitoring. These sites include: Ali in Tibet, Daocheng in Sichuan, and Muztagh Ata in Xinjiang. Up until now, all three sites have continuously collected data for two years. In this paper, we will introduce this site testing campaign, and present its monitoring results obtained during the period between March 2017 and March 2019.

preprint2020arXiv

Toward Autonomous Robotic Micro-Suturing using Optical Coherence Tomography Calibration and Path Planning

Robotic automation has the potential to assist human surgeons in performing suturing tasks in microsurgery, and in order to do so a robot must be able to guide a needle with sub-millimeter precision through soft tissue. This paper presents a robotic suturing system that uses 3D optical coherence tomography (OCT) system for imaging feedback. Calibration of the robot-OCT and robot-needle transforms, wound detection, keypoint identification, and path planning are all performed automatically. The calibration method handles pose uncertainty when the needle is grasped using a variant of iterative closest points. The path planner uses the identified wound shape to calculate needle entry and exit points to yield an evenly-matched wound shape after closure. Experiments on tissue phantoms and animal tissue demonstrate that the system can pass a suture needle through wounds with 0.27 mm overall accuracy in achieving the planned entry and exit points.

preprint2019arXiv

Dimensional hierarchy of higher-order topology in three-dimensional sonic crystals

Topological phases of matter have been extensively studied for their intriguing bulk and edge properties. Recently, higher-order topological insulators with boundary states that are two or more dimensions lower than the bulk states, have been proposed and investigated as novel states of matter. Previous implementations of higher-order topological insulators were based on two-dimensional (2D) systems in which 1D gapped edge states and 0D localized corner states were observed. Here we theoretically design and experimentally realize a 3D higher-order topological insulator in a sonic crystal with a large topological band gap. We observe the coexistence of third-, second- and first-order topological boundary states with codimension three, two and one, respectively, indicating a dimensional hierarchy of higher-order topological phenomena in 3D crystals. Our acoustic metamaterial goes beyond the descriptions of tight-binding model and possesses a band structure which automatically breaks the chiral symmetry, leading to the separation of bulk, surface, hinge and corner states. Our study opens a new route toward higher-order topological phenomena in three-dimensions and paves the way for topological wave trapping and manipulation in a hierarchy of dimensions in a single system.

preprint2019arXiv

Rapid falling of an orbiting moon to its parent planet due to tidal-seismic resonance

Tidal force plays an important role in the evolution of the planet-moon system. The tidal force of a moon can excite seismic waves in the planet it is orbiting. A tidal-seismic resonance is expected when a tidal force frequency matches a free-oscillation frequency of the planet. Here we show that when the moon is close to the planet, the tidal-seismic resonance can cause large-amplitude seismic waves, which can change the shape of the planet and in turn exert a negative torque on the moon to cause it to fall rapidly toward the planet. We postulate that the tidal-seismic resonance may be an important mechanism which can accelerate planet accretion process. On the other hand, tidal-seismic resonance effect can also be used to interrogate planet interior by long term tracking of the orbital change of the moon.

preprint2019arXiv

Symmetry-protected hierarchy of anomalous multipole topological band gaps in nonsymmorphic metacrystals

Symmetry and topology are two fundamental aspects of many quantum states of matter. Recently, new topological materials, higher-order topological insulators, were discovered, featuring, e.g., bulk-edge-corner correspondence that goes beyond the conventional topological paradigms. Here, we discover experimentally that the nonsymmorphic $p4g$ acoustic metacrystals host a symmetry-protected hierarchy of topological multipoles: the lowest band gap has a quantized Wannier dipole and can mimic the quantum spin Hall effect, while the second band gap exhibits quadrupole topology with anomalous Wannier bands. Such a topological hierarchy allows us to observe experimentally distinct, multiplexing topological phenomena and to reveal a topological transition triggered by the geometry-transition from the $p4g$ group to the $C_{4v}$ group which demonstrates elegantly the fundamental interplay between symmetry and topology. Our study demonstrates an instance that classical systems with controllable geometry can serve as powerful simulators for the discovery of novel topological states of matter and their phase transitions.