Researcher profile

Zhengyi Li

Zhengyi Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

On the (In-)Security of the Shuffling Defense in the Transformer Secure Inference

For Transformer models, cryptographically secure inference ensures that the client learns only the final output, while the server learns nothing about the client's input. However, securely computing nonlinear layers remains a major efficiency bottleneck due to the substantial communication rounds and data transmission required. To address this issue, prior works reveal intermediate activations to the client, allowing nonlinear operations to be computed in plaintext. Although this approach significantly improves efficiency, exposing activations enables adversaries to extract model weights. To mitigate this risk, existing works employ a shuffling defense that reveals only randomly permuted activations to the client. In this work, we show that the shuffling defense is not as robust as previously claimed. We propose an attack that aligns differently shuffled activations to a common permutation and subsequently exploits them to extract model weights. Experiments on Pythia-70m and GPT-2 demonstrate that the proposed attack can align shuffled activations with mean squared errors ranging from $10^{-9}$ to $10^{-6}$. With a query cost of approximately \$1, the adversary can recover model weights with L1-norm differences ranging from $10^{-4}$ to $10^{-2}$ compared to the oracle weights.

preprint2022arXiv

Block-Skim: Efficient Question Answering for Transformer

Transformer models have achieved promising results on natural language processing (NLP) tasks including extractive question answering (QA). Common Transformer encoders used in NLP tasks process the hidden states of all input tokens in the context paragraph throughout all layers. However, different from other tasks such as sequence classification, answering the raised question does not necessarily need all the tokens in the context paragraph. Following this motivation, we propose Block-skim, which learns to skim unnecessary context in higher hidden layers to improve and accelerate the Transformer performance. The key idea of Block-Skim is to identify the context that must be further processed and those that could be safely discarded early on during inference. Critically, we find that such information could be sufficiently derived from the self-attention weights inside the Transformer model. We further prune the hidden states corresponding to the unnecessary positions early in lower layers, achieving significant inference-time speedup. To our surprise, we observe that models pruned in this way outperform their full-size counterparts. Block-Skim improves QA models' accuracy on different datasets and achieves 3 times speedup on BERT-base model.

preprint2022arXiv

Holistic Transformer: A Joint Neural Network for Trajectory Prediction and Decision-Making of Autonomous Vehicles

Trajectory prediction and behavioral decision-making are two important tasks for autonomous vehicles that require good understanding of the environmental context; behavioral decisions are better made by referring to the outputs of trajectory predictions. However, most current solutions perform these two tasks separately. Therefore, a joint neural network that combines multiple cues is proposed and named as the holistic transformer to predict trajectories and make behavioral decisions simultaneously. To better explore the intrinsic relationships between cues, the network uses existing knowledge and adopts three kinds of attention mechanisms: the sparse multi-head type for reducing noise impact, feature selection sparse type for optimally using partial prior knowledge, and multi-head with sigmoid activation type for optimally using posteriori knowledge. Compared with other trajectory prediction models, the proposed model has better comprehensive performance and good interpretability. Perceptual noise robustness experiments demonstrate that the proposed model has good noise robustness. Thus, simultaneous trajectory prediction and behavioral decision-making combining multiple cues can reduce computational costs and enhance semantic relationships between scenes and agents.

preprint2022arXiv

Lalaine: Measuring and Characterizing Non-Compliance of Apple Privacy Labels at Scale

As a key supplement to privacy policies that are known to be lengthy and difficult to read, Apple has launched the app privacy labels, which purportedly help users more easily understand an app's privacy practices. However, false and misleading privacy labels can dupe privacy-conscious consumers into downloading data-intensive apps, ultimately eroding the credibility and integrity of the labels. Although Apple releases requirements and guidelines for app developers to create privacy labels, little is known about whether and to what extent the privacy labels in the wild are correct and compliant, reflecting the actual data practices of iOS apps. This paper presents the first systematic study, based on our new methodology named Lalaine, to evaluate data-flow to privacy-label (flow-to-label) consistency. Lalaine analyzed the privacy labels and binaries of 5,102 iOS apps, shedding light on the prevalence and seriousness of privacy-label non-compliance. We provide detailed case studies and analyze root causes for privacy label non-compliance that complements prior understandings. This has led to new insights for improving privacy-label design and compliance requirements, so app developers, platform stakeholders, and policy-makers can better achieve their privacy and accountability goals. Lalaine is thoroughly evaluated for its high effectiveness and efficiency. We are responsibly reporting the results to stakeholders.

preprint2022arXiv

Learning invariance preserving moment closure model for Boltzmann-BGK equation

As one of the main governing equations in kinetic theory, the Boltzmann equation is widely utilized in aerospace, microscopic flow, etc. Its high-resolution simulation is crucial in these related areas. However, due to the high dimensionality of the Boltzmann equation, high-resolution simulations are often difficult to achieve numerically. The moment method which was first proposed by Grad is among the popular numerical methods to achieve efficient high-resolution simulations. We can derive the governing equations in the moment method by taking moments on both sides of the Boltzmann equation, which effectively reduces the dimensionality of the problem. However, one of the main challenges is that it leads to an unclosed moment system, and closure is needed to obtain a closed moment system. It is truly an art in designing closures for moment systems and has been a significant research field in kinetic theory. Other than the traditional human designs of closures, the machine learning-based approach has attracted much attention lately in Han et al. and Huang et al. In this work, we propose a machine learning-based method to derive a moment closure model for the Boltzmann-BGK equation. In particular, the closure relation is approximated by a carefully designed deep neural network that possesses desirable physical invariances, i.e., the Galilean invariance, reflecting invariance, and scaling invariance, inherited from the original Boltzmann-BGK equation and playing an important role in the correct simulation of the Boltzmann equation. Numerical simulations on the 1D-1D examples including the smooth and discontinuous initial condition problems, Sod shock tube problem, the shock structure problems, and the 1D-3D examples including the smooth and discontinuous problems demonstrate satisfactory numerical performances of the proposed invariance preserving neural closure method.

preprint2022arXiv

Transkimmer: Transformer Learns to Layer-wise Skim

Transformer architecture has become the de-facto model for many machine learning tasks from natural language processing and computer vision. As such, improving its computational efficiency becomes paramount. One of the major computational inefficiency of Transformer-based models is that they spend the identical amount of computation throughout all layers. Prior works have proposed to augment the Transformer model with the capability of skimming tokens to improve its computational efficiency. However, they suffer from not having effectual and end-to-end optimization of the discrete skimming predictor. To address the above limitations, we propose the Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer. The skimmed tokens are then forwarded directly to the final output, thus reducing the computation of the successive layers. The key idea in Transkimmer is to add a parameterized predictor before each layer that learns to make the skimming decision. We also propose to adopt reparameterization trick and add skim loss for the end-to-end training of Transkimmer. Transkimmer achieves 10.97x average speedup on GLUE benchmark compared with vanilla BERT-base baseline with less than 1% accuracy degradation.

preprint2022arXiv

Tunable magnetically induced transparency spectra in magnon-magnon coupled Y3Fe5O12/permalloy bilayers

Hybrid magnonic systems host a variety of characteristic quantum phenomena such as the magnetically-induced transparency (MIT) and Purcell effect, which are considered useful for future coherent quantum information processing. In this work, we experimentally demonstrate a tunable MIT effect in the Y3Fe5O12(YIG)/Permalloy(Py) magnon-magnon coupled system via changing the magnetic field orientations. By probing the magneto-optic effects of Py and YIG, we identify clear features of MIT spectra induced by the mode hybridization between the uniform mode of Py and the perpendicular standing spin-wave modes of YIG. By changing the external magnetic field orientations, we observe a tunable coupling strength between the YIG's spin-wave modes and the Py's uniform mode, upon the application of an out-of-plane magnetic field. This observation is theoretically interpreted by a geometrical consideration of the Py and YIG magnetization under the oblique magnetic field even at a constant interfacial exchange coupling. Our findings show high promise for investigating tunable coherent phenomena with hybrid magnonic platforms.

preprint2020arXiv

The Nelson-Seiberg theorem generalized with nonpolynomial superpotentials

The Nelson-Seiberg theorem relates R-symmetries to F-term supersymmetry breaking, and provides a guiding rule for new physics model building beyond the Standard Model. A revision of the theorem gives a necessary and sufficient condition to supersymmetry breaking in models with polynomial superpotentials. This work revisits the theorem to include models with nonpolynomial superpotentials. With a generic R-symmetric superpotential, a singularity at the origin of the field space implies both R-symmetry breaking and supersymmetry breaking. We give a generalized necessary and sufficient condition for supersymmetry breaking which applies to both perturbative and nonperturbative models.