Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Mistletoe: Stealthy Acceleration-Collapse Attacks on Speculative Decoding

Speculative decoding has become a widely adopted technique for accelerating large language model (LLM) inference by drafting multiple candidate tokens and verifying them with a target model in parallel. Its efficiency, however, critically depends on the average accepted length $τ$, i.e., how many draft tokens survive each verification step. In this work, we identify a new mechanism-level vulnerability in model-based speculative decoding: the drafter is trained to approximate the target model distribution, but this approximation is inevitably imperfect. Such a drafter-target mismatch creates a hidden attack surface where small perturbations can preserve the target model's visible behavior while substantially reducing draft-token acceptability. We propose Mistletoe, a stealthy acceleration-collapse attack against speculative decoding. Mistletoe directly targets the acceptance mechanism of speculative decoding. It jointly optimizes a degradation objective that decreases drafter-target agreement and a semantic-preservation objective that constrains the target model's output distribution. To resolve the conflict between these objectives, we introduce a null-space projection mechanism, where degradation gradients are projected away from the local semantic-preserving direction, suppressing draft acceptance while minimizing semantic drift. Experiments on various speculative decoding systems show that Mistletoe substantially reduces average accepted length $τ$, collapses speedup, and lowers averaged token throughput, while preserving output quality and perplexity. Our work highlights that speculative decoding introduces a mechanism-level attack surface beyond existing output robustness, calling for more robust designs of LLM acceleration systems.

preprint2023arXiv

A Novel Estimation Method for Temperature of Magnetic Nanoparticles Dominated by Brownian Relaxation Based on Magnetic Particle Spectroscopy

This paper presents a novel method for estimating the temperature of magnetic nanoparticles (MNPs) based on AC magnetization harmonics of MNPs dominated by Brownian relaxation. The difference in the AC magnetization response and magnetization harmonic between the Fokker-Planck equation and the Langevin function was analyzed, and we studied the relationship between the magnetization harmonic and the key factors, such as Brownian relaxation time, temperature, magnetic field strength, core size and hydrodynamic size of MNPs, excitation frequency, and so on. We proposed a compensation function for AC magnetization harmonic with consideration of the key factors and the difference between the Fokker-Planck equation and the Langevin function. Then a temperature estimation model based on the compensation function and the Langevin function was established. By employing the least squares algorithm, the temperature was successfully calculated. The experimental results show that the temperature error is less than 0.035 K in the temperature range from 310 K to 320 K. The temperature estimation model is expected to improve the performance of the magnetic nanoparticle thermometer and be applied to magnetic nanoparticle-mediated hyperthermia.

preprint2022arXiv

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI

Diffusion magnetic resonance imaging (dMRI) is an important tool in characterizing tissue microstructure based on biophysical models, which are complex and highly non-linear. Resolving microstructures with optimization techniques is prone to estimation errors and requires dense sampling in the q-space. Deep learning based approaches have been proposed to overcome these limitations. Motivated by the superior performance of the Transformer, in this work, we present a learning-based framework based on Transformer, namely, a Microstructure Estimation Transformer with Sparse Coding (METSC) for dMRI-based microstructure estimation with downsampled q-space data. To take advantage of the Transformer while addressing its limitation in large training data requirements, we explicitly introduce an inductive bias - model bias into the Transformer using a sparse coding technique to facilitate the training process. Thus, the METSC is composed with three stages, an embedding stage, a sparse representation stage, and a mapping stage. The embedding stage is a Transformer-based structure that encodes the signal to ensure the voxel is represented effectively. In the sparse representation stage, a dictionary is constructed by solving a sparse reconstruction problem that unfolds the Iterative Hard Thresholding (IHT) process. The mapping stage is essentially a decoder that computes the microstructural parameters from the output of the second stage, based on the weighted sum of normalized dictionary coefficients where the weights are also learned. We tested our framework on two dMRI models with downsampled q-space data, including the intravoxel incoherent motion (IVIM) model and the neurite orientation dispersion and density imaging (NODDI) model. The proposed method achieved up to 11.25 folds of acceleration in scan time and outperformed the other state-of-the-art learning-based methods.

preprint2022arXiv

Complex analysis of divergent perturbation theory at finite temperature

We investigate the convergence properties of finite-temperature perturbation theory by considering the mathematical structure of thermodynamic potentials using complex analysis. We discover that zeros of the partition function lead to poles in the internal energy and logarithmic singularities in the Helmholtz free energy which create divergent expansions in the canonical ensemble. Analysing these zeros reveals that the radius of convergence increases for higher temperatures. In contrast, when the reference state is degenerate, these poles in the internal energy create a zero radius of convergence in the zero-temperature limit. Finally, by showing that the poles in the internal energy reduce to exceptional points in the zero-temperature limit, we unify the two main mathematical representations of quantum phase transitions.

preprint2022arXiv

TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance

Grasp pose estimation is an important issue for robots to interact with the real world. However, most of existing methods require exact 3D object models available beforehand or a large amount of grasp annotations for training. To avoid these problems, we propose TransGrasp, a category-level grasp pose estimation method that predicts grasp poses of a category of objects by labeling only one object instance. Specifically, we perform grasp pose transfer across a category of objects based on their shape correspondences and propose a grasp pose refinement module to further fine-tune grasp pose of grippers so as to ensure successful grasps. Experiments demonstrate the effectiveness of our method on achieving high-quality grasps with the transferred grasp poses. Our code is available at https://github.com/yanjh97/TransGrasp.

preprint2022arXiv

Using Natural Sentences for Understanding Biases in Language Models

Evaluation of biases in language models is often limited to synthetically generated datasets. This dependence traces back to the need for a prompt-style dataset to trigger specific behaviors of language models. In this paper, we address this gap by creating a prompt dataset with respect to occupations collected from real-world natural sentences present in Wikipedia. We aim to understand the differences between using template-based prompts and natural sentence prompts when studying gender-occupation biases in language models. We find bias evaluations are very sensitive to the design choices of template prompts, and we propose using natural sentence prompts for systematic evaluations to step away from design choices that could introduce bias in the observations.

preprint2021arXiv

Likelihood landscape and maximum likelihood estimation for the discrete orbit recovery model

We study the non-convex optimization landscape for maximum likelihood estimation in the discrete orbit recovery model with Gaussian noise. This model is motivated by applications in molecular microscopy and image processing, where each measurement of an unknown object is subject to an independent random rotation from a rotational group. Equivalently, it is a Gaussian mixture model where the mixture centers belong to a group orbit. We show that fundamental properties of the likelihood landscape depend on the signal-to-noise ratio and the group structure. At low noise, this landscape is "benign" for any discrete group, possessing no spurious local optima and only strict saddle points. At high noise, this landscape may develop spurious local optima, depending on the specific group. We discuss several positive and negative examples, and provide a general condition that ensures a globally benign landscape. For cyclic permutations of coordinates on $\mathbb{R}^d$ (multi-reference alignment), there may be spurious local optima when $d \geq 6$, and we establish a correspondence between these local optima and those of a surrogate function of the phase variables in the Fourier domain. We show that the Fisher information matrix transitions from resembling that of a single Gaussian in low noise to having a graded eigenvalue structure in high noise, which is determined by the graded algebra of invariant polynomials under the group action. In a local neighborhood of the true object, the likelihood landscape is strongly convex in a reparametrized system of variables given by a transcendence basis of this polynomial algebra. We discuss implications for optimization algorithms, including slow convergence of expectation-maximization, and possible advantages of momentum-based acceleration and variable reparametrization for first- and second-order descent methods.

preprint2020arXiv

Multi-Agent Deep Reinforcement Learning for HVAC Control in Commercial Buildings

In commercial buildings, about 40%-50% of the total electricity consumption is attributed to Heating, Ventilation, and Air Conditioning (HVAC) systems, which places an economic burden on building operators. In this paper, we intend to minimize the energy cost of an HVAC system in a multi-zone commercial building under dynamic pricing with the consideration of random zone occupancy, thermal comfort, and indoor air quality comfort. Due to the existence of unknown thermal dynamics models, parameter uncertainties (e.g., outdoor temperature, electricity price, and number of occupants), spatially and temporally coupled constraints associated with indoor temperature and CO2 concentration, a large discrete solution space, and a non-convex and non-separable objective function, it is very challenging to achieve the above aim. To this end, the above energy cost minimization problem is reformulated as a Markov game. Then, an HVAC control algorithm is proposed to solve the Markov game based on multi-agent deep reinforcement learning with attention mechanism. The proposed algorithm does not require any prior knowledge of uncertain parameters and can operate without knowing building thermal dynamics models. Simulation results based on real-world traces show the effectiveness, robustness and scalability of the proposed algorithm.

preprint2020arXiv

Multi-objective Ranking via Constrained Optimization

In this paper, we introduce an Augmented Lagrangian based method to incorporate the multiple objectives (MO) in a search ranking algorithm. Optimizing MOs is an essential and realistic requirement for building ranking models in production. The proposed method formulates MO in constrained optimization and solves the problem in the popular Boosting framework -- a novel contribution of our work. Furthermore, we propose a procedure to set up all optimization parameters in the problem. The experimental results show that the method successfully achieves MO criteria much more efficiently than existing methods.

preprint2020arXiv

On a class of new nonlocal traffic flow models with look-ahead rules

This paper presents a new class of one-dimensional (1D) traffic models with look-ahead rules that take into account of two effects: nonlocal slow-down effect and right-skewed non-concave asymmetry in the fundamental diagram. The proposed 1D cellular automata (CA) models with the Arrhenius type look-ahead interactions implement stochastic rules for cars' movement following the configuration of the traffic ahead of each car. In particular, we take two different look-ahead rules: one is based on the distance from the car under consideration to the car in front of it; the other one depends on the car density ahead. Both rules feature a novel idea of multiple moves, which plays a key role in recovering the non-concave flux in the macroscopic dynamics. Through a semi-discrete mesoscopic stochastic process, we derive the coarse-grained macroscopic dynamics of the CA model. We also design a numerical scheme to simulate the proposed CA models with an efficient list-based kinetic Monte Carlo (KMC) algorithm. Our results show that the fluxes of the KMC simulations agree with the coarse-grained macroscopic averaged fluxes for the different look-ahead rules under various parameter settings.

preprint2020arXiv

Potential quality improvement of stochastic optical localization nanoscopy images obtained by frame by frame localization algorithms

A data movie of stochastic optical localization nanoscopy contains spatial and temporal correlations, both providing information of emitter locations. The majority of localization algorithms in the literature estimate emitter locations by frame-by-frame localization (FFL), which exploit only the spatial correlation and leave the temporal correlation into the FFL nanoscopy images. The temporal correlation contained in the FFL images, if exploited, can improve the localization accuracy and the image quality. In this paper, we analyze the properties of the FFL images in terms of root mean square minimum distance (RMSMD) and root mean square error (RMSE). It is shown that RMSMD and RMSE can be potentially reduced by a maximum fold equal to the square root of the average number of activations per emitter. Analyzed and revealed are also several statistical properties of RMSMD and RMSE and their relationship with respect to a large number of data frames, bias and variance of localization errors, small localization errors, sample drift, and the worst FFL image. Numerical examples are taken and the results confirm the prediction of analysis. The ideas about how to develop an algorithm to exploit the temporal correlation of FFL images are also briefly discussed. The results suggest development of two kinds of localization algorithms: the algorithms that can exploit the temporal correlation of FFL images and the unbiased localization algorithms.

preprint2020arXiv

Principal components in linear mixed models with general bulk

We study the principal components of covariance estimators in multivariate mixed-effects linear models. We show that, in high dimensions, the principal eigenvalues and eigenvectors may exhibit bias and aliasing effects that are not present in low-dimensional settings. We derive the first-order limits of the principal eigenvalue locations and eigenvector projections in a high-dimensional asymptotic framework, allowing for general population spectral distributions for the random effects and extending previous results from a more restrictive spiked model. Our analysis uses free probability techniques, and we develop two general tools of independent interest-- strong asymptotic freeness of GOE and deterministic matrices and a free deterministic equivalent approximation for bilinear forms of resolvents.

preprint2019arXiv

Sem-LSD: A Learning-based Semantic Line Segment Detector

In this paper, we introduces a new type of line-shaped image representation, named semantic line segment (Sem-LS) and focus on solving its detection problem. Sem-LS contains high-level semantics and is a compact scene representation where only visually salient line segments with stable semantics are preserved. Combined with high-level semantics, Sem-LS is more robust under cluttered environment compared with existing line-shaped representations. The compactness of Sem-LS facilitates its use in large-scale applications, such as city-scale SLAM (simultaneously localization and mapping) and LCD (loop closure detection). Sem-LS detection is a challenging task due to its significantly different appearance from existing learning-based image representations such as wireframes and objects. For further investigation, we first label Sem-LS on two well-known datasets, KITTI and KAIST URBAN, as new benchmarks. Then, we propose a learning-based Sem-LS detector (Sem-LSD) and devise new module as well as metrics to address unique challenges in Sem-LS detection. Experimental results have shown both the efficacy and efficiency of Sem-LSD. Finally, the effectiveness of the proposed Sem-LS is supported by two experiments on detector repeatability and a city-scale LCD problem. Labeled datasets and code will be released shortly.