Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
26works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

26 published item(s)

preprint2026arXiv

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

Large reasoning models often reach correct answers through flawed intermediate steps, creating a gap between final accuracy and reasoning reliability. Existing alignment strategies address this with external verifiers or massive sampling, limiting scalability. In this work, we introduce CASPO (Confidence-Aware Step-wise Preference Optimization), a framework that aligns token-level confidence with step-wise logical correctness through iterative Direct Preference Optimization, without training a separate reward model. During inference, we propose Confidence-aware Thought (CaT), which leverages this calibrated confidence to dynamically prune uncertain reasoning branches with negligible O(V) latency. Experiments across ten benchmarks and multiple model families show that CASPO consistently improves reasoning reliability and inference efficiency. CASPO scales to Qwen3-8B-Base and surpasses tree-search baselines on AIME'24 and AIME'25 without using reward-model data. We also release a step-wise dataset with confidence annotations to support fine-grained analysis of reasoning reliability. Code is available at https://github.com/Thecommonirin/CASPO.

preprint2022arXiv

A vector magnetometer based on a single spin-orbit torque anomalous Hall device

In many applications, the ability to measure the vector information of a magnetic field with high spatial resolution and low cost is essential, but it is still a challenge for existing magnetometers composed of multiple sensors. Here, we report a single-device based vector magnetometer, which is enabled by spin-orbit torque, capable of measuring a vector magnetic field using the harmonic Hall resistances of a superparamagnetic ferromagnet (FM)/heavy metal (HM) bilayer. Under an ac driving current, the first and second harmonic Hall resistances of the FM/HM bilayer show a linear relationship with the vertical and longitudinal component (along the current direction) of the magnetic field, respectively. By employing a L-shaped Hall device with two orthogonal arms, we can measure all the three field components simultaneously, so as to detect both the amplitude and direction of magnetic field in a three-dimensional space. As proof of concepts, we demonstrate both angular position sensing on the three coordinate planes and vector mapping of magnetic field generated by a permanent magnet, both of which are in good agreement with the simulation results. Crosstalk between vertical and longitudinal field components at large field is discussed using theoretical models.

preprint2022arXiv

Learning Mixtures of Permutations: Groups of Pairwise Comparisons and Combinatorial Method of Moments

In applications such as rank aggregation, mixture models for permutations are frequently used when the population exhibits heterogeneity. In this work, we study the widely used Mallows mixture model. In the high-dimensional setting, we propose a polynomial-time algorithm that learns a Mallows mixture of permutations on $n$ elements with the optimal sample complexity that is proportional to $\log n$, improving upon previous results that scale polynomially with $n$. In the high-noise regime, we characterize the optimal dependency of the sample complexity on the noise parameter. Both objectives are accomplished by first studying demixing permutations under a noiseless query model using groups of pairwise comparisons, which can be viewed as moments of the mixing distribution, and then extending these results to the noisy Mallows model by simulating the noiseless oracle.

preprint2022arXiv

Lightweight Object-level Topological Semantic Mapping and Long-term Global Localization based on Graph Matching

Mapping and localization are two essential tasks for mobile robots in real-world applications. However, largescale and dynamic scenes challenge the accuracy and robustness of most current mature solutions. This situation becomes even worse when computational resources are limited. In this paper, we present a novel lightweight object-level mapping and localization method with high accuracy and robustness. Different from previous methods, our method does not need a prior constructed precise geometric map, which greatly releases the storage burden, especially for large-scale navigation. We use object-level features with both semantic and geometric information to model landmarks in the environment. Particularly, a learning topological primitive is first proposed to efficiently obtain and organize the object-level landmarks. On the basis of this, we use a robot-centric mapping framework to represent the environment as a semantic topology graph and relax the burden of maintaining global consistency at the same time. Besides, a hierarchical memory management mechanism is introduced to improve the efficiency of online mapping with limited computational resources. Based on the proposed map, the robust localization is achieved by constructing a novel local semantic scene graph descriptor, and performing multi-constraint graph matching to compare scene similarity. Finally, we test our method on a low-cost embedded platform to demonstrate its advantages. Experimental results on a large scale and multi-session real-world environment show that the proposed method outperforms the state of arts in terms of lightweight and robustness.

preprint2022arXiv

Optimal prediction of Markov chains with and without spectral gap

We study the following learning problem with dependent data: Observing a trajectory of length $n$ from a stationary Markov chain with $k$ states, the goal is to predict the next state. For $3 \leq k \leq O(\sqrt{n})$, using techniques from universal compression, the optimal prediction risk in Kullback-Leibler divergence is shown to be $Θ(\frac{k^2}{n}\log \frac{n}{k^2})$, in contrast to the optimal rate of $Θ(\frac{\log \log n}{n})$ for $k=2$ previously shown in Falahatgar et al. (2016). These rates, slower than the parametric rate of $O(\frac{k^2}{n})$, can be attributed to the memory in the data, as the spectral gap of the Markov chain can be arbitrarily small. To quantify the memory effect, we study irreducible reversible chains with a prescribed spectral gap. In addition to characterizing the optimal prediction risk for two states, we show that, as long as the spectral gap is not excessively small, the prediction risk in the Markov model is $O(\frac{k^2}{n})$, which coincides with that of an iid model with the same number of parameters. Extensions to higher-order Markov chains are also obtained.

preprint2022arXiv

Random Graph Matching in Geometric Models: the Case of Complete Graphs

This paper studies the problem of matching two complete graphs with edge weights correlated through latent geometries, extending a recent line of research on random graph matching with independent edge weights to geometric models. Specifically, given a random permutation $π^*$ on $[n]$ and $n$ iid pairs of correlated Gaussian vectors $\{X_{π^*(i)}, Y_i\}$ in $\mathbb{R}^d$ with noise parameter $σ$, the edge weights are given by $A_{ij}=κ(X_i,X_j)$ and $B_{ij}=κ(Y_i,Y_j)$ for some link function $κ$. The goal is to recover the hidden vertex correspondence $π^*$ based on the observation of $A$ and $B$. We focus on the dot-product model with $κ(x,y)=\langle x, y \rangle$ and Euclidean distance model with $κ(x,y)=\|x-y\|^2$, in the low-dimensional regime of $d=o(\log n)$ wherein the underlying geometric structures are most evident. We derive an approximate maximum likelihood estimator, which provably achieves, with high probability, perfect recovery of $π^*$ when $σ=o(n^{-2/d})$ and almost perfect recovery with a vanishing fraction of errors when $σ=o(n^{-1/d})$. Furthermore, these conditions are shown to be information-theoretically optimal even when the latent coordinates $\{X_i\}$ and $\{Y_i\}$ are observed, complementing the recent results of [DCK19] and [KNW22] in geometric models of the planted bipartite matching problem. As a side discovery, we show that the celebrated spectral algorithm of [Ume88] emerges as a further approximation to the maximum likelihood in the geometric model.

preprint2022arXiv

Settling the Sharp Reconstruction Thresholds of Random Graph Matching

This paper studies the problem of recovering the hidden vertex correspondence between two edge-correlated random graphs. We focus on the Gaussian model where the two graphs are complete graphs with correlated Gaussian weights and the Erdős-Rényi model where the two graphs are subsampled from a common parent Erdős-Rényi graph $\mathcal{G}(n,p)$. For dense graphs with $p=n^{-o(1)}$, we prove that there exists a sharp threshold, above which one can correctly match all but a vanishing fraction of vertices and below which correctly matching any positive fraction is impossible, a phenomenon known as the "all-or-nothing" phase transition. Even more strikingly, in the Gaussian setting, above the threshold all vertices can be exactly matched with high probability. In contrast, for sparse Erdős-Rényi graphs with $p=n^{-Θ(1)}$, we show that the all-or-nothing phenomenon no longer holds and we determine the thresholds up to a constant factor. Along the way, we also derive the sharp threshold for exact recovery, sharpening the existing results in Erdős-Rényi graphs. The proof of the negative results builds upon a tight characterization of the mutual information based on the truncated second-moment computation and an "area theorem" that relates the mutual information to the integral of the reconstruction error. The positive results follows from a tight analysis of the maximum likelihood estimator that takes into account the cycle structure of the induced permutation on the edges.

preprint2022arXiv

Testing network correlation efficiently via counting trees

We propose a new procedure for testing whether two networks are edge-correlated through some latent vertex correspondence. The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees. When the two networks are Erdős-Rényi random graphs $\mathcal{G}(n,q)$ that are either independent or correlated with correlation coefficient $ρ$, our test runs in $n^{2+o(1)}$ time and succeeds with high probability as $n\to\infty$, provided that $n\min\{q,1-q\} \ge n^{-o(1)}$ and $ρ^2>α\approx 0.338$, where $α$ is Otter's constant so that the number of unlabeled trees with $K$ edges grows as $(1/α)^K$. This significantly improves the prior work in terms of statistical accuracy, running time, and graph sparsity.

preprint2022arXiv

VisImages: A Fine-Grained Expert-Annotated Visualization Dataset

Images in visualization publications contain rich information, e.g., novel visualization designs and implicit design patterns of visualizations. A systematic collection of these images can contribute to the community in many aspects, such as literature analysis and automated tasks for visualization. In this paper, we build and make public a dataset, VisImages, which collects 12,267 images with captions from 1,397 papers in IEEE InfoVis and VAST. Built upon a comprehensive visualization taxonomy, the dataset includes 35,096 visualizations and their bounding boxes in the images.We demonstrate the usefulness of VisImages through three use cases: 1) investigating the use of visualizations in the publications with VisImages Explorer, 2) training and benchmarking models for visualization classification, and 3) localizing visualizations in the visual analytics systems automatically.

preprint2021arXiv

Dualizing Le Cam's method for functional estimation, with applications to estimating the unseens

Le Cam's method (or the two-point method) is a commonly used tool for obtaining statistical lower bound and especially popular for functional estimation problems. This work aims to explain and give conditions for the tightness of Le Cam's lower bound in functional estimation from the perspective of convex duality. Under a variety of settings it is shown that the maximization problem that searches for the best two-point lower bound, upon dualizing, becomes a minimization problem that optimizes the bias-variance tradeoff among a family of estimators. For estimating linear functionals of a distribution our work strengthens prior results of Donoho-Liu \cite{DL91} (for quadratic loss) by dropping the Hölderian assumption on the modulus of continuity. For exponential families our results extend those of Juditsky-Nemirovski \cite{JN09} by characterizing the minimax risk for the quadratic loss under weaker assumptions on the exponential family. We also provide an extension to the high-dimensional setting for estimating separable functionals. Notably, coupled with tools from complex analysis, this method is particularly effective for characterizing the ``elbow effect'' -- the phase transition from parametric to nonparametric rates. As the main application we derive sharp minimax rates in the Distinct elements problem (given a fraction $p$ of colored balls from an urn containing $d$ balls, the optimal error of estimating the number of distinct colors is $\tilde Θ(d^{-\frac{1}{2}\min\{\frac{p}{1-p},1\}})$) and the Fisher's species problem (given $n$ iid observations from an unknown distribution, the optimal prediction error of the number of unseen symbols in the next (unobserved) $r \cdot n$ observations is $\tilde Θ(n^{-\min\{\frac{1}{r+1},\frac{1}{2}\}})$).

preprint2021arXiv

Likelihood landscape and maximum likelihood estimation for the discrete orbit recovery model

We study the non-convex optimization landscape for maximum likelihood estimation in the discrete orbit recovery model with Gaussian noise. This model is motivated by applications in molecular microscopy and image processing, where each measurement of an unknown object is subject to an independent random rotation from a rotational group. Equivalently, it is a Gaussian mixture model where the mixture centers belong to a group orbit. We show that fundamental properties of the likelihood landscape depend on the signal-to-noise ratio and the group structure. At low noise, this landscape is "benign" for any discrete group, possessing no spurious local optima and only strict saddle points. At high noise, this landscape may develop spurious local optima, depending on the specific group. We discuss several positive and negative examples, and provide a general condition that ensures a globally benign landscape. For cyclic permutations of coordinates on $\mathbb{R}^d$ (multi-reference alignment), there may be spurious local optima when $d \geq 6$, and we establish a correspondence between these local optima and those of a surrogate function of the phase variables in the Fourier domain. We show that the Fisher information matrix transitions from resembling that of a single Gaussian in low noise to having a graded eigenvalue structure in high noise, which is determined by the graded algebra of invariant polynomials under the group action. In a local neighborhood of the true object, the likelihood landscape is strongly convex in a reparametrized system of variables given by a transcendence basis of this polynomial algebra. We discuss implications for optimization algorithms, including slow convergence of expectation-maximization, and possible advantages of momentum-based acceleration and variable reparametrization for first- and second-order descent methods.

preprint2021arXiv

Testing correlation of unlabeled random graphs

We study the problem of detecting the edge correlation between two random graphs with $n$ unlabeled nodes. This is formalized as a hypothesis testing problem, where under the null hypothesis, the two graphs are independently generated; under the alternative, the two graphs are edge-correlated under some latent node correspondence, but have the same marginal distributions as the null. For both Gaussian-weighted complete graphs and dense Erdős-Rényi graphs (with edge probability $n^{-o(1)}$), we determine the sharp threshold at which the optimal testing error probability exhibits a phase transition from zero to one as $n\to \infty$. For sparse Erdős-Rényi graphs with edge probability $n^{-Ω(1)}$, we determine the threshold within a constant factor. The proof of the impossibility results is an application of the conditional second-moment method, where we bound the truncated second moment of the likelihood ratio by carefully conditioning on the typical behavior of the intersection graph (consisting of edges in both observed graphs) and taking into account the cycle structure of the induced random permutation on the edges. Notably, in the sparse regime, this is accomplished by leveraging the pseudoforest structure of subcritical Erdős-Rényi graphs and a careful enumeration of subpseudoforests that can be assembled from short orbits of the edge permutation.

preprint2020arXiv

Application of information-percolation method to reconstruction problems on graphs

In this paper we propose a method of proving impossibility results based on applying strong data-processing inequalities to estimate mutual information between sets of variables forming certain Markov random fields. The end result is that mutual information between two "far away" (as measured by the graph distance) variables is bounded by the probability of the existence of an open path in a bond-percolation problem on the same graph. Furthermore, stronger bounds can be obtained by establishing mutual information comparison results with an erasure model on the same graph, with erasure probabilities given by the contraction coefficients. As applications, we show that our method gives sharp threshold for partially recovering a rank-one perturbation of a random Gaussian matrix (spiked Wigner model), yields the best known upper bound on the noise level for group synchronization (obtained concurrently by Abbe and Boix), and establishes new impossibility result for community detection on the stochastic block model with $k$ communities.

preprint2020arXiv

Eddy Current Testing of Metal Cracks Using Spin Hall Magnetoresistance Sensor and Machine Learning

Recently we have developed a spin Hall magnetoresistance (SMR) sensor which operates under AC bias and sense currents. Here we demonstrate both theoretically and experimentally that the SMR sensor is uniquely suited for eddy current testing applications because both the coil and sensor utilize AC current as the excitation source. The use of SMR sensor effectively eliminates the necessity of any demodulation or lock-in technique for detecting the eddy current, which greatly simplifies the detection system. Furthermore, we show that the combination of principal component analysis and decision tree model is effective in classifying the metal cracks. The relatively clean signals obtained by the SMR sensor greatly facilitates the subsequent signal analysis and ensures high accuracy in the classification of different types of crack features.

preprint2020arXiv

Efficient random graph matching via degree profiles

Random graph matching refers to recovering the underlying vertex correspondence between two random graphs with correlated edges; a prominent example is when the two random graphs are given by Erdős-Rényi graphs $G(n,\frac{d}{n})$. This can be viewed as an average-case and noisy version of the graph isomorphism problem. Under this model, the maximum likelihood estimator is equivalent to solving the intractable quadratic assignment problem. This work develops an $\tilde{O}(n d^2+n^2)$-time algorithm which perfectly recovers the true vertex correspondence with high probability, provided that the average degree is at least $d = Ω(\log^2 n)$ and the two graphs differ by at most $δ= O( \log^{-2}(n) )$ fraction of edges. For dense graphs and sparse graphs, this can be improved to $δ= O( \log^{-2/3}(n) )$ and $δ= O( \log^{-2}(d) )$ respectively, both in polynomial time. The methodology is based on appropriately chosen distance statistics of the degree profiles (empirical distribution of the degrees of neighbors). Before this work, the best known result achieves $δ=O(1)$ and $n^{o(1)} \leq d \leq n^c$ for some constant $c$ with an $n^{O(\log n)}$-time algorithm \cite{barak2018nearly} and $δ=\tilde O((d/n)^4)$ and $d = \tildeΩ(n^{4/5})$ with a polynomial-time algorithm \cite{dai2018performance}.

preprint2020arXiv

Evidence for a Coronal Shock Wave Origin for Relativistic Protons Producing Solar Gamma-Rays and Observed by Neutron Monitors at Earth

We study the solar eruptive event on 2017 September 10 that produced long-lasting $>$100 MeV $γ$-ray emission and a ground level enhancement (GLE72). The origin of the high-energy ions producing late-phase gamma-ray emission (LPGRE) is still an open question, but a possible explanation is proton acceleration at coronal shocks produced by coronal mass ejections. We examine a common shock acceleration origin for both the LPGRE and GLE72. The $γ$-ray emission observed by the Fermi-Large Area Telescope exhibits a weak impulsive phase, consistent with that observed in hard X-and $γ$-ray line flare emissions, and what appear to be two distinct stages of LPGRE. From a detailed modeling of the shock wave, we derive the 3D distribution and temporal evolution of the shock parameters, and we examine the shock wave magnetic connection with the visible solar disk. The evolution of shock parameters on field lines returning to the visible disk, mirrors the two stages of LPGRE. We find good agreement between the time history of $>$100 MeV $γ$-rays and one produced by a basic shock acceleration model. The time history of shock parameters magnetically mapped to Earth agrees with the rates observed by the Fort Smith neutron monitor during the first hour of the GLE72 if we include a 30% contribution of flare-accelerated protons during the first 10 minutes, having a release time following the time history of nuclear $γ$-rays. Our analysis provides compelling evidence for a common shock origin for protons producing the LPGRE and most of the particles observed in GLE72.

preprint2020arXiv

Extrapolating the profile of a finite population

We study a prototypical problem in empirical Bayes. Namely, consider a population consisting of $k$ individuals each belonging to one of $k$ types (some types can be empty). Without any structural restrictions, it is impossible to learn the composition of the full population having observed only a small (random) subsample of size $m = o(k)$. Nevertheless, we show that in the sublinear regime of $m =ω(k/\log k)$, it is possible to consistently estimate in total variation the \emph{profile} of the population, defined as the empirical distribution of the sizes of each type, which determines many symmetric properties of the population. We also prove that in the linear regime of $m=c k$ for any constant $c$ the optimal rate is $Θ(1/\log k)$. Our estimator is based on Wolfowitz's minimum distance method, which entails solving a linear program (LP) of size $k$. We show that there is a single infinite-dimensional LP whose value simultaneously characterizes the risk of the minimum distance estimator and certifies its minimax optimality. The sharp convergence rate is obtained by evaluating this LP using complex-analytic techniques.

preprint2020arXiv

Note on approximating the Laplace transform of a Gaussian on a complex disk

In this short note we study how well a Gaussian distribution can be approximated by distributions supported on $[-a,a]$. Perhaps, the natural conjecture is that for large $a$ the almost optimal choice is given by truncating the Gaussian to $[-a,a]$. Indeed, such approximation achieves the optimal rate of $e^{-Θ(a^2)}$ in terms of the $L_\infty$-distance between characteristic functions. However, if we consider the $L_\infty$-distance between Laplace transforms on a complex disk, the optimal rate is $e^{-Θ(a^2 \log a)}$, while truncation still only attains $e^{-Θ(a^2)}$. The optimal rate can be attained by the Gauss-Hermite quadrature. As corollary, we also construct a ``super-flat'' Gaussian mixture of $Θ(a^2)$ components with means in $[-a,a]$ and whose density has all derivatives bounded by $e^{-Ω(a^2 \log(a))}$ in the $O(1)$-neighborhood of the origin.

preprint2020arXiv

Relating streamer flows to density and magnetic structures at the Parker Solar Probe

The physical mechanisms that produce the slow solar wind are still highly debated. Parker Solar Probe's (PSP's) second solar encounter provided a new opportunity to relate in situ measurements of the nascent slow solar wind with white-light images of streamer flows. We exploit data taken by the Solar and Heliospheric Observatory (SOHO), the Solar TErrestrial RElations Observatory (STEREO) and the Wide Imager on Solar Probe to reveal for the first time a close link between imaged streamer flows and the high-density plasma measured by the Solar Wind Electrons Alphas and Protons (SWEAP) experiment. We identify different types of slow winds measured by PSP that we relate to the spacecraft's magnetic connectivity (or not) to streamer flows. SWEAP measured high-density and highly variable plasma when PSP was well connected to streamers but more tenuous wind with much weaker density variations when it exited streamer flows. STEREO imaging of the release and propagation of small transients from the Sun to PSP reveals that the spacecraft was continually impacted by the southern edge of streamer transients. The impact of specific density structures is marked by a higher occurrence of magnetic field reversals measured by the FIELDS magnetometers. Magnetic reversals originating from the streamers are associated with larger density variations compared with reversals originating outside streamers. We tentatively interpret these findings in terms of magnetic reconnection between open magnetic fields and coronal loops with different properties, providing support for the formation of a subset of the slow wind by magnetic reconnection.

preprint2020arXiv

Sample complexity of population recovery

The problem of population recovery refers to estimating a distribution based on incomplete or corrupted samples. Consider a random poll of sample size $n$ conducted on a population of individuals, where each pollee is asked to answer $d$ binary questions. We consider one of the two polling impediments: (a) in lossy population recovery, a pollee may skip each question with probability $ε$, (b) in noisy population recovery, a pollee may lie on each question with probability $ε$. Given $n$ lossy or noisy samples, the goal is to estimate the probabilities of all $2^d$ binary vectors simultaneously within accuracy $δ$ with high probability. This paper settles the sample complexity of population recovery. For lossy model, the optimal sample complexity is $\tildeΘ(δ^{-2\max\{\fracε{1-ε},1\}})$, improving the state of the art by Moitra and Saks in several ways: a lower bound is established, the upper bound is improved and the result depends at most on the logarithm of the dimension. Surprisingly, the sample complexity undergoes a phase transition from parametric to nonparametric rate when $ε$ exceeds $1/2$. For noisy population recovery, the sharp sample complexity turns out to be more sensitive to dimension and scales as $\exp(Θ(d^{1/3} \log^{2/3}(1/δ)))$ except for the trivial cases of $ε=0,1/2$ or $1$. For both models, our estimators simply compute the empirical mean of a certain function, which is found by pre-solving a linear program (LP). Curiously, the dual LP can be understood as Le Cam's method for lower-bounding the minimax risk, thus establishing the statistical optimality of the proposed estimators. The value of the LP is determined by complex-analytic methods.

preprint2020arXiv

Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models

Introduced by Kiefer and Wolfowitz \cite{KW56}, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture odels and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures, which can be viewed as an extreme form of overparameterization. In this paper we discover a surprising property of the NPMLE solution. Consider, for example, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques, we show that with high probability the NPMLE based on a sample of size $n$ has $O(\log n)$ atoms (mass points), significantly improving the deterministic upper bound of $n$ due to Lindsay \cite{lindsay1983geometry1}. Notably, any such Gaussian mixture is statistically indistinguishable from a finite one with $O(\log n)$ components (and this is tight for certain mixtures). Thus, absent any explicit form of model selection, NPMLE automatically chooses the right model complexity, a property we term \emph{self-regularization}. Extensions to other exponential families are given. As a statistical application, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures, recovering a result of Zhang \cite{zhang2009generalized}.

preprint2020arXiv

Spin torque gate magnetic field sensor

Spin-orbit torque provides an efficient pathway to manipulate the magnetic state and magnetization dynamics of magnetic materials, which is crucial for energy-efficient operation of a variety of spintronic devices such as magnetic memory, logic, oscillator, and neuromorphic computing. Here, we describe and experimentally demonstrate a strategy for the realization of a spin torque gate magnetic field sensor with extremely simple structure by exploiting the longitudinal field dependence of the spin torque driven magnetization switching. Unlike most magnetoresistance sensors which require a delicate magnetic bias to achieve a linear response to the external field, the spin torque gate sensor can achieve the same without any magnetic bias, which greatly simplifies the sensor structure. Furthermore, by driving the sensor using an ac current, the dc offset is automatically suppressed, which eliminates the need for a bridge or compensation circuit. We verify the concept using the newly developed WTe2/Ti/CoFeB trilayer and demonstrate that the sensor can work linearly in the range of 3-10 Oe with negligible dc offset.

preprint2020arXiv

Terahertz Emission From an Exchange-Coupled Synthetic Antiferromagnet

We report on terahertz emission from FeMnPt/Ru/FeMnPt and Pt/CoFeB/Ru/CoFeB/Pt synthetic antiferromagnet (SAF) structures upon irradiation by a femtosecond laser; the former is via the anomalous Hall effect, whereas the latter is through the inverse spin Hall effect. The antiparallel alignment of the two ferromagnetic layers leads to a terahertz emission peak amplitude that is almost double that for a corresponding single-layer or bilayer emitter with the same equivalent thickness. In addition, we demonstrate by both simulation and experiment that terahertz emission provides a powerful tool to probe the magnetization reversal processes of individual ferromagnetic layers in a SAF structure, as the terahertz signal is proportional to the vector difference of the magnetizations of the two ferromagnetic layers.

preprint2019arXiv

Intrinsic skyrmions in monolayer Janus magnets

Skyrmions are localized solitonic spin textures with protected topology, which are promising as information carriers in ultra-dense and energy-efficient logic and memory devices. Recently, magnetic skyrmions have been observed in magnetic thin films, and are stabilized by the extrinsic interfacial Dzyaloshinskii-Moriya interaction (DMI) and/or external magnetic fields. The specific effects in magnetic monolayer materials have not been thoroughly studied. Here, we investigate the intrinsic magnetic skyrmions in a family of monolayer Janus van der Waals magnets, MnSTe, MnSeTe, VSeTe, and MnSSe, by the first-principles calculations combined with the micromagnetic simulations. The monolayer Janus MnSTe, MnSeTe, and VSeTe with out-of-plane geometric asymmetry and strong spin-orbit coupling (SOC) have a large intrinsic DMI, which could stabilize a sub-50 nm intrinsic skyrmions in monolayer MnSTe and MnSeTe at zero magnetic field. While monolayer VSeTe with in-plane easy axis forms magnetic domain rather than skyrmions. Moreover, the size and shape of skyrmions can be tuned by an external magnetic field. Therefore, our work motivates a new vista for seeking intrinsic skyrmions in atomic-scale magnets.

preprint2019arXiv

Spin Hall magnetoresistance sensor using Au$_x$Pt$_{1-x}$ as the spin-orbit torque biasing layer

We report on investigation of spin Hall magnetoresistance sensor based on NiFe/AuxPt1-x bilayers. Compared to NiFe/Pt, the NiFe/AuxPt1-x sensor exhibits a much lower power consumption (reduced by about 57%), due to 80% enhancement of spin-orbit torque efficiency of AuxPt1-x at an optimum composition of x = 0.19 as compared to pure Pt. The enhanced spin-orbit torque efficiency allows to increase the thickness of NiFe from 1.8 nm to 2.5 nm without significantly increasing the power consumption. We show that, by increasing the NiFe thickness, we were able to improve the working field range (0.86 Oe), operation temperature range (150 degree C) and detectivity (0.71 nT/sqrt(Hz) at 1 Hz) of the sensor, which is important for practical applications.

preprint2019arXiv

Terahertz emission from anomalous Hall effect in a single-layer ferromagnet

We report on terahertz emission from a single layer ferromagnet which involves the generation of backflow nonthermal charge current from the ferromagnet/dielectric interface by femtosecond laser excitation and subsequent conversion of the charge current to a transverse transient charge current via the anomalous Hall effect, thereby generating the THz radiation. The THz emission can be either enhanced or suppressed, or even the polarity can be reversed, by introducing a magnetization gradient in the thickness direction of the ferromagnet. Unlike spintronic THz emitters reported previously, it does not require additional non-magnetic layer or Rashba interface.