Source author record

Xi He

Xi He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security cond-mat.str-el cond-mat.supr-con gr-qc Databases hep-th Machine Learning quant-ph Artificial Intelligence astro-ph.CO Biological Physics cond-mat.mtrl-sci Data Structures and Algorithms Human-Computer Interaction physics.med-ph Populations and Evolution

Catalog footprint

What is connected

19works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

On Privacy Leakage in Tabular Diffusion Models: Influential Factors, Attacker Knowledge, and Metrics

Tabular data plays an important role in many fields and industries, including those with elevated privacy considerations and risks. As such, there is a rising interest in generating high-quality synthetic proxies for real tabular data as a means of reducing privacy risk and proprietary data exposure. With tabular diffusion models (TDMs) demonstrating leading performance in synthesizing such data, understanding and measuring the privacy risks associated with these models is imperative. Leveraging state-of-the-art membership inference attacks for TDMs in both black- and white-box settings, this work quantifies the impact of training setup, synthesis choices, and attacker knowledge on privacy leakage. Moreover, the results demonstrate that adversaries need not have perfect knowledge of the training setup, identical data distributions, or massive compute resources to construct successful attacks. Finally, the pitfalls associated with applying heuristic privacy metrics, such as distance-to-closest record, are revealed.

preprint2022arXiv

Effect of radiation-induced defects on the superfluid density and optical conductivity of overdoped La$_{2-x}$Sr$_x$CuO$_4$

Using a combination of time-domain THz spectroscopy (TDTS) and mutual inductance measurements, we have investigated the low-energy electrodynamic response of overdoped La$_{2-x}$Sr$_x$CuO$_4$ films that have been exposed to ion irradiation. Films went through three rounds of irradiation (2, 4, and 6 $\times 10^{13}$ ions/cm$^2$) and mutual inductance and TDTS experiments were performed between each step. Together with the as-grown film, this gives four different levels of disorder. The transport scattering rate that is measured directly in the THz experiments is an approximately linear function of the radiation dose at all temperatures. This is consistent with a proportionate increase in elastic scattering. In the superconducting state we find that the relation between $T_c$, the superfluid density, and the scattering rates are quantitatively at odds with the predictions based on the extant theory of Abrikosov-Gorkov-like pair breaking in a dirty $d$-wave superconductor. Increasing disorder causes only a small change in the superconducting transition temperature for the overdoped films, but the changes to the $ω\sim 0$ superfluid density are much larger.

preprint2022arXiv

Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases

Organizations often collect private data and release aggregate statistics for the public's benefit. If no steps toward preserving privacy are taken, adversaries may use released statistics to deduce unauthorized information about the individuals described in the private dataset. Differentially private algorithms address this challenge by slightly perturbing underlying statistics with noise, thereby mathematically limiting the amount of information that may be deduced from each data release. Properly calibrating these algorithms -- and in turn the disclosure risk for people described in the dataset -- requires a data curator to choose a value for a privacy budget parameter, $ε$. However, there is little formal guidance for choosing $ε$, a task that requires reasoning about the probabilistic privacy-utility trade-off. Furthermore, choosing $ε$ in the context of statistical inference requires reasoning about accuracy trade-offs in the presence of both measurement error and differential privacy (DP) noise. We present Visualizing Privacy (ViP), an interactive interface that visualizes relationships between $ε$, accuracy, and disclosure risk to support setting and splitting $ε$ among queries. As a user adjusts $ε$, ViP dynamically updates visualizations depicting expected accuracy and risk. ViP also has an inference setting, allowing a user to reason about the impact of DP noise on statistical inferences. Finally, we present results of a study where 16 research practitioners with little to no DP background completed a set of tasks related to setting $ε$ using both ViP and a control. We find that ViP helps participants more correctly answer questions related to judging the probability of where a DP-noised release is likely to fall and comparing between DP-noised and non-private confidence intervals.

preprint2021arXiv

Scanning SQUID characterization of extremely overdoped $La_{2-x}Sr_{x}CuO_{4}$

Recently, advances in film synthesis methods have enabled a study of extremely overdoped $La_{2-x}Sr_{x}CuO_{4}$. This has revealed a surprising behavior of the superfluid density as a function of doping and temperature, the explanation of which is vividly debated. One popular class of models posits electronic phase separation, where the superconducting phase fraction decreases with doping, while some competing phase (e.g. ferromagnetic) progressively takes over. A problem with this scenario is that all the way up to the dome edge the superconducting transition remains sharp, according to mutual inductance measurements. However, the physically relevant scale is the Pearl penetration depth, $Λ_{P}$, and this technique probes the sample on a length scale $L$ that is much larger than $Λ_{P}$. In the present paper, we use local scanning SQUID measurements that probe the susceptibility of the sample on the scale $L << Λ_{P}$. Our SQUID maps show uniform landscapes of susceptibility and excellent overall agreement of the local penetration depth data with the bulk measurements. These results contribute an important piece to the puzzle of how high-temperature superconductivity vanishes on the overdoped side of the cuprates phase diagram.

preprint2020arXiv

CHD Risk Minimization through Lifestyle Control: Machine Learning Gateway

Studies on the influence of a modern lifestyle in abetting Coronary Heart Diseases (CHD) have mostly focused on deterrent health factors, like smoking, alcohol intake, cheese consumption and average systolic blood pressure, largely disregarding the impact of a healthy lifestyle in mitigating CHD risk. In this study, 30+ years' World Health Organization (WHO) data have been analyzed, using a wide array of advanced Machine Learning techniques, to quantify how regulated reliance on positive health indicators, e.g. fruits/vegetables, cereals can offset CHD risk factors over a period of time. Our research ranks the impact of the negative outliers on CHD and then quantifies the impact of the positive health factors in mitigating the negative risk-factors. Our research outcomes, presented through simple mathematical equations, outline the best CHD prevention strategy using lifestyle control only. We show that a 20% increase in the intake of fruit/vegetable leads to 3-6% decrease in SBP; or, a 10% increase in cereal intake lowers SBP by 3%; a simultaneous increase of 10% in fruit-vegetable can further offset the effects of SBP by 6%. Our analysis establishes gender independence of lifestyle on CHD, refuting long held assumptions and unqualified beliefs. We show that CHD risk can be lowered with incremental changes in lifestyle and diet, e.g. fruit-vegetable intake ameliorating effects of alcohol-smoking-fatty food. Our multivariate data model also estimates functional relationships amongst lifestyle factors that can potentially redefine the diagnostics of Framingham score-based CHD-prediction.

preprint2020arXiv

Computing Local Sensitivities of Counting Queries with Joins

Local sensitivity of a query Q given a database instance D, i.e. how much the output Q(D) changes when a tuple is added to D or deleted from D, has many applications including query analysis, outlier detection, and in differential privacy. However, it is NP-hard to find local sensitivity of a conjunctive query in terms of the size of the query, even for the class of acyclic queries. Although the complexity is polynomial when the query size is fixed, the naive algorithms are not efficient for large databases and queries involving multiple joins. In this paper, we present a novel approach to compute local sensitivity of counting queries involving join operations by tracking and summarizing tuple sensitivities -- the maximum change a tuple can cause in the query result when it is added or removed. We give algorithms for the sensitivity problem for full acyclic join queries using join trees, that run in polynomial time in both the size of the database and query for an interesting sub-class of queries, which we call 'doubly acyclic queries' that include path queries, and in polynomial time in combined complexity when the maximum degree in the join tree is bounded. Our algorithms can be extended to certain non-acyclic queries using generalized hypertree decompositions. We evaluate our approach experimentally, and show applications of our algorithms to obtain better results for differential privacy by orders of magnitude.

preprint2020arXiv

Crypt$ε$: Crypto-Assisted Differential Privacy on Untrusted Servers

Differential privacy (DP) has steadily become the de-facto standard for achieving privacy in data analysis, which is typically implemented either in the "central" or "local" model. The local model has been more popular for commercial deployments as it does not require a trusted data collector. This increased privacy, however, comes at a cost of utility and algorithmic expressibility as compared to the central model. In this work, we propose, Crypt$ε$, a system and programming framework that (1) achieves the accuracy guarantees and algorithmic expressibility of the central model (2) without any trusted data collector like in the local model. Crypt$ε$ achieves the "best of both worlds" by employing two non-colluding untrusted servers that run DP programs on encrypted data from the data owners. Although straightforward implementations of DP programs using secure computation tools can achieve the above goal theoretically, in practice they are beset with many challenges such as poor performance and tricky security proofs. To this end, Crypt$ε$ allows data analysts to author logical DP programs that are automatically translated to secure protocols that work on encrypted data. These protocols ensure that the untrusted servers learn nothing more than the noisy outputs, thereby guaranteeing DP (for computationally bounded adversaries) for all Crypt$ε$ programs. Crypt$ε$ supports a rich class of DP programs that can be expressed via a small set of transformation and measurement operators followed by arbitrary post-processing. Further, we propose performance optimizations leveraging the fact that the output is noisy. We demonstrate Crypt$ε$'s feasibility for practical DP analysis with extensive empirical evaluations on real datasets.

preprint2020arXiv

DP-Cryptography: Marrying Differential Privacy and Cryptography in Emerging Applications

Differential privacy (DP) has arisen as the state-of-the-art metric for quantifying individual privacy when sensitive data are analyzed, and it is starting to see practical deployment in organizations such as the US Census Bureau, Apple, Google, etc. There are two popular models for deploying differential privacy - standard differential privacy (SDP), where a trusted server aggregates all the data and runs the DP mechanisms, and local differential privacy (LDP), where each user perturbs their own data and perturbed data is analyzed. Due to security concerns arising from aggregating raw data at a single server, several real world deployments in industry have embraced the LDP model. However, systems based on the LDP model tend to have poor utility - "a gap" in the utility achieved as compared to systems based on the SDP model. In this work, we survey and synthesize emerging directions of research at the intersection of differential privacy and cryptography. First, we survey solutions that combine cryptographic primitives like secure computation and anonymous communication with differential privacy to give alternatives to the LDP model that avoid a trusted server as in SDP but close the gap in accuracy. These primitives introduce performance bottlenecks and necessitate efficient alternatives. Second, we synthesize work in an area we call "DP-Cryptography" - cryptographic primitives that are allowed to leak differentially private outputs. These primitives have orders of magnitude better performance than standard cryptographic primitives. DP-cryptographic primitives are perfectly suited for implementing alternatives to LDP, but are also applicable to scenarios where standard cryptographic primitives do not have practical implementations. Through this unique lens of research taxonomy, we survey ongoing research in these directions while also providing novel directions for future research.

preprint2020arXiv

Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy

In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy. Our proposed method is multistage in which the solution of a stage serves as a warm start for the next stage which contains more samples (including the samples in the previous stage). The proposed multistage algorithm reduces the number of passes over data to achieve the statistical accuracy of the full training set. Moreover, our algorithm in nature is easy to be distributed and shares the strong scaling property indicating that acceleration is always expected by using more computing nodes. Various iteration complexity results regarding descent direction computation, communication efficiency and stopping criteria are analyzed under convex setting. Our numerical results illustrate that the proposed method outperforms other comparable methods for solving learning problems including neural networks.

preprint2020arXiv

Linear and Range Counting under Metric-based Local Differential Privacy

Local differential privacy (LDP) enables private data sharing and analytics without the need for a trusted data collector. Error-optimal primitives (for, e.g., estimating means and item frequencies) under LDP have been well studied. For analytical tasks such as range queries, however, the best known error bound is dependent on the domain size of private data, which is potentially prohibitive. This deficiency is inherent as LDP protects the same level of indistinguishability between any pair of private data values for each data downer. In this paper, we utilize an extension of $ε$-LDP called Metric-LDP or $E$-LDP, where a metric $E$ defines heterogeneous privacy guarantees for different pairs of private data values and thus provides a more flexible knob than $ε$ does to relax LDP and tune utility-privacy trade-offs. We show that, under such privacy relaxations, for analytical workloads such as linear counting, multi-dimensional range counting queries, and quantile queries, we can achieve significant gains in utility. In particular, for range queries under $E$-LDP where the metric $E$ is the $L^1$-distance function scaled by $ε$, we design mechanisms with errors independent on the domain sizes; instead, their errors depend on the metric $E$, which specifies in what granularity the private data is protected. We believe that the primitives we design for $E$-LDP will be useful in developing mechanisms for other analytical tasks, and encourage the adoption of LDP in practice.

preprint2020arXiv

Quantum locally linear embedding for nonlinear dimensionality reduction

Reducing the dimension of nonlinear data is crucial in data processing and visualization. The locally linear embedding algorithm (LLE) is specifically a representative nonlinear dimensionality reduction method with well maintaining the original manifold structure. In this paper, we present two implementations of the quantum locally linear embedding algorithm (QLLE) to perform the nonlinear dimensionality reduction on quantum devices. One implementation, the linear-algebra-based QLLE algorithm, utilizes quantum linear algebra subroutines to reduce the dimension of the given data. The other implementation, the variational quantum locally linear embedding algorithm (VQLLE) utilizes a variational hybrid quantum-classical procedure to acquire the low-dimensional data. The classical LLE algorithm requires polynomial time complexity of $N$, where $N$ is the global number of the original high-dimensional data. Compared with the classical LLE, the linear-algebra-based QLLE achieves quadratic speedup in the number and dimension of the given data. The VQLLE can be implemented on the near term quantum devices in two different designs. In addition, the numerical experiments are presented to demonstrate that the two implementations in our work can achieve the procedure of locally linear embedding.

preprint2020arXiv

Quantum transfer component analysis for domain adaptation

Domain adaptation, a crucial sub-field of transfer learning, aims to utilize known knowledge of one data set to accomplish tasks on another data set. In this paper, we perform one of the most representative domain adaptation algorithms, transfer component analysis (TCA), on quantum devices. Two different quantum implementations of this transfer learning algorithm; namely, the linear-algebra-based quantum TCA algorithm and the variational quantum TCA algorithm, are presented. The algorithmic complexity of the linear-algebra-based quantum TCA algorithm is $O(\mathrm{poly}(\log (n_{s} + n_{t})))$, where $n_{s}$ and $n_{t}$ are input sample size. Compared with the corresponding classical algorithm, the linear-algebra-based quantum TCA can be performed on a universal quantum computer with exponential speedup in the number of given samples. Finally, the variational quantum TCA algorithm based on a quantum-classical hybrid procedure, that can be implemented on the near term quantum devices, is proposed.

preprint2020arXiv

Tunneling spectroscopy of c-axis epitaxial cuprate junctions

Atomically precise epitaxial structures are unique systems for tunneling spectroscopy that minimize extrinsic effects of disorder. We present a systematic tunneling spectroscopy study, over a broad doping, temperature, and bias range, in epitaxial c-axis La$_{2-x}$Sr$_{x}$CuO$_{4}$/La$_{2}$CuO$_{4}$/La$_{2-x}$Sr$_{x}$CuO$_{4}$ heterostructures. The behavior of these superconductor/insulator/superconductor (SIS) devices is unusual. Down to 20 mK there is complete suppression of c-axis Josephson critical current with a barrier of only 2 nm of La$_{2}$CuO$_{4}$, and the zero-bias conductance remains at 20-30% of the normal-state conductance, implying a substantial population of in-gap states. Tunneling spectra show greatly suppressed coherence peaks. As the temperature is raised, the superconducting gap fills in rather than closing at $T_{c}$. For all doping levels, the spectra show an inelastic tunneling feature at $\sim$ 80 meV, suppressed as $T$ exceeds $T_{c}$. These nominally simple epitaxial cuprate junctions deviate markedly from expectations based on the standard Bardeen-Cooper-Schrieffer (BCS) theory.

preprint2016arXiv

The role of double TiO2 layers at the interface of FeSe/SrTiO3 superconductors

We determine the surface reconstruction of SrTiO3 used to achieve superconducting FeSe films in experiments, which is different from the 1x1 TiO2 terminated SrTiO3 assumed by most previous theoretical studies. In particular, we identify the existence of a double TiO2 layer at the SrTiO3-FeSe interface that plays two important roles. First, it facilitates the epitaxial growth of FeSe. Second, ab initio calculations reveal a strong tendency for electrons to transfer from an oxygen deficient SrTiO3 surface to FeSe when the double TiO2 layer is present. As a better electron donor than previously proposed interfacial structures, the double layer helps to remove the hole pocket in the FeSe at the Γ point of the Brillouin zone and leads to a band structure characteristic of superconducting samples. The characterization of the interface structure presented here is a key step towards the resolution of many open questions about this novel superconductor.

preprint2014arXiv

Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies

Privacy definitions provide ways for trading-off the privacy of individuals in a statistical database for the utility of downstream analysis of the data. In this paper, we present Blowfish, a class of privacy definitions inspired by the Pufferfish framework, that provides a rich interface for this trade-off. In particular, we allow data publishers to extend differential privacy using a policy, which specifies (a) secrets, or information that must be kept secret, and (b) constraints that may be known about the data. While the secret specification allows increased utility by lessening protection for certain individual properties, the constraint specification provides added protection against an adversary who knows correlations in the data (arising from constraints). We formalize policies and present novel algorithms that can handle general specifications of sensitive information and certain count constraints. We show that there are reasonable policies under which our privacy mechanisms for k-means clustering, histograms and range queries introduce significantly lesser noise than their differentially private counterparts. We quantify the privacy-utility trade-offs for various policies analytically and empirically on real datasets.

preprint2012arXiv

Constraints of the equation of state of dark energy from current and future observational data by piecewise parametrizations

The model-independent piecewise parametrizations (0-spline, linear-spline and cubic-spline) are used to estimate constraints of equation of state of dark energy ($w_{de}$) from current observational data (including SNIa, BAO and Hubble parameter) and the simulated future data. A combination of fitting results of $w_{de}$ from these three spline methods reveal essential properties of real equation of state $w_{de}$. It is shown that $w_{de}$ beyond redshift $z\sim0.5$ is poorly constrained from current data, and the mock future $\sim2300$ supernovae data give poor constraints of $w_{de}$ beyond $z\sim1$. The fitting results also indicate that there might exist a rapid transition of $w_{de}$ around $z\sim0.5$. The difference between three spline methods in reconstructing and constraining $w_{de}$ has also been discussed.

preprint2011arXiv

Phase transitions in AdS soliton spacetime through marginally stable modes

We investigate the marginally stable modes of the scalar (vector) perturbations in the AdS soliton background coupled to electric field. In the probe limit, we find that the marginally stable modes can reveal the onset of the phase transitions of this model. The critical chemical potentials obtained from this approach are in good agreement with the previous numerical or analytical results.

preprint2010arXiv

Robust isothermal electric switching of interface magnetization: A route to voltage-controlled spintronics

Roughness-insensitive and electrically controllable magnetization at the (0001) surface of antiferromagnetic chromia is observed using magnetometry and spin-resolved photoemission measurements and explained by the interplay of surface termination and magnetic ordering. Further, this surface in placed in proximity with a ferromagnetic Co/Pd multilayer film. Exchange coupling across the interface between chromia and Co/Pd induces an electrically controllable exchange bias in the Co/Pd film, which enables a reversible isothermal (at room temperature) shift of the global magnetic hysteresis loop of the Co/Pd film along the magnetic field axis between negative and positive values. These results reveal the potential of magnetoelectric chromia for spintronic applications requiring non-volatile electric control of magnetization.

preprint2010arXiv

Signature of the black hole phase transition in quasinormal modes

We study the perturbation of the scalar field interacting with the Maxwell field in the background of d-dimensional charged AdS black hole and AdS soliton. Different from the single classical field perturbation, which always has the decay mode in the black hole background, we observe the possible growing mode when the perturbation of the scalar field strongly couples to the Maxwell field. Our results disclose the signature of how the phase transition happens when the interaction among classical fields is strong. The sudden change of the perturbation to growing mode is also observed in the AdS soliton with electric potential. However in the magnetic charged AdS soliton background, we observe the consistent perturbation behavior when the interaction between scalar field and Maxwell field is considered. This implies that for the magnetic charged AdS soliton configuration, unlike the situation with electric potential, there is no scalar field condensation which causes the phase change.

Xi He

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

On Privacy Leakage in Tabular Diffusion Models: Influential Factors, Attacker Knowledge, and Metrics

Effect of radiation-induced defects on the superfluid density and optical conductivity of overdoped La$_{2-x}$Sr$_x$CuO$_4$

Visualizing Privacy-Utility Trade-Offs in Differentially Private Data Releases

Scanning SQUID characterization of extremely overdoped $La_{2-x}Sr_{x}CuO_{4}$

CHD Risk Minimization through Lifestyle Control: Machine Learning Gateway

Computing Local Sensitivities of Counting Queries with Joins

Crypt$ε$: Crypto-Assisted Differential Privacy on Untrusted Servers

DP-Cryptography: Marrying Differential Privacy and Cryptography in Emerging Applications

Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy

Linear and Range Counting under Metric-based Local Differential Privacy

Quantum locally linear embedding for nonlinear dimensionality reduction

Quantum transfer component analysis for domain adaptation

Tunneling spectroscopy of c-axis epitaxial cuprate junctions

The role of double TiO2 layers at the interface of FeSe/SrTiO3 superconductors

Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies

Constraints of the equation of state of dark energy from current and future observational data by piecewise parametrizations

Phase transitions in AdS soliton spacetime through marginally stable modes

Robust isothermal electric switching of interface magnetization: A route to voltage-controlled spintronics

Signature of the black hole phase transition in quasinormal modes