Source author record

Li Xing

Li Xing appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Methodology Computation cond-mat.mes-hall Machine Learning physics.app-ph physics.class-ph quant-ph

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Minor Issues Escalated to Critical Levels in Large Samples: A Permutation-Based Fix

In the big data era, the need to reevaluate traditional statistical methods is paramount due to the challenges posed by vast datasets. While larger samples theoretically enhance accuracy and hypothesis testing power without increasing false positives, practical concerns about inflated Type-I errors persist. The prevalent belief is that larger samples can uncover subtle effects, necessitating dual consideration of p-value and effect size. Yet, the reliability of p-values from large samples remains debated. This paper warns that larger samples can exacerbate minor issues into significant errors, leading to false conclusions. Through our simulation study, we demonstrate how growing sample sizes amplify issues arising from two commonly encountered violations of model assumptions in real-world data and lead to incorrect decisions. This underscores the need for vigilant analytical approaches in the era of big data. In response, we introduce a permutation-based test to counterbalance the effects of sample size and assumption discrepancies by neutralizing them between actual and permuted data. We demonstrate that this approach effectively stabilizes nominal Type I error rates across various sample sizes, thereby ensuring robust statistical inferences even amidst breached conventional assumptions in big data. For reproducibility, our R codes are publicly available at: \url{https://github.com/ubcxzhang/bigDataIssue}.

preprint2022arXiv

Handling highly correlated genes in prediction analysis of genomic studies

Background: Selecting feature genes to predict phenotypes is one of the typical tasks in analyzing genomics data. Though many general-purpose algorithms were developed for prediction, dealing with highly correlated genes in the prediction model is still not well addressed. High correlation among genes introduces technical problems, such as multi-collinearity issues, leading to unreliable prediction models. Furthermore, when a causal gene (whose variants have an actual biological effect on a phenotype) is highly correlated with other genes, most algorithms select the feature gene from the correlated group in a purely data-driven manner. Since the correlation structure among genes could change substantially when condition changes, the prediction model based on not correctly selected feature genes is unreliable. Therefore, we aim to keep the causal biological signal in the prediction process and build a more robust prediction model. Method: We propose a grouping algorithm, which treats highly correlated genes as a group and uses their common pattern to represent the group's biological signal in feature selection. Our novel grouping algorithm can be integrated into existing prediction algorithms to enhance their prediction performance. Our proposed grouping method has two advantages. First, using the gene group's common patterns makes the prediction more robust and reliable under condition change. Second, it reports whole correlated gene groups as discovered biomarkers for prediction tasks, allowing researchers to conduct follow-up studies to identify causal genes within the identified groups. Result: Using real benchmark scRNA-seq datasets with simulated cell phenotypes, we demonstrate our novel method significantly outperforms standard models in both (1) prediction of cell phenotypes and (2) feature gene selection.

preprint2022arXiv

Microwave heating effect on diamond sample of NV centers

Diamond samples of defects with negative charged nitrogen-vacancy (NV) centers are promising solid state spin sensors suitable for quantum information processing, high sensitive measurements of magnetic, electric and thermal fields in nanoscale. The diamond defect with a NV center is unique for its robust temperature-dependent zero field splitting Dgs of the triplet ground state. This property enables optical readout of electron spin states through manipulation of the ground triplet state using microwave resonance with Dgs from 100 K to about 600 K. Thus, prohibiting Dgs from unwanted external thermal disturbances is crucial for an accurate measurement using diamond NV sensors. Our observation demonstrates the existence of a prominent microwave heating effect on the diamond samples of NV centers. The effect is inevitable to shift Dgs and cause measurement errors. The temperature increment caused by the effect monotonically depends on the power and the duration of microwave irradiation. The effect is obvious with the microwave irradiation in the continuous mode and some pulse sequence modes, but is neglectable for the quantum lock-in XY8-N method.

preprint2021arXiv

A measurement method of transverse light-shift in atomic spin co-magnetometer

We disclose a method to obtain the transverse light-shift along the probe light of a single-axis alkali metal-noble gas co-magnetometer. The relationship between transverse compensating field and light-shift is deduced through the steady-state solution of Bloch equations. The variety of probe light intensity is used to obtain the residual magnetic field, and step modulation tests are applied to acquire the total spin-relaxation rate of electron spins and self-compensation point. Finally, the transverse light-shift is reduced from -0.115 nT to -0.039 nT by optimizing the probe light wavelength, and the value of the calibration coefficient can be increased simultaneously.

preprint2020arXiv

Optimal Study Design for Reducing Variances of Coefficient Estimators in Change-Point Models

In longitudinal studies, we observe measurements of the same variables at different time points to track the changes in their pattern over time. In such studies, scheduling of the data collection waves (i.e. time of participants' visits) is often pre-determined to accommodate ease of project management and compliance. Hence, it is common to schedule those visits at equally spaced time intervals. However, recent publications based on simulated experiments indicate that the power of studies and the precision of model parameter estimators is related to the participants' visiting schemes. In this paper, we consider the longitudinal studies that investigate the changing pattern of a disease outcome, (e.g. the accelerated cognitive decline of senior adults). Such studies are often analyzed by the broken-stick model, consisting of two segments of linear models connected at an unknown change-point. We formulate this design problem into a high-dimensional optimization problem and derive its analytical solution. Based on this solution, we propose an optimal design of the visiting scheme that maximizes the power (i.e. reduce the variance of estimators) to identify the onset of accelerated decline. Using both simulation studies and evidence from real data, we demonstrate our optimal design outperforms the standard equally-spaced design. Applying our novel design to plan the longitudinal studies, researchers can improve the power of detecting pattern change without collecting extra data.