Source author record

Lin Yang

Lin Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

53works

37topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Large and Precise All-Sky Photometric Standard Star Dataset Across More Than 200 Passbands

High-precision photometric standard stars play a key role in enabling accurate photometric calibration and advancing various fields of astronomy. However, due to limitations in calibration methods and the limited availability and underuse of high-precision reference data, existing photometric standard stars may suffer from insufficient numbers, systematic errors exceeding 10 milli-magnitude (mmag), limited photometric band coverage, or incomplete sky coverage, among other issues. To overcome these limitations, we have constructed the largest (over 200 million stars, 1000 times the widely recognized Landolt standards in the same magnitude range), most precise (better than 10 mmag), and most comprehensive (over 200 bands, nearly 40 times the coverage of traditional standards) all-sky standard stars. Based on standards, we have calibrated multiple survey datasets to mmag precision, and subsequently developed a complete sky distribution of stars for the Pan-STARRS system. This database, the BEst STars Database (BEST), is expected to pave the way for achieving mmag-level - or even higher - photometric precision in large-scale surveys, and to play a central role in shaping a high-precision astronomical measurement framework.

preprint2026arXiv

A Production-Ready RL Framework for Personalized Utility Tuning with Pareto Sweeping in Pinterest Recommender Systems

Large-scale recommenders encode multi-objective trade-offs by combining multiple predicted outcomes into a single utility score. Although this utility layer can be updated independently of the ranker, weight tuning remains largely manual, globally applied, slow to adapt to changing environments and business needs, and hard to govern as priorities shift. We propose PRL-PUTS, a Production-ready, ranker independent RL framework for Personalized Utility-weight Tuning with Pareto Sweeping. We cast utility tuning as a one-step, value-based RL problem: given request context, an agent selects a utility-weight vector that re-weights ranker predictions to maximize request-level engagement rewards. To visualize performance across the trade-off spectrum and allow decision makers to update the deployed operating policy instantly, we adopt an inference-time Pareto frontier sweeping via a scalarization parameter, producing a family of policies and an empirical Pareto frontier used as a governance artifact for operating policy selection. PRL-PUTS runs in parallel with ranking inference without adding serving latency. We validate PRL-PUTS with offline analysis using unbiased exploration logs and online experiments on Pinterest Homefeed where PRL-PUTS showed significant increases in engagement compared to baseline such as +0.13\% increase in successful session, a core metric for user engagement.

preprint2024arXiv

Multi-modal Learning with Missing Modality in Predicting Axillary Lymph Node Metastasis

Multi-modal Learning has attracted widespread attention in medical image analysis. Using multi-modal data, whole slide images (WSIs) and clinical information, can improve the performance of deep learning models in the diagnosis of axillary lymph node metastasis. However, clinical information is not easy to collect in clinical practice due to privacy concerns, limited resources, lack of interoperability, etc. Although patient selection can ensure the training set to have multi-modal data for model development, missing modality of clinical information can appear during test. This normally leads to performance degradation, which limits the use of multi-modal models in the clinic. To alleviate this problem, we propose a bidirectional distillation framework consisting of a multi-modal branch and a single-modal branch. The single-modal branch acquires the complete multi-modal knowledge from the multi-modal branch, while the multi-modal learns the robust features of WSI from the single-modal. We conduct experiments on a public dataset of Lymph Node Metastasis in Early Breast Cancer to validate the method. Our approach not only achieves state-of-the-art performance with an AUC of 0.861 on the test set without missing data, but also yields an AUC of 0.842 when the rate of missing modality is 80\%. This shows the effectiveness of the approach in dealing with multi-modal data and missing modality. Such a model has the potential to improve treatment decision-making for early breast cancer patients who have axillary lymph node metastatic status.

preprint2024arXiv

On the Model-Misspecification in Reinforcement Learning

The success of reinforcement learning (RL) crucially depends on effective function approximation when dealing with complex ground-truth models. Existing sample-efficient RL algorithms primarily employ three approaches to function approximation: policy-based, value-based, and model-based methods. However, in the face of model misspecification (a disparity between the ground-truth and optimal function approximators), it is shown that policy-based approaches can be robust even when the policy function approximation is under a large locally-bounded misspecification error, with which the function class may exhibit a $Ω(1)$ approximation error in specific states and actions, but remains small on average within a policy-induced state distribution. Yet it remains an open question whether similar robustness can be achieved with value-based and model-based approaches, especially with general function approximation. To bridge this gap, in this paper we present a unified theoretical framework for addressing model misspecification in RL. We demonstrate that, through meticulous algorithm design and sophisticated analysis, value-based and model-based methods employing general function approximation can achieve robustness under local misspecification error bounds. In particular, they can attain a regret bound of $\widetilde{O}\left(\text{poly}(d H)(\sqrt{K} + Kζ) \right)$, where $d$ represents the complexity of the function class, $H$ is the episode length, $K$ is the total number of episodes, and $ζ$ denotes the local bound for misspecification error. Furthermore, we propose an algorithmic framework that can achieve the same order of regret bound without prior knowledge of $ζ$, thereby enhancing its practical applicability.

preprint2023arXiv

Semantic Data Sourcing for 6G Edge Intelligence

As a new function of 6G networks, edge intelligence refers to the ubiquitous deployment of machine learning and artificial intelligence (AI) algorithms at the network edge to empower many emerging applications ranging from sensing to auto-pilot. To support relevant use cases, including sensing, edge learning, and edge inference, all require transmission of high-dimensional data or AI models over the air. To overcome the bottleneck, we propose a novel framework of SEMantic DAta Sourcing (SEMDAS) for locating semantically matched data sources to efficiently enable edge-intelligence operations. The comprehensive framework comprises new architecture, protocol, semantic matching techniques, and design principles for task-oriented wireless techniques. As the key component of SEMDAS, we discuss a set of machine learning based semantic matching techniques targeting different edge-intelligence use cases. Moreover, for designing task-oriented wireless techniques, we discuss different tradeoffs in SEMDAS systems, propose the new concept of joint semantics-and-channel matching, and point to a number of research opportunities. The SEMDAS framework not only overcomes the said communication bottleneck but also addresses other networking issues including long-distance transmission, sparse connectivity, high-speed mobility, link disruptions, and security. In addition, experimental results using a real dataset are presented to demonstrate the performance gain of SEMDAS.

preprint2022arXiv

3-D Markerless Tracking of Human Gait by Geometric Trilateration of Multiple Kinects

In this paper, we develop an integrated markerless gait tracking system with three Kinect v2 sensors. A geometric principle-based trilateration method is proposed for optimizing the accuracy of the measured gait data. To tackle the data synchronization problem among the Kinect clients and the server, a synchronization mechanism based on NTP (Network Time Protocol) is designed for synchronizing the server and Kinect clients' clocks. Furthermore, a time schedule is designed for timing each Kinect client's data transmission. In the experiment, participants are asked to perform a 60 s walk while the proposed tracking system obtains the participant's gait data. Six joints (including left hip, right hip, left knee, right knee, left ankle and right ankle) of the participants are tracked where the obtained gait data are described as 6000 {movements} of joint positions (1000 {movements} for each joint). The results show that the trilateration tracking result by the three Kinect sensors has a much higher accuracy compared with the accuracy measured by a single Kinect sensor. Within a randomly sampled time period (67.726 s in the experiment), 98.37% of the frames generated by the gait tracking system have timing errors less than 1 ms, which is much better than the default NTP service embedded in the Windows 8.1 operating system. The accuracy of the proposed system is quantitatively evaluated and verified by a comparison with a commercial medical system (Delsys Trigno Smart Sensor System).

preprint2022arXiv

A Novel Physics-Regularized Interpretable Machine Learning Model for Grain Growth

Experimental grain growth observations often deviate from grain growth simulations, revealing that the governing rules for grain boundary motion are not fully understood. A novel deep learning model was developed to capture grain growth behavior from training data without making assumptions about the underlying physics. The Physics-Regularized Interpretable Machine Learning Microstructure Evolution (PRIMME) model consists of a multi-layer neural network that predicts the likelihood of a point changing to a neighboring grain. Here, we demonstrate PRIMME's ability to replicate two-dimensional normal grain growth by training it with Monte Carlo Potts simulations. The trained PRIMME model's grain growth predictions in several test cases show good agreement with analytical models, phase-field simulations, Monte Carlo Potts simulations, and results from the literature. Additionally, PRIMME's adaptability to investigate irregular grain growth behavior is shown. Important aspects of PRIMME like interpretability, regularization, extrapolation, and overfitting are also discussed.

preprint2022arXiv

Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology

When designing a diagnostic model for a clinical application, it is crucial to guarantee the robustness of the model with respect to a wide range of image corruptions. Herein, an easy-to-use benchmark is established to evaluate how deep neural networks perform on corrupted pathology images. Specifically, corrupted images are generated by injecting nine types of common corruptions into validation images. Besides, two classification and one ranking metrics are designed to evaluate the prediction and confidence performance under corruption. Evaluated on two resulting benchmark datasets, we find that (1) a variety of deep neural network models suffer from a significant accuracy decrease (double the error on clean images) and the unreliable confidence estimation on corrupted images; (2) A low correlation between the validation and test errors while replacing the validation set with our benchmark can increase the correlation. Our codes are available on https://github.com/superjamessyx/robustness_benchmark.

preprint2022arXiv

ChrSNet: Chromosome Straightening using Self-attention Guided Networks

Karyotyping is an important procedure to assess the possible existence of chromosomal abnormalities. However, because of the non-rigid nature, chromosomes are usually heavily curved in microscopic images and such deformed shapes hinder the chromosome analysis for cytogeneticists. In this paper, we present a self-attention guided framework to erase the curvature of chromosomes. The proposed framework extracts spatial information and local textures to preserve banding patterns in a regression module. With complementary information from the bent chromosome, a refinement module is designed to further improve fine details. In addition, we propose two dedicated geometric constraints to maintain the length and restore the distortion of chromosomes. To train our framework, we create a synthetic dataset where curved chromosomes are generated from the real-world straight chromosomes by grid-deformation. Quantitative and qualitative experiments are conducted on synthetic and real-world data. Experimental results show that our proposed method can effectively straighten bent chromosomes while keeping banding details and length.

preprint2022arXiv

Consecutive topological phase transitions and colossal magnetoresistance in a magnetic topological semimetal

The combination of magnetic symmetries and electronic band topology provides a promising route for realizing topologically nontrivial quasiparticles, and the manipulation of magnetic structures may enable the switching between topological phases, with the potential for achieving functional physical properties. Here, we report measurements of the electrical resistivity of EuCd$_2$As$_2$ under pressure, which show an intriguing insulating dome at pressures between $p_{\rm c1}\sim1.0$~GPa and $p_{\rm c2}\sim2.0$~GPa, situated between two regimes with metallic transport. The insulating state can be fully suppressed by a small magnetic field, leading to a colossal negative magnetoresistance on the order of $10^5$\%, accessible via a modest field of $\sim0.2$~T. First-principles calculations reveal that the dramatic evolution of the resistivity under pressure is due to consecutive transitions of EuCd$_2$As$_2$ from a magnetic topological insulator to a trivial insulator, and then to a Weyl semimetal, with the latter resulting from a pressure-induced change in the magnetic ground state. Similarly, the colossal magnetoresistance results from a field-induced polarization of the magnetic moments, transforming EuCd$_2$As$_2$ from a trivial insulator to a Weyl semimetal. These findings underscore weak magnetic exchange couplings and spin anisotropy as ingredients for discovering tunable magnetic topological materials with desirable functionalities.

preprint2022arXiv

Discovery-and-Selection: Towards Optimal Multiple Instance Learning for Weakly Supervised Object Detection

Weakly supervised object detection (WSOD) is a challenging task that requires simultaneously learn object classifiers and estimate object locations under the supervision of image category labels. A major line of WSOD methods roots in multiple instance learning which regards images as bags of instances and selects positive instances from each bag to learn the detector. However, a grand challenge emerges when the detector inclines to converge to discriminative parts of objects rather than the whole objects. In this paper, under the hypothesis that optimal solutions are included in local minima, we propose a discovery-and-selection approach fused with multiple instance learning (DS-MIL), which finds rich local minima and select optimal solution from multiple local minima. To implement DS-MIL, an attention module is proposed so that more context information can be captured by feature maps and more valuable proposals can be collected during training. With proposal candidates, a selection module is proposed to select informative instances for object detector. Experimental results on commonly used benchmarks show that our proposed DS-MIL approach can consistently improve the baselines, reporting state-of-the-art performance.

preprint2022arXiv

Distributed Bandits with Heterogeneous Agents

This paper tackles a multi-agent bandit setting where $M$ agents cooperate together to solve the same instance of a $K$-armed stochastic bandit problem. The agents are \textit{heterogeneous}: each agent has limited access to a local subset of arms and the agents are asynchronous with different gaps between decision-making rounds. The goal for each agent is to find its optimal local arm, and agents can cooperate by sharing their observations with others. While cooperation between agents improves the performance of learning, it comes with an additional complexity of communication between agents. For this heterogeneous multi-agent setting, we propose two learning algorithms, \ucbo and \AAE. We prove that both algorithms achieve order-optimal regret, which is $O\left(\sum_{i:\tildeΔ_i>0} \log T/\tildeΔ_i\right)$, where $\tildeΔ_i$ is the minimum suboptimality gap between the reward mean of arm $i$ and any local optimal arm. In addition, a careful selection of the valuable information for cooperation, \AAE achieves a low communication complexity of $O(\log T)$. Last, numerical experiments verify the efficiency of both algorithms.

preprint2022arXiv

End-to-end cell recognition by point annotation

Reliable quantitative analysis of immunohistochemical staining images requires accurate and robust cell detection and classification. Recent weakly-supervised methods usually estimate probability density maps for cell recognition. However, in dense cell scenarios, their performance can be limited by pre- and post-processing as it is impossible to find a universal parameter setting. In this paper, we introduce an end-to-end framework that applies direct regression and classification for preset anchor points. Specifically, we propose a pyramidal feature aggregation strategy to combine low-level features and high-level semantics simultaneously, which provides accurate cell recognition for our purely point-based model. In addition, an optimized cost function is designed to adapt our multi-task learning framework by matching ground truth and predicted points. The experimental results demonstrate the superior accuracy and efficiency of the proposed method, which reveals the high potentiality in assisting pathologist assessments.

preprint2022arXiv

Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis

Synthesizing a subject-specific pathology-free image from a pathological image is valuable for algorithm development and clinical practice. In recent years, several approaches based on the Generative Adversarial Network (GAN) have achieved promising results in pseudo-healthy synthesis. However, the discriminator (i.e., a classifier) in the GAN cannot accurately identify lesions and further hampers from generating admirable pseudo-healthy images. To address this problem, we present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images. Then, we apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem existing in medical image segmentation. Furthermore, a reliable metric is proposed by utilizing two attributes of label noise to measure the health of synthetic images. Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods. The method achieves better performance than the existing methods with only 30\% of the training data. The effectiveness of the proposed method is also demonstrated on the LiTS and the T1 modality of BraTS. The code and the pre-trained model of this study are publicly available at https://github.com/Au3C2/Generator-Versus-Segmentor.

preprint2022arXiv

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.

preprint2022arXiv

Invariant Content Synergistic Learning for Domain Generalization of Medical Image Segmentation

While achieving remarkable success for medical image segmentation, deep convolution neural networks (DCNNs) often fail to maintain their robustness when confronting test data with the novel distribution. To address such a drawback, the inductive bias of DCNNs is recently well-recognized. Specifically, DCNNs exhibit an inductive bias towards image style (e.g., superficial texture) rather than invariant content (e.g., object shapes). In this paper, we propose a method, named Invariant Content Synergistic Learning (ICSL), to improve the generalization ability of DCNNs on unseen datasets by controlling the inductive bias. First, ICSL mixes the style of training instances to perturb the training distribution. That is to say, more diverse domains or styles would be made available for training DCNNs. Based on the perturbed distribution, we carefully design a dual-branches invariant content synergistic learning strategy to prevent style-biased predictions and focus more on the invariant content. Extensive experimental results on two typical medical image segmentation tasks show that our approach performs better than state-of-the-art domain generalization methods.

preprint2022arXiv

Low-Latency Online Speaker Diarization with Graph-Based Label Generation

This paper introduces an online speaker diarization system that can handle long-time audio with low latency. We enable Agglomerative Hierarchy Clustering (AHC) to work in an online fashion by introducing a label matching algorithm. This algorithm solves the inconsistency between output labels and hidden labels that are generated each turn. To ensure the low latency in the online setting, we introduce a variant of AHC, namely chkpt-AHC, to cluster the speakers. In addition, we propose a speaker embedding graph to exploit a graph-based re-clustering method, further improving the performance. In the experiment, we evaluate our systems on both DIHARD3 and VoxConverse datasets. The experimental results show that our proposed online systems have better performance than our baseline online system and have comparable performance to our offline systems. We find out that the framework combining the chkpt-AHC method and the label matching algorithm works well in the online setting. Moreover, the chkpt-AHC method greatly reduces the time cost, while the graph-based re-clustering method helps improve the performance.

preprint2022arXiv

Magnetic interlayer coupling between ferromagnetic SrRuO$_3$ layers through a SrIrO$_3$ spacer

A key element to tailor the properties of magnetic multilayers is the coupling between the individual magnetic layers. In case of skyrmion hosting multilayers, coupling of skyrmions across the magnetic layers is highly desirable. Here the magnetic interlayer coupling was studied in epitaxial all-oxide heterostructures of ferromagnetic perovskite SrRuO$_3$ layers separated by spacers of the strong spin-orbit coupling oxide SrIrO$_3$. This combination of oxide layers is being discussed as a potential candidate system to host Néel skyrmions. First order reversal curve (FORC) measurements were performed in order to distinguish between magnetic switching processes of the individual layers and to disentangle the signal of soft magnetic impurities from the samples$'$ signal. Additionally, FORC investigations enabled to determine whether the coupling between the magnetic layers is ferromagnetic or antiferromagnetic. The observed interlayer coupling strength was weak for all the heterostructures, with SrIrO$_3$ spacers between 2 monolayers and 12 monolayers thick.

preprint2022arXiv

Mechanism, measurement, and quantification of stress in decision process: a model based systematic-review protocol

Every human action begins with decision-making. Stress is a significant source of biases that can influence human decision-making. In order to understand the relationship between stress and decision-making, stress quantification is fundamental. Different methods of measuring and quantifying stress in decision-making have been described in the literature while an up-to-date systematic review of the existing methods is lacking. Moreover, mental stress, mental effort, cognitive workload, and workload are often used interchangeably but should be distinguished to enable in-depth investigations of decision-mechanisms. Our objectives are to clarify stress related concepts and review the measurement, quantification, and application of stress during decision making activities.

preprint2022arXiv

The low-entropy hydration shell at the binding site of spike RBD determines the contagiousness of SARS-CoV-2 variants

The infectivity of SARS-CoV-2 depends on the binding affinity of the receptor-binding domain (RBD) of the spike protein with the angiotensin converting enzyme 2 (ACE2) receptor. The calculated RBD-ACE2 binding energies indicate that the difference in transmission efficiency of SARS-CoV-2 variants cannot be fully explained by electrostatic interactions, hydrogen-bond interactions, van der Waals interactions, internal energy, and nonpolar solvation energies. Here, we demonstrate that low-entropy regions of hydration shells around proteins drive hydrophobic attraction between shape-matched low-entropy regions of the hydration shells, which essentially coordinates protein-protein binding in rotational-configurational space of mutual orientations and determines the binding affinity. An innovative method was used to identify the low-entropy regions of the hydration shells of the RBDs of multiple SARS-CoV-2 variants and the ACE2. We observed integral low-entropy regions of hydration shells covering the binding sites of the RBDs and matching in shape to the low-entropy region of hydration shell at the binding site of the ACE2. The RBD-ACE2 binding is thus found to be guided by hydrophobic collapse between the shape-matched low-entropy regions of the hydration shells. A measure of the low-entropy of the hydration shells can be obtained by counting the number of hydrophilic groups expressing hydrophilicity within the binding sites. The low-entropy level of hydration shells at the binding site of a spike protein is found to be an important indicator of the contagiousness of the coronavirus.

preprint2022arXiv

Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification

With the development of deep learning, automatic speaker verification has made considerable progress over the past few years. However, to design a lightweight and robust system with limited computational resources is still a challenging problem. Traditionally, a speaker verification system is symmetrical, indicating that the same embedding extraction model is applied for both enrollment and verification in inference. In this paper, we come up with an innovative asymmetric structure, which takes the large-scale ECAPA-TDNN model for enrollment and the small-scale ECAPA-TDNNLite model for verification. As a symmetrical system, our proposed ECAPA-TDNNLite model achieves an EER of 3.07% on the Voxceleb1 original test set with only 11.6M FLOPS. Moreover, the asymmetric structure further reduces the EER to 2.31%, without increasing any computational costs during verification.

preprint2022arXiv

Weakly Supervised Learning for cell recognition in immunohistochemical cytoplasm staining images

Cell classification and counting in immunohistochemical cytoplasm staining images play a pivotal role in cancer diagnosis. Weakly supervised learning is a potential method to deal with labor-intensive labeling. However, the inconstant cell morphology and subtle differences between classes also bring challenges. To this end, we present a novel cell recognition framework based on multi-task learning, which utilizes two additional auxiliary tasks to guide robust representation learning of the main task. To deal with misclassification, the tissue prior learning branch is introduced to capture the spatial representation of tumor cells without additional tissue annotation. Moreover, dynamic masks and consistency learning are adopted to learn the invariance of cell scale and shape. We have evaluated our framework on immunohistochemical cytoplasm staining images, and the results demonstrate that our method outperforms recent cell recognition approaches. Besides, we have also done some ablation studies to show significant improvements after adding the auxiliary branches.

preprint2021arXiv

A Thermodynamics Model for Mechanochemical Synthesis of Gold Nanoparticles: Implications for Solvent-free Nanoparticle Production

Mechanochemistry is becoming an established method for the sustainable, solid-phase synthesis of scores of nano-materials and molecules, ranging from active pharmaceutical ingredients to materials for cleantech. Yet we are still lacking a good model to rationalize experimental observations and develop a mechanistic understanding of the factors at play during mechanically assisted, solid-phase nanoparticle synthesis. We propose herein a structural-phase-field-crystal (XPFC) model with a ballistic driving force to describe such a process, with the specific example of the growth of gold nanoparticles in a two component mixture. The reaction path is described in the context of free energy landscape of the model, and dynamical simulations are performed based on phenomenological model parameters closely corresponding to the experimental conditions, so as to draw conclusions on nanoparticle growth dynamics. It is shown that the ballistic term lowers the activation energy barrier of reaction, enabling the reaction in a temperature regime compatible with experimental observations. The model also explains the mechanism of precipitated grain size reduction that is consistent with experimental observations. Our simulation results afford novel mechanistic insights into mechanosynthesis with implications for nanaparticle production and beyond.

preprint2021arXiv

Correction to the photometric magnitudes of the Gaia Early Data Release 3

In this letter, we have carried out an independent validation of the Gaia EDR3 photometry using about 10,000 Landolt standard stars from Clem & Landolt (2013). Using a machine learning technique, the UBVRI magnitudes are converted into the Gaia magnitudes and colors and then compared to those in the EDR3, with the effect of metallicity incorporated. Our result confirms the significant improvements in the calibration process of the Gaia EDR3. Yet modest trends up to 10 mmag with G magnitude are found for all the magnitudes and colors for the 10 < G < 19 mag range, particularly for the bright and faint ends. With the aid of synthetic magnitudes computed on the CALSPEC spectra with the Gaia EDR3 passbands, absolute corrections are further obtained, paving the way for optimal usage of the Gaia EDR3 photometry in high accuracy investigations.

preprint2021arXiv

Hydrophobic interaction determines docking affinity of SARS CoV 2 variants with antibodies

Preliminary epidemiologic, phylogenetic and clinical findings suggest that several novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have increased transmissibility and decreased efficacy of several existing vaccines. Four mutations in the receptor-binding domain (RBD) of the spike protein that are reported to contribute to increased transmission. Understanding physical mechanism responsible for the affinity enhancement between the SARS-CoV-2 variants and ACE2 is the "urgent challenge" for developing blockers, vaccines and therapeutic antibodies against the coronavirus disease 2019 (COVID-19) pandemic. Based on a hydrophobic-interaction-based protein docking mechanism, this study reveals that the mutation N501Y obviously increased the hydrophobic attraction and decrease hydrophilic repulsion between the RBD and ACE2 that most likely caused the transmissibility increment of the variants. By analyzing the mutation-induced hydrophobic surface changes in the attraction and repulsion at the binding site of the complexes of the SARS-CoV-2 variants and antibodies, we found out that all the mutations of N501Y, E484K, K417N and L452R can selectively decrease or increase their binding affinity with some antibodies.

preprint2021arXiv

J-PLUS: Stellar Parameters, C, N, Mg, Ca and [α/Fe] Abundances for Two Million Stars from DR1

Context. The Javalambre Photometric Local Universe Survey (J-PLUS) has obtained precise photometry in twelve specially designed filters for large numbers of Galactic stars. Deriving their precise stellar atmospheric parameters and individual elemental abundances is crucial for studies of Galactic structure, and the assembly history and chemical evolution of our Galaxy. Aims. Our goal is to estimate not only stellar parameters (effective temperature, Teff, surface gravity, log g, and metallicity, [Fe/H]), but also [α/Fe] and four elemental abundances ([C/Fe], [N/Fe], [Mg/Fe], and [Ca/Fe]) using data from J-PLUS DR1. Methods. By combining recalibrated photometric data from J-PLUS DR1, Gaia DR2, and spectroscopic labels from LAMOST, we design and train a set of cost-sensitive neural networks, the CSNet, to learn the non-linear mapping from stellar colors to their labels. Results. We have achieved precisions of δTeff {\sim}55K, δlogg{\sim}0.15dex, and δ[Fe/H]{\sim}0.07dex, respectively, over a wide range of temperature, surface gravity, and metallicity. The uncertainties of the abundance estimates for [α/Fe] and the four individual elements are in the range 0.04-0.08 dex. We compare our parameter and abundance estimates with those from other spectroscopic catalogs such as APOGEE and GALAH, and find an overall good agreement. Conclusions. Our results demonstrate the potential of well-designed, high-quality photometric data for determinations of stellar parameters as well as individual elemental abundances. Applying the method to J-PLUS DR1, we have obtained the aforementioned parameters for about two million stars, providing an outstanding data set for chemo-dynamic analyses of the Milky Way. The catalog of the estimated parameters is publicly accessible.

preprint2021arXiv

Stellar Loci V: Photometric Metallicities of 27 Million FGK Stars based on Gaia Early Data Release 3

We combine LAMOST DR7 spectroscopic data and Gaia EDR3 photometric data to construct high-quality giant (0.7 $< (BP-RP) <$ 1.4) and dwarf (0.5 $< (BP-RP) < $ 1.5) samples in the high Galactic latitude region, with precise corrections for magnitude-dependent systematic errors in the Gaia photometry and careful reddening corrections using empirically determined color- and reddening-dependent coefficients. We use the two samples to build metallicity-dependent stellar loci of Gaia colors for giants and dwarfs, respectively. For a given $(BP-RP)$ color, a one dex change in [Fe/H] results in about a 5 mmag change in $(BP-G)$ color for solar-type stars. These relations are used to determine metallicity estimates from EDR3 colors. Despite the weak sensitivity, the exquisite data quality of these colors enables a typical precision of about $δ$\,[Fe/H] = 0.2 dex. Our method is valid for FGK stars with $G \leq 16$, [Fe/H] $\geq -2.5$, and $E(B-V) \leq 0.5$. Stars with fainter $G$ magnitudes, lower metallicities, or larger reddening suffer from higher metallicity uncertainties. With the enormous data volume of Gaia, we have measured metallicity estimates for about 27 million stars with 10 $< G \leq 16$ across almost the entire sky, including over 6 million giants and 20 million dwarfs, which can be used for a number of studies. These include investigations of Galactic formation and evolution, the identification of candidate stars for subsequent high-resolution spectroscopic follow-up, the identification of wide binaries, and to obtain metallicity estimates of stars for asteroseismology and exoplanet research.

preprint2021arXiv

The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

In this paper, we present the submitted system for the third DIHARD Speech Diarization Challenge from the DKU-Duke-Lenovo team. Our system consists of several modules: voice activity detection (VAD), segmentation, speaker embedding extraction, attentive similarity scoring, agglomerative hierarchical clustering. In addition, the target speaker VAD (TSVAD) is used for the phone call data to further improve the performance. Our final submitted system achieves a DER of 15.43% for the core evaluation set and 13.39% for the full evaluation set on task 1, and we also get a DER of 21.63% for core evaluation set and 18.90% for full evaluation set on task 2.

preprint2020arXiv

A hydrophobic-interaction-based mechanism trigger docking between the SARS CoV 2 spike and angiotensin-converting enzyme 2

A recent experimental study found that the binding affinity between the cellular receptor human angiotensin converting enzyme 2 (ACE2) and receptor-binding domain (RBD) in spike (S) protein of novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is more than 10-fold higher than that of the original severe acute respiratory syndrome coronavirus (SARS-CoV). However, main-chain structures of the SARS-CoV-2 RBD are almost the same with that of the SARS-CoV RBD. Understanding physical mechanism responsible for the outstanding affinity between the SARS-CoV-2 S and ACE2 is the "urgent challenge" for developing blockers, vaccines and therapeutic antibodies against the coronavirus disease 2019 (COVID-19) pandemic. Considering the mechanisms of hydrophobic interaction, hydration shell, surface tension, and the shielding effect of water molecules, this study reveals a hydrophobic-interaction-based mechanism by means of which SARS-CoV-2 S and ACE2 bind together in an aqueous environment. The hydrophobic interaction between the SARS-CoV-2 S and ACE2 protein is found to be significantly greater than that between SARS-CoV S and ACE2. At the docking site, the hydrophobic portions of the hydrophilic side chains of SARS-CoV-2 S are found to be involved in the hydrophobic interaction between SARS-CoV-2 S and ACE2. We propose a method to design live attenuated viruses by mutating several key amino acid residues of the spike protein to decrease the hydrophobic surface areas at the docking site. Mutation of a small amount of residues can greatly reduce the hydrophobic binding of the coronavirus to the receptor, which may be significant reduce infectivity and transmissibility of the virus.

preprint2020arXiv

Acoustic Word Embedding System for Code-Switching Query-by-example Spoken Term Detection

In this paper, we propose a deep convolutional neural network-based acoustic word embedding system on code-switching query by example spoken term detection. Different from previous configurations, we combine audio data in two languages for training instead of only using one single language. We transform the acoustic features of keyword templates and searching content to fixed-dimensional vectors and calculate the distances between keyword segments and searching content segments obtained in a sliding manner. An auxiliary variability-invariant loss is also applied to training data within the same word but different speakers. This strategy is used to prevent the extractor from encoding undesired speaker- or accent-related information into the acoustic word embeddings. Experimental results show that our proposed system produces promising searching results in the code-switching test scenario. With the increased number of templates and the employment of variability-invariant loss, the searching performance is further enhanced.

preprint2020arXiv

DIHARD II is Still Hard: Experimental Results and Discussions from the DKU-LENOVO Team

In this paper, we present the submitted system for the second DIHARD Speech Diarization Challenge from the DKULENOVO team. Our diarization system includes multiple modules, namely voice activity detection (VAD), segmentation, speaker embedding extraction, similarity scoring, clustering, resegmentation and overlap detection. For each module, we explore different techniques to enhance performance. Our final submission employs the ResNet-LSTM based VAD, the Deep ResNet based speaker embedding, the LSTM based similarity scoring and spectral clustering. Variational Bayes (VB) diarization is applied in the resegmentation stage and overlap detection also brings slight improvement. Our proposed system achieves 18.84% DER in Track1 and 27.90% DER in Track2. Although our systems have reduced the DERs by 27.5% and 31.7% relatively against the official baselines, we believe that the diarization task is still very difficult.

preprint2020arXiv

IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report

This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, with everyday objects in home, office, campus, and mall scenarios. The dataset explicitly quantifies the variants of illumination, object occlusion, object size, camera-object distance/angles, and clutter information. Rules are designed to quantify the learning capability of the robotic vision system when faced with the objects appearing in the dynamic environments in the contest. Individual reports, dataset information, rules, and released source code can be found at the project homepage: "https://lifelong-robotic-vision.github.io/competition/".

preprint2020arXiv

Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling

This paper introduces our approaches for the Mask and Breathing Sub-Challenge in the Interspeech COMPARE Challenge 2020. For the mask detection task, we train deep convolutional neural networks with filter-bank energies, gender-aware features, and speaker-aware features. Support Vector Machines follows as the back-end classifiers for binary prediction on the extracted deep embeddings. Several data augmentation schemes are used to increase the quantity of training data and improve our models' robustness, including speed perturbation, SpecAugment, and random erasing. For the speech breath monitoring task, we investigate different bottleneck features based on the Bi-LSTM structure. Experimental results show that our proposed methods outperform the baselines and achieve 0.746 PCC and 78.8% UAR on the Breathing and Mask evaluation set, respectively.

preprint2020arXiv

Multi-modal Sentiment Analysis using Super Characters Method on Low-power CNN Accelerator Device

Recent years NLP research has witnessed the record-breaking accuracy improvement by DNN models. However, power consumption is one of the practical concerns for deploying NLP systems. Most of the current state-of-the-art algorithms are implemented on GPUs, which is not power-efficient and the deployment cost is also very high. On the other hand, CNN Domain Specific Accelerator (CNN-DSA) has been in mass production providing low-power and low cost computation power. In this paper, we will implement the Super Characters method on the CNN-DSA. In addition, we modify the Super Characters method to utilize the multi-modal data, i.e. text plus tabular data in the CL-Aff sharedtask.

preprint2020arXiv

Nearly Linear Row Sampling Algorithm for Quantile Regression

We give a row sampling algorithm for the quantile loss function with sample complexity nearly linear in the dimensionality of the data, improving upon the previous best algorithm whose sampling complexity has at least cubic dependence on the dimensionality. Based upon our row sampling algorithm, we give the fastest known algorithm for quantile regression and a graph sparsification algorithm for balanced directed graphs. Our main technical contribution is to show that Lewis weights sampling, which has been used in row sampling algorithms for $\ell_p$ norms, can also be applied in row sampling algorithms for a variety of loss functions. We complement our theoretical results by experiments to demonstrate the practicality of our approach.

preprint2020arXiv

Origin of the hump anomalies in the Hall resistance loops of ultrathin SrRuO$_3$/SrIrO$_3$ multilayers

The proposal that very small Néel skyrmions can form in SrRuO$_3$/SrIrO$_3$ epitaxial bilayers and that the electric field-effect can be used to manipulate these skyrmions in gated devices strongly stimulated the recent research of SrRuO$_3$ heterostructures. A strong interfacial Dzyaloshinskii-Moriya interaction, combined with the breaking of inversion symmetry, was considered as the driving force for the formation of skyrmions in SrRuO$_3$/SrIrO$_3$ bilayers. Here, we investigated nominally symmetric heterostructures in which an ultrathin ferromagnetic SrRuO$_3$ layer is sandwiched between large spin-orbit coupling SrIrO$_3$ layers, for which the conditions are not favorable for the emergence of a net interfacial Dzyaloshinskii-Moriya interaction. Previously the formation of skyrmions in the asymmetric SrRuO$_3$/SrIrO$_3$ bilayers was inferred from anomalous Hall resistance loops showing humplike features that resembled topological Hall effect contributions. Symmetric SrIrO$_3$/SrRuO$_3$/SrIrO$_3$ trilayers do not show hump anomalies in the Hall loops. However, the anomalous Hall resistance loops of symmetric multilayers, in which the trilayer is stacked several times, do exhibit the humplike structures, similar to the asymmetric SrRuO$_3$/SrIrO$_3$ bilayers. The origin of the Hall effect loop anomalies likely resides in unavoidable differences in the electronic and magnetic properties of the individual SrRuO$_3$ layers rather than in the formation of skyrmions.

preprint2020arXiv

Spin Excitations and Spin Wave Gap in the Ferromagnetic Weyl Semimetal Co$_3$Sn$_2$S$_2$

We report a comprehensive neutron scattering study on the spin excitations in the magnetic Weyl semimetal Co$_3$Sn$_2$S$_2$ with quasi-two-dimensional structure. Both in-plane and out-of-plane dispersions of the spin waves are revealed in the ferromagnetic state, similarly dispersive but damped spin excitations persist into the paramagnetic state. The effective exchange interactions have been estimated by a semi-classical Heisenberg model to consistently reproduce the experimental $T_C$ and spin stiffness. However, a full spin wave gap below $E_g=2.3$ meV is observed at $T=4$ K, much larger than the estimated magnetic anisotropy energy ($\sim0.6$ meV), while its temperature dependence indicates a significant contribution from the Weyl fermions. These results suggest that Co$_3$Sn$_2$S$_2$ is a three-dimensional correlated system with large spin stiffness, and the low-energy spin dynamics could interplay with the topological electron states.

preprint2020arXiv

The role of hydrophobic interactions in folding of $β$-sheets

Exploring the protein-folding problem has been a long-standing challenge in molecular biology. Protein folding is highly dependent on folding of secondary structures as the way to pave a native folding pathway. Here, we demonstrate that a feature of a large hydrophobic surface area covering most side-chains on one side or the other side of adjacent $β$-strands of a $β$-sheet is prevail in almost all experimentally determined $β$-sheets, indicating that folding of $β$-sheets is most likely triggered by multistage hydrophobic interactions among neighbored side-chains of unfolded polypeptides, enable $β$-sheets fold reproducibly following explicit physical folding codes in aqueous environments. $β$-turns often contain five types of residues characterized with relatively small exposed hydrophobic proportions of their side-chains, that is explained as these residues can block hydrophobic effect among neighbored side-chains in sequence. Temperature dependence of the folding of $β$-sheet is thus attributed to temperature dependence of the strength of the hydrophobicity. The hydrophobic-effect-based mechanism responsible for $β$-sheets folding is verified by bioinformatics analyses of thousands of results available from experiments. The folding codes in amino acid sequence that dictate formation of a $β$-hairpin can be deciphered through evaluating hydrophobic interaction among side-chains of an unfolded polypeptide from a $β$-strand-like thermodynamic metastable state.

preprint2018arXiv

Spatially-correlated Site Occupancy in the Nonstoichiometric Meta-stable ε-Al60Sm11 Phase during Devitrification of Al-10.2 at.% Sm Glasses

A metastable ε-Al60Sm11 phase appears during the initial devitrification of as-quenched Al-10.2 at.% Sm glasses. The ε phase is nonstoichiometric in nature since Al occupation is observed on the 16f Sm lattice sites. Scanning transmission electron microscopic images reveal profound spatial correlation of Sm content on these sites, which cannot be explained by the "average crystal" description from Rietveld analysis of diffraction data. Thermodynamically favorable configurations, established by Monte Carlo (MC) simulations based on a cluster-expansion model, also give qualitatively different correlation functions from experimental observations. On the other hand, molecular dynamics simulations of the growth of ε-Al60Sm11 in undercooled liquid show that when the diffusion range of Sm is limited to ~ 4 Å, the correlation function of the as-grown crystal structure agrees well with that of the STEM images. Our results show that kinetic effects, especially the limited diffusivity of Sm atoms plays the fundamental role in determining the nonstoichiometric site occupancies of the ε-Al60Sm11 phase during the crystallization process.

preprint2017arXiv

A self-contained algorithm for determination of solid-liquid equilibria in an alloy system

We describe a self-contained procedure to evaluate the free energy of liquid and solid phases of an alloy system. The free energy of a single-element solid phase is calculated with thermodynamic integration using the Einstein crystal as the reference system. Then, free energy difference between the solid and liquid phases is calculated by Gibbs-Duhem integration. The central part of our method is the construction of a reversible alchemical path connecting a pure liquid and a liquid alloy to calculate the mixing enthalpy and entropy. We have applied the method to calculate the free energy of solid and liquid phases in the Al-Sm system. The driving force for fcc-Al nucleation in Al-Sm liquid and the melting curve for fcc-Al and Al3Sm are also calculated.

preprint2016arXiv

Decreased aneurysmal subarachnoid hemorrhage incidence rate in elderly population than in middle aged population: a retrospective analysis of 8,144 cases in Mainland China

Purpose: Rupture of an intracranial aneurysm is the most common cause of subarachnoid haemorrhage (SAH), which is a life-threatening acute cerebrovascular event that typically affects working-age people. This study aims to investigate the aneurysmal SAH incidence rate in elderly population than in middle aged population in China. Materials and methods: Aneurysmal SAH cases were collected retrospectively from the archives of 21 hospitals in Mainland China. All the cases collected were from September 2016 and backward consecutively for a period of time up to 8 years. SAH was initially diagnosed by brain computed tomography, and CT angiography (CTA) or digital subtraction angiography (DSA) was followed and SAH was confirmed to be due to cerebral aneurysm. When for cases multiple bleeding occurred, the age of the first SAH was used in this study. The toltal incidence from all hospital at each age were summed together for females and males; then adjusted by the total population number at each age for females and males. The total population data was from the 2010 population census of the People's Republic of China. Results: In total there were 8,144 cases, with 4,861 females and 3,283 males. Our analysis shows for both females and males the relative aneurysmal SAH rate started to decrease after around 65 years old. The males the relative aneurysmal SAH rate might have started to decrease after around 55 years old. Conclusion: In contrast to previous reports, our data demonstrated a decreased aneurysmal subarachnoid hemorrhage incidence rate in elderly population than in middle aged population. Our data therefore support the hypothesis that aneurysms do not grow progressively once they form but probably either rupture or stabilize and that very elderly patients are at a reduced risk of rupture compared with atients who are younger with the same-sized aneurysms.

preprint2016arXiv

Forward Backward Similarity Search in Knowledge Networks

Similarity search is a fundamental problem in social and knowledge networks like GitHub, DBLP, Wikipedia, etc. Existing network similarity measures are limited because they only consider similarity from the perspective of the query node. However, due to the complicated topology of real-world networks, ignoring the preferences of target nodes often results in odd or unintuitive performance. In this work, we propose a dual perspective similarity metric called Forward Backward Similarity (FBS) that efficiently computes topological similarity from the perspective of both the query node and the perspective of candidate nodes. The effectiveness of our method is evaluated by traditional quantitative ranking metrics and large-scale human judgement on four large real world networks. The proposed method matches human preference and outperforms other similarity search algorithms on community overlap and link prediction. Finally, we demonstrate top-5 rankings for five famous researchers on an academic collaboration network to illustrate how our approach captures semantics more intuitively than other approaches.

preprint2016arXiv

Microscopic Muscle Image Enhancement

We propose a robust image enhancement algorithm dedicated for muscle fiber specimen images captured by optical microscopes. Blur or out of focus problems are prevalent in muscle images during the image acquisition stage. Traditional image deconvolution methods do not work since they assume the blur kernels are known and also produce ring artifacts. We provide a compact framework which involves a novel spatially non-uniform blind deblurring approach specialized to muscle images which automatically detects and alleviates degraded regions. Ring artifacts problems are addressed and a kernel propagation strategy is proposed to speedup the algorithm and deals with the high non-uniformity of the blur kernels on muscle images. Experiments show that the proposed framework performs well on muscle images taken with modern advanced optical microscopes. Our framework is free of laborious parameter settings and is computationally efficient.

preprint2016arXiv

Online Offering Strategies for Storage-Assisted Renewable Power Producer in Hour-Ahead Market

A promising approach to hedge against the inherent uncertainty of renewable generation is to equip the renewable plants with energy storage systems. This paper focuses on designing profit maximization offering strategies, i.e., the strategies that determine the offering price and volume, for a storage-assisted renewable power producer that participates in hour-ahead electricity market. Designing the strategies is challenging since (i) the underlying problem is coupled across time due to the evolution of the storage level, and (ii) inputs to the problem including the renewable output and market clearing price are unknown when submitting offers. Following the competitive online algorithm design approach, we first study a basic setting where the renewable output and the clearing price are known for the next hour. We propose sOffer, a simple online offering strategy that achieves the best possible competitive ratio of O(log θ), where $θ$ is the ratio between the maximum and the minimum clearing prices. Then, we consider the case where the clearing price is unknown. By exploiting the idea of submitting multiple offers to combat price uncertainty, we propose mOffer, and demonstrate that the competitive ratio of mOffer converges to that of sOffer as the number of offers grows. Finally, we extend our approach to the scenario where the renewable output has forecasting error. We propose gOffer as the generalized offering strategy and characterize its competitive ratio as a function of the forecasting error. Our trace-driven experiments demonstrate that our algorithms achieve performance close to the offline optimal and outperform a baseline alternative significantly.

preprint2016arXiv

SemiContour: A Semi-supervised Learning Approach for Contour Detection

Supervised contour detection methods usually require many labeled training images to obtain satisfactory performance. However, a large set of annotated data might be unavailable or extremely labor intensive. In this paper, we investigate the usage of semi-supervised learning (SSL) to obtain competitive detection accuracy with very limited training data (three labeled images). Specifically, we propose a semi-supervised structured ensemble learning approach for contour detection built on structured random forests (SRF). To allow SRF to be applicable to unlabeled data, we present an effective sparse representation approach to capture inherent structure in image patches by finding a compact and discriminative low-dimensional subspace representation in an unsupervised manner, enabling the incorporation of abundant unlabeled patches with their estimated structured labels to help SRF perform better node splitting. We re-examine the role of sparsity and propose a novel and fast sparse coding algorithm to boost the overall learning efficiency. To the best of our knowledge, this is the first attempt to apply SSL for contour detection. Extensive experiments on the BSDS500 segmentation dataset and the NYU Depth dataset demonstrate the superiority of the proposed method.

preprint2015arXiv

On a Poissonian Change-Point Model with Variable Jump Size

A model of Poissonian observation having a jump (change-point) in the intensity function is considered. Two cases are studied. The first one corresponds to the situation when the jump size converges to a non-zero limit, while in the second one the limit is zero. The limiting likelihood ratios in these two cases are quite different. In the first case, like in the case of a fixed jump size, the normalized likelihood ratio converges to a log Poisson process. In the second case, the normalized likelihood ratio converges to a log Wiener process, and so, the statistical problems of parameter estimation and hypotheses testing are asymptotically equivalent in this case to the well known problems of change-point estimation and testing for the model of a signal in white Gaussian noise. The properties of the maximum likelihood and Bayesian estimators, as well as those of the general likelihood ratio, Wald's and Bayesian tests are deduced form the convergence of normalized likelihood ratios. The convergence of the moments of the estimators is also established. The obtained theoretical results are illustrated by numerical simulations.

preprint2015arXiv

On Hypothesis Testing for Poisson Processes. Regular Case

We consider the problem of hypothesis testing in the situation when the first hypothesis is simple and the second one is local one-sided composite. We describe the choice of the thresholds and the power functions of the Score Function test, of the General Likelihood Ratio test, of the Wald test and of two Bayes tests in the situation when the intensity function of the observed inhomogeneous Poisson process is smooth with respect to the parameter. It is shown that almost all these tests are asymptotically uniformly most powerful. The results of numerical simulations are presented.

preprint2015arXiv

On Hypothesis Testing for Poisson Processes. Singular Cases

We consider the problem of hypothesis testing in the situation where the first hypothesis is simple and the second one is local one-sided composite. We describe the choice of the thresholds and the power functions of different tests when the intensity function of the observed inhomogeneous Poisson process has two different types of singularity: cusp and discontinuity. The asymptotic results are illustrated by numerical simulations.

preprint2014arXiv

Dimensional evolution between one- and two-dimensional topological phases

Dimensional evolution between one- ($1D$) and two-dimensional ($2D$) topological phases is investigated systematically. The crossover from a $2D$ topological insulator to its $1D$ limit shows oscillating behavior between a $1D$ ordinary insulator and a $1D$ topological insulator. By constructing a $2D$ topological system from a $1D$ topological insulator, it is shown that there exist possibly weak topological phases in $2D$ time-reversal invariant band insulators, one of which can be realized in anisotropic systems. The topological invariant of the phase is $Z_{2}=0$. However the edge states may appear along specific boundaries. It can be interpreted as arranged $1D$ topological phases, and have symmetry-protecting nature as the corresponding $1D$ topological phase. Robust edge states can exist under specific conditions. These results provide further understanding on $2D$ time-reversal invariant insulators, and can be realized experimentally.

preprint2014arXiv

Global Spatio-temporal Patterns of Influenza in the Post-pandemic Era

We study the global spatio-temporal patterns of influenza dynamics. This is achieved by analysing and modelling weekly laboratory confirmed cases of influenza A and B from 138 countries between January 2006 and May 2014. The data were obtained from FluNet, the surveillance network compiled by the the World Health Organization. We report a pattern of {\it skip-and-resurgence} behavior between the years 2011 and 2013 for influenza H1N1/09, the strain responsible for the 2009 pandemic, in Europe and Eastern Asia. In particular, the expected H1N1/09 epidemic outbreak in 2011 failed to occur (or"skipped") in many countries across the globe, although an outbreak occurred in the following year. We also report a pattern of {\it well-synchronized} 2010 winter wave of H1N1/09 in the Northern Hemisphere countries, and a pattern of replacement of strain H1N1/77 by H1N1/09 between the 2009 and 2012 influenza seasons. Using both a statistical and a mechanistic mathematical model, and through fitting the data of 108 countries (108 countries in a statistical model and 10 large populations with a mechanistic model), we discuss the mechanisms that are likely to generate these events taking into account the role of multi-strain dynamics. A basic understanding of these patterns has important public health implications and scientific significance.

preprint2011arXiv

The Optical Counterpart of NGC 1313 X-1

We identify the optical counterpart of the ultraluminous X-ray source (ULX) NGC 1313 X-1 and discuss constraints on its physical nature from multiband optical spectra. There is a single object on Hubble Space Telescope (HST) images within the aspect-corrected Chandra X-ray error circle; a fainter, possibly extended, feature lies near the edge of the error circle. The brighter object showed prominent variation in the F555W band, but was constant in the F814W band. The spectrum was consistent with a single power-law on 2003 Nov 17, but deviated from this on 2004 Jul 17, suggestive of more than one emission component. Based on the location, magnitudes, spectral shape, and variability of the bright object, it is likely the ULX counterpart. The red wing of the spectrum around F814W may be due to emission from the companion star, and the blue wing is likely from disk emission. The stellar population around X-1 has an age older than 30 Myr, without very blue stars or young clusters. This places a constraint on the companion mass of the ULX as no more than 10 solar masses.

preprint2009arXiv

200km Decoy-state quantum key distribution with photon polarization

We demonstrate the decoy-state quantum key distribution over 200 km with photon polarization through optical fiber, by using super-conducting single photon detector with a repetition rate of 320 Mega Hz and a dark count rate of lower than 1 Hz. Since we have used the polarization coding, the synchronization pulses can be run in a low frequency. The final key rate is 14.1 Hz. The experiment lasts for 3089 seconds with 43555 total final bits.

preprint2009arXiv

Decoy-state quantum key distribution with both source errors and statistical fluctuations

We show how to calculate the fraction of single photon counts of the 3-intensity decoy-state quantum cryptography faithfully with both statistical fluctuations and source errors. Our results only rely on the bound values of a few parameters of the states of pulses.

Lin Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

53 published item(s)

A Large and Precise All-Sky Photometric Standard Star Dataset Across More Than 200 Passbands

A Production-Ready RL Framework for Personalized Utility Tuning with Pareto Sweeping in Pinterest Recommender Systems

Multi-modal Learning with Missing Modality in Predicting Axillary Lymph Node Metastasis

On the Model-Misspecification in Reinforcement Learning

Semantic Data Sourcing for 6G Edge Intelligence

3-D Markerless Tracking of Human Gait by Geometric Trilateration of Multiple Kinects

A Novel Physics-Regularized Interpretable Machine Learning Model for Grain Growth

Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology

ChrSNet: Chromosome Straightening using Self-attention Guided Networks

Consecutive topological phase transitions and colossal magnetoresistance in a magnetic topological semimetal

Discovery-and-Selection: Towards Optimal Multiple Instance Learning for Weakly Supervised Object Detection

Distributed Bandits with Heterogeneous Agents

End-to-end cell recognition by point annotation

Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Invariant Content Synergistic Learning for Domain Generalization of Medical Image Segmentation

Low-Latency Online Speaker Diarization with Graph-Based Label Generation

Magnetic interlayer coupling between ferromagnetic SrRuO$_3$ layers through a SrIrO$_3$ spacer

Mechanism, measurement, and quantification of stress in decision process: a model based systematic-review protocol

The low-entropy hydration shell at the binding site of spike RBD determines the contagiousness of SARS-CoV-2 variants

Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification

Weakly Supervised Learning for cell recognition in immunohistochemical cytoplasm staining images

A Thermodynamics Model for Mechanochemical Synthesis of Gold Nanoparticles: Implications for Solvent-free Nanoparticle Production

Correction to the photometric magnitudes of the Gaia Early Data Release 3

Hydrophobic interaction determines docking affinity of SARS CoV 2 variants with antibodies

J-PLUS: Stellar Parameters, C, N, Mg, Ca and [α/Fe] Abundances for Two Million Stars from DR1

Stellar Loci V: Photometric Metallicities of 27 Million FGK Stars based on Gaia Early Data Release 3

The DKU-Duke-Lenovo System Description for the Third DIHARD Speech Diarization Challenge

A hydrophobic-interaction-based mechanism trigger docking between the SARS CoV 2 spike and angiotensin-converting enzyme 2

Acoustic Word Embedding System for Code-Switching Query-by-example Spoken Term Detection

DIHARD II is Still Hard: Experimental Results and Discussions from the DKU-LENOVO Team

IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report

Mask Detection and Breath Monitoring from Speech: on Data Augmentation, Feature Representation and Modeling

Multi-modal Sentiment Analysis using Super Characters Method on Low-power CNN Accelerator Device

Nearly Linear Row Sampling Algorithm for Quantile Regression

Origin of the hump anomalies in the Hall resistance loops of ultrathin SrRuO$_3$/SrIrO$_3$ multilayers

Spin Excitations and Spin Wave Gap in the Ferromagnetic Weyl Semimetal Co$_3$Sn$_2$S$_2$

The role of hydrophobic interactions in folding of $β$-sheets

Spatially-correlated Site Occupancy in the Nonstoichiometric Meta-stable ε-Al60Sm11 Phase during Devitrification of Al-10.2 at.% Sm Glasses

A self-contained algorithm for determination of solid-liquid equilibria in an alloy system

Decreased aneurysmal subarachnoid hemorrhage incidence rate in elderly population than in middle aged population: a retrospective analysis of 8,144 cases in Mainland China

Forward Backward Similarity Search in Knowledge Networks

Microscopic Muscle Image Enhancement

Online Offering Strategies for Storage-Assisted Renewable Power Producer in Hour-Ahead Market

SemiContour: A Semi-supervised Learning Approach for Contour Detection

On a Poissonian Change-Point Model with Variable Jump Size

On Hypothesis Testing for Poisson Processes. Regular Case

On Hypothesis Testing for Poisson Processes. Singular Cases

Dimensional evolution between one- and two-dimensional topological phases

Global Spatio-temporal Patterns of Influenza in the Post-pandemic Era

The Optical Counterpart of NGC 1313 X-1

200km Decoy-state quantum key distribution with photon polarization

Decoy-state quantum key distribution with both source errors and statistical fluctuations