Source author record

Yu Li

Yu Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

74works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Simply Stabilizing the Loop via Fully Looped Transformer

Scaling model performance typically requires increasing model size. Looped Transformer offers a compelling alternative by iteratively reusing the same Transformer blocks, trading additional computation for improved performance without increasing parameter count or context length. Because the number of loop iterations can be adjusted at inference, it also provides a natural mechanism for balancing performance and test-time compute. However, Looped Transformer still suffers from training instability when the number of loop iterations increases. Our analysis reveals that this instability stems from two sources: gradient oscillation and residual explosion. To address these two problems, we propose the Fully Looped Transformer, which introduces two parameter-free modifications: (1) Fully Looped Architecture, which distributes inter-loop signals across all layers to mitigate residual explosion; (2) Attention Injection, which reuses the existing attention block to suppress gradient oscillation. These modifications stabilize training dynamics, enabling the Fully Looped Transformer to be trained stably up to 12 loop iterations, whereas other baseline looped models collapse in this regime. In milder settings where Looped Transformer does not collapse, Fully Looped Transformer still improves average downstream-task performance by up to 13.2\%. Overall, our experiments demonstrate that Fully Looped Transformer improves training stability, enhances downstream performance, and provides preliminary adaptability under different test-time compute budgets by varying loop iterations at inference.

preprint2023arXiv

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation. However, identifying novel drug combinations through wet-lab experiments is resource intensive due to the vast combinatorial search space. Recently, computational approaches, specifically deep learning models have emerged as an efficient way to discover synergistic combinations. While previous methods reported fair performance, their models usually do not take advantage of multi-modal data and they are unable to handle new drugs or cell lines. In this study, we collected data from various datasets covering various drug-related aspects. Then, we take advantage of large-scale pre-training models to generate informative representations and features for drugs, proteins, and diseases. Based on that, a message-passing graph is built on top to propagate information together with graph structure learning flexibility. This is first introduced in the biological networks and enables us to generate pseudo-relations in the graph. Our framework achieves state-of-the-art results in comparison with other deep learning-based methods on synergistic prediction benchmark datasets. We are also capable of inferencing new drug combination data in a test on an independent set released by AstraZeneca, where 10% of improvement over previous methods is observed. In addition, we're robust against unseen drugs and surpass almost 15% AU ROC compared to the second-best model. We believe our framework contributes to both the future wet-lab discovery of novel drugs and the building of promising guidance for precise combination medicine.

preprint2023arXiv

Heat kernel on Ricci shrinkers (II)

This paper is the sequel to our study of heat kernels on Ricci shrinkers in \cite{LW20}. In this paper, we improve many estimates in \cite{LW20} and extend the recent progress of Bamler \cite{Bam20a}. In particular, we drop the compactness and curvature boundedness assumptions and show that the theory of $\IF$-convergence holds naturally on any Ricci flows induced by Ricci shrinkers.

preprint2022arXiv

"Think Before You Speak": Improving Multi-Action Dialog Policy by Planning Single-Action Dialogs

Multi-action dialog policy (MADP), which generates multiple atomic dialog actions per turn, has been widely applied in task-oriented dialog systems to provide expressive and efficient system responses. Existing MADP models usually imitate action combinations from the labeled multi-action dialog samples. Due to data limitations, they generalize poorly toward unseen dialog flows. While interactive learning and reinforcement learning algorithms can be applied to incorporate external data sources of real users and user simulators, they take significant manual effort to build and suffer from instability. To address these issues, we propose Planning Enhanced Dialog Policy (PEDP), a novel multi-task learning framework that learns single-action dialog dynamics to enhance multi-action prediction. Our PEDP method employs model-based planning for conceiving what to express before deciding the current response through simulating single-action dialogs. Experimental results on the MultiWOZ dataset demonstrate that our fully supervised learning-based method achieves a solid task success rate of 90.6%, improving 3% compared to the state-of-the-art methods.

preprint2022arXiv

A physics and data co-driven surrogate modeling approach for temperature field prediction on irregular geometric domain

In the whole aircraft structural optimization loop, thermal analysis plays a very important role. But it faces a severe computational burden when directly applying traditional numerical analysis tools, especially when each optimization involves repetitive parameter modification and thermal analysis followed. Recently, with the fast development of deep learning, several Convolutional Neural Network (CNN) surrogate models have been introduced to overcome this obstacle. However, for temperature field prediction on irregular geometric domains (TFP-IGD), CNN can hardly be competent since most of them stem from processing for regular images. To alleviate this difficulty, we propose a novel physics and data co-driven surrogate modeling method. First, after adapting the Bezier curve in geometric parameterization, a body-fitted coordinate mapping is introduced to generate coordinate transforms between the irregular physical plane and regular computational plane. Second, a physics-driven CNN surrogate with partial differential equation (PDE) residuals as a loss function is utilized for fast meshing (meshing surrogate); then, we present a data-driven surrogate model based on the multi-level reduced-order method, aiming to learn solutions of temperature field in the above regular computational plane (thermal surrogate). Finally, combining the grid position information provided by the meshing surrogate with the scalar temperature field information provided by the thermal surrogate (combined model), we reach an end-to-end surrogate model from geometric parameters to temperature field prediction on an irregular geometric domain. Numerical results demonstrate that our method can significantly improve accuracy prediction on a smaller dataset while reducing the training time when compared with other CNN methods.

preprint2022arXiv

A Virtual Reality-based Training and Assessment System for Bridge Inspectors with an Assistant Drone

Over 600,000 bridges in the U.S. must be inspected every two years to identify flaws, defects, or potential problems that may need follow-up maintenance. Bridge inspection has adopted unmanned aerial vehicles (or drones) for improving safety, efficiency, and cost-effectiveness. Although drones can operate in an autonomous mode, keeping inspectors in the loop is critical for complex tasks in bridge inspection. Therefore, inspectors need to develop the skill and confidence to operate drones in their jobs. This paper presents the design and development of a virtual reality-based training and assessment system for inspectors assisted by a drone in bridge inspection. The system is composed of four integrated modules: a simulated bridge inspection developed in Unity, an interface that allows a trainee to operate the drone in simulation using a remote controller, data monitoring and analysis to provide real-time, in-task feedback to trainees to assist their learning, and a post-study assessment supporting personalized training. The paper also conducts a proof-of-concept pilot study to illustrate the functionality of this system. The study demonstrated that TASBID, as a tool for the early-stage training, can objectively identify the training needs of individuals in detail and, further, help them develop the skill and confidence in collaborating with a drone in bridge inspection. The system has built a modeling and analysis platform for exploring advanced solutions to the human-drone cooperative inspection of civil infrastructure.

preprint2022arXiv

Active Learning for Open-set Annotation

Existing active learning studies typically work in the closed-set setting by assuming that all data examples to be labeled are drawn from known classes. However, in real annotation tasks, the unlabeled data usually contains a large amount of examples from unknown classes, resulting in the failure of most active learning methods. To tackle this open-set annotation (OSA) problem, we propose a new active learning framework called LfOSA, which boosts the classification performance with an effective sampling strategy to precisely detect examples from known classes for annotation. The LfOSA framework introduces an auxiliary network to model the per-example max activation value (MAV) distribution with a Gaussian Mixture Model, which can dynamically select the examples with highest probability from known classes in the unlabeled set. Moreover, by reducing the temperature $T$ of the loss function, the detection model will be further optimized by exploiting both known and unknown supervision. The experimental results show that the proposed method can significantly improve the selection quality of known classes, and achieve higher classification accuracy with lower annotation cost than state-of-the-art active learning methods. To the best of our knowledge, this is the first work of active learning for open-set annotation.

preprint2022arXiv

Enhanced two-component superconductivity in CoSi2/TiSi2 heterojunctions

We report enhanced two-component superconductivity in (CoSi2/Si)/TiSi2 superconductor/normal-metal (S/N) heterojunctions. An enhanced superconducting transition temperature about twice that of CoSi2 and an upper critical field about 20 times bigger than that of epitaxial CoSi2/Si films were found. The tunneling spectra of three-terminal S/N junctions show pronounced zero-bias conductance peaks (ZBCPs) that signify penetration of odd-frequency, spin-triplet and even-parity Cooper pairs in TiSi2 from triplet dominant pairing in CoSi2/Si driven by symmetry reduction at the CoSi2/Si interface. Both the enhancement of the superconducting transition temperature and the ZBCPs are found to be more pronounced if TiSi2 is made more diffusive.

preprint2022arXiv

Genome-wide nucleotide-resolution model of single-strand break site reveals species evolutionary hierarchy

Single-strand breaks (SSBs) are the major DNA damage in the genome arising spontaneously as the outcome of genotoxins and intermediates of DNA transactions. SSBs play a crucial role in various biological processes and show a non-random distribution in the genome. Several SSB detection approaches such as S1 END-seq and SSiNGLe-ILM emerged to characterize the genomic landscape of SSB with nucleotide resolution. However, these sequencing-based methods are costly and unfeasible for large-scale analysis of diverse species. Thus, we proposed the first computational approach, SSBlazer, which is an explainable and scalable deep learning framework for genome-wide nucleotide-resolution SSB site prediction. We demonstrated that SSBlazer can accurately predict SSB sites and sufficiently alleviate false positives by constructing an imbalanced dataset to simulate the realistic SSB distribution. The model interpretation analysis reveals that SSBlazer captures the pattern of individual CpG in genomic context and the motif of TGCC in the center region as critical features. Besides, SSBlazer is a lightweight model with robust cross-species generalization ability in the cross-species evaluation, which enables the large-scale genome-wide application in diverse species. Strikingly, the putative SSB genomic landscapes of 216 vertebrates reveal a negative correlation between SSB frequency and evolutionary hierarchy, suggesting that the genome tends to be integrity during evolution.

preprint2022arXiv

Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image Retrieval

The task of hot-refresh model upgrades of image retrieval systems plays an essential role in the industry but has never been investigated in academia before. Conventional cold-refresh model upgrades can only deploy new models after the gallery is overall backfilled, taking weeks or even months for massive data. In contrast, hot-refresh model upgrades deploy the new model immediately and then gradually improve the retrieval accuracy by backfilling the gallery on-the-fly. Compatible training has made it possible, however, the problem of model regression with negative flips poses a great challenge to the stable improvement of user experience. We argue that it is mainly due to the fact that new-to-old positive query-gallery pairs may show less similarity than new-to-new negative pairs. To solve the problem, we introduce a Regression-Alleviating Compatible Training (RACT) method to properly constrain the feature compatibility while reducing negative flips. The core is to encourage the new-to-old positive pairs to be more similar than both the new-to-old negative pairs and the new-to-new negative pairs. An efficient uncertainty-based backfilling strategy is further introduced to fasten accuracy improvements. Extensive experiments on large-scale retrieval benchmarks (e.g., Google Landmark) demonstrate that our RACT effectively alleviates the model regression for one more step towards seamless model upgrades. The code will be available at https://github.com/binjiezhang/RACT_ICLR2022.

preprint2022arXiv

Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions

Non-coding RNA structure and function are essential to understanding various biological processes, such as cell signaling, gene expression, and post-transcriptional regulations. These are all among the core problems in the RNA field. With the rapid growth of sequencing technology, we have accumulated a massive amount of unannotated RNA sequences. On the other hand, expensive experimental observatory results in only limited numbers of annotated data and 3D structures. Hence, it is still challenging to design computational methods for predicting their structures and functions. The lack of annotated data and systematic study causes inferior performance. To resolve the issue, we propose a novel RNA foundation model (RNA-FM) to take advantage of all the 23 million non-coding RNA sequences through self-supervised learning. Within this approach, we discover that the pre-trained RNA-FM could infer sequential and evolutionary information of non-coding RNAs without using any labels. Furthermore, we demonstrate RNA-FM's effectiveness by applying it to the downstream secondary/3D structure prediction, SARS-CoV-2 genome structure and evolution prediction, protein-RNA binding preference modeling, and gene expression regulation modeling. The comprehensive experiments show that the proposed method improves the RNA structural and functional modelling results significantly and consistently. Despite only being trained with unlabelled data, RNA-FM can serve as the foundational model for the field.

preprint2022arXiv

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

Knowledge-grounded dialogue systems are challenging to build due to the lack of training data and heterogeneous knowledge sources. Existing systems perform poorly on unseen topics due to limited topics covered in the training data. In addition, heterogeneous knowledge sources make it challenging for systems to generalize to other tasks because knowledge sources in different knowledge representations require different knowledge encoders. To address these challenges, we present PLUG, a language model that homogenizes different knowledge sources to a unified knowledge representation for knowledge-grounded dialogue generation tasks. PLUG is pre-trained on a dialogue generation task conditioned on a unified essential knowledge representation. It can generalize to different downstream knowledge-grounded dialogue generation tasks with a few training examples. The empirical evaluation on two benchmarks shows that our model generalizes well across different knowledge-grounded tasks. It can achieve comparable performance with state-of-the-art methods under a fully-supervised setting and significantly outperforms other methods in zero-shot and few-shot settings.

preprint2022arXiv

MixDefense: A Defense-in-Depth Framework for Adversarial Example Detection Based on Statistical and Semantic Analysis

Machine learning with deep neural networks (DNNs) has become one of the foundation techniques in many safety-critical systems, such as autonomous vehicles and medical diagnosis systems. DNN-based systems, however, are known to be vulnerable to adversarial examples (AEs) that are maliciously perturbed variants of legitimate inputs. While there has been a vast body of research to defend against AE attacks in the literature, the performances of existing defense techniques are still far from satisfactory, especially for adaptive attacks, wherein attackers are knowledgeable about the defense mechanisms and craft AEs accordingly. In this work, we propose a multilayer defense-in-depth framework for AE detection, namely MixDefense. For the first layer, we focus on those AEs with large perturbations. We propose to leverage the `noise' features extracted from the inputs to discover the statistical difference between natural images and tampered ones for AE detection. For AEs with small perturbations, the inference result of such inputs would largely deviate from their semantic information. Consequently, we propose a novel learning-based solution to model such contradictions for AE detection. Both layers are resilient to adaptive attacks because there do not exist gradient propagation paths for AE generation. Experimental results with various AE attack methods on image classification datasets show that the proposed MixDefense solution outperforms the existing AE detection techniques by a considerable margin.

preprint2022arXiv

Multipole-fluctuation pairing mechanism of $d_{x^2-y^2}+ig$ superconductivity in Sr$_2$RuO$_4$

Despite of many experimental and theoretical efforts, the pairing symmetry of superconductivity in Sr$_2$RuO$_4$ remains undecided. The accidentally degenerate $d_{x^2-y^2}+ig$ is consistent with most current experiments and seems to be one of the most probable candidates, but we still lack a satisfactory theoretical mechanism for its appearance. Here we construct a phenomenological model combining realistic electronic band structures and all symmetry-allowed multipole fluctuations as potential pairing glues, and make a systematic survey of major pairing states within the Eliashberg framework. Our calculations show that $d_{x^2-y^2}+ig$ can arise naturally from the interplay of antiferromagnetic, ferromagnetic, and electric multipole fluctuations whose coexistence is manifested in previous experiments and calculations. Our work provides a physically reasonable basis supporting the possibility of $d_{x^2-y^2}+ig$ pairing in superconducting Sr$_2$RuO$_4$.

preprint2022arXiv

Neutron Scattering Study of Fluctuating and Static Spin Correlations in the Anisotropic Spin Glass Fe$_2$TiO$_5$

The anisotropic spin glass transition, in which spin freezing is observed only along the c-axis in pseudobrookite Fe$_2$TiO$_5$, has long been perplexing because the Fe$^{3+}$ moments (d$^5$) are expected to be isotropic. Recently, neutron diffraction demonstrated that surfboard-shaped antiferromagnetic nanoregions coalesce above the glass transition temperature, T$_g$ $\approx$ 55 K, and a model was proposed in which the freezing of the fluctuations of the surfboards' magnetization leads to the anisotropic spin glass state. Given this new model, we have carried out high resolution inelastic neutron scattering measurements of the spin-spin correlations to understand the temperature dependence of the intra-surfboard spin dynamics on neutron (picosecond) time-scales. Here, we report on the temperature-dependence of the spin fluctuations measured from single crystal Fe$_2$TiO$_5$. Strong quasi-elastic magnetic scattering, arising from intra-surfboard correlations, is observed well above T$_g$. The spin fluctuations possess a steep energy-wave vector relation and are indicative of strong exchange interactions, consistent with the large Curie-Weiss temperature. As the temperature approaches T$_g$ from above, a shift in spectral weight from inelastic to elastic scattering is observed. At various temperatures between 4 K and 300 K, a characteristic relaxation rate of the fluctuations is determined. Despite the freezing of the majority of the spin correlations, an inelastic contribution remains even at base temperature, signifying the presence of fluctuating intra-surfboard spin correlations to at least T/T$_g$ $\approx$ 0.1 consistent with a description of Fe$_2$TiO$_5$ as a hybrid between conventional and geometrically frustrated spin glasses.

preprint2022arXiv

Parameter-robust Braess-Sarazin-type smoothers for linear elasticity problems

In this work, we propose three Braess-Sarazin-type multigrid relaxation schemes for solving linear elasticity problems, where the marker and cell scheme, a finite difference method, is used for the discretization. The three relaxation schemes are Jacobi-Braess-Sarazin, Mass-Braess-Sarazin, and Vanka-Braess-Sarazin. A local Fourier analysis (LFA) for the block-structured relaxation schemes is presented to study multigrid convergence behavior. From LFA, we derive optimal LFA smoothing factor for each case. We obtain highly efficient smoothing factors, which are independent of Lamé constants. Vanka-Braess-Sarazin relaxation scheme leads to the most efficient one. In each relaxation, a Schur complement system needs to be solved. Due to the fact that direct solve is often expensive, an inexact version is developed, where we simply use at most three weighted Jacobi iterations on the Schur complement system. Finally, two-grid and V-cycle multigrid performances are presented to validate our theoretical results. Our numerical results show that inexact versions can achieve the same performance as that of exact versions and our methods are robust to the Lamé constants.

preprint2022arXiv

Rethinking Knowledge Distillation via Cross-Entropy

Knowledge Distillation (KD) has developed extensively and boosted various tasks. The classical KD method adds the KD loss to the original cross-entropy (CE) loss. We try to decompose the KD loss to explore its relation with the CE loss. Surprisingly, we find it can be regarded as a combination of the CE loss and an extra loss which has the identical form as the CE loss. However, we notice the extra loss forces the student's relative probability to learn the teacher's absolute probability. Moreover, the sum of the two probabilities is different, making it hard to optimize. To address this issue, we revise the formulation and propose a distributed loss. In addition, we utilize teachers' target output as the soft target, proposing the soft loss. Combining the soft loss and the distributed loss, we propose a new KD loss (NKD). Furthermore, we smooth students' target output to treat it as the soft target for training without teachers and propose a teacher-free new KD loss (tf-NKD). Our method achieves state-of-the-art performance on CIFAR-100 and ImageNet. For example, with ResNet-34 as the teacher, we boost the ImageNet Top-1 accuracy of ResNet18 from 69.90% to 71.96%. In training without teachers, MobileNet, ResNet-18 and SwinTransformer-Tiny achieve 70.04%, 70.76%, and 81.48%, which are 0.83%, 0.86%, and 0.30% higher than the baseline, respectively. The code is available at https://github.com/yzd-v/cls_KD.

preprint2022arXiv

Semantic Guided Single Image Reflection Removal

Reflection is common in images capturing scenes behind a glass window, which is not only a disturbance visually but also influence the performance of other computer vision algorithms. Single image reflection removal is an ill-posed problem because the color at each pixel needs to be separated into two values, i.e., the desired clear background and the reflection. To solve it, existing methods propose priors such as smoothness, color consistency. However, the low-level priors are not reliable in complex scenes, for instance, when capturing a real outdoor scene through a window, both the foreground and background contain both smooth and sharp area and a variety of color. In this paper, inspired by the fact that human can separate the two layers easily by recognizing the objects, we use the object semantic as guidance to force the same semantic object belong to the same layer. Extensive experiments on different datasets show that adding the semantic information offers a significant improvement to reflection separation. We also demonstrate the applications of the proposed method to other computer vision tasks.

preprint2022arXiv

Temporally Efficient Vision Transformer for Video Instance Segmentation

Recently vision transformer has achieved tremendous success on image-level visual recognition tasks. To effectively and efficiently model the crucial temporal information within a video clip, we propose a Temporally Efficient Vision Transformer (TeViT) for video instance segmentation (VIS). Different from previous transformer-based VIS methods, TeViT is nearly convolution-free, which contains a transformer backbone and a query-based video instance segmentation head. In the backbone stage, we propose a nearly parameter-free messenger shift mechanism for early temporal context fusion. In the head stages, we propose a parameter-shared spatiotemporal query interaction mechanism to build the one-to-one correspondence between video instances and queries. Thus, TeViT fully utilizes both framelevel and instance-level temporal context information and obtains strong temporal modeling capacity with negligible extra computational cost. On three widely adopted VIS benchmarks, i.e., YouTube-VIS-2019, YouTube-VIS-2021, and OVIS, TeViT obtains state-of-the-art results and maintains high inference speed, e.g., 46.6 AP with 68.9 FPS on YouTube-VIS-2019. Code is available at https://github.com/hustvl/TeViT.

preprint2022arXiv

Towards explainable artificial intelligence (XAI) for early anticipation of traffic accidents

Traffic accident anticipation is a vital function of Automated Driving Systems (ADSs) for providing a safety-guaranteed driving experience. An accident anticipation model aims to predict accidents promptly and accurately before they occur. Existing Artificial Intelligence (AI) models of accident anticipation lack a human-interpretable explanation of their decision-making. Although these models perform well, they remain a black-box to the ADS users, thus difficult to get their trust. To this end, this paper presents a Gated Recurrent Unit (GRU) network that learns spatio-temporal relational features for the early anticipation of traffic accidents from dashcam video data. A post-hoc attention mechanism named Grad-CAM is integrated into the network to generate saliency maps as the visual explanation of the accident anticipation decision. An eye tracker captures human eye fixation points for generating human attention maps. The explainability of network-generated saliency maps is evaluated in comparison to human attention maps. Qualitative and quantitative results on a public crash dataset confirm that the proposed explainable network can anticipate an accident on average 4.57 seconds before it occurs, with 94.02% average precision. In further, various post-hoc attention-based XAI methods are evaluated and compared. It confirms that the Grad-CAM chosen by this study can generate high-quality, human-interpretable saliency maps (with 1.23 Normalized Scanpath Saliency) for explaining the crash anticipation decision. Importantly, results confirm that the proposed AI model, with a human-inspired design, can outperform humans in the accident anticipation.

preprint2022arXiv

Towards Vivid and Diverse Image Colorization with Generative Color Prior

Colorization has attracted increasing interest in recent years. Classic reference-based methods usually rely on external color images for plausible results. A large image database or online search engine is inevitably required for retrieving such exemplars. Recent deep-learning-based methods could automatically colorize images at a low cost. However, unsatisfactory artifacts and incoherent colors are always accompanied. In this work, we propose GCP-Colorization that leverages the rich and diverse color priors encapsulated in a pretrained Generative Adversarial Networks (GAN) for automatic colorization. Specifically, we first "retrieve" matched features (similar to exemplars) via a GAN encoder and then incorporate these features into the colorization process with feature modulations. Thanks to the powerful generative color prior (GCP) and delicate designs, our GCP-Colorization could produce vivid colors with a single forward pass. Moreover, it is highly convenient to obtain diverse results by modifying GAN latent codes. GCP-Colorization also inherits the merit of interpretable controls of GANs and could attain controllable and smooth transitions by walking through GAN latent space. Extensive experiments and user studies demonstrate that GCP-Colorization achieves superior performance than previous works. Codes are available at https://github.com/ToTheBeginning/GCP-Colorization.

preprint2022arXiv

Unseasonal super ionospheric plasma bubble and scintillations seeded by the 2022 Tonga Volcano Eruption related perturbations

The Hunga-Tonga volcano eruption at 04:14:45 UT on 15 January 2022 produced various waves propagating globally, disturbing the background atmosphere and ionosphere. Coinciding with the arrival of perturbation waves, several equatorial plasma bubbles (EPBs) were consecutively generated at post-sunset hours over the East/Southeast Asian region, with the largest extension to middle latitudes. These EPBs caused intense L-band amplitude scintillations at middle-to-low latitudes, with signal fading depths up to ~16 dB. Considering the very rare occurrence of EPBs during this season in East/Southeast Asian sector and the significantly modulated background ionosphere, we believe that the perturbation waves launched by the volcano eruption triggered the generation of unseasonal super EPBs. The ionospheric perturbations linked with the 2022 Tonga volcano eruption propagated coincidently through the East/Southeast Asia longitude sector near sunset, modulated the equatorial F region bottomside plasma density and acted as the seeding source for the generation of unseasonal super bubbles. Our results implicate that volcano eruption could indirectly affect the satellite communication links in the region more than ten thousand kilometers away.

preprint2022arXiv

Using Chatbots to Teach Languages

This paper reports on progress towards building an online language learning tool to provide learners with conversational experience by using dialog systems as conversation practice partners. Our system can adapt to users' language proficiency on the fly. We also provide automatic grammar error feedback to help users learn from their mistakes. According to our first adopters, our system is entertaining and useful. Furthermore, we will provide the learning technology community a large-scale conversation dataset on language learning and grammar correction. Our next step is to make our system more adaptive to user profile information by using reinforcement learning algorithms.

preprint2022arXiv

What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction

Adversarial examples (AEs) pose severe threats to the applications of deep neural networks (DNNs) to safety-critical domains, e.g., autonomous driving. While there has been a vast body of AE defense solutions, to the best of our knowledge, they all suffer from some weaknesses, e.g., defending against only a subset of AEs or causing a relatively high accuracy loss for legitimate inputs. Moreover, most existing solutions cannot defend against adaptive attacks, wherein attackers are knowledgeable about the defense mechanisms and craft AEs accordingly. In this paper, we propose a novel AE detection framework based on the very nature of AEs, i.e., their semantic information is inconsistent with the discriminative features extracted by the target DNN model. To be specific, the proposed solution, namely ContraNet, models such contradiction by first taking both the input and the inference result to a generator to obtain a synthetic output and then comparing it against the original input. For legitimate inputs that are correctly inferred, the synthetic output tries to reconstruct the input. On the contrary, for AEs, instead of reconstructing the input, the synthetic output would be created to conform to the wrong label whenever possible. Consequently, by measuring the distance between the input and the synthetic output with metric learning, we can differentiate AEs from legitimate inputs. We perform comprehensive evaluations under various AE attack scenarios, and experimental results show that ContraNet outperforms existing solutions by a large margin, especially under adaptive attacks. Moreover, our analysis shows that successful AEs that can bypass ContraNet tend to have much-weakened adversarial semantics. We have also shown that ContraNet can be easily combined with adversarial training techniques to achieve further improved AE defense capabilities.

preprint2021arXiv

CLiMP: A Benchmark for Chinese Language Model Evaluation

Linguistically informed analyses of language models (LMs) contribute to the understanding and improvement of these models. Here, we introduce the corpus of Chinese linguistic minimal pairs (CLiMP), which can be used to investigate what knowledge Chinese LMs acquire. CLiMP consists of sets of 1,000 minimal pairs (MPs) for 16 syntactic contrasts in Mandarin, covering 9 major Mandarin linguistic phenomena. The MPs are semi-automatically generated, and human agreement with the labels in CLiMP is 95.8%. We evaluated 11 different LMs on CLiMP, covering n-grams, LSTMs, and Chinese BERT. We find that classifier-noun agreement and verb complement selection are the phenomena that models generally perform best at. However, models struggle the most with the ba construction, binding, and filler-gap dependencies. Overall, Chinese BERT achieves an 81.8% average accuracy, while the performances of LSTMs and 5-grams are only moderately above chance level.

preprint2021arXiv

On the structure of Ricci shrinkers

We develop a structure theory for non-collapsed Ricci shrinkers without any curvature condition. As applications, we obtain some curvature estimates of the Ricci shrinkers depending only on the non-collapsing constant.

preprint2021arXiv

Orbital-Selective High-Temperature Cooper Pairing Developed in the Two-Dimensional Limit

The orbital multiplicity in multiband superconductors yields orbital differentiation in normal-state properties, and can lead to orbital-selective spin-fluctuation Cooper pairing. This phenomenon has become increasingly pivotal in clarifying the pairing 'enigma' particularly for multiband high-temperature superconductors. In one-unit-cell (1-UC) FeSe/SrTiO3, the thinnest and highest-Tc member of iron-based superconductors, the standard electron-hole Fermi pocket nesting scenario is apparently not applicable since the Gamma-centered hole pockets are absent, so the actual pairing mechanism is the subject of intense debate. Here, by measuring high-resolution Bogoliubov quasiparticle interference, we report observations of highly anisotropic magnetic Cooper pairing in 1-UC FeSe. From a theoretical point of view, it is important to incorporate effects of electronic correlations within a spin-fluctuation pairing calculation, where the dxy orbital becomes coherence-suppressed. The resulting pairing gap is compatible with the experimental findings, which suggests that high-Tc Cooper pairing with orbital selectivity applies to 1-UC FeSe. Our findings imply the general existence of orbital selectivity in iron-based superconductors and the universal importance of electron correlations in high-Tc superconductors.

preprint2021arXiv

Two-Stage Single Image Reflection Removal with Reflection-Aware Guidance

Removing undesired reflection from an image captured through a glass surface is a very challenging problem with many practical application scenarios. For improving reflection removal, cascaded deep models have been usually adopted to estimate the transmission in a progressive manner. However, most existing methods are still limited in exploiting the result in prior stage for guiding transmission estimation. In this paper, we present a novel two-stage network with reflection-aware guidance (RAGNet) for single image reflection removal (SIRR). To be specific, the reflection layer is firstly estimated due to that it generally is much simpler and is relatively easier to estimate. Reflectionaware guidance (RAG) module is then elaborated for better exploiting the estimated reflection in predicting transmission layer. By incorporating feature maps from the estimated reflection and observation, RAG can be used (i) to mitigate the effect of reflection from the observation, and (ii) to generate mask in partial convolution for mitigating the effect of deviating from linear combination hypothesis. A dedicated mask loss is further presented for reconciling the contributions of encoder and decoder features. Experiments on five commonly used datasets demonstrate the quantitative and qualitative superiority of our RAGNet in comparison to the state-of-the-art SIRR methods. The source code and pre-trained model are available at https://github.com/liyucs/RAGNet.

preprint2020arXiv

A new generalized inverse of matrices from core-EP decomposition

A new generalized inverse for a square matrix $H\in\mathbb{C}^{n\times n}$, called CCE-inverse, is established by the core-EP decomposition and Moore-Penrose inverse $H^†$. We propose some characterizations of the CCE-inverse. Furthermore, two canonical forms of the CCE-inverse are presented. At last, we introduce the definitions of CCE-matrices and $k$-CCE matrices, and prove that CCE-matrices are the same as $i$-EP matrices studied by Wang and Liu in [The weak group matrix, Aequationes Mathematicae, 93(6): 1261-1273, 2019].

preprint2020arXiv

Ancient solutions to the Ricci flow with isotropic curvature conditions

We show that every $n$-dimensional, $κ$-noncollapsed, noncompact, complete ancient solution to the Ricci flow with uniformly PIC for $n=4$ or $n\ge 12$ has weakly PIC$_2$ and bounded curvature. Combining this with earlier results, we prove that any such solution is isometric to either a family of shrinking cylinders (or a quotient thereof) or the Bryant soliton. Also, we classify all complex 2-dimensional, $κ$-noncollapsed, complete ancient solutions to the Kähler Ricci flow with weakly PIC.

preprint2020arXiv

Attention Guided Low-light Image Enhancement with a Large Scale Low-light Simulation Dataset

Low-light image enhancement is challenging in that it needs to consider not only brightness recovery but also complex issues like color distortion and noise, which usually hide in the dark. Simply adjusting the brightness of a low-light image will inevitably amplify those artifacts. To address this difficult problem, this paper proposes a novel end-to-end attention-guided method based on multi-branch convolutional neural network. To this end, we first construct a synthetic dataset with carefully designed low-light simulation strategies. The dataset is much larger and more diverse than existing ones. With the new dataset for training, our method learns two attention maps to guide the brightness enhancement and denoising tasks respectively. The first attention map distinguishes underexposed regions from well lit regions, and the second attention map distinguishes noises from real textures. With their guidance, the proposed multi-branch decomposition-and-fusion enhancement network works in an input adaptive way. Moreover, a reinforcement-net further enhances color and contrast of the output image. Extensive experiments on multiple datasets demonstrate that our method can produce high fidelity enhancement results for low-light images and outperforms the current state-of-the-art methods by a large margin both quantitatively and visually.

preprint2020arXiv

Classification Calibration for Long-tail Instance Segmentation

Remarkable progress has been made in object instance detection and segmentation in recent years. However, existing state-of-the-art methods are mostly evaluated with fairly balanced and class-limited benchmarks, such as Microsoft COCO dataset [8]. In this report, we investigate the performance drop phenomenon of state-of-the-art two-stage instance segmentation models when processing extreme long-tail training data based on the LVIS [5] dataset, and find a major cause is the inaccurate classification of object proposals. Based on this observation, we propose to calibrate the prediction of classification head to improve recognition performance for the tail classes. Without much additional cost and modification of the detection model architecture, our calibration method improves the performance of the baseline by a large margin on the tail classes. Codes will be available. Importantly, after the submission, we find significant improvement can be further achieved by modifying the calibration head, which we will update later.

preprint2020arXiv

Dual Semantic Fusion Network for Video Object Detection

Video object detection is a tough task due to the deteriorated quality of video sequences captured under complex environments. Currently, this area is dominated by a series of feature enhancement based methods, which distill beneficial semantic information from multiple frames and generate enhanced features through fusing the distilled information. However, the distillation and fusion operations are usually performed at either frame level or instance level with external guidance using additional information, such as optical flow and feature memory. In this work, we propose a dual semantic fusion network (abbreviated as DSFNet) to fully exploit both frame-level and instance-level semantics in a unified fusion framework without external guidance. Moreover, we introduce a geometric similarity measure into the fusion process to alleviate the influence of information distortion caused by noise. As a result, the proposed DSFNet can generate more robust features through the multi-granularity fusion and avoid being affected by the instability of external guidance. To evaluate the proposed DSFNet, we conduct extensive experiments on the ImageNet VID dataset. Notably, the proposed dual semantic fusion network achieves, to the best of our knowledge, the best performance of 84.1\% mAP among the current state-of-the-art video object detectors with ResNet-101 and 85.4\% mAP with ResNeXt-101 without using any post-processing steps.

preprint2020arXiv

Fast Video Object Segmentation using the Global Context Module

We developed a real-time, high-quality semi-supervised video object segmentation algorithm. Its accuracy is on par with the most accurate, time-consuming online-learning model, while its speed is similar to the fastest template-matching method with sub-optimal accuracy. The core component of the model is a novel global context module that effectively summarizes and propagates information through the entire video. Compared to previous approaches that only use one frame or a few frames to guide the segmentation of the current frame, the global context module uses all past frames. Unlike the previous state-of-the-art space-time memory network that caches a memory at each spatio-temporal position, the global context module uses a fixed-size feature representation. Therefore, it uses constant memory regardless of the video length and costs substantially less memory and computation. With the novel module, our model achieves top performance on standard benchmarks at a real-time speed.

preprint2020arXiv

Hyperspectral City V1.0 Dataset and Benchmark

This document introduces the background and the usage of the Hyperspectral City Dataset and the benchmark. The documentation first starts with the background and motivation of the dataset. Follow it, we briefly describe the method of collecting the dataset and the processing method from raw dataset to the final release dataset, specifically, the version 1.0. We also provide the detailed usage of the dataset and the evaluation metric for submitted the result for the 2019 Hyperspectral City Challenge.

preprint2020arXiv

Learning to Stop While Learning to Predict

There is a recent surge of interest in designing deep architectures based on the update steps in traditional algorithms, or learning neural networks to improve and replace traditional algorithms. While traditional algorithms have certain stopping criteria for outputting results at different iterations, many algorithm-inspired deep models are restricted to a ``fixed-depth'' for all inputs. Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances, either to avoid ``over-thinking'', or because we want to compute less for operations converged already. In this paper, we tackle this varying depth problem using a steerable architecture, where a feed-forward deep model and a variational stopping policy are learned together to sequentially determine the optimal number of layers for each input instance. Training such architecture is very challenging. We provide a variational Bayes perspective and design a novel and effective training procedure which decomposes the task into an oracle model learning stage and an imitation stage. Experimentally, we show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks, including learning sparse recovery, few-shot meta learning, and computer vision tasks.

preprint2020arXiv

On the regular-convexity of Ricci shrinker limit spaces

In this paper, we study the structure of the pointed-Gromov-Hausdorff limits of sequences of Ricci shrinkers. We define a regular-singular decomposition following the work of Cheeger-Colding for manifolds with a uniform Ricci curvature lower bound, and prove that the regular part of any Ricci shrinker limit space is convex, inspired by Colding-Naber's original idea of parabolic smoothing of the distance functions.

preprint2020arXiv

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

Solving long-tail large vocabulary object detection with deep learning based models is a challenging and demanding task, which is however under-explored.In this work, we provide the first systematic analysis on the underperformance of state-of-the-art models in front of long-tail distribution. We find existing detection methods are unable to model few-shot classes when the dataset is extremely skewed, which can result in classifier imbalance in terms of parameter magnitude. Directly adapting long-tail classification models to detection frameworks can not solve this problem due to the intrinsic difference between detection and classification.In this work, we propose a novel balanced group softmax (BAGS) module for balancing the classifiers within the detection frameworks through group-wise training. It implicitly modulates the training process for the head and tail classes and ensures they are both sufficiently trained, without requiring any extra sampling for the instances from the tail classes.Extensive experiments on the very recent long-tail large vocabulary object recognition benchmark LVIS show that our proposed BAGS significantly improves the performance of detectors with various backbones and frameworks on both object detection and instance segmentation. It beats all state-of-the-art methods transferred from long-tail image classification and establishes new state-of-the-art.Code is available at https://github.com/FishYuLi/BalancedGroupSoftmax.

preprint2020arXiv

RNA Secondary Structure Prediction By Learning Unrolled Algorithms

In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts significantly better structures compared to previous SOTA (especially for pseudoknotted structures), while being as efficient as the fastest algorithms in terms of inference time.

preprint2020arXiv

Topological Hall effect and the magnetic states of Nowotney chimney ladder compound Cr$_{11}$Ge$_{19}$}

We have investigated the magnetic and charge transport properties of single crystals of Nowotney Chimney Ladder compound Cr$_{11}$Ge$_{19}$ and mapped out a comprehensive phase diagram reflecting the complicated interplay between the Dzyaloshinskii-Moriya (DM) interaction, the dipolar interaction, and the magnetic anisotropy. We have identified a set of interesting magnetic phases and attributed a finite topological Hall effect to the recently discovered bi-skyrmion phase. These data also suggest the existence of an anti-skyrmion state at finite fields for temperatures just below the magnetic ordering temperature, $T_c$, as indicated by a distinct change in sign of the topological Hall effect. Above $T_c$, we discovered a region of enhanced magnetic response corresponding to a disordered phase likely existing near the ferromagnetic critical point under small magnetic fields. Strong spin chirality fluctuations are demonstrated by the large value of the topological Hall resistivity persisting up to 1 T which is most likely due to the existence of the DM interaction. We argue that changes to the topological Hall effect correspond to different topological spin textures that are controlled by magnetic dipolar and DM interactions that vary in importance with temperature.

preprint2020arXiv

Towards Structured Prediction in Bioinformatics with Deep Learning

Using machine learning, especially deep learning, to facilitate biological research is a fascinating research direction. However, in addition to the standard classification or regression problems, in bioinformatics, we often need to predict more complex structured targets, such as 2D images and 3D molecular structures. The above complex prediction tasks are referred to as structured prediction. Structured prediction is more complicated than the traditional classification but has much broader applications, considering that most of the original bioinformatics problems have complex output objects. Due to the properties of those structured prediction problems, such as having problem-specific constraints and dependency within the labeling space, the straightforward application of existing deep learning models can lead to unsatisfactory results. Here, we argue that the following ideas can help resolve structured prediction problems in bioinformatics. Firstly, we can combine deep learning with other classic algorithms, such as probabilistic graphical models, which model the problem structure explicitly. Secondly, we can design the problem-specific deep learning architectures or methods by considering the structured labeling space and problem constraints, either explicitly or implicitly. We demonstrate our ideas with six projects from four bioinformatics subfields, including sequencing analysis, structure prediction, function annotation, and network analysis. The structured outputs cover 1D signals, 2D images, 3D structures, hierarchical labeling, and heterogeneous networks. With the help of the above ideas, all of our methods can achieve SOTA performance on the corresponding problems. The success of these projects motivates us to extend our work towards other more challenging but important problems, such as health-care problems, which can directly benefit people's health and wellness.

preprint2020arXiv

Transient Grating Spectroscopy of Photocarrier Dynamics in Semiconducting Polymer Thin Films

While charge carrier dynamics and thermal management are both keys to the operational efficiency and stability for energy-related devices, experimental techniques that can simultaneously characterize both properties are still lacking. In this paper, we use laser-induced transient grating (TG) spectroscopy to characterize thin films of the archetypal organic semiconductor regioregular poly(3-hexylthiophene) (P3HT) and its blends with the electron acceptor [6,6]-phenyl-C61-butyric acid methyl ester (PCBM) on glass substrates. While the thermal response is determined to be dominated by the substrates, we show that the recombination dynamics of photocarriers in the organic semiconductor thin films occur on a similar timescale and can be separated from the thermal response. Our measurements indicate that the photocarrier dynamics are determined by multiple recombination processes and our extracted recombination rates are in good agreement with previous reports using other techniques. We further apply TG spectroscopy to characterize another conjugated polymer and a molecular fluorescent material to demonstrate its general applicability. Our study indicates the potential of transient grating spectroscopy to simultaneously characterize thermal transport and photocarrier dynamics in organic optoelectronic devices.

preprint2020arXiv

Unsupervised Learning for Intrinsic Image Decomposition from a Single Image

Intrinsic image decomposition, which is an essential task in computer vision, aims to infer the reflectance and shading of the scene. It is challenging since it needs to separate one image into two components. To tackle this, conventional methods introduce various priors to constrain the solution, yet with limited performance. Meanwhile, the problem is typically solved by supervised learning methods, which is actually not an ideal solution since obtaining ground truth reflectance and shading for massive general natural scenes is challenging and even impossible. In this paper, we propose a novel unsupervised intrinsic image decomposition framework, which relies on neither labeled training data nor hand-crafted priors. Instead, it directly learns the latent feature of reflectance and shading from unsupervised and uncorrelated data. To enable this, we explore the independence between reflectance and shading, the domain invariant content constraint and the physical constraint. Extensive experiments on both synthetic and real image datasets demonstrate consistently superior performance of the proposed method.

preprint2019arXiv

Spin fluctuation anisotropy as a probe of orbital-selective hole-electron quasiparticle excitations in detwinned Ba(Fe1-xCox)2As2

We use inelastic neutron scattering to study spin excitation anisotropy in mechanically detwinned Ba(Fe1-xCox)2As2 with x = 0.048 and 0.054. Both samples exhibit a tetragonal-to-orthorhombic structural transition at Ts, a collinear static antiferromagnetic (AF) order at wave vector Q1 = QAF = (1, 0) below the Neel temperature TN, and superconductivity below Tc (Ts > TN > Tc). In the high temperature paramagnetic tetragonal phase (T > Ts), spin excitations centered at Q1 and Q2 = (0, 1) are gapless and have four-fold (C4) rotational symmetry. On cooling to below TN but above Tc, spin excitations become highly anisotropic, developing a gap at Q2 but still are gapless at Q1. Upon entering into the superconducting state, a neutron spin resonance appears at Q1 with no magnetic scattering at Q2. By comparing these results with those from angle resolved photoemission spectroscopy experiments, we conclude that the anisotropic shift of the dyz and dxz bands in detwinned Ba(Fe1-xCox)2As2 below Ts is associated with the spin excitation anisotropy, and the superconductivity-induced resonance arises from the electron-hole Fermi surface nesting of quasiparticles with the dyz orbital characters.

preprint2019arXiv

Tuneable terahertz oscillation arising from Bloch-point dynamics in chiral magnets

Skyrmionic textures are being extensively investigated due to the occurrence of novel topological magnetic phenomena and their promising applications in a new generation of spintronic devices that take advantage of the robust topological stability of their spin structures. The development of practical devices relies on a detailed understanding of how skyrmionic structures can be formed, transferred, detected and annihilated. In this work, our considerations go beyond static skyrmions and theoretically show that the formation/annihilation of both skyrmions and antiskyrmions is enabled by the transient creation and propagation of topological singularities (magnetic monopole-like Bloch points). Critically, during the winding/unwinding of skyrmionic textures, our results predict that the Bloch-point propagation will give rise to an emergent electric field in a terahertz frequency range and with substantial amplitude. We also demonstrate ways for controlling Bloch-point dynamics, which directly enable the tuneability on both frequency and amplitude of this signal. Our studies provide a concept of directly exploiting topological singularities for terahertz skyrmion-based electronic devices.

preprint2018arXiv

c-axis pressure induced antiferromagnetic order in optimally P-doped BaFe2(As0.70P0.30)2 superconductor

Superconductivity in BaFe2(As1-xPx)2 iron pnictides emerges when its in-plane two-dimensional (2D) orthorhombic lattice distortion associated with nematic phase at Ts and three-dimensional (3D) collinear antiferromagnetic (AF) order at TN (Ts = TN) are gradually suppressed with increasing x, reaching optimal superconductivity around x = 0.30 with Tc $\approx$ 30 K. Here we show that a moderate uniaxial pressure along the c-axis in BaFe2(As0.70P0.30)2 spontaneously induces a 3D collinear AF order with TN = Ts > 30 K, while only slightly suppresses Tc. Although a ~ 400 MPa pressure compresses the c-axis lattice while expanding the in-plane lattice and increasing the nearest-neighbor Fe-Fe distance, it barely changes the average iron-pnictogen height in BaFe2(As0.70P0.30)2. Therefore, the pressure- induced AF order must arise from a strong in-plane magnetoelastic coupling, suggesting that the 2D nematic phase is a competing state with superconductivity.

preprint2016arXiv

A Mott insulator continuously connected to iron pnictide superconductors

Iron-based superconductivity develops near an antiferromagnetic order and out of a bad metal normal state, which has been interpreted as originating from a proximate Mott transition. Whether an actual Mott insulator can be realized in the phase diagram of the iron pnictides remains an open question. Here we use transport, transmission electron microscopy, X-ray absorption spectroscopy, and neutron scattering to demonstrate that NaFe$_{1-x}$Cu$_x$As near $x\approx 0.5$ exhibits real space Fe and Cu ordering, and are antiferromagnetic insulators with the insulating behavior persisting above the Néel temperature, indicative of a Mott insulator. Upon decreasing $x$ from $0.5$, the antiferromagnetic ordered moment continuously decreases, yielding to superconductivity around $x=0.05$. Our discovery of a Mott insulating state in NaFe$_{1-x}$Cu$_x$As thus makes it the only known Fe-based material in which superconductivity can be smoothly connected to the Mott insulating state, highlighting the important role of electron correlations in the high-$T_{\rm c}$ superconductivity.

preprint2016arXiv

Absence of long wavelength nematic fluctuations in LiFeAs

We investigated long-wavelength nematic fluctuations in an Fe-based superconductor LiFeAs near q=(0.05,0,0) by measuring temperature-dependent renormalization of acoustic phonons through inelastic neutron scattering. We found that the phonons have conventional behavior, as would be expected in the absence of electronic nematic fluctuations. This observation implies that either electron-phonon coupling is too weak to see any effect or that nematic fluctuations are not present.

preprint2016arXiv

Dynamical co-existence of excitons and free carriers in perovskite probed by density-resolved fluorescent spectroscopic method

Using transient fluorescent spectra at time-zero, we develop a density-resolved fluorescent spectroscopic method for investigating photoproducts in CH3NH3PbI3 perovskite and related photophysics. The density dependent dynamical co-existence of excitons and free carriers over a wide density range is experimentally observed for the first time. The exciton binding energy (EB) and the effective mass of electron-hole pair can be estimated based on such co-existence. No ionic polarization is found contributing to photophysical behavior. It also solves the conflict between the large experimentally measured EB and the small predicted values. The spectroscopic method also helps to detect the true free carrier density under continuous illumination without the interference of ionic conductivity. Our methods and results profoundly enrich the study and understanding of the photophysics in perovskite materials for photovoltaic applications.

preprint2016arXiv

Electron doping evolution of the magnetic excitations in NaFe$_{1-x}$Co$_x$As

We use time-of-flight (ToF) inelastic neutron scattering (INS) spectroscopy to investigate the doping dependence of magnetic excitations across the phase diagram of NaFe$_{1-x}$Co$_x$As with $x=0, 0.0175, 0.0215, 0.05,$ and $0.11$. The effect of electron-doping by partially substituting Fe by Co is to form resonances that couple with superconductivity, broaden and suppress low energy ($E\le 80$ meV) spin excitations compared with spin waves in undoped NaFeAs. However, high energy ($E> 80$ meV) spin excitations are weakly Co-doping dependent. Integration of the local spin dynamic susceptibility $χ^{\prime\prime}(ω)$ of NaFe$_{1-x}$Co$_x$As reveals a total fluctuating moment of 3.6 $μ_B^2$/Fe and a small but systematic reduction with electron doping. The presence of a large spin gap in the Co-overdoped nonsuperconducting NaFe$_{0.89}$Co$_{0.11}$As suggests that Fermi surface nesting is responsible for low-energy spin excitations. These results parallel Ni-doping evolution of spin excitations in BaFe$_{2-x}$Ni$_x$As$_2$, confirming the notion that low-energy spin excitations coupling with itinerant electrons are important for superconductivity, while weakly doping dependent high-energy spin excitations result from localized moments.

preprint2016arXiv

Haze Visibility Enhancement: A Survey and Quantitative Benchmarking

This paper provides a comprehensive survey of methods dealing with visibility enhancement of images taken in hazy or foggy scenes. The survey begins with discussing the optical models of atmospheric scattering media and image formation. This is followed by a survey of existing methods, which are grouped to multiple image methods, polarizing filters based methods, methods with known depth, and single-image methods. We also provide a benchmark of a number of well known single-image methods, based on a recent dataset provided by Fattal and our newly generated scattering media dataset that contains ground truth images for quantitative evaluation. To our knowledge, this is the first benchmark using numerical metrics to evaluate dehazing techniques. This benchmark allows us to objectively compare the results of existing methods and to better identify the strengths and limitations of each method.

preprint2016arXiv

Interplay between multiple charge-density waves and the relationship with superconductivity in Pd$_x$HoTe$_{3}$

HoTe$_{3}$, a member of the rare-earth tritelluride ($R$Te$_{3}$) family, and its Pd-intercalated compounds, Pd$_x$HoTe$_{3}$, where superconductivity (SC) sets in as the charge-density wave (CDW) transition is suppressed by the intercalation of a small amount of Pd, are investigated using angle-resolved photoemission spectroscopy (ARPES) and electrical resistivity. Two incommensurate CDWs with perpendicular nesting vectors are observed in HoTe$_{3}$ at low temperatures. With a slight Pd intercalation ($x$ = 0.01), the large CDW gap decreases and the small one increases. The momentum dependence of the gaps along the inner Fermi surface (FS) evolves from orthorhombicity to near tetragonality, manifesting the competition between two CDW orders. At $x$ = 0.02, both CDW gaps decreases with the emergence of SC. Further increasing the content of Pd for $x$ = 0.04 will completely suppress the CDW instabilities and give rise to the maximal SC order. The evolution of the electronic structures and electron-phonon couplings (EPCs) of the multiple CDWs upon Pd intercalation are carefully scrutinized. We discuss the interplay between multiple CDW orders, and the competition between CDW and SC in detail.

preprint2016arXiv

Orbital selective spin excitations and their impact on superconductivity of LiFe1-xCoxAs

We use neutron scattering to study spin excitations in single crystals of LiFe$_{0.88}$Co$_{0.12}$As, which is located near the boundary of the superconducting phase of LiFe$_{1-x}$Co$_{x}$As and exhibits non-Fermi-liquid behavior indicative of a quantum critical point. By comparing spin excitations of LiFe$_{0.88}$Co$_{0.12}$As with a combined density functional theory (DFT) and dynamical mean field theory (DMFT) calculation, we conclude that wave-vector correlated low energy spin excitations are mostly from the $d_{xy}$ orbitals, while high-energy spin excitations arise from the $d_{yz}$ and $d_{xz}$ orbitals. Unlike most iron pnictides, the strong orbital selective spin excitations in LiFeAs family cannot be described by anisotropic Heisenberg Hamiltonian. While the evolution of low-energy spin excitations of LiFe$_{1-x}$Co$_x$As are consistent with electron-hole Fermi surface nesting condition for the $d_{xy}$ orbital, the reduced superconductivity in LiFe$_{0.88}$Co$_{0.12}$As suggests that Fermi surface nesting conditions for the $d_{yz}$ and $d_{xz}$ orbitals are also important for superconductivity in iron pnictides.

preprint2015arXiv

A Multilevel Newton Iteration Method for Eigenvalue Problems

We propose a new type of multilevel method for solving eigenvalue problems based on Newton iteration. With the proposed iteration method, solving eigenvalue problem on the finest finite element space is replaced by solving a small scale eigenvalue problem in a coarse space and solving a series of augmented linear problems, derived by Newton step in the corresponding series of finite element spaces. This iteration scheme improves overall efficiency of the finite element method for solving eigenvalue problems. Finally, some numerical examples are provided to validate the efficiency of the proposed numerical scheme.

preprint2015arXiv

Direct Observation of Long Electron-Hole Diffusion Distance in CH3NH3PbI3 Perovskite Thin Film

In high performance perovskite based solar cells, CH3NH3PbI3 is the key material. We carried out a study on charge diffusion in spin-coated CH3NH3PbI3 perovskite thin film by transient fluorescent spectroscopy. A thickness-dependent fluorescent lifetime was found. By coating the film with an electron or hole transfer layer, [6,6]-phenyl-C61-butyric acid methyl ester (PCBM) or 2,2,7,7-tetrakis(N,N-di-p-methoxyphenylamine)-9,9-spirobifluorene (Spiro-OMeTAD) respectively, we observed the charge transfer directly through the fluorescence quenching. One-dimensional diffusion model was applied to obtain long charge diffusion distances in thick films, which is ~1.7 um for electrons and up to ~6.3 um for holes. Short diffusion distance of few hundreds of nanosecond was also observed in thin films. This thickness dependent charge diffusion explained the formerly reported short charge diffusion distance (~100 nm) in films and resolved its confliction to thick working layer (300-500 nm) in real devices. This study presents direct support to the high performance perovskite solar cells and will benefit the devices design.

preprint2015arXiv

Explicit solutions, conservation laws of the extended (2+1)-dimensional Jaulent-Miodek equation

By applying the direct symmetry method, the symmetry reductions and some new group invariant solutions were obtained, We have derived some exact solutions by using the relationship between the new solutions and the old ones, which include Weierstrass periodic solutions, elliptic periodic solutions, triangular function solutions and so on. Also, in order to reflect the characteristics and properties of this solutions, we give figures of some solutions. In addition, we give the conservation laws of the extended (2+1)-dimensional Jaulent-Miodek equation. At last, we draw conclusions and discuss it.

preprint2015arXiv

Neutron spin resonance as a probe of superconducting gap anisotropy in partially detwinned electron underdoped NaFe$_{0.985}$Co$_{0.015}$As

We use inelastic neutron scattering (INS) to study the spin excitations in partially detwinned NaFe$_{0.985}$Co$_{0.015}$As which has coexisting static antiferromagnetic (AF) order and superconductivity ($T_c=15$ K, $T_N=30$ K). In previous INS work on a twinned sample, spin excitations form a dispersive sharp resonance near $E_{r1}=3.25$ meV and a broad dispersionless mode at $E_{r1}=6$ meV at the AF ordering wave vector ${\bf Q}_{\rm AF}={\bf Q}_1=(1,0)$ and its twinned domain ${\bf Q}_2=(0,1)$. For partially detwinned NaFe$_{0.985}$Co$_{0.015}$As with the static AF order mostly occurring at ${\bf Q}_{\rm AF}=(1,0)$, we still find a double resonance at both wave vectors with similar intensity. Since ${\bf Q}_1=(1,0)$ characterizes the explicit breaking of the spin rotational symmetry associated with the AF order, these results indicate that the double resonance cannot be due to the static and fluctuating AF orders, but originate from the superconducting gap anisotropy.

preprint2015arXiv

What is Cook's theorem?

In this paper, we make a preliminary interpretation of Cook's theorem presented in [1]. This interpretation reveals cognitive biases in the proof of Cook's theorem that arise from the attempt of constructing a formula in CNF to represent a computation of a nondeterministic Turing machine. Such cognitive biases are due to the lack of understanding about the essence of nondeterminism, and lead to the confusion between different levels of nondeterminism and determinism, thus cause the loss of nondeterminism from the NP-completeness theory. The work shows that Cook's theorem is the origin of the loss of nondeterminism in terms of the equivalence of the two definitions of NP, the one defining NP as the class of problems solvable by a nondeterministic Turing machine in polynomial time, and the other defining NP as the class of problems verifiable by a deterministic Turing machine in polynomial time. Therefore, we argue that fundamental difficulties in understanding P versus NP lie firstly at cognition level, then logic level.

preprint2015arXiv

What is NP? - Interpretation of a Chinese paradox "white horse is not horse"

The notion of nondeterminism has disappeared from the current definition of NP, which has led to ambiguities in understanding NP, and caused fundamental difficulties in studying the relation P versus NP. In this paper, we question the equivalence of the two definitions of NP, the one defining NP as the class of problems solvable by a nondeterministic Turing machine in polynomial time, and the other defining NP as the class of problems verifiable by a deterministic Turing machine in polynomial time, and reveal cognitive biases in this equivalence. Inspired from a famous Chinese paradox white horse is not horse, we further analyze these cognitive biases. The work shows that these cognitive biases arise from the confusion between different levels of nondeterminism and determinism, due to the lack of understanding about the essence of nondeterminism. Therefore, we argue that fundamental difficulties in understanding P versus NP lie firstly at cognition level, then logic level.

preprint2014arXiv

A Multigrid Method Based On Shifted-Inverse Power Technique for Eigenvalue Problems

A multigrid method is proposed in this paper to solve eigenvalue problems by the finite element method based on the shifted-inverse power iteration technique. With this scheme, solving eigenvalue problem is transformed to a series of nonsingular solutions of boundary value problems on multilevel meshes. Since replacing the difficult eigenvalue solving by the easier solution of boundary value problems, the multigrid way can improve the overall efficiency of the eigenvalue problem solving. Some numerical experiments are presented to validate the efficiency of this new method.

preprint2014arXiv

Local and Parallel Finite Element Algorithm Based On Multilevel Discretization for Eigenvalue Problem

A local and parallel algorithm based on the multilevel discretization is proposed in this paper to solve the eigenvalue problem by the finite element method. With this new scheme, solving the eigenvalue problem in the finest grid is transferred to solutions of the eigenvalue problems on the coarsest mesh and a series of solutions of boundary value problems by using the local and parallel algorithm. The computational work in each processor can reach the optimal order. Therefore, this type of multilevel local and parallel method improves the overall efficiency of solving the eigenvalue problem. Some numerical experiments are presented to validate the efficiency of the new method.

preprint2014arXiv

Tail Behavoir of sums of random components

A class of stochastic processes strongly related to random sums plays an important role in network and in finance. In this paper we study this kind of stochastic process discuss an overtime unchanged parameter and reveal its asymptotic behavior.

preprint2013arXiv

Gröbner-Shirshov bases for some Lie algebras

We give Gröbner-Shirshov bases for Drinfeld-Kohno Lie algebra $\textbf{L}_{n}$ in \cite{[Et]} and Kukin Lie algebra $A_P$ in \cite{Kukin}, where $P$ is a semigroup. As applications, we show that as $\mathbb{Z}$-module $\textbf{L}_{n}$ is free and a $\mathbb{Z}$-basis of $\textbf{L}_{n}$ is given. We give another proof of Kukin Theorem: if semigroup $P$ has the undecidable word problem then the Lie algebra $A_P$ has the same property.

preprint2013arXiv

Some remarks for the Akivis algebras and the Pre-Lie algebras

In this paper, by using the Composition-Diamond lemma for non-associative algebras invented by A. I. Shirshov in 1962, we give Gröbner-Shirshov bases for free Pre-Lie algebras and the universal enveloping non-associative algebra of an Akivis algebra, respectively. As applications, we show I.P. Shestakov's result that any Akivis algebra is linear and D. Segal's result that the set of all good words in $X^{**}$ forms a linear basis of the free Pre-Lie algebra $PLie(X)$ generated by the set $X$. For completeness, we give the details of the proof of Shirshov's Composition-Diamond lemma for non-associative algebras.

preprint2012arXiv

A Parallel Method for Population Balance Equations Based on the Method of Characteristics

In this paper, we present a parallel scheme to solve the population balance equations based on the method of characteristics and the finite element discretization. The application of the method of characteristics transform the higher dimensional population balance equation into a series of lower dimensional convection-diffusion-reaction equations which can be solved in a parallel way.Some numerical results are presented to show the accuracy and efficiency.

preprint2011arXiv

Adiabatic and non-adiabatic perturbations for loop quantum cosmology

We generalize the perturbations theory of loop quantum cosmology to a hydrodynamical form and define an effective curvature perturbation on an uniform density hypersurfaces $ζ_e$. As in the classical cosmology, $ζ_e$ should be gauge-invariant and conservation on the large scales. The evolutions of both the adiabatic and the non-adiabatic perturbations for a multi-fluids model are investigated in the framework of the effective hydrodynamical theory of loop quantum cosmology with the inverse triad correction. We find that, different from the classical cosmology, the evolution of the large-scales non-adiabatic entropy perturbation can be driven by an adiabatic curvature perturbation and this adiabatic source for the non-adiabatic perturbation is a quantum effect. As an application of the related formalism, we study a decay model and give out the numerical results.

preprint2011arXiv

Application of higher order holonomy corrections to perturbation theory of cosmology

Applying the higher order holonomy corrections to the perturbation theory of cosmology, the lattice power law of Loop Quantum Cosmology, $\tildeμ\propto p^β$, is analysed and the range of $β$ is decided to be [-1,0] which is different from the conventional range $-0.1319>β\geq-5/2$ \cite{lqct}. At the same time, we find that there is a anomaly free condition in this theory, and we obtain this condition in the vector and tensor mode. We also find that the nonzero mass of gravitational wave essentially results from the quantum nature of Riemannian geometry of loop quantum gravity.

preprint2011arXiv

Convergence of Perturbations for a Big Bounce in Loop Quantum Cosmology

We investigate the convergence behaviors of the scalar and the vector perturbations for a big bounce phase in loop quantum cosmology. Two models are discussed: one is the universe filled by a massless scalar field; the other is a toy model which is radiation-dominated in the asymptotic past and future. We find that the behaviors of the Bardeen potential of the scalar mode near both the bounce point and the transition point of the null energy condition are good, moreover, the unlimited growth of the vector perturbation can be avoided in our bounce model. This is different from the bounce models in pure general relativity. And we also find that the maximum of an observable vector mode is inversely proportional to the square of the minimum scalar factor $a_{bounce}$. This conclusion is independent with the bounce model, and we may conclude that the bounce in loop quantum cosmology is reasonable.

preprint2011arXiv

Genus one open books with non-left-orderable fundamental group

Let $Y$ be a closed, connected, orientable three-manifold admitting a genus one open book decomposition with one boundary component. We prove that if $Y$ is an L-space, then the fundamental group of $Y$ is not left-orderable. This answers a question posed by John Baldwin.

preprint2011arXiv

Gröbner-Shirshov bases for categories

In this paper we establish Composition-Diamond lemma for small categories. We give Gröbner-Shirshov bases for simplicial category and cyclic category.

preprint2011arXiv

Lyndon-Shirshov basis and anti-commutative algebras

Chen, Fox, Lyndon 1958 \cite{CFL58} and Shirshov 1958 \cite{Sh58} introduced non-associative Lyndon-Shirshov words and proved that they form a linear basis of a free Lie algebra, independently. In this paper we give another approach to definition of Lyndon-Shirshov basis, i.e., we find an anti-commutative Gröbner-Shirshov basis $S$ of a free Lie algebra such that $Irr(S)$ is the set of all non-associative Lyndon-Shirshov words, where $Irr(S)$ is the set of all monomials of $N(X)$, a basis of the free anti-commutative algebra on $X$, not containing maximal monomials of polynomials from $S$. Following from Shirshov's anti-commutative Gröbner-Shirshov bases theory \cite{S62a2}, the set $Irr(S)$ is a linear basis of a free Lie algebra.

preprint2009arXiv

Composition-Diamond lemma for differential algebras

In this paper, we establish the Composition-Diamond lemma for free differential algebras. As applications, we give Groebner-Shirshov bases for free Lie-differential algebra and free commutative-differential algebra, respectively.

preprint2009arXiv

Gröbner--Shirshov bases for Vinberg--Koszul--Gerstenhaber right-symmetric algebras

In this paper, we establish the Composition-Diamond lemma for right-symmetric algebras. As an application, we give a Gröbner-Shirshov basis for universal enveloping right--symmetric algebra of a Lie algebra.

preprint2008arXiv

Anti-commutative Groebner-Shirshov basis of a free Lie algebra

One of the natural ways to prove that the Hall words (Philip Hall, 1933) consist of a basis of a free Lie algebra is a direct construction: to start with a linear space spanned by Hall words, to define the Lie product of Hall words, and then to check that the product yields the Lie identities (Marshall Hall, 1950). Here we suggest another way using the Composition-Diamond lemma for free anti-commutative (non-associative) algebras (A.I. Shirshov, 1962).

Yu Li

What is connected

Connect this record

See the researcher in context

Building this map preview

74 published item(s)

Simply Stabilizing the Loop via Fully Looped Transformer

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

Heat kernel on Ricci shrinkers (II)

"Think Before You Speak": Improving Multi-Action Dialog Policy by Planning Single-Action Dialogs

A physics and data co-driven surrogate modeling approach for temperature field prediction on irregular geometric domain

A Virtual Reality-based Training and Assessment System for Bridge Inspectors with an Assistant Drone

Active Learning for Open-set Annotation

Enhanced two-component superconductivity in CoSi2/TiSi2 heterojunctions

Genome-wide nucleotide-resolution model of single-strand break site reveals species evolutionary hierarchy

Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image Retrieval

Interpretable RNA Foundation Model from Unannotated Data for Highly Accurate RNA Structure and Function Predictions

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation

MixDefense: A Defense-in-Depth Framework for Adversarial Example Detection Based on Statistical and Semantic Analysis

Multipole-fluctuation pairing mechanism of $d_{x^2-y^2}+ig$ superconductivity in Sr$_2$RuO$_4$

Neutron Scattering Study of Fluctuating and Static Spin Correlations in the Anisotropic Spin Glass Fe$_2$TiO$_5$

Parameter-robust Braess-Sarazin-type smoothers for linear elasticity problems

Rethinking Knowledge Distillation via Cross-Entropy

Semantic Guided Single Image Reflection Removal

Temporally Efficient Vision Transformer for Video Instance Segmentation

Towards explainable artificial intelligence (XAI) for early anticipation of traffic accidents

Towards Vivid and Diverse Image Colorization with Generative Color Prior

Unseasonal super ionospheric plasma bubble and scintillations seeded by the 2022 Tonga Volcano Eruption related perturbations

Using Chatbots to Teach Languages

What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction

CLiMP: A Benchmark for Chinese Language Model Evaluation

On the structure of Ricci shrinkers

Orbital-Selective High-Temperature Cooper Pairing Developed in the Two-Dimensional Limit

Two-Stage Single Image Reflection Removal with Reflection-Aware Guidance

A new generalized inverse of matrices from core-EP decomposition

Ancient solutions to the Ricci flow with isotropic curvature conditions

Attention Guided Low-light Image Enhancement with a Large Scale Low-light Simulation Dataset

Classification Calibration for Long-tail Instance Segmentation

Dual Semantic Fusion Network for Video Object Detection

Fast Video Object Segmentation using the Global Context Module

Hyperspectral City V1.0 Dataset and Benchmark

Learning to Stop While Learning to Predict

On the regular-convexity of Ricci shrinker limit spaces

Overcoming Classifier Imbalance for Long-tail Object Detection with Balanced Group Softmax

RNA Secondary Structure Prediction By Learning Unrolled Algorithms

Topological Hall effect and the magnetic states of Nowotney chimney ladder compound Cr$_{11}$Ge$_{19}$}

Towards Structured Prediction in Bioinformatics with Deep Learning

Transient Grating Spectroscopy of Photocarrier Dynamics in Semiconducting Polymer Thin Films

Unsupervised Learning for Intrinsic Image Decomposition from a Single Image

Spin fluctuation anisotropy as a probe of orbital-selective hole-electron quasiparticle excitations in detwinned Ba(Fe1-xCox)2As2

Tuneable terahertz oscillation arising from Bloch-point dynamics in chiral magnets

c-axis pressure induced antiferromagnetic order in optimally P-doped BaFe2(As0.70P0.30)2 superconductor

A Mott insulator continuously connected to iron pnictide superconductors

Absence of long wavelength nematic fluctuations in LiFeAs

Dynamical co-existence of excitons and free carriers in perovskite probed by density-resolved fluorescent spectroscopic method

Electron doping evolution of the magnetic excitations in NaFe$_{1-x}$Co$_x$As

Haze Visibility Enhancement: A Survey and Quantitative Benchmarking

Interplay between multiple charge-density waves and the relationship with superconductivity in Pd$_x$HoTe$_{3}$

Orbital selective spin excitations and their impact on superconductivity of LiFe1-xCoxAs

A Multilevel Newton Iteration Method for Eigenvalue Problems

Direct Observation of Long Electron-Hole Diffusion Distance in CH3NH3PbI3 Perovskite Thin Film

Explicit solutions, conservation laws of the extended (2+1)-dimensional Jaulent-Miodek equation

Neutron spin resonance as a probe of superconducting gap anisotropy in partially detwinned electron underdoped NaFe$_{0.985}$Co$_{0.015}$As

What is Cook's theorem?

What is NP? - Interpretation of a Chinese paradox "white horse is not horse"

A Multigrid Method Based On Shifted-Inverse Power Technique for Eigenvalue Problems

Local and Parallel Finite Element Algorithm Based On Multilevel Discretization for Eigenvalue Problem

Tail Behavoir of sums of random components

Gröbner-Shirshov bases for some Lie algebras

Some remarks for the Akivis algebras and the Pre-Lie algebras

A Parallel Method for Population Balance Equations Based on the Method of Characteristics

Adiabatic and non-adiabatic perturbations for loop quantum cosmology

Application of higher order holonomy corrections to perturbation theory of cosmology

Convergence of Perturbations for a Big Bounce in Loop Quantum Cosmology

Genus one open books with non-left-orderable fundamental group

Gröbner-Shirshov bases for categories

Lyndon-Shirshov basis and anti-commutative algebras

Composition-Diamond lemma for differential algebras

Gröbner--Shirshov bases for Vinberg--Koszul--Gerstenhaber right-symmetric algebras

Anti-commutative Groebner-Shirshov basis of a free Lie algebra