Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
64works
0followers
37topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

64 published item(s)

preprint2026arXiv

From Flat Language Labels to Typological Priors: Structured Language Conditioning for Multilingual Speech-to-Speech Translation

Compositional speech-to-speech translation (S2ST) systems built upon speech large language models (SpeechLLMs) have recently shown promising performance. However, existing S2ST systems often either neglect source-language information or encode it through a language-as-label paradigm, representing each source language as an independent flat embedding. Such a design overlooks systematic linguistic structure shared across languages, which may limit data-efficient multilingual adaptation when supervised S2ST data are scarce. To address this issue, we propose S2ST-Omni 2, a many-to-one compositional S2ST framework that systematically reformulates multilingual language conditioning from flat language labels to structured typological priors. Specifically, S2ST-Omni 2 revisits language conditioning at three levels: typology-informed hierarchical language encoding for structured source-language representation, dynamically-gated language-aware Dual-CTC for content-adaptive acoustic modulation, and typology-aware LLM prompting for decoder-side linguistic guidance. Experiments on CVSS-C show that S2ST-Omni 2 achieves superior average performance among representative S2ST approaches across BLEU, COMET, ASR-BLEU, and BLASER 2.0 under the adopted evaluation protocol. Ablation studies indicate that the proposed representation-level, acoustic-level, and decoding-level strategies provide complementary benefits. Moreover, controlled data-budget analyses and a Japanese-to-English evaluation using only approximately 3 hours of supervised training data suggest that explicit typological priors provide useful inductive biases for data-efficient multilingual S2ST.

preprint2026arXiv

Generalized $2$-split for higher-derivative YM and GR amplitudes at tree-level

We study the generalized $2$-split of higher-derivative amplitudes, including Yang-Mills (YM) and Gravity (GR) amplitudes with special insertions of higher-derivative vertices, by expanding them into ${\rm YM}\oplus{\rm BAS}$, ${\rm GR}\oplus{\rm YM}$, and ${\rm GR}\oplus{\rm YM}\oplus{\rm BAS}$ amplitude, respectively. By leveraging the established $2$-split properties of these constituent theories, we show that these higher-derivative amplitudes -- which also exhibit another newly discovered phenomenon called hidden zero -- do not factorize into a single product of two currents. Instead, their factorization universally appears as a sum of multiple $2$-split contributions.

preprint2026arXiv

High-Q AlN microresonators for nonlinear near-infrared and near-visible photonics

High Q-factors of microresonators are crucial for nonlinear integrated photonics, as many nonlinear dynamics have quadratic or even cubic dependence on Q-factors. The unique material properties make AlN microresonators invaluable for microcomb generation, Raman lasing and visible integrated photonics. However, the loss level of AlN falls behind other integrated platforms. By optimizing the fabrication, we demonstrate record Q-factors of 5.4$\times$10$^6$ and 2.2$\times$10$^6$ for AlN microresonators in the near-infrared and near-visible, respectively. Polarized-mode-interaction was used to create anomalous dispersion to support bright AlN Dirac solitons. Measurement of polarization-dependent spectra reveals the polarization hybridization of the Dirac soliton. In a microresonator with normal dispersion, Raman assisted four-wave-mixing (RFWM) was observed to initiate platicon formation, adding an approach to generate normal dispersion microcombs. A design of width-varying waveguides was used to ensure both efficient coupling and high Q-factor for racetrack microresonators at 780 nm. The microresonator was pumped to generate near-visble Raman laser at 820 nm with a fundamental linewidth narrower than 220 Hz. Our work unlocks new opportunities for integrated AlN photonics by improving Q-factors and uncovering nonlinear dynamics in AlN microresonators.

preprint2026arXiv

Improved LLM Agents for Financial Document Question Answering

Large language models (LLMs) have shown impressive capabilities on numerous natural language processing tasks. However, LLMs still struggle with numerical question answering for financial documents that include tabular and textual data. Recent works have showed the effectiveness of critic agents (i.e., self-correction) for this task given oracle labels. Building upon this framework, this paper examines the effectiveness of the traditional critic agent when oracle labels are not available, and show, through experiments, that this critic agent's performance deteriorates in this scenario. With this in mind, we present an improved critic agent, along with the calculator agent which outperforms the previous state-of-the-art approach (program-of-thought) and is safer. Furthermore, we investigate how our agents interact with each other, and how this interaction affects their performance.

preprint2026arXiv

POLYCHARTQA: Benchmarking Large Vision-Language Models with Multilingual Chart Question Answering

Charts are a universally adopted medium for data communication, yet existing chart understanding benchmarks are overwhelmingly English-centric, limiting their accessibility and relevance to global audiences. To address this limitation, we introduce PolyChartQA, the first large-scale multilingual benchmark for chart question answering, comprising 22,606 charts and 26,151 QA pairs across 10 diverse languages. PolyChartQA is constructed through a scalable pipeline that enables efficient multilingual chart generation via data translation and code reuse, supported by LLM-based translation and rigorous quality control. We systematically evaluate multilingual chart understanding with PolyChartQA on state-of-the-art LVLMs and reveal a significant performance gap between English and other languages, particularly low-resource ones. Additionally, we introduce a companion multilingual chart question answering training set, PolyChartQA-Train, on which fine-tuning LVLMs yields substantial gains in multilingual chart understanding across diverse model sizes and architectures. Together, our benchmark provides a foundation for developing globally inclusive vision-language models capable of understanding charts across diverse linguistic contexts.

preprint2026arXiv

X-ray and radio observations of the AMXP MAXI J1957+032 covering the 2022-2025 outbursts

We presented a comprehensive multi-epoch timing and multiwavelength analysis of the accreting millisecond X-ray pulsar MAXI J1957+032, covering two major outbursts in 2022 and 2025. By reanalyzing the 2022 outburst data from the Neutron Star Interior Composition Explorer (NICER), we found the spin frequency and orbital parameters from the observations in 0.3-5 keV. For the 2025 outburst, we reported the detection of pulsations with the Einstein Probe (EP). Based on the $\sim$3-year baseline between these two outbursts, we measured a significant long-term spin-down rate of $\dotν= (-5.73 \pm 0.28) \times 10^{-14}~{\rm Hz~s^{-1}}$. Assuming that the quiescent spin-down is driven by magnetic dipole radiation, we inferred a spin-down luminosity of $L \approx 1.1 \times 10^{36}~{\rm erg~s^{-1}}$ and a surface dipolar magnetic field of $B \approx (7.3 - 10.4) \times 10^8$ G. Furthermore, we conducted a deep radio pulsation search with the Five-hundred-meter Aperture Spherical radio Telescope (FAST) during the X-ray quiescent state in 2024, resulting in a non-detection with a 7$σ$ flux density upper limit of 12.3 $μ$Jy. This corresponds to a radio efficiency upper limit of $ξ< 2.8 \times 10^{-10}$, which is significantly lower than that of typical millisecond pulsars with a similar spin-down power. This profound radio pulsation faintness can be explained by two primary scenarios: either a geometric effect, wherein the pulsar&#39;s radio beam is directed away from our line of sight, or a physical suppression of the emission mechanism, potentially caused by a persistent low-level accretion flow during the X-ray quiescent state.

preprint2024arXiv

A multimodal gesture recognition dataset for desktop human-computer interaction

Gesture recognition is an indispensable component of natural and efficient human-computer interaction technology, particularly in desktop-level applications, where it can significantly enhance people&#39;s productivity. However, the current gesture recognition community lacks a suitable desktop-level (top-view perspective) dataset for lightweight gesture capture devices. In this study, we have established a dataset named GR4DHCI. What distinguishes this dataset is its inherent naturalness, intuitive characteristics, and diversity. Its primary purpose is to serve as a valuable resource for the development of desktop-level portable applications. GR4DHCI comprises over 7,000 gesture samples and a total of 382,447 frames for both Stereo IR and skeletal modalities. We also address the variances in hand positioning during desktop interactions by incorporating 27 different hand positions into the dataset. Building upon the GR4DHCI dataset, we conducted a series of experimental studies, the results of which demonstrate that the fine-grained classification blocks proposed in this paper can enhance the model&#39;s recognition accuracy. Our dataset and experimental findings presented in this paper are anticipated to propel advancements in desktop-level gesture recognition research.

preprint2024arXiv

Flowmind2Digital: The First Comprehensive Flowmind Recognition and Conversion Approach

Flowcharts and mind maps, collectively known as flowmind, are vital in daily activities, with hand-drawn versions facilitating real-time collaboration. However, there&#39;s a growing need to digitize them for efficient processing. Automated conversion methods are essential to overcome manual conversion challenges. Existing sketch recognition methods face limitations in practical situations, being field-specific and lacking digital conversion steps. Our paper introduces the Flowmind2digital method and hdFlowmind dataset to address these challenges. Flowmind2digital, utilizing neural networks and keypoint detection, achieves a record 87.3% accuracy on our dataset, surpassing previous methods by 11.9%. The hdFlowmind dataset, comprising 1,776 annotated flowminds across 22 scenarios, outperforms existing datasets. Additionally, our experiments emphasize the importance of simple graphics, enhancing accuracy by 9.3%.

preprint2022arXiv

A Densely Connected Criss-Cross Attention Network for Document-level Relation Extraction

Document-level relation extraction (RE) aims to identify relations between two entities in a given document. Compared with its sentence-level counterpart, document-level RE requires complex reasoning. Previous research normally completed reasoning through information propagation on the mention-level or entity-level document-graph, but rarely considered reasoning at the entity-pair-level.In this paper, we propose a novel model, called Densely Connected Criss-Cross Attention Network (Dense-CCNet), for document-level RE, which can complete logical reasoning at the entity-pair-level. Specifically, the Dense-CCNet performs entity-pair-level logical reasoning through the Criss-Cross Attention (CCA), which can collect contextual information in horizontal and vertical directions on the entity-pair matrix to enhance the corresponding entity-pair representation. In addition, we densely connect multiple layers of the CCA to simultaneously capture the features of single-hop and multi-hop logical reasoning.We evaluate our Dense-CCNet model on three public document-level RE datasets, DocRED, CDR, and GDA. Experimental results demonstrate that our model achieves state-of-the-art performance on these three datasets.

preprint2022arXiv

A Masked Image Reconstruction Network for Document-level Relation Extraction

Document-level relation extraction aims to extract relations among entities within a document. Compared with its sentence-level counterpart, Document-level relation extraction requires inference over multiple sentences to extract complex relational triples. Previous research normally complete reasoning through information propagation on the mention-level or entity-level document-graphs, regardless of the correlations between the relationships. In this paper, we propose a novel Document-level Relation Extraction model based on a Masked Image Reconstruction network (DRE-MIR), which models inference as a masked image reconstruction problem to capture the correlations between relationships. Specifically, we first leverage an encoder module to get the features of entities and construct the entity-pair matrix based on the features. After that, we look on the entity-pair matrix as an image and then randomly mask it and restore it through an inference module to capture the correlations between the relationships. We evaluate our model on three public document-level relation extraction datasets, i.e. DocRED, CDR, and GDA. Experimental results demonstrate that our model achieves state-of-the-art performance on these three datasets and has excellent robustness against the noises during the inference process.

preprint2022arXiv

A Roadmap for Big Model

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

preprint2022arXiv

Analysis and visualization of spatial transcriptomic data

Human and animal tissues consist of heterogeneous cell types that organize and interact in highly structured manners. Bulk and single-cell sequencing technologies remove cells from their original microenvironments, resulting in a loss of spatial information. Spatial transcriptomics is a recent technological innovation that measures transcriptomic information while preserving spatial information. Spatial transcriptomic data can be generated in several ways. RNA molecules are measured by in situ sequencing, in situ hybridization, or spatial barcoding to recover original spatial coordinates. The inclusion of spatial information expands the range of possibilities for analysis and visualization, and spurred the development of numerous novel methods. In this review, we summarize the core concepts of spatial genomics technology and provide a comprehensive review of current analysis and visualization methods for spatial transcriptomics.

preprint2022arXiv

Approaching the Fundamental Limit of Orbital Angular Momentum Multiplexing Through a Hologram Metasurface

Establishing and approaching the fundamental limit of orbital angular momentum (OAM) multiplexing are necessary and increasingly urgent for current multiple-input multiple-output research. In this work, we elaborate the fundamental limit in terms of independent scattering channels (or degrees of freedom of scattered fields) through angular-spectral analysis, in conjunction with a rigorous Green function method. The scattering channel limit is universal for arbitrary spatial mode multiplexing, which is launched by a planar electromagnetic device, such as antenna, metasurface, etc, with a predefined physical size. As a proof of concept, we demonstrate both theoretically and experimentally the limit by a metasurface hologram that transforms orthogonal OAM modes to plane-wave modes scattered at critically separated angular-spectral regions. Particularly, a minimax optimization algorithm is applied to suppress angular spectrum aliasing, achieving good performances in both full-wave simulation and experimental measurement at microwave frequencies. This work offers a theoretical upper bound and corresponding approach route for engineering designs of OAM multiplexing.

preprint2022arXiv

Birth places of extreme ultraviolet waves driven by impingement of solar jets upon coronal loops

Solar extreme ultraviolet (EUV) waves are large-scale propagating disturbances in the corona. It is generally believed that the vital key for the formation of EUV waves is the rapid expansion of the loops that overlie erupting cores in solar eruptions, such as coronal mass ejections (CMEs) and solar jets. However, the details of the interaction between the erupting cores and overlying loops are not clear, because that the overlying loops are always instantly opened after the energetic eruptions. Here, we present three typical jet-driven EUV waves without CME to study the interaction between the jets and the overlying loops that remained closed during the events. All three jets emanated from magnetic flux cancelation sites in source regions. Interestingly, after the interactions between jets and overlying loops, three EUV waves respectively formed ahead of the top, the near end (close to the jet source), and the far (another) end of the overlying loops. According to the magnetic field distribution of the loops extrapolated from Potential Field Source Surface method, it is confirmed that the birth places of three jet-driven EUV waves were around the weakest magnetic field strength part of the overlying loops. We suggest that the jet-driven EUV waves preferentially occur at the weakest part of the overlying loops, and the location can be subject to the magnetic field intensity around the ends of the loops.

preprint2022arXiv

CATCH: Chasing All Transients Constellation Hunters Space Mission

In time-domain astronomy, a substantial number of transients will be discovered by multi-wavelength and multi-messenger observatories, posing a great challenge for follow-up capabilities. We have thus proposed an intelligent X-ray constellation, the Chasing All Transients Constellation Hunters (CATCH) space mission. Consisting of 126 micro-satellites in three types, CATCH will have the capability to perform follow-up observations for a large number of different types of transients simultaneously. Each satellite in the constellation will carry lightweight X-ray optics and use a deployable mast to increase the focal length. The combination of different optics and detector systems enables different types of satellites to have multiform observation capabilities, including timing, spectroscopy, imaging, and polarization. Controlled by the intelligent system, different satellites can cooperate to perform uninterrupted monitoring, all-sky follow-up observations, and scanning observations with a flexible field of view (FOV) and multi-dimensional observations. Therefore, CATCH will be a powerful mission to study the dynamic universe. Here, we present the current design of the spacecraft, optics, detector system, constellation configuration and observing modes, as well as the development plan.

preprint2022arXiv

Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model

Recently brain networks have been widely adopted to study brain dynamics, brain development and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. Firstly, most current graph learning models are designed for unsigned graph, which hinders the analysis of many signed network data (e.g., brain functional networks). Meanwhile, the insufficiency of brain network data limits the model performance on clinical phenotypes predictions. Moreover, few of current graph learning model is interpretable, which may not be capable to provide biological insights for model outcomes. Here, we propose an interpretable hierarchical signed graph representation learning model to extract graph-level representations from brain functional networks, which can be used for different prediction tasks. In order to further improve the model performance, we also propose a new strategy to augment functional brain network data for contrastive learning. We evaluate this framework on different classification and regression tasks using the data from HCP and OASIS. Our results from extensive experiments demonstrate the superiority of the proposed model compared to several state-of-the-art techniques. Additionally, we use graph saliency maps, derived from these prediction tasks, to demonstrate detection and interpretation of phenotypic biomarkers.

preprint2022arXiv

Control spontaneous symmetry breaking of photonic chirality with reconfigurable anomalous nonlinearity

Spontaneous symmetry breaking in nonlinear systems provides a unified method to understand vastly different phenomena, ranging from Higgs mechanism [1] and superconductivity [2] to ecological stability [3] and genome generation [4]. Spontaneous symmetry breaking is typically considered as the intrinsic property of nonlinear systems with fixed occurrence condition and property, as the form and magnitude of nonlinear interactions cannot be modified [5]. Here, we report the development of reconfigurable Kerr optical nonlinearity to control spontaneous symmetry breaking. This is achieved through the interference between the intrinsic Kerr and cascaded second-order nonlinear processes [6]. Anomalous Kerr effects including negative self-phase modulation and strength tuning between competing nonlinear processes have been demonstrated. With the reconfigurable Kerr nonlinearity, we realize the in-situ prohibition and facilitation of spontaneous symmetry breaking of photonic chirality. This work could empower the experimental study of spontaneous symmetry breaking in unexplored regimes and inspire the development of novel photonic functions.

preprint2022arXiv

Coupling between the accreting corona and the relativistic jet in the micro quasar GRS 1915+105

GRS 1915+105 was the first stellar-mass black-hole in our Galaxy to display a superluminal radio jet, similar to those observed in active galactic nuclei with a supermassive black hole at the centre. It has been proposed that the radio emission in GRS 1915+105 is fed by instabilities in the accretion disc by which the inner parts of the accretion flow is ejected in the jet. Here we show that there is a significant correlation between: (i) the radio flux, coming from the jet, and the flux of the iron emission line, coming from the disc and, (ii) the temperature of the corona that produces the high-energy part of the X-ray spectrum via inverse Compton scattering and the amplitude of a high-frequency variability component coming from the innermost part of the accretion flow. At the same time, the radio flux and the flux of the iron line are strongly anti-correlated with the temperature of the X-ray corona and the amplitude of the high-frequency variability component. These correlations persist over ~10 years, despite the highly variable X-ray and radio properties of the source in that period. Our findings provide, for the first time, incontrovertible evidence that the energy that powers this black-hole system can be directed either to the X-ray corona or the jet. When this energy is used to power the corona, raising its temperature, there is less energy left to fuel the jet and the radio flux drops, and vice versa. These facts, plus the modelling of the variability in this source show conclusively that in GRS 1915+105 the X-ray corona morphs into the jet.

preprint2022arXiv

Determination of QPO properties in the presence of strong broad-band noise: a case study on the data of MAXI J1820+070

Accurate calculation of the phase lags of quasi-periodic oscillations (QPOs) will provide insight into their origin. In this paper we investigate the phase lag correction method which has been applied to calculate the intrinsic phase lags of the QPOs in MAXI J1820+070. We find that the traditional additive model between BBN and QPOs in the time domain is rejected, but the convolution model is accepted. By introducing a convolution mechanism in the time domain, the Fourier cross-spectrum analysis shows that the phase lags between QPOs components in different energy bands will have a simple linear relationship with the phase lags between the total signals, so that the intrinsic phase lags of the QPOs can be obtained by linear correction. The power density spectrum (PDS) thus requires a multiplicative model to interpret the data. We briefly discuss a physical scenario for interpreting the convolution. In this scenario, the corona acts as a low-pass filter, the Green&#39;s function containing the noise is convolved with the QPOs to form the low-frequency part of the PDS, while the high-frequency part requires an additive component. We use a multiplicative PDS model to fit the data observed by Insight-HXMT. The overall fitting results are similar compared to the traditional additive PDS model. Neither the width nor the centroid frequency of the QPOs obtained from each of the two PDS models were significantly different, except for the r.m.s. of the QPOs. Our work thus provides a new perspective on the coupling of noise and QPOs.

preprint2022arXiv

Digital Twin for Networking: A Data-driven Performance Modeling Perspective

Emerging technologies and applications make the network unprecedentedly complex and heterogeneous, leading physical network practices to be costly and risky. The digital twin network (DTN) can ease these burdens by virtually enabling users to understand how performance changes accordingly with modifications. For this &#34;What-if&#34; performance evaluation, conventional simulation and analytical approaches are inefficient, inaccurate, and inflexible, and we argue that data-driven methods are most promising. In this article, we identify three requirements (fidelity, efficiency, and flexibility) for performance evaluation. Then we present a comparison of selected data-driven methods and investigate their potential trends in data, models, and applications. Although extensive applications have been enabled, there are still significant conflicts between models&#39; capacities to handle diversified inputs and limited data collected from the production network. We further illustrate the opportunities for data collection, model construction, and application prospects. This survey aims to provide a reference for performance evaluation while also facilitating future DTN research.

preprint2022arXiv

Expression might be enough: representing pressure and demand for reinforcement learning based traffic signal control

Many studies confirmed that a proper traffic state representation is more important than complex algorithms for the classical traffic signal control (TSC) problem. In this paper, we (1) present a novel, flexible and efficient method, namely advanced max pressure (Advanced-MP), taking both running and queuing vehicles into consideration to decide whether to change current signal phase; (2) inventively design the traffic movement representation with the efficient pressure and effective running vehicles from Advanced-MP, namely advanced traffic state (ATS); and (3) develop a reinforcement learning (RL) based algorithm template, called Advanced-XLight, by combining ATS with the latest RL approaches, and generate two RL algorithms, namely &#34;Advanced-MPLight&#34; and &#34;Advanced-CoLight&#34; from Advanced-XLight. Comprehensive experiments on multiple real-world datasets show that: (1) the Advanced-MP outperforms baseline methods, and it is also efficient and reliable for deployment; and (2) Advanced-MPLight and Advanced-CoLight can achieve the state-of-the-art.

preprint2022arXiv

Finite-time enclosing control for multiple moving targets: a continuous estimator approach

This work addresses the finite-time enclosing control problem where a set of followers are deployed to encircle and rotate around multiple moving targets with a predefined spacing pattern in finite time. A novel distributed and continuous estimator is firstly proposed to track the geometric center of targets in finite time using only local information for every follower. Then a pair of decentralized control laws for both the relative distance and included angle, respectively, are designed to achieve the desired spacing pattern in finite time based on the output of the proposed estimator. Through both theoretical analysis and simulation validation, we show that the proposed estimator is continuous and therefore can avoid dithering control output while still inheriting the merit of finite-time convergence. The steady errors of the estimator and the enclosing controller are guaranteed to converge to some bounded and adjustable regions around zero.

preprint2022arXiv

Formation and Immediate Deformation of a Small Filament Through Intermittent Magnetic Interactions

It is generally believed that filament formation involves a process of the accumulation of magnetic energy. However, in this paper we discuss the idea that filaments will not erupt and will only deform when the stored magnetic energy is released gradually. Combining high-quality observations from Solar Dynamics Observatory and other instruments, we present the formation and immediate deformation of a small filament (F1) in the active region (AR) 12760 on 28-30 April 2020. Before the filament formation, three successive dipoles quickly emerged with separation motions in the center of AR 12760. Due to the magnetic interaction between magnetic dipoles and pre-existing positive polarities, coronal brightenings consequently appeared in the overlying atmosphere. Subsequently, because of the continuous cancellation of magnetic flux that happened around the adjacent ends of F1 and another nearby filament (F2), the magnetic reconections occurred intermittently occurred between F1 and F2. Finally, F1 lessened in the shear, and F2 became shorter. All the results show that the formation of F1 was closely associated with intermittent interactions between the sequence of emerging dipoles and pre-existing magnetic polarities, and the immediate deformation of F1 was intimately related to intermittent interactions between F1 and F2. We also suggest that the intermittent magnetic interactions driven by the continuous magnetic activities (magnetic-flux emergence, cancellation, and convergence) play an important role in the formation and deformation of filaments.

preprint2022arXiv

Generalizing Multimodal Pre-training into Multilingual via Language Acquisition

English-based Vision-Language Pre-training (VLP) has achieved great success in various downstream tasks. Some efforts have been taken to generalize this success to non-English languages through Multilingual Vision-Language Pre-training (M-VLP). However, due to the large number of languages, M-VLP models often require huge computing resources and cannot be flexibly extended to new languages. In this work, we propose a \textbf{M}ulti\textbf{L}ingual \textbf{A}cquisition (MLA) framework that can easily generalize a monolingual Vision-Language Pre-training model into multilingual. Specifically, we design a lightweight language acquisition encoder based on state-of-the-art monolingual VLP models. We further propose a two-stage training strategy to optimize the language acquisition encoder, namely the Native Language Transfer stage and the Language Exposure stage. With much less multilingual training data and computing resources, our model achieves state-of-the-art performance on multilingual image-text and video-text retrieval benchmarks.

preprint2022arXiv

Learning from Pixel-Level Noisy Label : A New Perspective for Light Field Saliency Detection

Saliency detection with light field images is becoming attractive given the abundant cues available, however, this comes at the expense of large-scale pixel level annotated data which is expensive to generate. In this paper, we propose to learn light field saliency from pixel-level noisy labels obtained from unsupervised hand crafted featured based saliency methods. Given this goal, a natural question is: can we efficiently incorporate the relationships among light field cues while identifying clean labels in a unified framework? We address this question by formulating the learning as a joint optimization of intra light field features fusion stream and inter scenes correlation stream to generate the predictions. Specially, we first introduce a pixel forgetting guided fusion module to mutually enhance the light field features and exploit pixel consistency across iterations to identify noisy pixels. Next, we introduce a cross scene noise penalty loss for better reflecting latent structures of training data and enabling the learning to be invariant to noise. Extensive experiments on multiple benchmark datasets demonstrate the superiority of our framework showing that it learns saliency prediction comparable to state-of-the-art fully supervised light field saliency methods. Our code is available at https://github.com/OLobbCode/NoiseLF.

preprint2022arXiv

MobileCodec: Neural Inter-frame Video Compression on Mobile Devices

Realizing the potential of neural video codecs on mobile devices is a big technological challenge due to the computational complexity of deep networks and the power-constrained mobile hardware. We demonstrate practical feasibility by leveraging Qualcomm&#39;s technology and innovation, bridging the gap from neural network-based codec simulations running on wall-powered workstations, to real-time operation on a mobile device powered by Snapdragon technology. We show the first-ever inter-frame neural video decoder running on a commercial mobile phone, decoding high-definition videos in real-time while maintaining a low bitrate and high visual quality.

preprint2022arXiv

NC-DRE: Leveraging Non-entity Clue Information for Document-level Relation Extraction

Document-level relation extraction (RE), which requires reasoning on multiple entities in different sentences to identify complex inter-sentence relations, is more challenging than sentence-level RE. To extract the complex inter-sentence relations, previous studies usually employ graph neural networks (GNN) to perform inference upon heterogeneous document-graphs. Despite their great successes, these graph-based methods, which normally only consider the words within the mentions in the process of building graphs and reasoning, tend to ignore the non-entity clue words that are not in the mentions but provide important clue information for relation reasoning. To alleviate this problem, we treat graph-based document-level RE models as an encoder-decoder framework, which typically uses a pre-trained language model as the encoder and a GNN model as the decoder, and propose a novel graph-based model NC-DRE that introduces decoder-to-encoder attention mechanism to leverage Non-entity Clue information for Document-level Relation Extraction.

preprint2022arXiv

Probabilistic network topology prediction for active planning:An adaptive algorithm and application

This paper tackles the problem of active planning to achieve cooperative localization for multi-robot systems (MRS) under measurement uncertainty in GNSS-limited scenarios. Specifically, we address the issue of accurately predicting the probability of a future connection between two robots equipped with range-based measurement devices. Due to the limited range of the equipped sensors, edges in the network connection topology will be created or destroyed as the robots move with respect to one another. Accurately predicting the future existence of an edge, given imperfect state estimation and noisy actuation, is therefore a challenging task. An adaptive power series expansion (or APSE) algorithm is developed based on current estimates and control candidates. Such an algorithm applies the power series expansion formula of the quadratic positive form in a normal distribution. Finite-term approximation is made to realize the computational tractability. Further analyses are presented to show that the truncation error in the finite-term approximation can be theoretically reduced to a desired threshold by adaptively choosing the summation degree of the power series. Several sufficient conditions are rigorously derived as the selection principles. Finally, extensive simulation results and comparisons, with respect to both single and multi-robot cases, validate that a formally computed and therefore more accurate probability of future topology can help improve the performance of active planning under uncertainty.

preprint2022arXiv

Production of $ΩNN$ and $ΩΩN$ in ultra-relativistic heavy ion collisions

Even though lots of $Λ$-hypernuclei have been found and measured, multi-strangeness hypernuclei consisting of $Ω$ are not yet discovered. The studies of multi-strangeness hypernuclei help us further understand the interaction between hyperons and nucleons. Recently the $ΩN$ and $ΩΩ$ interactions as well as binding energies were calculated by the HAL-QCD&#39;s lattice Quantum Chromo-Dynamics (LQCD) simulations and production rates of $Ω$-dibaryon in Au + Au collisions at RHIC and Pb + Pb collisions at LHC energies were estimated by a coalescence model. The present work discusses the production of more exotic triple-baryons including $Ω$, namely $ΩNN$ and $ΩΩN$ as well as their decay channels. A variation method is used in calculations of bound states and binding energy of $ΩNN$ and $ΩΩN$ with the potentials from the HAL-QCD&#39;s results. The productions of $ΩNN$ and $ΩΩN$ are predicted by using a blast-wave model plus coalescence model in ultra-relativistic heavy-ion collisions at $\sqrt{s_{NN}} = 200$ GeV and $2.76$ TeV. Furthermore, plots for baryon number dependent yields of different baryons ($N$ and $Ω$), their dibaryons and hypernuclei are made and the production rate of a more exotic tetra-baryon ($ΩΩNN$) is extrapolated.

preprint2022arXiv

Radiation hardness study on a CMOS pixel sensor for charged particle tracking

A CMOS pixel sensor, named Supix-1, is developed for a pixelated silicon tracker for the Circular Electron-Positron Collider (CEPC) project. The sensor, consisted of nine sectors varying in pixel sizes, diode sizes and geometries, is fabricated with a 180 nm CMOS Image Sensor (CIS) process to study the particle detection performance of enlarged pixels. In this work, the radiation-induced effects on the charge collection of the sensor under the fluence of 1 $\times$ 10^13 1 MeV neq/cm^2 are studied by the measurements with the radioactive source of Fe-55 and the Technology Computer Aided Design (TCAD) simulations, since the radiation hardness of 6.8 $\times$ 10^12 1 MeV neq/cm^2 per year for Non-Ionizing Energy Loss (NIEL) effects is required. In measurements, the sensor gain has been calibrated using the k-$α$ peak of Fe-55 before and after irradiation. The pixel-wise equivalent noise charge (ENC), charge collection efficiency (CCE) and signal-to-noise ratio (SNR) were evaluated. The radiation-induced effects on cluster properties are studied through a self-developed reconstruction algorithm. In TCAD simulations, charge collections in 5 $\times$ 5 pixel matrixes for two typical impinging cases of incident particles were simulated with and without irradiation. Both measurements and simulations indicate that enlarged pixels with area of 21 $μ$m $\times$ 84 $μ$m, though suffering greater loss on sensor performance than small pixels do, still have satisfactory noise and charge collection performance after irradiation for particle tracking in the upcoming collider detectors.

preprint2022arXiv

Scene Graph Generation: A Comprehensive Survey

Deep learning techniques have led to remarkable breakthroughs in the field of generic object detection and have spawned a lot of scene-understanding tasks in recent years. Scene graph has been the focus of research because of its powerful semantic representation and applications to scene understanding. Scene Graph Generation (SGG) refers to the task of automatically mapping an image into a semantic structural scene graph, which requires the correct labeling of detected objects and their relationships. Although this is a challenging task, the community has proposed a lot of SGG approaches and achieved good results. In this paper, we provide a comprehensive survey of recent achievements in this field brought about by deep learning techniques. We review 138 representative works that cover different input modalities, and systematically summarize existing methods of image-based SGG from the perspective of feature extraction and fusion. We attempt to connect and systematize the existing visual relationship detection methods, to summarize, and interpret the mechanisms and the strategies of SGG in a comprehensive way. Finally, we finish this survey with deep discussions about current existing problems and future research directions. This survey will help readers to develop a better understanding of the current research status and ideas.

preprint2022arXiv

Spatial Parsing and Dynamic Temporal Pooling networks for Human-Object Interaction detection

The key of Human-Object Interaction(HOI) recognition is to infer the relationship between human and objects. Recently, the image&#39;s Human-Object Interaction(HOI) detection has made significant progress. However, there is still room for improvement in video HOI detection performance. Existing one-stage methods use well-designed end-to-end networks to detect a video segment and directly predict an interaction. It makes the model learning and further optimization of the network more complex. This paper introduces the Spatial Parsing and Dynamic Temporal Pooling (SPDTP) network, which takes the entire video as a spatio-temporal graph with human and object nodes as input. Unlike existing methods, our proposed network predicts the difference between interactive and non-interactive pairs through explicit spatial parsing, and then performs interaction recognition. Moreover, we propose a learnable and differentiable Dynamic Temporal Module(DTM) to emphasize the keyframes of the video and suppress the redundant frame. Furthermore, the experimental results show that SPDTP can pay more attention to active human-object pairs and valid keyframes. Overall, we achieve state-of-the-art performance on CAD-120 dataset and Something-Else dataset.

preprint2022arXiv

The accretion flow geometry of MAXI J1820+070 through broadband noise research with Insight-HXMT

Here we present a detailed study of the broadband noise in the power density spectra of the black hole X-ray binary MAXI J1820+070 during the hard state of its 2018 outburst, using the Hard X-ray Modulation Telescope (Insight-HXMT) observations. The broadband noise shows two main humps, which might separately correspond to variability from a variable disk and two Comptonization regions. We fitted the two humps with multiple Lorentzian functions and studied the energy-dependent properties of each component up to 100--150 keV and their evolution with spectral changes. The lowest frequency component is considered as the sub-harmonic of QPO component and shows different energy dependence compared with other broadband noise components. We found that although the fractional rms of all the broadband noise components mainly decrease with energy, their rms spectra are different in shape. Above $\sim$ 20--30 keV, the characteristic frequencies of these components increase sharply with energy, meaning that the high-energy component is more variable on short timescales. Our results suggest that the hot inner flow in MAXI J1820+070 is likely to be inhomogeneous. We propose a geometry with a truncated accretion disk, two Comptonization regions.

preprint2022arXiv

The deformation of an erupting magnetic flux rope in a confined solar flare

Magnetic flux ropes (MFRs), sets of coherently twisted magnetic field lines, are believed as core structures of various solar eruptions. Their evolution plays an important role to understand the physical mechanisms of solar eruptions, and can shed light on adverse space weather near the Earth. However, the erupting MFRs are occasionally prevented by strong overlying magnetic fields, and the MFR evolution during the descending phase in the confined cases is lack of attention. Here, we present the deformation of an erupting MFR accompanied by a confined double-peaked solar flare. The first peak corresponded to the MFR eruption in a standard flare model, and the second peak was closely associated with the flashings of an underlying sheared arcade (SA), the reversal slipping motion of the L-shaped flare ribbon, the falling of the MFR, and the shifting of top of filament threads. All results suggest that the confined MFR eruption involved in two-step magnetic reconnection presenting two distinct episodes of energy release in the flare impulsive phase, and the latter magnetic reconnection between the confined MFR and the underlying SA caused the deformation of MFR.

preprint2022arXiv

The disk wind in GRS 1915+105 as seen by Insight-HXMT

We analyze three observations of GRS 1915+105 in 2017 by Insight-HXMT when the source was in a spectrally soft state. We find strong absorption lines from highly ionized iron, which are due to absorption by disk wind outflowing at a velocity of $\sim$ 1000 km s$^{-1}$ along our line of sight. Two of the three observations show large amplitude oscillation in their light curves and the variation pattern corresponds to state $κ$ of GRS 1915+105. From time-averaged and flux-resolved analysis, we find that the variation of the ionization state of the disk wind follows the X-ray continuum on timescales from hundreds seconds to months. The radial location of the disk wind is consistent with thermal driving. The mass-loss rate due to the outflowing wind is comparable to the mass accretion rate in the inner disk, which demonstrates the important role of the disk wind in the disk accretion system.

preprint2022arXiv

The evolution of the corona in MAXI J1535-571 through type-C quasi-periodic oscillations with Insight-HXMT

Type-C quasi-periodic oscillations (QPOs) in black hole X-ray transients can appear when the source is in the low-hard and hard-intermediate states. The spectral-timing evolution of the type-C QPO in MAXI J1535-571 has been recently studied with Insight-HXMT. Here we fit simultaneously the time-averaged energy spectrum, using a relativistic reflection model, and the fractional rms and phase-lag spectra of the type-C QPOs, using a recently developed time-dependent Comptonization model when the source was in the intermediate state. We show, for the first time, that the time-dependent Comptonization model can successfully explain the X-ray data up to 100 keV. We find that in the hard-intermediate state the frequency of the type-C QPO decreases from 2.6 Hz to 2.1 Hz, then increases to 3.3 Hz, and finally increases to ~ 9 Hz. Simultaneously with this, the evolution of corona size and the feedback fraction (the fraction of photons up-scattered in the corona that return to the disc) indicates the change of the morphology of the corona. Comparing with contemporaneous radio observations, this evolution suggests a possible connection between the corona and the jet when the system is in the hard-intermediate state and about to transit into the soft-intermediate state.

preprint2022arXiv

The evolution of the high-frequency variability in the black hole candidate GRS 1915+105 as seen by RXTE

GRS 1915+105 can show type-C quasi-periodic oscillations (QPOs) in the power density spectrum. A high-frequency QPO (HFQPO) at 67 Hz has been observed in this source, albeit less often than the type-C QPOs. Besides these features, GRS 1915+105 sometimes shows a broad bump in the power spectrum at around 30-150 Hz. We study the power spectra of GRS 1915+105 with the Rossi X-ray Timing Explorer when the source was in the $χ$ class. We find that the rms amplitude of the bump depends strongly upon both the frequency of the type-C QPO and the hardness ratio, and is correlated with the corona temperature and anti-correlated with the radio flux at 15 GHz. The characteristic frequency of the bump is better correlated with a combination of the frequency of the type-C QPO and the hardness ratio than with the frequency of the type-C QPO alone. The rms amplitude of the bump generally increases with energy from ~1-2% at ~3 keV to ~10-15% at ~30 keV. We suggest that the bump and the high-frequency QPO may be the same variability component but the properties of the corona affect the coherence of this variability, leading either to a HFQPO when the spectrum is in the relatively soft $γ$ class, or to a bump when the spectrum is in the hard $χ$ class. Finally, we discuss the anti-correlation between the rms amplitude of the bump and the radio flux in the context of the relation between the corona and the jet.

preprint2022arXiv

The evolving properties of the corona of GRS 1915+105: A spectral-timing perspective through variable-Comptonisation modelling

The inverse Compton process by which soft photons are up-scattered by hot electrons in a corona plays a fundamental role in shaping the X-ray spectra of black-hole (BH) low-mass X-ray binaries (LMXBs), particularly in the hard and hard-intermediate states. In these states, the power-density spectra of these sources typically show Type-C low-frequency quasi-periodic oscillations (QPOs). Although several models have been proposed to explain the dynamical origin of their frequency, only a few of those models predict the spectral-timing radiative properties of the QPOs. Here we study the physical and geometrical properties of the corona of the BH-LMXB GRS 1915+105 based on a large sample of observations available in the RXTE archive. We use a recently-developed spectral-timing Comptonisation model to fit simultaneously the energy-dependent fractional rms amplitude and phase-lag spectra of the Type-C QPO in 398 observations. For this, we include spectral information gathered from fitting a Comptonisation model to the corresponding time-averaged spectra. We analyse the dependence of the physical and geometrical properties of the corona upon the QPO frequency and spectral state of the source, the latter characterised by the hardness ratio. We find consistent trends in the evolution of the corona size, temperature, and feedback (the fraction of the corona photons that impinge back onto the disc) that persist for roughly 15~years. By correlating our observations with simultaneous radio-monitoring of the source at 15 GHz, we propose a scenario in which the disc-corona interactions connect with the launching mechanism of the radio jet in this source.

preprint2022arXiv

The Future of Traditional Fuel Vehicles (TFV) and New Energy Vehicles (NEV): Creative Destruction or Co-existence?

There is a rapid development and commercialization of new Energy Vehicles (NEV) in recent years. Although traditional fuel vehicles (TFV) still occupy a majority share of the market, it is generally believed that NEV is more efficient, more environmental friendly, and has a greater potential of a Schumpeterian &#34;creative destruction&#34; that may lead to a paradigm shift in auto production and consumption. However, less is discussed regarding the potential environmental impact of NEV production and future uncertainty in R&D bottleneck of NEV technology and innovation. This paper aims to propose a modelling framework based on Lux (1995) that investigates the long-term dynamics of TFV and NEV, along with their associated environmental externality. We argue that environmental and technological policies will play a critical role in determining its future development. It is of vital importance to constantly monitor the potential environmental impact of both sectors and support the R&D of critical NEV technology, as well as curbing its negative externality in a preemptive manner.

preprint2022arXiv

The KLT Relation from the Tree formula and Permutohedron

In this paper, we generalize the Nguyen-Spradlin-Volovich-Wen (NSVW) tree formula from the MHV sector to any helicity sector. We find a close connection between the Permutohedron and the KLT relation, and construct a non-trivial mapping between them, linking the amplitudes in the gauge and gravity theories. The gravity amplitude can also be mapped from a determinant followed from the matrix-tree theorem. Besides, we use the binary tree graphs to manifest its Lie structure. In our tree formula, there is an evident Hopf algebra of the permutation group behind the gravity amplitudes. Using the tree formula, we can directly re-derive the soft/collinear limit of the amplitudes.

preprint2022arXiv

Twin extreme ultraviolet waves in the solar corona

Solar extreme ultraviolet (EUV) waves are spectacular propagating disturbances with EUV enhancements in annular shapes in the solar corona. These EUV waves carry critical information about the coronal magnetised plasma that can shed light on the elusive physical parameters (e.g. the magnetic field strength) by global solar coronal magneto-seismology. EUV waves are closely associated with a wide range of solar atmospheric eruptions, from violent flares and coronal mass ejections (CMEs) to less energetic plasma jets or mini-filament eruptions. However, the physical nature and driving mechanism of EUV waves is still controversial. Here, we report the unique discovery of twin EUV waves (TEWs) that were formed in a single eruption with observations from two different perspectives. In all earlier studies, a single eruption was associated at most with a single EUV wave. The newly found TEWs urge to re-visit our theoretical understanding about the underlying formation mechanism(s) of coronal EUV waves. Two distinct scenarios of TEWs were found. In the first scenario, the two waves were separately associated with a filament eruption and a precursor jet, while in another scenario the two waves were successively associated with a filament eruption. Hence, we label these distinguished scenarios as &#34;fraternal TEWs&#34; and &#34;identical TEWs&#34;, respectively. Further, we also suggest that impulsive lateral expansions of two distinct groups of coronal loops are critical to the formation of TEWs in a single eruption.

preprint2022arXiv

Variability and phase lags of the type-C quasi-periodic oscillation of MAXI J1348-630 with NICER

We study the properties of the type-C quasi-periodic oscillation (type-C QPO) of MAXI J1348-630 during its 2019 outburst and reflare with NICER. This is the first time that the evolution of the properties of type-C QPOs is studied during an outburst reflare. We found that the properties of the type-C QPO during the reflare are similar to those of type-C QPOs observed in other black-hole systems during outburst. This suggests that the physical processes responsible for type-C QPOs are the same in a reflare and in an outburst. We also found that the FWHM of a high-frequency broadband component observed during the reflare changes significantly with energy. We studied the energy-dependent fractional rms amplitude and phase lags of the type-C QPO from 0.5 keV to 12 keV. We found that the fractional rms amplitude increases up to 2-3 keV and then remains approximately constant above this energy, and the lag spectra of the type-C QPO are hard. We discuss the dependence of the fractional rms amplitude and phase lags with energy in the context of Comptonisation as the radiative mechanism driving the QPO rms and lag spectra.

preprint2021arXiv

MeSIN: Multilevel Selective and Interactive Network for Medication Recommendation

Recommending medications for patients using electronic health records (EHRs) is a crucial data mining task for an intelligent healthcare system. It can assist doctors in making clinical decisions more efficiently. However, the inherent complexity of the EHR data renders it as a challenging task: (1) Multilevel structures: the EHR data typically contains multilevel structures which are closely related with the decision-making pathways, e.g., laboratory results lead to disease diagnoses, and then contribute to the prescribed medications; (2) Multiple sequences interactions: multiple sequences in EHR data are usually closely correlated with each other; (3) Abundant noise: lots of task-unrelated features or noise information within EHR data generally result in suboptimal performance. To tackle the above challenges, we propose a multilevel selective and interactive network (MeSIN) for medication recommendation. Specifically, MeSIN is designed with three components. First, an attentional selective module (ASM) is applied to assign flexible attention scores to different medical codes embeddings by their relevance to the recommended medications in every admission. Second, we incorporate a novel interactive long-short term memory network (InLSTM) to reinforce the interactions of multilevel medical sequences in EHR data with the help of the calibrated memory-augmented cell and an enhanced input gate. Finally, we employ a global selective fusion module (GSFM) to infuse the multi-sourced information embeddings into final patient representations for medications recommendation. To validate our method, extensive experiments have been conducted on a real-world clinical dataset. The results demonstrate a consistent superiority of our framework over several baselines and testify the effectiveness of our proposed approach.

preprint2020arXiv

A Systematic Analysis of the Phase Lags Associated with the Type-C Quasi-periodic Oscillation in GRS 1915+105

We present a systematic analysis of the phase lags associated with the type-C QPOs in GRS 1915+105 using RXTE data. Our sample comprises of 620 RXTE observations with type-C QPOs ranging from ~0.4 Hz to ~6.3 Hz. Based on our analysis, we confirm that the QPO phase lags decrease with QPO frequency, and change sign from positive to negative at a QPO frequency of ~2 Hz. In addition, we find that the slope of this relation is significantly different between QPOs below and above 2 Hz. The relation between the QPO lags and QPO rms can be well fitted with a broken line: as the QPO lags go from negative to positive, the QPO rms first increases, reaching its maximum at around zero lag, and then decreases. The phase-lag behaviour of the subharmonic of the QPO is similar to that of the QPO fundamental, where the subharmonic lags decrease with subharmonic frequency and change sign from positive to negative at a subharmonic frequency of ~1 Hz; on the contrary, the second harmonic of the QPO shows a quite different phase-lag behaviour, where all the second harmonics show hard lags that remain more or less constant. For both the QPO and its (sub)harmonics, the slope of the lag-energy spectra shows a similar evolution with frequency as the average phase lags. This suggests that the lag-energy spectra drives the average phase lags. We discuss the possibility for the change in lag sign, and the physical origin of the QPO lags.

preprint2020arXiv

A two-component Comptonisation model for the type-B QPO in MAXI J1348-630

Spectral-timing analysis of the fast variability observed in X-rays is a powerful tool to study the physical and geometrical properties of the accretion/ejection flows in black-hole binaries. The origin of type-B quasi-periodic oscillations (QPO), predominantly observed in black-hole candidates in the soft-intermediate state, has been linked to emission arising from the relativistic jet. In this state, the X-ray spectrum is characterised by a soft-thermal blackbody-like emission due to the accretion disc, an iron emission line (in the 6-7 keV range), and a power-law like hard component due to Inverse-Compton scattering of the soft-photon source by hot electrons in a corona or the relativistic jet itself. The spectral-timing properties of MAXI J1348-630 have been recently studied using observations obtained with the NICER observatory. The data show a strong type-B QPO at ~4.5 Hz with increasing fractional rms amplitude with energy and positive lags with respect to a reference band at 2-2.5 keV. We use a variable-Comptonisation model that assumes a sinusoidal coherent oscillation of the Comptonised X-ray flux and the physical parameters of the corona at the QPO frequency, to fit simultaneously the energy-dependent fractional rms amplitude and phase lags of this QPO. We show that two physically-connected Comptonisation regions can successfully explain the radiative properties of the QPO in the full 0.8-10 keV energy range.

preprint2020arXiv

Adversarial Data Encryption

In the big data era, many organizations face the dilemma of data sharing. Regular data sharing is often necessary for human-centered discussion and communication, especially in medical scenarios. However, unprotected data sharing may also lead to data leakage. Inspired by adversarial attack, we propose a method for data encryption, so that for human beings the encrypted data look identical to the original version, but for machine learning methods they are misleading. To show the effectiveness of our method, we collaborate with the Beijing Tiantan Hospital, which has a world leading neurological center. We invite $3$ doctors to manually inspect our encryption method based on real world medical images. The results show that the encrypted images can be used for diagnosis by the doctors, but not by machine learning methods.

preprint2020arXiv

COVID-19 Chest CT Image Segmentation -- A Deep Convolutional Neural Network Solution

A novel coronavirus disease 2019 (COVID-19) was detected and has spread rapidly across various countries around the world since the end of the year 2019, Computed Tomography (CT) images have been used as a crucial alternative to the time-consuming RT-PCR test. However, pure manual segmentation of CT images faces a serious challenge with the increase of suspected cases, resulting in urgent requirements for accurate and automatic segmentation of COVID-19 infections. Unfortunately, since the imaging characteristics of the COVID-19 infection are diverse and similar to the backgrounds, existing medical image segmentation methods cannot achieve satisfactory performance. In this work, we try to establish a new deep convolutional neural network tailored for segmenting the chest CT images with COVID-19 infections. We firstly maintain a large and new chest CT image dataset consisting of 165,667 annotated chest CT images from 861 patients with confirmed COVID-19. Inspired by the observation that the boundary of the infected lung can be enhanced by adjusting the global intensity, in the proposed deep CNN, we introduce a feature variation block which adaptively adjusts the global properties of the features for segmenting COVID-19 infection. The proposed FV block can enhance the capability of feature representation effectively and adaptively for diverse cases. We fuse features at different scales by proposing Progressive Atrous Spatial Pyramid Pooling to handle the sophisticated infection areas with diverse appearance and shapes. We conducted experiments on the data collected in China and Germany and show that the proposed deep CNN can produce impressive performance effectively.

preprint2020arXiv

DeText: A Deep Text Ranking Framework with BERT

Ranking is the most important component in a search system. Mostsearch systems deal with large amounts of natural language data,hence an effective ranking system requires a deep understandingof text semantics. Recently, deep learning based natural languageprocessing (deep NLP) models have generated promising results onranking systems. BERT is one of the most successful models thatlearn contextual embedding, which has been applied to capturecomplex query-document relations for search ranking. However,this is generally done by exhaustively interacting each query wordwith each document word, which is inefficient for online servingin search product systems. In this paper, we investigate how tobuild an efficient BERT-based ranking model for industry use cases.The solution is further extended to a general ranking framework,DeText, that is open sourced and can be applied to various rankingproductions. Offline and online experiments of DeText on threereal-world search systems present significant improvement overstate-of-the-art approaches.

preprint2020arXiv

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

Low-frequency quasi-periodic oscillations (LFQPOs) are commonly found in black hole X-ray binaries, and their origin is still under debate. The properties of LFQPOs at high energies (above 30 keV) are closely related to the nature of the accretion flow in the innermost regions, and thus play a crucial role in critically testing various theoretical models. The Hard X-ray Modulation Telescope (Insight-HXMT) is capable of detecting emissions above 30 keV, and is therefore an ideal instrument to do so. Here we report the discovery of LFQPOs above 200 keV in the new black hole MAXI J1820+070 in the X-ray hard state, which allows us to understand the behaviours of LFQPOs at hundreds of kiloelectronvolts. The phase lag of the LFQPO is constant around zero below 30 keV, and becomes a soft lag (that is, the high-energy photons arrive first) above 30 keV. The soft lag gradually increases with energy and reaches ~0.9s in the 150-200 keV band. The detection at energies above 200 keV, the large soft lag and the energy-related behaviors of the LFQPO pose a great challenge for most currently existing models, but suggest that the LFQPO probably originates from the precession of a small-scale jet.

preprint2020arXiv

Efficient Scene Text Detection with Textual Attention Tower

Scene text detection has received attention for years and achieved an impressive performance across various benchmarks. In this work, we propose an efficient and accurate approach to detect multioriented text in scene images. The proposed feature fusion mechanism allows us to use a shallower network to reduce the computational complexity. A self-attention mechanism is adopted to suppress false positive detections. Experiments on public benchmarks including ICDAR 2013, ICDAR 2015 and MSRA-TD500 show that our proposed approach can achieve better or comparable performances with fewer parameters and less computational cost.

preprint2020arXiv

MeDaS: An open-source platform as service to help break the walls between medicine and informatics

In the past decade, deep learning (DL) has achieved unprecedented success in numerous fields including computer vision, natural language processing, and healthcare. In particular, DL is experiencing an increasing development in applications for advanced medical image analysis in terms of analysis, segmentation, classification, and furthermore. On the one hand, tremendous needs that leverage the power of DL for medical image analysis are arising from the research community of a medical, clinical, and informatics background to jointly share their expertise, knowledge, skills, and experience. On the other hand, barriers between disciplines are on the road for them often hampering a full and efficient collaboration. To this end, we propose our novel open-source platform, i.e., MeDaS -- the MeDical open-source platform as Service. To the best of our knowledge, MeDaS is the first open-source platform proving a collaborative and interactive service for researchers from a medical background easily using DL related toolkits, and at the same time for scientists or engineers from information sciences to understand the medical knowledge side. Based on a series of toolkits and utilities from the idea of RINV (Rapid Implementation aNd Verification), our proposed MeDaS platform can implement pre-processing, post-processing, augmentation, visualization, and other phases needed in medical image analysis. Five tasks including the subjects of lung, liver, brain, chest, and pathology, are validated and demonstrated to be efficiently realisable by using MeDaS.

preprint2020arXiv

On Vocabulary Reliance in Scene Text Recognition

The pursuit of high performance on public benchmarks has been the driving force for research in scene text recognition, and notable progress has been achieved. However, a close investigation reveals a startling fact that the state-of-the-art methods perform well on images with words within vocabulary but generalize poorly to images with words outside vocabulary. We call this phenomenon &#34;vocabulary reliance&#34;. In this paper, we establish an analytical framework to conduct an in-depth study on the problem of vocabulary reliance in scene text recognition. Key findings include: (1) Vocabulary reliance is ubiquitous, i.e., all existing algorithms more or less exhibit such characteristic; (2) Attention-based decoders prove weak in generalizing to words outside vocabulary and segmentation-based decoders perform well in utilizing visual features; (3) Context modeling is highly coupled with the prediction layers. These findings provide new insights and can benefit future research in scene text recognition. Furthermore, we propose a simple yet effective mutual learning strategy to allow models of two families (attention-based and segmentation-based) to learn collaboratively. This remedy alleviates the problem of vocabulary reliance and improves the overall scene text recognition performance.

preprint2020arXiv

Perfect coherent transfer in an on-chip reconfigurable nanoelectromechanical network

Realizing a controllable network with multiple degrees of interaction is a challenge to physics and engineering. Here, we experimentally report an on-chip reconfigurable network based on nanoelectromechanical resonators with nearest-neighbor (NN) and next-nearest-neighbor (NNN) strong couplings. By applying different parametric voltages on the same on-chip device, we carry out perfect coherent transfer in NN and NNN coupled array networks. Moreover, the low-loss resonators ensure the desired evolution to achieve perfect transfer and the demonstration of the parity-dependent phase relation at transmission cycles. The realization of NNN couplings demonstrates the capability of engineering coherent coupling beyond a simple model of a NN coupled array of doubly clamped resonators. Our reconfigurable nanoelectromechanical network provides a highly tunable physical platform and offers the possibilities of investigating various interesting phenomena, such as topological transport, synchronization of networks, as well as metamaterials.

preprint2020arXiv

Prediction of mechanical properties of non-equiatomic high-entropy alloy by atomistic simulation and machine learning

High-entropy alloys (HEAs) with multiple constituent elements have been extensively studied in the past 20 years due to their promising engineering application. Previous experimental and computational studies of HEAs focused mainly on equiatomic or near equiatomic HEAs. However, there is probably far more treasure in those non-equiatomic HEAs with carefully designed composition. In this study, molecular dynamics (MD) simulation combined with machine learning (ML) methods were used to predict the mechanical properties of non-equiatomic CuFeNiCrCo HEAs. A database was established based on a tensile test of 900 HEA single-crystal samples by MD simulation. We investigated and compared eight ML models for the learning tasks, ranging from shallow models to deep models. It was found that the kernel-based extreme learning machine (KELM) model outperformed others for the prediction of yield stress and Young&#39;s modulus. The accuracy of the KELM model was further verified by the large-sized polycrystal HEA samples.

preprint2020arXiv

Realization of programmable nanomechanical lattice with both nearest-neighboring and next-nearest-neighboring couplings

The programmable artificial lattice, based on the controllability of coupling strengths and the scalability of multiple sites, is desperately desired in engineering metamaterials and exploring fundamental physics. In this work, we experimentally present a programmable lattice consisting of multiple paralleled nanomechanical resonators, whose internal interactions can be linearly manipulated by external voltages. Flexural modes of nearest-neighboring (NN) and next-nearest-neighboring (NNN) resonators are parametrically coupled through modulated electrostatic interactions. Particularly, in a wide range up to deep strong coupling regime, both the NN and NNN coupling strengths are precisely proportional to manipulation voltage. The realization of long-range coupling provides a promising prospect in constructing complex lattice structure, which is essential in investigating mechanical logic devices, topological physics and coherent phononic dynamics.

preprint2020arXiv

Structure-Feature based Graph Self-adaptive Pooling

Various methods to deal with graph data have been proposed in recent years. However, most of these methods focus on graph feature aggregation rather than graph pooling. Besides, the existing top-k selection graph pooling methods have a few problems. First, to construct the pooled graph topology, current top-k selection methods evaluate the importance of the node from a single perspective only, which is simplistic and unobjective. Second, the feature information of unselected nodes is directly lost during the pooling process, which inevitably leads to a massive loss of graph feature information. To solve these problems mentioned above, we propose a novel graph self-adaptive pooling method with the following objectives: (1) to construct a reasonable pooled graph topology, structure and feature information of the graph are considered simultaneously, which provide additional veracity and objectivity in node selection; and (2) to make the pooled nodes contain sufficiently effective graph information, node feature information is aggregated before discarding the unimportant nodes; thus, the selected nodes contain information from neighbor nodes, which can enhance the use of features of the unselected nodes. Experimental results on four different datasets demonstrate that our method is effective in graph classification and outperforms state-of-the-art graph pooling methods.

preprint2020arXiv

Structures and Properties of $β$-Titanium Doping Trace Transition Metal Elements: a Density Functional Theory Study

We systematically calculate the structure, formation enthalpy, formation free energy, elastic constants and electronic structure of Ti$_{0.98}$X$_{0.02}$ system by density functional theory (DFT) simulations to explore the effect of transition metal X (X=Ag, Cd, Co, Cr, Cu, Fe, Mn, Mo, Nb, Ni, Pd, Rh, Ru, Tc, and Zn) on the stability mechanism of $β$-titanium. Based on our calculations, the results of formation enthalpy and free energy show that adding trace X is beneficial to the thermodynamic stability of $β$-titanium. This behavior is well explained by the density of state (DOS). However, the tetragonal shear moduli of Ti$_{0.98}$X$_{0.02}$ systems are negative, indicating that $β$-titanium doping with a low concentration of X is still elastically unstable at 0 K. Therefore, we theoretically explain that $β$-titanium doping with trace transition metal X is unstable in the ground state.

preprint2020arXiv

ThreshKnot: Thresholded ProbKnot for Improved RNA Secondary Structure Prediction

RNA structure prediction is a challenging problem, especially with pseudoknots. Recently, there has been a shift from the classical minimum free energy-based methods (MFE) to partition function-based ones that assemble structures using base-pairing probabilities. Two examples of the latter group are the popular maximum expected accuracy (MEA) method and the ProbKnot method. ProbKnot is a fast heuristic that pairs nucleotides that are reciprocally most probable pairing partners, and unlike MEA, can also predict structures with pseudoknots. However, ProbKnot&#39;s full potential has been largely overlooked. In particular, when introduced, it did not have an MEA-like hyperparameter that can balance between positive predictive value (PPV) and sensitivity. We show that a simple thresholded version of ProbKnot, which we call ThreshKnot, leads to more accurate overall predictions by filtering out unlikely pairs whose probabilities fall under a given threshold. We also show that on three widely-used folding engines (RNAstructure, Vienna RNAfold, and CONTRAfold), ThreshKnot always outperforms the much more involved MEA algorithm in (1) its higher structure prediction accuracy, (2) its capability to predict pseudoknots, and (3) its faster runtime and easier implementation. This suggests that ThreshKnot should replace MEA as the default partition function-based structure prediction algorithm. ThreshKnot is already available in the widely used RNAstructure software package version 6.2 (released November 27, 2019): https://rna.urmc.rochester.edu/RNAstructure.html

preprint2020arXiv

Time lags of the type-B QPO in MAXI J1348-630

The fast variability observed in the X-ray emission from black-hole binaries has a very complex phenomenology, but offers the possibility to investigate directly the properties of the inner accretion flow. In particular, type-B oscillations in the 2-8 Hz range, observed in the Soft-Intermediate state, have been associated to the emission from a relativistic jet. We present the results of the timing and spectral analysis of a set of observations of the bright transient MAXI J1348-630 made with the NICER telescope. The observations are in the brightest part of the outburst and all feature a strong type-B QPO at ~4.5 Hz. We compute the energy dependence of the fractional rms and the phase lags at the QPO frequency, obtaining high signal-to-noise data and sampling for the first time at energies below 2 keV. The fractional rms decreases from more than 10% at 9 keV to 0.6% at 1.5 keV, and is constant below that energy. Taking the 2-3 keV band as reference, photons at all energies show a hard lag, increasing with the distance from the reference band. The behaviour below 2 keV has never been observed before, due to the higher energy bandpass of previous timing instruments. The energy spectrum can be fitted with a standard model for this state, consisting of a thin disc component and a harder power law, plus an emission line between 6 and 7 keV. We discuss the results, concentrating on the phase lags, and show that they can be interpreted within a Comptonization model.

preprint2020arXiv

Towards Label-Free 3D Segmentation of Optical Coherence Tomography Images of the Optic Nerve Head Using Deep Learning

Since the introduction of optical coherence tomography (OCT), it has been possible to study the complex 3D morphological changes of the optic nerve head (ONH) tissues that occur along with the progression of glaucoma. Although several deep learning (DL) techniques have been recently proposed for the automated extraction (segmentation) and quantification of these morphological changes, the device specific nature and the difficulty in preparing manual segmentations (training data) limit their clinical adoption. With several new manufacturers and next-generation OCT devices entering the market, the complexity in deploying DL algorithms clinically is only increasing. To address this, we propose a DL based 3D segmentation framework that is easily translatable across OCT devices in a label-free manner (i.e. without the need to manually re-segment data for each device). Specifically, we developed 2 sets of DL networks. The first (referred to as the enhancer) was able to enhance OCT image quality from 3 OCT devices, and harmonized image-characteristics across these devices. The second performed 3D segmentation of 6 important ONH tissue layers. We found that the use of the enhancer was critical for our segmentation network to achieve device independency. In other words, our 3D segmentation network trained on any of 3 devices successfully segmented ONH tissue layers from the other two devices with high performance (Dice coefficients > 0.92). With such an approach, we could automatically segment images from new OCT devices without ever needing manual segmentation data from such devices.

preprint2019arXiv

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

As China&#39;s first X-ray astronomical satellite, the Hard X-ray Modulation Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15, 2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was designed to perform pointing, scanning and gamma-ray burst (GRB) observations and, based on the Direct Demodulation Method (DDM), the image of the scanned sky region can be reconstructed. Here we give an overview of the mission and its progresses, including payload, core sciences, ground calibration/facility, ground segment, data archive, software, in-orbit performance, calibration, background model, observations and some preliminary results.

preprint2019arXiv

Spaceborne low-noise single-photon detection for satellite-based quantum communications

Single-photon detectors (SPDs) play important roles in highly sensitive detection applications, such as fluorescence spectroscopy, remote sensing and ranging, deep space optical communications, elementary particle detection, and quantum communications. However, the adverse conditions in space, such as the increased radiation flux and thermal vacuum, severely limit their noise performances, reliability, and lifetime. Herein, we present the first example of spaceborne, low-noise, high reliability SPDs, based on commercial off-the-shelf (COTS) silicon avalanche photodiodes (APD). Based on the high noise-radiation sensitivity of silicon APD, we have developed special shielding structures, multistage cooling technologies, and configurable driver electronics that significantly improved the COTS APD reliability and mitigated the SPD noise-radiation sensitivity. This led to a reduction of the expected in-orbit radiation-induced dark count rate (DCR) from ~219 counts per second (cps) per day to ~0.76 cps/day. During a continuous period of continuous operations in orbit which spanned of 1029 days, the SPD DCR was maintained below 1000 cps, i.e., the actual in-orbit radiation-induced DCR increment rate was ~0.54 cps/day, i.e., two orders of magnitude lower than those evoked by previous technologies, while its photon detection efficiency was > 45%. Our spaceborne, low-noise SPDs established a feasible satellite-based up-link quantum communication that was validated on the quantum experiment science satellite platform. Moreover, our SPDs open new windows of opportunities for space research and applications in deep-space optical communications, single-photon laser ranging, as well as for testing the fundamental principles of physics in space.

preprint2019arXiv

Unveiling delay-time-resolved phase noise dynamics of narrow-linewidth laser via coherent optical time domain reflectometry

Laser with high spectral purity plays a crucial role in high-precision optical metrology and coherent communication. Thanks to the rapid development of laser frequency stabilization, the laser phase noise can be remarkably compensated, allowing its ultra-narrow linewidth subject to mostly quantum limit. Nevertheless, the accurate characterization of phase noise dynamics and its intrinsic linewidth of a highly coherent laser remains ambiguous and challenging. Here, we present an approach capable of revealing delay-time-resolved phase noise dynamics of a coherent laser based on coherent optical time domain reflectometry (COTDR), in which distributed Rayleigh scattering along a delay fibre essentially allows a time-of-flight mapping of a heterodyne beating signal associated with delay-time-dependent phase information from a single laser source. Ultimately, this novel technique facilitates a precise measurement of ultra-narrow laser linewidth by exploiting its delay-time-resolved phase jitter statistics, confirmed with the analytical modelling and numerical simulations.

preprint2018arXiv

Doubly Robust Sure Screening for Elliptical Copula Regression Model

Regression analysis has always been a hot research topic in statistics. We propose a very flexible semi-parametric regression model called Elliptical Copula Regression (ECR) model, which covers a large class of linear and nonlinear regression models such as additive regression model,single index model. Besides, ECR model can capture the heavy-tail characteristic and tail dependence between variables, thus it could be widely applied in many areas such as econometrics and finance. In this paper we mainly focus on the feature screening problem for ECR model in ultra-high dimensional setting. We propose a doubly robust sure screening procedure for ECR model, in which two types of correlation coefficient are involved: Kendall tau correlation and Canonical correlation. Theoretical analysis shows that the procedure enjoys sure screening property, i.e., with probability tending to 1, the screening procedure selects out all important variables and substantially reduces the dimensionality to a moderate size against the sample size. Thorough numerical studies are conducted to illustrate its advantage over existing sure independence screening methods and thus it can be used as a safe replacement of the existing procedures in practice. At last, the proposed procedure is applied on a gene-expression real data set to show its empirical usefulness.