Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
62works
0followers
31topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

62 published item(s)

preprint2026arXiv

Bayesian constraints on quark stars from multi-messenger observations

We perform a systematic Bayesian analysis of quark star equations of state under current multimessenger constraints, investigating the impact of prior assumptions and extreme-mass observations. Quark matter is modeled within an interacting MIT bag framework that consistently accommodates color-superconducting phases (2SC, 2SC+s, and CFL) and perturbative QCD corrections. We find that quark star models exhibit a distinct advantage in naturally accommodating the ultra-low mass object HESS J1731-347, a configuration that is challenging for standard neutron star models. In the high-mass regime, the interpretation of the secondary component of GW190814 is shown to be strongly prior-dependent: only broad priors allow for the substantial stiffness required to support such a massive object ($\sim$2.6 M$_\odot$), while more restrictive priors favor a softer equation of state consistent with standard pulsar populations. Microscopically, we demonstrate that current data tightly constrain the effective bag constant and the overall stiffness, but cannot distinguish between different color-superconducting phases. Furthermore, we validate a reduction of the model to two effective parameters without loss of information. Our results indicate that if quark stars exist, their sound speeds consistently exceeds the conformal limit ($c_s^2>1/3$) at stellar densities.

preprint2026arXiv

Bridging External and Parametric Knowledge: Mitigating Hallucination of LLMs with Shared-Private Semantic Synergy in Dual-Stream Knowledge

Retrieval-augmented generation (RAG) aims to mitigate the hallucination of Large Language Models (LLMs) by retrieving and incorporating relevant external knowledge into the generation process. However, the external knowledge may contain noise and conflict with the parametric knowledge of LLMs, leading to degraded performance. Current LLMs lack inherent mechanisms for resolving such conflicts. To fill this gap, we propose a Dual-Stream Knowledge-Augmented Framework for Shared-Private Semantic Synergy (DSSP-RAG). Central to it is the refinement of the traditional self-attention into a mixed-attention that distinguishes shared and private semantics for a controlled knowledge integration. An unsupervised hallucination detection method that captures the LLMs' intrinsic cognitive uncertainty ensures that external knowledge is introduced only when necessary. To reduce noise in external knowledge, an Energy Quotient (EQ), defined by attention difference matrices between task-aligned and task-misaligned layers, is proposed. Extensive experiments show that DSSP-RAG achieves a superior performance over strong baselines.

preprint2026arXiv

CineMatte: Background Matting for Virtual Production and Beyond

LED Virtual Production (VP) uses large LED volumes to render backgrounds in real time, enabling in-camera visual effects but making post-shot changes labor-intensive. We address this with CineMatte, a robust background matting framework for VP and beyond. CineMatte employs a cross-attention-conditioned design. Instead of concatenating the background with the input, CineMatte employs a Siamese, frozen DINOv3 Vision Transformer with shared weights to encode the input frame and the captured background separately. A cross-attention module compares the two streams to predict the foreground, preserving pretrained semantics and improving robustness to background shifts. Previous ViT-based matting models use a parallel convolutional "detail branch" to recover fine details, which can cause boundary artifacts in real-world samples due to semantic misalignment with the backbone. We instead replace it with a pretrained, image-guided feature upsampler, which largely mitigates the problem. We also introduce CineMatte-4K, a 4K HDR image-video dataset captured on a professional LED VP stage. To the best of our knowledge, the image subset is the first dataset for VP matting and is non-synthetic, obtained via green-screen insertion; the video subset includes camera motion with tracked trajectories so that arbitrary backgrounds can be rendered later with correct parallax. Across CineMatte-4K and public benchmarks (VideoMatte240K, YouTubeMatte), CineMatte not only excels in VP but also generalizes robustly to real-world footage.

preprint2026arXiv

DRL-STAF: A Deep Reinforcement Learning Framework for State-Aware Forecasting of Complex Multivariate Hidden Markov Processes

Forecasting multivariate hidden Markov processes is challenging due to nonlinear and nonstationary observations, latent state transitions, and cross-sequence dependencies. While deep learning methods achieve strong predictive accuracy, they typically lack explicit state modeling, whereas Hidden Markov Models (HMMs) provide interpretable latent states but struggle with complex nonlinear emissions and scalability. To address these limitations, we propose DRL-STAF, a Deep Reinforcement Learning based STate-Aware Forecasting framework that jointly predicts next-step observations and estimates the corresponding hidden states for complex multivariate hidden Markov processes. Specifically, DRL-STAF models complex nonlinear emissions using deep neural networks and estimates discrete hidden states using reinforcement learning, reducing the reliance on predefined transition structures and enabling flexible adaptation to diverse temporal dynamics. In particular, DRL-STAF mitigates the state-space explosion encountered by typical multivariate HMM-based methods. Extensive experiments demonstrate that DRL-STAF outperforms HMM variants, standalone deep learning models, and existing DL-HMM hybrids in most cases, while also providing reliable hidden-state estimates.

preprint2026arXiv

Efficient Context Scaling with LongCat ZigZag Attention

We introduce LongCat ZigZag Attention (LoZA), which is a sparse attention scheme designed to transform any existing full-attention models into sparse versions with rather limited compute budget. In long-context scenarios, LoZA can achieve significant speed-ups both for prefill-intensive (e.g., retrieval-augmented generation) and decode-intensive (e.g., tool-integrated reasoning) cases. Specifically, by applying LoZA to LongCat-Flash during mid-training, we serve LongCat-Flash-Exp as a long-context foundation model that can swiftly process up to 1 million tokens, enabling efficient long-term reasoning and long-horizon agentic capabilities.

preprint2026arXiv

Electronic Nematicity Revealed by Polarized Ultrafast Spectroscopy in Bilayer La$_3$Ni$_2$O$_7$

We report a polarized ultrafast pump-probe study of the normal-state electronic dynamics in bilayer La$_3$Ni$_2$O$_7$ and trilayer La$_4$Ni$_3$O$_{10}$ single crystals at ambient pressure. While both nickelates exhibit density-wave (DW) transitions accompanied by the opening of a quasiparticle relaxation bottleneck, their electronic responses display strikingly different symmetry properties. La$_4$Ni$_3$O$_{10}$ maintains an isotropic optical response across the entire temperature range. In contrast, La$_3$Ni$_2$O$_7$ exhibits a pronounced twofold ($C_2$) anisotropy in its low-temperature electronic dynamics. This electronic nematicity, evident in both the relaxation dynamics and the effective gap scales, competes with a secondary isotropic order emerging below 115 K. The presence of macroscopic electronic anisotropy in the bilayer system, and its absence in the trilayer system, suggests an intimate relation between electronic nematic fluctuations and superconducting pairing in La$_3$Ni$_2$O$_7$ that worth for deeper explorations.

preprint2026arXiv

EvoMemBench: Benchmarking Agent Memory from a Self-Evolving Perspective

Recent benchmarks for Large Language Model (LLM) agents mainly evaluate reasoning, planning, and execution. However, memory is also essential for agents, as it enables them to store, update, and retrieve information over time. This ability remains under-evaluated, largely because existing benchmarks do not provide a systematic way to assess memory mechanisms. In this paper, we study agent memory from a self-evolving perspective and introduce EvoMemBench, a unified benchmark organized along two axes: memory scope (in-episode vs. cross-episode) and memory content (knowledge-oriented vs. execution-oriented). We compare 15 representative memory methods with strong long-context baselines under a standardized protocol. Results show that current memory systems are still far from a general solution: long-context baselines remain highly competitive, memory helps most when the current context is insufficient or tasks are difficult, and no single memory form works consistently across all settings. Retrieval-based methods remain strong for knowledge-intensive settings, whereas procedural and long-term memory methods are more effective for execution-oriented tasks when their stored experience matches the task structure. We hope EvoMemBench facilitates future research on more effective memory systems for LLM-based agents. Our code is available at https://github.com/DSAIL-Memory/EvoMemBench.

preprint2026arXiv

IMPACT-Scribe: Interactive Temporal Action Segmentation with Boundary Scribbles and Query Planning

Dense temporal annotation of procedural activity videos is vital for action understanding and embodied intelligence but remains labor-intensive due to reactive tools. Each correction is treated as an isolated edit, limiting reuse of information on annotator uncertainty and model reliability. We introduce IMPACT-Scribe, a correction-driven framework for dense labeling that uses each correction to improve future human-machine collaboration. IMPACT-Scribe combines uncertainty-aware boundary scribble supervision, local proposal modeling, cost-aware query planning, structured propagation, and correction-driven adaptation. Experiments and a human study show that this closed-loop design improves labeling quality per effort, enhances boundary accuracy, and fosters better human-machine interaction over time. The code will be made publicly available at https://github.com/BanzQians/IMPACT_AS.

preprint2026arXiv

Kagome goldene with flat bands and Dirac nodal line fermions via line-graph epitaxy

The kagome lattice has emerged as a promising platform for investigating exotic quantum phases. However, achieving a single-atomic-layer kagome lattice in elemental materials remains a significant challenge. Here, we introduce line-graph epitaxy, a novel approach that enables the atomic-scale synthesis of goldene, a monolayer of elemental gold atoms arranged in a kagome lattice. Through scanning tunneling microscopy/spectroscopy (STM/STS), and density functional theory (DFT) calculations, we demonstrate the formation of kagome goldene, featuring a flat band with a van Hove singularity approximately 1.1 eV below the Fermi level, signaling strong electron correlation effects. Notably, the flat band is disrupted at the zigzag edges of goldene nanoflakes, revealing substantial edge effects. Furthermore, our calculations show that weak interlayer interactions between goldene and the underlying Au2Ge substrate generate dual Dirac nodal lines through a proximity effect. These findings offer not only a novel strategy for constructing elemental kagome lattices, but also a generalizable framework for fabricating and controlling line-graph materials. This research advances the exploration of quantum phases driven by strong correlations and the design of materials for next-generation quantum technologies.

preprint2026arXiv

MiMo-V2-Flash Technical Report

We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.

preprint2026arXiv

PruneTIR: Inference-Time Tool Call Pruning for Effective yet Efficient Tool-Integrated Reasoning

Tool-integrated reasoning (TIR) enables large language models (LLMs) to enhance their capabilities by interacting with external tools, such as code interpreters (CI). Most recent studies focus on exploring various methods to equip LLMs with the ability to use tools. However, how to further boost the reasoning ability of already tool-capable LLMs at inference time remains underexplored. Improving reasoning at inference time requires no additional training and can help LLMs better leverage tools to solve problems. We observe that, during tool-capable LLM inference, both the number and the proportion of erroneous tool calls are negatively correlated with answer correctness. Moreover, erroneous tool calls are typically resolved successfully within a few subsequent turns. If not, LLMs often struggle to resolve such errors even with many additional turns. Building on the above observations, we propose PruneTIR, a rather effective yet efficient framework that enhances the tool-integrated reasoning at inference time. During LLM inference, PruneTIR prunes trajectories, resamples tool calls, and suspends tool usage through three components: Success-Triggered Pruning, Stuck-Triggered Pruning and Resampling, and Retry-Triggered Tool Suspension. These three components enable PruneTIR to mitigate the negative impact of erroneous tool calls and prevent LLMs from getting stuck in repeated failed resolution attempts, thereby improving overall LLM performance. Extensive experimental results demonstrate the effectiveness of PruneTIR, which significantly improves Pass@1 and efficiency while reducing the working context length for tool-capable LLMs.

preprint2026arXiv

Rapid post-merger signal of circularly polarized gravitational wave from magnetic black hole superradiance: novel approach to detect magnetic monopole

We present an analytic framework demonstrating that a spinning black hole endowed with a net magnetic charge exhibits a dramatically amplified superradiant instability against charged scalar fields, enhanced by several orders of magnitude compared with the neutral Kerr case. The amplification arises from a monopole induced reduction of the centrifugal barrier. This shift deepens the gravitational bound-state potential well and produces a parametrically larger instability growth rate. This resulting rapid growth yields a macroscopic boson cloud that acts as a coherent source of near monochromatic continuous gravitational waves (GWs). We find an enhanced GW power. Monopole harmonic selection rules restrict the emission from the north (south) clouds corresponding to opposite helicities. Their superposition generates an (approximately) circularly polarized continuous GWs at a fixed sky location within even parity general relativity, distinct from the generic elliptical polarization of the Kerr case. In light of these new findings, we propose a potential smoking-gun search strategy for magnetic monopole and ultralight boson: the rapid post-merger follow-up GW signals from binary-black-hole merger remnants through ground-based and space-based GW experiments. In contrast to the Kerr case, where the signal turn-on can be delayed to decades-centuries, a magnetic remnant can form a cloud and emit a stronger, circularly polarized continuous GWs within weeks to months. Taking the magnetic supermassive remnants as an example, we demonstrate that the rapid follow-up GW signal in the mHz band appears just in few weeks after binary black hole mergers. Moreover, future polarization (ellipticity) measurements can distinguish the magnetic scenario from Kerr while providing a parity-even mechanism for circularly polarized GWs in general relativity.

preprint2026arXiv

Sandwich Reasoning: An Answer-Reasoning-Answer Approach for Low-Latency Query Correction

Query correction is a critical entry point in modern search pipelines, demanding high accuracy strictly within real-time latency constraints. Chain-of-Thought (CoT) reasoning improves accuracy but incurs prohibitive latency for real-time query correction. A potential solution is to output an answer before reasoning to reduce latency; however, under autoregressive decoding, the early answer is independent of subsequent reasoning, preventing the model from leveraging its reasoning capability to improve accuracy. To address this issue, we propose Sandwich Reasoning (SandwichR), a novel approach that explicitly aligns a fast initial answer with post-hoc reasoning, enabling low-latency query correction without sacrificing reasoning-aware accuracy. SandwichR follows an Answer-Reasoning-Answer paradigm, producing an initial correction, an explicit reasoning process, and a final refined correction. To align the initial answer with post-reasoning insights, we design a consistency-aware reinforcement learning (RL) strategy: a dedicated consistency reward enforces alignment between the initial and final corrections, while margin-based rejection sampling prioritizes borderline samples where reasoning drives the most impactful corrective gains. Additionally, we construct a high-quality query correction dataset, addressing the lack of specialized benchmarks for complex query correction. Experimental results demonstrate that SandwichR achieves SOTA accuracy comparable to standard CoT while delivering a 40-70% latency reduction, resolving the latency-accuracy trade-off in online search.

preprint2026arXiv

SpatialJB: How Text Distribution Art Becomes the "Jailbreak Key" for LLM Guardrails

While Large Language Models (LLMs) have powerful capabilities, they remain vulnerable to jailbreak attacks, which is a critical barrier to their safe web real-time application. Current commercial LLM providers deploy output guardrails to filter harmful outputs, yet these defenses are not impenetrable. Due to LLMs' reliance on autoregressive, token-by-token inference, their semantic representations lack robustness to spatially structured perturbations, such as redistributing tokens across different rows, columns, or diagonals. Exploiting the Transformer's spatial weakness, we propose SpatialJB to disrupt the model's output generation process, allowing harmful content to bypass guardrails without detection. Comprehensive experiments conducted on leading LLMs get nearly 100% ASR, demonstrating the high effectiveness of SpatialJB. Even after adding advanced output guardrails, like the OpenAI Moderation API, SpatialJB consistently maintains a success rate exceeding 75%, outperforming current jailbreak techniques by a significant margin. The proposal of SpatialJB exposes a key weakness in current guardrails and emphasizes the importance of spatial semantics, offering new insights to advance LLM safety research. To prevent potential misuse, we also present baseline defense strategies against SpatialJB and evaluate their effectiveness in mitigating such attacks. The code for the attack, baseline defenses, and a demo are available at https://anonymous.4open.science/r/SpatialJailbreak-8E63.

preprint2026arXiv

UniPPTBench: A Unified Benchmark for Presentation Generation Across Diverse Input Settings

Existing works typically focus on presentation generation under isolated input settings, whereas real-world use cases span diverse scenarios, including vague user prompts, long documents, multimodal materials, and multiple heterogeneous sources. Moreover, current evaluations are often insufficiently scenario-specific. They mainly rely on generic presentation-quality criteria, such as visual appeal, layout quality, and overall coherence, but fail to assess the core capabilities required by different input settings, including grounded compression, visual-text alignment, and cross-source synthesis. Consequently, the field lacks a unified benchmark and a scenario-aware evaluation framework for faithfully diagnosing presentation-generation systems across diverse real-world settings. We present UniPPTBench, a unified benchmark for presentation generation across four representative input settings: vague-prompt, long-document, multimodal-document, and multi-source generation. We further introduce UniPPTEval, a scenario-aware evaluation protocol that combines shared metrics for cross-setting comparison with scenario-specific metrics tailored to the core requirements of each setting. We also provide transparent reference baselines to support reproducible comparison. Experiments on UniPPTBench reveal substantial performance variation across settings and recurring failure modes in content grounding, multimodal integration, and cross-source synthesis. In particular, strong performance on generic presentation-quality metrics does not necessarily imply strong task fulfillment in grounded scenarios. Together, UniPPTBench and UniPPTEval provide a faithful and diagnostic foundation for evaluating presentation generation across diverse real-world scenarios. Code and data will be publicly available.

preprint2025arXiv

Difference between quark stars and neutron stars in universal relations and their effect on gravitational waves

We calculate the $f$-mode frequency and tidal overlap of quark stars using the full general relativity method. We verify the universal relations obtained from conventional neutron stars in the case of quark stars and explore the cases with different values of parameters of the quark star equation of state. Since quark stars have significantly smaller radii compared to neutron stars in the low mass range, the relation between the tidal defomability and $f$-mode frequency times radius is different for neutron stars and quark stars. This difference has an impact on dynamical tide, which is the lowest-order effect we know of that can distinguish quark stars and neutron stars from the gravitational wave during the inspiral phase. We calculate the tidal dephasing caused by this effect in the post-Newtonian method and find that it can not be detected even by the next-generation gravitational wave detectors.

preprint2024arXiv

Hybrid Strangeon Stars

It was conjectured that the basic units of the ground state of bulk strong matter may be strange-clusters called strangeons, and they can form self-bound strangeon stars that are highly compact. Strangeon stars can develop a strange quark matter (SQM) core at high densities, particularly in the color-flavor-locking phase, yielding a branch of hybrid strangeon stars. We explore the stellar structure and astrophysical implications of hybrid strangeon stars. We find that hybrid strangeon stars can meet various astrophysical constraints on pulsar masses, radii, and tidal deformabilities. Finally, we show that the strangeon-SQM mixed phase is not preferred if the charge-neutrality condition is imposed at the strangeon-SQM transition region.

preprint2023arXiv

Transition routes of electrokinetic flow in a divergent microchannel with bending walls

Electrokinetic flow can be generated as a highly coupled phenomenon among velocity field, electric conductivity field and electric field. It can exhibit different responses to AC electric fields in different frequency regimes, according to different instability/receptivity mechanisms. In this investigation, by both flow visualization and single-point laser-induced fluorescence (LIF) method, the response of AC electrokinetic flow and the transition routes towards chaos and turbulence have been experimentally investigated. It is found, when the AC frequency $f_f<30$ Hz, the interface responds at both the neutral frequency of the basic flow and the AC frequency. However, when $f_f>=30$ Hz, the interface responds only at the neutral frequency of the basic flow. Both periodic doubling and subcritical bifurcations have been observed in the transition of AC electrokinetic flow. We hope the current investigation can promote our current understanding on the ultrafast transition process of electrokinetic flow from laminar state to turbulence.

preprint2022arXiv

A two-step backward compatible fullband speech enhancement system

Speech enhancement methods based on deep learning have surpassed traditional methods. While many of these new approaches are operating on the wideband (16kHz) sample rate, a new fullband (48kHz) speech enhancement system is proposed in this paper. Compared to the existing fullband systems that utilizes perceptually motivated features to train the fullband speech enhancement using a single network structure, the proposed system is a two-step system ensuring good fullband speech enhancement quality while backward compatible to the existing wideband systems.

preprint2022arXiv

A Weakly Supervised Learning Framework for Salient Object Detection via Hybrid Labels

Fully-supervised salient object detection (SOD) methods have made great progress, but such methods often rely on a large number of pixel-level annotations, which are time-consuming and labour-intensive. In this paper, we focus on a new weakly-supervised SOD task under hybrid labels, where the supervision labels include a large number of coarse labels generated by the traditional unsupervised method and a small number of real labels. To address the issues of label noise and quantity imbalance in this task, we design a new pipeline framework with three sophisticated training strategies. In terms of model framework, we decouple the task into label refinement sub-task and salient object detection sub-task, which cooperate with each other and train alternately. Specifically, the R-Net is designed as a two-stream encoder-decoder model equipped with Blender with Guidance and Aggregation Mechanisms (BGA), aiming to rectify the coarse labels for more reliable pseudo-labels, while the S-Net is a replaceable SOD network supervised by the pseudo labels generated by the current R-Net. Note that, we only need to use the trained S-Net for testing. Moreover, in order to guarantee the effectiveness and efficiency of network training, we design three training strategies, including alternate iteration mechanism, group-wise incremental mechanism, and credibility verification mechanism. Experiments on five SOD benchmarks show that our method achieves competitive performance against weakly-supervised/unsupervised methods both qualitatively and quantitatively.

preprint2022arXiv

Adaptable Text Matching via Meta-Weight Regulator

Neural text matching models have been used in a range of applications such as question answering and natural language inference, and have yielded a good performance. However, these neural models are of a limited adaptability, resulting in a decline in performance when encountering test examples from a different dataset or even a different task. The adaptability is particularly important in the few-shot setting: in many cases, there is only a limited amount of labeled data available for a target dataset or task, while we may have access to a richly labeled source dataset or task. However, adapting a model trained on the abundant source data to a few-shot target dataset or task is challenging. To tackle this challenge, we propose a Meta-Weight Regulator (MWR), which is a meta-learning approach that learns to assign weights to the source examples based on their relevance to the target loss. Specifically, MWR first trains the model on the uniformly weighted source examples, and measures the efficacy of the model on the target examples via a loss function. By iteratively performing a (meta) gradient descent, high-order gradients are propagated to the source examples. These gradients are then used to update the weights of source examples, in a way that is relevant to the target performance. As MWR is model-agnostic, it can be applied to any backbone neural model. Extensive experiments are conducted with various backbone text matching models, on four widely used datasets and two tasks. The results demonstrate that our proposed approach significantly outperforms a number of existing adaptation methods and effectively improves the cross-dataset and cross-task adaptability of the neural text matching models in the few-shot setting.

preprint2022arXiv

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use fixed-point integer or floating-point types, which have limited benefits, as both require more bits to maintain the accuracy of original models. On the other hand, variable-length quantization uses low-bit quantization for normal values and high-precision for a fraction of outlier values. Even though this line of work brings algorithmic benefits, it also introduces significant hardware overheads due to variable-length encoding and decoding. In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads. Our data type ANT leverages two key innovations to exploit the intra-tensor and inter-tensor adaptive opportunities in DNN models. First, we propose a particular data type, flint, that combines the advantages of float and int for adapting to the importance of different values within a tensor. Second, we propose an adaptive framework that selects the best type for each tensor according to its distribution characteristics. We design a unified processing element architecture for ANT and show its ease of integration with existing DNN accelerators. Our design results in 2.8$\times$ speedup and 2.5$\times$ energy efficiency improvement over the state-of-the-art quantization accelerators.

preprint2022arXiv

Aspect-specific Context Modeling for Aspect-based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) aims at predicting sentiment polarity (SC) or extracting opinion span (OE) expressed towards a given aspect. Previous work in ABSA mostly relies on rather complicated aspect-specific feature induction. Recently, pretrained language models (PLMs), e.g., BERT, have been used as context modeling layers to simplify the feature induction structures and achieve state-of-the-art performance. However, such PLM-based context modeling can be not that aspect-specific. Therefore, a key question is left under-explored: how the aspect-specific context can be better modeled through PLMs? To answer the question, we attempt to enhance aspect-specific context modeling with PLM in a non-intrusive manner. We propose three aspect-specific input transformations, namely aspect companion, aspect prompt, and aspect marker. Informed by these transformations, non-intrusive aspect-specific PLMs can be achieved to promote the PLM to pay more attention to the aspect-specific context in a sentence. Additionally, we craft an adversarial benchmark for ABSA (advABSA) to see how aspect-specific modeling can impact model robustness. Extensive experimental results on standard and adversarial benchmarks for SC and OE demonstrate the effectiveness and robustness of the proposed method, yielding new state-of-the-art performance on OE and competitive performance on SC.

preprint2022arXiv

Automatic Song Translation for Tonal Languages

This paper develops automatic song translation (AST) for tonal languages and addresses the unique challenge of aligning words&#39; tones with melody of a song in addition to conveying the original meaning. We propose three criteria for effective AST -- preserving meaning, singability and intelligibility -- and design metrics for these criteria. We develop a new benchmark for English--Mandarin song translation and develop an unsupervised AST system, Guided AliGnment for Automatic Song Translation (GagaST), which combines pre-training with three decoding constraints. Both automatic and human evaluations show GagaST successfully balances semantics and singability.

preprint2022arXiv

BANet: Motion Forecasting with Boundary Aware Network

We propose a motion forecasting model called BANet, which means Boundary-Aware Network, and it is a variant of LaneGCN. We believe that it is not enough to use only the lane centerline as input to obtain the embedding features of the vector map nodes. The lane centerline can only provide the topology of the lanes, and other elements of the vector map also contain rich information. For example, the lane boundary can provide traffic rule constraint information such as whether it is possible to change lanes which is very important. Therefore, we achieved better performance by encoding more vector map elements in the motion forecasting model.We report our results on the 2022 Argoverse2 Motion Forecasting challenge and rank 1st on the test leaderboard.

preprint2022arXiv

CATCH: Chasing All Transients Constellation Hunters Space Mission

In time-domain astronomy, a substantial number of transients will be discovered by multi-wavelength and multi-messenger observatories, posing a great challenge for follow-up capabilities. We have thus proposed an intelligent X-ray constellation, the Chasing All Transients Constellation Hunters (CATCH) space mission. Consisting of 126 micro-satellites in three types, CATCH will have the capability to perform follow-up observations for a large number of different types of transients simultaneously. Each satellite in the constellation will carry lightweight X-ray optics and use a deployable mast to increase the focal length. The combination of different optics and detector systems enables different types of satellites to have multiform observation capabilities, including timing, spectroscopy, imaging, and polarization. Controlled by the intelligent system, different satellites can cooperate to perform uninterrupted monitoring, all-sky follow-up observations, and scanning observations with a flexible field of view (FOV) and multi-dimensional observations. Therefore, CATCH will be a powerful mission to study the dynamic universe. Here, we present the current design of the spacecraft, optics, detector system, constellation configuration and observing modes, as well as the development plan.

preprint2022arXiv

Connection-oriented and Connectionless Quantum Internet Considering Quantum Repeaters

With the rapid development of quantum information and technology in recent years, the construction of quantum internet for interconnecting all kinds of quantum devices, such as quantum processors and sensors, will be the next trend for practical quantum applications. In this paper, we propose the protocols for construction of connection-oriented and connectionless quantum networks by considering the concrete quantum repeater (QR) nodes. Four classes of QRs networks are considered first and designed with two types of protocols in link layer, i.e. simultaneous and one-by-one link. Based on those two link models, the connection-oriented protocol is presented for all classes of QRs networks and the connectionless protocol is proposed for the first, second and third classes QRs networks by only one-by-one link. Furthermore, we introduce a new hybrid connection model of quantum networks combined with connection-oriented and connectionless for practical uses. Our work is a new attempt to study the model of the network layer for different kinds of QR networks and paves the way for developing the protocol stack of universal large-scale quantum internet.

preprint2022arXiv

Dark Confinement and Chiral Phase Transitions: Gravitational Waves vs Matter Representations

We study the gravitational-wave signal stemming from strongly coupled models featuring both, dark chiral and confinement phase transitions. We therefore identify strongly coupled theories that can feature a first-order phase transition. Employing the Polyakov-Nambu-Jona-Lasinio model, we focus our attention on SU(3) Yang-Mills theories featuring fermions in fundamental, adjoint, and two-index symmetric representations. We discover that for the gravitational-wave signals analysis, there are significant differences between the various representations. Interestingly we also observe that the two-index symmetric representation leads to the strongest first-order phase transition and therefore to a higher chance of being detected by the Big Bang Observer experiment. Our study of the confinement and chiral phase transitions is further applicable to extensions of the Standard Model featuring composite dynamics.

preprint2022arXiv

Data-Driven Decision Making in COVID-19 Response: A Survey

COVID-19 has spread all over the world, having an enormous effect on our daily life and work. In response to the epidemic, a lot of important decisions need to be taken to save communities and economies worldwide. Data clearly plays a vital role in effective decision making. Data-driven decision making uses data related evidence and insights to guide the decision making process and to verify the plan of action before it is committed. To better handle the epidemic, governments and policy making institutes have investigated abundant data originating from COVID-19. These data include those related to medicine, knowledge, media, etc. Based on these data, many prevention and control policies are made. In this survey paper, we summarize the progress of data-driven decision making in the response to COVID-19, including COVID-19 prevention and control, psychological counselling, financial aid, work resumption, and school re-opening. We also propose some current challenges and open issues in data-driven decision making, including data collection and quality, complex data analysis, and fairness in decision making. This survey paper sheds light on current policy making driven by data, which also provides a feasible direction for further scientific research.

preprint2022arXiv

Familiarity-based Collaborative Team Recognition in Academic Social Networks

Collaborative teamwork is key to major scientific discoveries. However, the prevalence of collaboration among researchers makes team recognition increasingly challenging. Previous studies have demonstrated that people are more likely to collaborate with individuals they are familiar with. In this work, we employ the definition of familiarity and then propose MOTO (faMiliarity-based cOllaborative Team recOgnition algorithm) to recognize collaborative teams. MOTO calculates the shortest distance matrix within the global collaboration network and the local density of each node. Central team members are initially recognized based on local density. Then MOTO recognizes the remaining team members by using the familiarity metric and shortest distance matrix. Extensive experiments have been conducted upon a large-scale data set. The experimental results show that compared with baseline methods, MOTO can recognize the largest number of teams. The teams recognized by MOTO possess more cohesive team structures and lower team communication costs compared with other methods. MOTO utilizes familiarity in team recognition to identify cohesive academic teams. The recognized teams are in line with real-world collaborative teamwork patterns. Based on team recognition using MOTO, the research team structure and performance are further analyzed for given time periods. The number of teams that consist of members from different institutions increases gradually. Such teams are found to perform better in comparison with those whose members are from the same institution.

preprint2022arXiv

Knowledge-enhanced Iterative Instruction Generation and Reasoning for Knowledge Base Question Answering

Multi-hop Knowledge Base Question Answering(KBQA) aims to find the answer entity in a knowledge base which is several hops from the topic entity mentioned in the question. Existing Retrieval-based approaches first generate instructions from the question and then use them to guide the multi-hop reasoning on the knowledge graph. As the instructions are fixed during the whole reasoning procedure and the knowledge graph is not considered in instruction generation, the model cannot revise its mistake once it predicts an intermediate entity incorrectly. To handle this, we propose KBIGER(Knowledge Base Iterative Instruction GEnerating and Reasoning), a novel and efficient approach to generate the instructions dynamically with the help of reasoning graph. Instead of generating all the instructions before reasoning, we take the (k-1)-th reasoning graph into consideration to build the k-th instruction. In this way, the model could check the prediction from the graph and generate new instructions to revise the incorrect prediction of intermediate entities. We do experiments on two multi-hop KBQA benchmarks and outperform the existing approaches, becoming the new-state-of-the-art. Further experiments show our method does detect the incorrect prediction of intermediate entities and has the ability to revise such errors.

preprint2022arXiv

MDD-Eval: Self-Training on Augmented Data for Multi-Domain Dialogue Evaluation

Chatbots are designed to carry out human-like conversations across different domains, such as general chit-chat, knowledge exchange, and persona-grounded conversations. To measure the quality of such conversational agents, a dialogue evaluator is expected to conduct assessment across domains as well. However, most of the state-of-the-art automatic dialogue evaluation metrics (ADMs) are not designed for multi-domain evaluation. We are motivated to design a general and robust framework, MDD-Eval, to address the problem. Specifically, we first train a teacher evaluator with human-annotated data to acquire a rating skill to tell good dialogue responses from bad ones in a particular domain and then, adopt a self-training strategy to train a new evaluator with teacher-annotated multi-domain data, that helps the new evaluator to generalize across multiple domains. MDD-Eval is extensively assessed on six dialogue evaluation benchmarks. Empirical results show that the MDD-Eval framework achieves a strong performance with an absolute improvement of 7% over the state-of-the-art ADMs in terms of mean Spearman correlation scores across all the evaluation benchmarks.

preprint2022arXiv

Multi-scale temporal-frequency attention for music source separation

In recent years, deep neural networks (DNNs) based approaches have achieved the start-of-the-art performance for music source separation (MSS). Although previous methods have addressed the large receptive field modeling using various methods, the temporal and frequency correlations of the music spectrogram with repeated patterns have not been explicitly explored for the MSS task. In this paper, a temporal-frequency attention module is proposed to model the spectrogram correlations along both temporal and frequency dimensions. Moreover, a multi-scale attention is proposed to effectively capture the correlations for music signal. The experimental results on MUSDB18 dataset show that the proposed method outperforms the existing state-of-the-art systems with 9.51 dB signal-to-distortion ratio (SDR) on separating the vocal stems, which is the primary practical application of MSS.

preprint2022arXiv

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

Text to speech (TTS) has made rapid progress in both academia and industry in recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge that quality and how to achieve it. In this paper, we answer these questions by first defining the human-level quality based on the statistical significance of subjective measure and introducing appropriate guidelines to judge it, and then developing a TTS system called NaturalSpeech that achieves human-level quality on a benchmark dataset. Specifically, we leverage a variational autoencoder (VAE) for end-to-end text to waveform generation, with several key modules to enhance the capacity of the prior from text and reduce the complexity of the posterior from speech, including phoneme pre-training, differentiable duration modeling, bidirectional prior/posterior modeling, and a memory mechanism in VAE. Experiment evaluations on popular LJSpeech dataset show that our proposed NaturalSpeech achieves -0.01 CMOS (comparative mean opinion score) to human recordings at the sentence level, with Wilcoxon signed rank test at p-level p >> 0.05, which demonstrates no statistically significant difference from human recordings for the first time on this dataset.

preprint2022arXiv

ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships

Lyric-to-melody generation, which generates melody according to given lyrics, is one of the most important automatic music composition tasks. With the rapid development of deep learning, previous works address this task with end-to-end neural network models. However, deep learning models cannot well capture the strict but subtle relationships between lyrics and melodies, which compromises the harmony between lyrics and generated melodies. In this paper, we propose ReLyMe, a method that incorporates Relationships between Lyrics and Melodies from music theory to ensure the harmony between lyrics and melodies. Specifically, we first introduce several principles that lyrics and melodies should follow in terms of tone, rhythm, and structure relationships. These principles are then integrated into neural network lyric-to-melody models by adding corresponding constraints during the decoding process to improve the harmony between lyrics and melodies. We use a series of objective and subjective metrics to evaluate the generated melodies. Experiments on both English and Chinese song datasets show the effectiveness of ReLyMe, demonstrating the superiority of incorporating lyric-melody relationships from the music domain into neural lyric-to-melody generation.

preprint2022arXiv

Split Hierarchical Variational Compression

Variational autoencoders (VAEs) have witnessed great success in performing the compression of image datasets. This success, made possible by the bits-back coding framework, has produced competitive compression performance across many benchmarks. However, despite this, VAE architectures are currently limited by a combination of coding practicalities and compression ratios. That is, not only do state-of-the-art methods, such as normalizing flows, often demonstrate out-performance, but the initial bits required in coding makes single and parallel image compression challenging. To remedy this, we introduce Split Hierarchical Variational Compression (SHVC). SHVC introduces two novelties. Firstly, we propose an efficient autoregressive prior, the autoregressive sub-pixel convolution, that allows a generalisation between per-pixel autoregressions and fully factorised probability models. Secondly, we define our coding framework, the autoregressive initial bits, that flexibly supports parallel coding and avoids -- for the first time -- many of the practicalities commonly associated with bits-back coding. In our experiments, we demonstrate SHVC is able to achieve state-of-the-art compression performance across full-resolution lossless image compression tasks, with up to 100x fewer model parameters than competing VAE approaches.

preprint2022arXiv

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

Quantization of deep neural networks (DNN) has been proven effective for compressing and accelerating DNN models. Data-free quantization (DFQ) is a promising approach without the original datasets under privacy-sensitive and confidential scenarios. However, current DFQ solutions degrade accuracy, need synthetic data to calibrate networks, and are time-consuming and costly. This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements. With the theoretical analysis of the second-order information of DNN task loss, we decompose and approximate the Hessian-based optimization objective into three diagonal sub-items, which have different areas corresponding to three dimensions of weight tensor: element-wise, kernel-wise, and output channel-wise. Then, we progressively compose sub-items and propose a novel data-free optimization objective in the discrete domain, minimizing Constrained Absolute Sum of Error (or CASE in short), which surprisingly does not need any dataset and is even not aware of network architecture. We also design an efficient algorithm without back-propagation to further reduce the computation complexity of the objective solver. Finally, without fine-tuning and synthetic datasets, SQuant accelerates the data-free quantization process to a sub-second level with >30% accuracy improvement over the existing data-free post-training quantization works, with the evaluated models under 4-bit quantization. We have open-sourced the SQuant framework at https://github.com/clevercool/SQuant.

preprint2022arXiv

Structural Bias for Aspect Sentiment Triplet Extraction

Structural bias has recently been exploited for aspect sentiment triplet extraction (ASTE) and led to improved performance. On the other hand, it is recognized that explicitly incorporating structural bias would have a negative impact on efficiency, whereas pretrained language models (PLMs) can already capture implicit structures. Thus, a natural question arises: Is structural bias still a necessity in the context of PLMs? To answer the question, we propose to address the efficiency issues by using an adapter to integrate structural bias in the PLM and using a cheap-to-compute relative position structure in place of the syntactic dependency structure. Benchmarking evaluation is conducted on the SemEval datasets. The results show that our proposed structural adapter is beneficial to PLMs and achieves state-of-the-art performance over a range of strong baselines, yet with a light parameter demand and low latency. Meanwhile, we give rise to the concern that the current evaluation default with data of small scale is under-confident. Consequently, we release a large-scale dataset for ASTE. The results on the new dataset hint that the structural adapter is confidently effective and efficient to a large scale. Overall, we draw the conclusion that structural bias shall still be a necessity even with PLMs.

preprint2022arXiv

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

Text summarization models are often trained to produce summaries that meet human quality requirements. However, the existing evaluation metrics for summary text are only rough proxies for summary quality, suffering from low correlation with human scoring and inhibition of summary diversity. To solve these problems, we propose SummScore, a comprehensive metric for summary quality evaluation based on CrossEncoder. Firstly, by adopting the original-summary measurement mode and comparing the semantics of the original text, SummScore gets rid of the inhibition of summary diversity. With the help of the text-matching pre-training Cross-Encoder, SummScore can effectively capture the subtle differences between the semantics of summaries. Secondly, to improve the comprehensiveness and interpretability, SummScore consists of four fine-grained submodels, which measure Coherence, Consistency, Fluency, and Relevance separately. We use semi-supervised multi-rounds of training to improve the performance of our model on extremely limited annotated data. Extensive experiments show that SummScore significantly outperforms existing evaluation metrics in the above four dimensions in correlation with human scoring. We also provide the quality evaluation results of SummScore on 16 mainstream summarization models for later research.

preprint2022arXiv

The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5.0) of the Core Package

The Astropy Project supports and fosters the development of open-source and openly-developed Python packages that provide commonly needed functionality to the astronomical community. A key element of the Astropy Project is the core package $\texttt{astropy}$, which serves as the foundation for more specialized projects and packages. In this article, we summarize key features in the core package as of the recent major release, version 5.0, and provide major updates for the Project. We then discuss supporting a broader ecosystem of interoperable packages, including connections with several astronomical observatories and missions. We also revisit the future outlook of the Astropy Project and the current status of Learn Astropy. We conclude by raising and discussing the current and future challenges facing the Project.

preprint2022arXiv

Three-Port Impedance Model and Validation of VSCs for Stability Analysis

Modern power system is undergoing a paradigm shift from the synchronous generators-based system to the power electronics converters-dominated system. With the high penetration of converters, serious stability problems are provoked, especially the wideband oscillations. Various studies have been conducted in this respect, while most of them separate the ac-side stability with the dc-side stability. However, for the stability analysis of the hybrid AC/DC grid, it is necessary to consider the converter ac-side and dc-side, simultaneously. In this paper, the stability analysis of voltage source converters (VSCs) considering both ac and dc dynamics is carried out. At first, the three-port AC/DC admittance model of VSCs is established, and the corresponding measurement method from simulations is presented to validate its accuracy. Secondly, based on such three-port model, two stability analysis methods are presented: the one is based on the system open-loop model, where the stability can be judged via the Generalized Nyquist Criterion (GNC); the other one is based on the system closed-loop model, whose stability can be predicted through the pole-zero calculation. At last, a test AC/DC system is built in MATLAB/Simulink, by which the effectiveness of the three-port model-based stability analysis is validated.

preprint2022arXiv

Ultrafast Optical Spectroscopy Evidence of Pseudogap and Electron-Phonon Coupling in an Iron-Based Superconductor KCa$_2$Fe$_4$As$_4$F$_2$

We use ultrafast optical spectroscopy to study the nonequilibrium quasiparticle relaxation dynamics of the iron-based superconductor KCa$_2$Fe$_4$As$_4$F$_2$ with $T_c=33.5$ K. Our results reveal a possible pseudogap ($Δ_{PG}$ = 2.4 $\pm$ 0.1 meV) below $T^*\approx 50$ K but prior to the opening of a superconducting gap ($Δ_{SC}$(0) $\approx$ 4.3 $\pm$ 0.1 meV). Measurements under high pump fluence real two distinct, coherent phonon oscillations with 1.95 and 5.51 THz frequencies, respectively. The high-frequency $A_{1g}$(2) mode corresponds to the $c-$axis polarized vibrations of FeAs planes with a nominal electron-phonon coupling constant $λ_{A_{1g}(2)}$ = 0.194 $\pm$ 0.02. Our findings suggest that the pseudogap is likely a precursor of superconductivity, and the electron-phonon coupling may play an essential role in the superconducting pairing in KCa$_2$Fe$_4$As$_4$F$_2$.

preprint2021arXiv

CARE: Commonsense-Aware Emotional Response Generation with Latent Concepts

Rationality and emotion are two fundamental elements of humans. Endowing agents with rationality and emotion has been one of the major milestones in AI. However, in the field of conversational AI, most existing models only specialize in one aspect and neglect the other, which often leads to dull or unrelated responses. In this paper, we hypothesize that combining rationality and emotion into conversational agents can improve response quality. To test the hypothesis, we focus on one fundamental aspect of rationality, i.e., commonsense, and propose CARE, a novel model for commonsense-aware emotional response generation. Specifically, we first propose a framework to learn and construct commonsense-aware emotional latent concepts of the response given an input message and a desired emotion. We then propose three methods to collaboratively incorporate the latent concepts into response generation. Experimental results on two large-scale datasets support our hypothesis and show that our model can produce more accurate and commonsense-aware emotional responses and achieve better human ratings than state-of-the-art models that only specialize in one aspect.

preprint2021arXiv

More on the Weak Gravity Conjecture via Convexity of Charged Operators

The Weak Gravity Conjecture has recently been re-formulated in terms of a particle with non-negative self-binding energy. Because of the dual conformal field theory (CFT) formulation in the anti-de Sitter space the conformal dimension $Δ(Q)$ of the lowest-dimension operator with charge Q under some global U(1) symmetry must be a convex function of Q. This property has been conjectured to hold for any (unitary) conformal field theory and generalized to larger global symmetry groups. Here we refine and further test the convex charge conjecture via semiclassical computations for fixed charge sectors of different theories in different dimensions. We analyze the convexity properties of the leading and next-to-leading order terms stemming from the semiclassical computation, de facto, extending previous tests beyond the leading perturbative contributions and to arbitrary charges. In particular, the leading contribution is sufficient to test convexity in the semiclassical computations. We also consider intriguing cases in which the models feature a transition from real to complex conformal dimensions either as a function of the charge or number of matter fields. As a relevant example of the first kind, we investigate the $O(N)$ model in $4+ε$ dimensions. As an example of the second type we consider the $U(N)\times U(M)$ model in $4-ε$ dimensions. Both models display a rich dynamics where, by changing the number of matter fields and/or charge, one can achieve dramatically different physical regimes. We discover that whenever a complex conformal dimension appears, the real part satisfies the convexity property.

preprint2021arXiv

Quantum-assisted Distortion-free audio signal sensing

Quantum sensors are keeping the cutting-edge sensitivities in metrology. However, for high-sensitive measurements of arbitrary signals, limitations in linear dynamic range could introduce distortions when sensing the frequency, magnitude and phase of unknown signals. Here, we overcome these limitations with advanced sensing protocol that combines quantum phase-sensitive detection with heterodyne readout. We present theoretical and experimental investigations using nitrogen-vacancy centers in diamond, showing the ability to sense radio signals with a 98 dB linear dynamic range, a 31 pT/Hz$^{1/2}$ sensitivity, and arbitrary frequency resolution. Further, we perform the quantum-assisted distortion-free audio signal (melody, speech) sensing with high fidelity. The methods developed here could broaden the horizon for quantum sensors towards applications in telecommunication, where high-fidelity and low-distortion at multiple frequency bands within small sensing volumes are required.

preprint2021arXiv

Temperature evolution of quasiparticle dispersion dynamics in semimetallic 1T-TiTe2 via high-resolution angle-resolved photoemission spectroscopy and ultrafast optical pump-probe spectroscopy

High-resolution angle-resolved photoemission spectroscopy and ultrafast optical pump-probe spectroscopy were used to study semimetallic 1T - TiTe2 quasiparticle dispersion and dynamics. A kink and a flat band, having the same energy scale and temperature-dependent behaviors along the G-M direction, were detected. Both manifested at low temperatures but blurred as temperature increased. The kink was formed by an electron-phonon coupling. And the localized flat band might be closely related to an electron-phonon coupling. Ultrafast optical spectroscopy identified multiple distinct time scales in the 10-300 K range. Quantitative analysis of the fastest decay process evidenced a significant lifetime temperature dependence at high temperatures, while this starts to change slowly below ~ 100 K where an anomalous Hall coefficient occurred. At low temperature, a coherent A1g phonon mode with a frequency of ~ 4.36 THz was extracted. Frequency temperature dependence suggests that phonon hardening occurs as temperature falls and anharmonic effects can explain it. Frequency fluence dependence indicates that the phonons soften as fluence increases.

preprint2020arXiv

A Driver Fatigue Recognition Algorithm Based on Spatio-Temporal Feature Sequence

Researches show that fatigue driving is one of the important causes of road traffic accidents, so it is of great significance to study the driver fatigue recognition algorithm to improve road traffic safety. In recent years, with the development of deep learning, the field of pattern recognition has made great development. This paper designs a real-time fatigue state recognition algorithm based on spatio-temporal feature sequence, which can be mainly applied to the scene of fatigue driving recognition. The algorithm is divided into three task networks: face detection network, facial landmark detection and head pose estimation network, fatigue recognition network. Experiments show that the algorithm has the advantages of small volume, high speed and high accuracy.

preprint2020arXiv

A Survey on Dynamic Network Embedding

Real-world networks are composed of diverse interacting and evolving entities, while most of existing researches simply characterize them as particular static networks, without consideration of the evolution trend in dynamic networks. Recently, significant progresses in tracking the properties of dynamic networks have been made, which exploit changes of entities and links in the network to devise network embedding techniques. Compared to widely proposed static network embedding methods, dynamic network embedding endeavors to encode nodes as low-dimensional dense representations that effectively preserve the network structures and the temporal dynamics, which is beneficial to multifarious downstream machine learning tasks. In this paper, we conduct a systematical survey on dynamic network embedding. In specific, basic concepts of dynamic network embedding are described, notably, we propose a novel taxonomy of existing dynamic network embedding techniques for the first time, including matrix factorization based, Skip-Gram based, autoencoder based, neural networks based and other embedding methods. Additionally, we carefully summarize the commonly used datasets and a wide variety of subsequent tasks that dynamic network embedding can benefit. Afterwards and primarily, we suggest several challenges that the existing algorithms faced and outline possible directions to facilitate the future research, such as dynamic embedding models, large-scale dynamic networks, heterogeneous dynamic networks, dynamic attributed networks, task-oriented dynamic network embedding and more embedding spaces.

preprint2020arXiv

Charging the Walking U(N)$\times$U(N) Higgs Theory as a Complex CFT

We apply a semi-classical method to compute the conformal field theory (CFT) data for the U(N)xU(N) non-abelian Higgs theory in four minus epsilon dimensions at its complex fixed point. The theory features more than one coupling and walking dynamics. Given our charge configuration, we identify a family of corresponding operators and compute their scaling dimensions which remarkably agree with available results from conventional perturbation theory validating the use of the state-operator correspondence for a complex CFT.

preprint2020arXiv

Crowding Prediction of In-Situ Metro Passengers Using Smart Card Data

The metro system is playing an increasingly important role in the urban public transit network, transferring a massive human flow across space everyday in the city. In recent years, extensive research studies have been conducted to improve the service quality of metro systems. Among them, crowd management has been a critical issue for both public transport agencies and train operators. In this paper, by utilizing accumulated smart card data, we propose a statistical model to predict in-situ passenger density, i.e., number of on-board passengers between any two neighbouring stations, inside a closed metro system. The proposed model performs two main tasks: i) forecasting time-dependent Origin-Destination (OD) matrix by applying mature statistical models; and ii) estimating the travel time cost required by different parts of the metro network via truncated normal mixture distributions with Expectation-Maximization (EM) algorithm. Based on the prediction results, we are able to provide accurate prediction of in-situ passenger density for a future time point. A case study using real smart card data in Singapore Mass Rapid Transit (MRT) system demonstrate the efficacy and efficiency of our proposed method.

preprint2020arXiv

Deeper Insights into Weight Sharing in Neural Architecture Search

With the success of deep neural networks, Neural Architecture Search (NAS) as a way of automatic model design has attracted wide attention. As training every child model from scratch is very time-consuming, recent works leverage weight-sharing to speed up the model evaluation procedure. These approaches greatly reduce computation by maintaining a single copy of weights on the super-net and share the weights among every child model. However, weight-sharing has no theoretical guarantee and its impact has not been well studied before. In this paper, we conduct comprehensive experiments to reveal the impact of weight-sharing: (1) The best-performing models from different runs or even from consecutive epochs within the same run have significant variance; (2) Even with high variance, we can extract valuable information from training the super-net with shared weights; (3) The interference between child models is a main factor that induces high variance; (4) Properly reducing the degree of weight sharing could effectively reduce variance and improve performance.

preprint2020arXiv

Long-Short Term Spatiotemporal Tensor Prediction for Passenger Flow Profile

Spatiotemporal data is very common in many applications, such as manufacturing systems and transportation systems. It is typically difficult to be accurately predicted given intrinsic complex spatial and temporal correlations. Most of the existing methods based on various statistical models and regularization terms, fail to preserve innate features in data alongside their complex correlations. In this paper, we focus on a tensor-based prediction and propose several practical techniques to improve prediction. For long-term prediction specifically, we propose the &#34;Tensor Decomposition + 2-Dimensional Auto-Regressive Moving Average (2D-ARMA)&#34; model, and an effective way to update prediction real-time; For short-term prediction, we propose to conduct tensor completion based on tensor clustering to avoid oversimplifying and ensure accuracy. A case study based on the metro passenger flow data is conducted to demonstrate the improved performance.

preprint2020arXiv

Microwave-free vector magnetometry with nitrogen-vacancy centers along a single axis in diamond

Sensing vector magnetic fields is critical to many applications in fundamental physics, bioimaging, and material science. Magnetic-field sensors exploiting nitrogen-vacancy (NV) centers are particularly compelling as they offer high sensitivity and spatial resolution even at nanoscale. Achieving vector magnetometry has, however, often required applying microwaves sequentially or simultaneously, limiting the sensors&#39; applications under cryogenic temperature. Here we propose and demonstrate a microwave-free vector magnetometer that simultaneously measures all Cartesian components of a magnetic field using NV ensembles in diamond. In particular, the present magnetometer leverages the level anticrossing in the triplet ground state at 102.4 mT, allowing the measurement of both longitudinal and transverse fields with a wide bandwidth from zero to megahertz range. Full vector sensing capability is proffered by modulating fields along the preferential NV axis and in the transverse plane and subsequent demodulation of the signal. This sensor exhibits a root mean square noise floor of about 300 pT/Hz^(1/2) in all directions. The present technique is broadly applicable to both ensemble sensors and potentially also single-NV sensors, extending the vector capability to nanoscale measurement under ambient temperatures.

preprint2020arXiv

Partially Observable Online Change Detection via Smooth-Sparse Decomposition

We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities. On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes. On the other, the scheme should be able to adaptively and actively select the most important variables to observe to maximize the detection power. To address these two points, in this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition, whose parameters can be learned via spike-slab variational Bayesian inference. Then the posterior Bayes factor, which incorporates the learned parameters and sparse change information, is formulated as a detection statistic. Finally, by formulating the statistic as the reward of a combinatorial multi-armed bandit problem, an adaptive sampling strategy based on Thompson sampling is proposed. The efficacy and applicability of our method in practice are demonstrated with numerical studies and a real case study.

preprint2020arXiv

Probabilistic Residual Learning for Aleatoric Uncertainty in Image Restoration

Aleatoric uncertainty is an intrinsic property of ill-posed inverse and imaging problems. Its quantification is vital for assessing the reliability of relevant point estimates. In this paper, we propose an efficient framework for quantifying aleatoric uncertainty for deep residual learning and showcase its significant potential on image restoration. In the framework, we divide the conditional probability modeling for the residual variable into a deterministic homo-dimensional level, a stochastic low-dimensional level and a merging level. The low-dimensionality is especially suitable for sparse correlation between image pixels, enables efficient sampling for high dimensional problems and acts as a regularizer for the distribution. Preliminary numerical experiments show that the proposed method can give not only state-of-the-art point estimates of image restoration but also useful associated uncertainty information.

preprint2020arXiv

Probing up-down quark matter via gravitational waves

Recently, it was shown that quark matter with only $u$ and $d$ quarks ($ud$QM) can be the ground state of matter for baryon numbers $A>A_\textrm{min}$ with $A_{\rm min}\gtrsim 300$. In this paper, we explore $ud$ quark stars ($ud$QSs) that are composed of $ud$QM, in the context of the two-families scenario in which $ud$QSs and hadronic stars (HSs) can coexist. Distinct signatures are discussed compared to the conventional study regarding strange quark stars (SQSs). We show that the requirements of $A_{\rm min}\gtrsim 300$ and the most massive compact star observed being a $ud$QS together may put stringent constraints on the allowed parameter space of $ud$QSs. Then, we study the related gravitational-wave probe of the tidal deformability in binary star mergers, including the $ud$QS-$ud$QS and $ud$QS-HS cases. The obtained values of the tidal deformability at 1.4 solar masses and the average tidal deformability are all in good compatibility with the experimental constraints of GW170817. This study points to a new possible interpretation of the GW170817 binary merger event, where $ud$QS may be at least one component of the binary system detected.

preprint2020arXiv

Task-Level Curriculum Learning for Non-Autoregressive Neural Machine Translation

Non-autoregressive translation (NAT) achieves faster inference speed but at the cost of worse accuracy compared with autoregressive translation (AT). Since AT and NAT can share model structure and AT is an easier task than NAT due to the explicit dependency on previous target-side tokens, a natural idea is to gradually shift the model training from the easier AT task to the harder NAT task. To smooth the shift from AT training to NAT training, in this paper, we introduce semi-autoregressive translation (SAT) as intermediate tasks. SAT contains a hyperparameter k, and each k value defines a SAT task with different degrees of parallelism. Specially, SAT covers AT and NAT as its special cases: it reduces to AT when k = 1 and to NAT when k = N (N is the length of target sentence). We design curriculum schedules to gradually shift k from 1 to N, with different pacing functions and number of tasks trained at the same time. We called our method as task-level curriculum learning for NAT (TCL-NAT). Experiments on IWSLT14 De-En, IWSLT16 En-De, WMT14 En-De and De-En datasets show that TCL-NAT achieves significant accuracy improvements over previous NAT baselines and reduces the performance gap between NAT and AT models to 1-2 BLEU points, demonstrating the effectiveness of our proposed method.

preprint2020arXiv

Testing dispersion of gravitational waves from eccentric extreme-mass-ratio inspirals

In general relativity, there is no dispersion in gravitational waves, while some modified gravity theories predict dispersion phenomena in the propagation of gravitational waves. In this paper, we demonstrate that this dispersion will induce an observable deviation of waveforms if the orbits have large eccentricities. The mechanism is that the waveform modes with different frequencies will be emitted at the same time due to the existence of eccentricity. During the propagation, because of the dispersion, the arrival time of different modes will be different, then produce the deviation and dephasing of waveforms compared with general relativity. This kind of dispersion phenomena related with extreme-mass-ratio inspirals could be observed by space-borne detectors, and the constraint on the graviton mass could be improved . Moreover, we find that the dispersion effect may also be constrained by ground detectors better than the current result if a highly eccentric intermediate-mass-ratio inspirals be observed.

preprint2020arXiv

The Techni-Pati-Salam Composite Higgs

Composite Higgs models can be extended to the Planck scale by means of the partially unified partial compositeness (PUPC) framework. We present in detail the Techni-Pati-Salam model, based on a renormalizable gauge theory $SU(8)_{PS}\times SU(2)_L\times SU(2)_R$. We demonstrate that masses and mixings for all generations of standard model fermions can be obtained via partial compositeness at low energy, with four-fermion operators mediated by either heavy gauge bosons or scalars. The strong dynamics is predicted to be that of a confining $Sp(4)_{\rm HC}$ gauge group, with hyper-fermions in the fundamental and two-index anti-symmetric representations, with fixed multiplicities. This motivates for Lattice studies of the Infra-Red near-conformal walking phase, with results that may validate or rule out the model. This is the first complete and realistic attempt at providing an Ultra-Violet completion for composite Higgs models with top partial compositeness. In the baryon-number conserving vacuum, the theory also predicts a Dark Matter candidate, with mass in the few TeV range, protected by semi-integer baryon number.

preprint2019arXiv

Angle-resolved photoemission spectroscopy study of crystal electric field in heavy fermion compound CePt2In7

The three-dimensional electronic structure and Ce 4f electrons of the heavy fermion superconductor CePt2In7 is investigated. Angle-resolved photoemission spectroscopy using variable photon energy establishes the existence of quasi-two and three dimensional Fermi surface topologies. Temperature-dependent 4d-4f on-resonance photoemission spectroscopies reveal that heavy quasiparticle bands begin to form at a temperature well above the characteristic (coherence) temperature T*. T* emergence may be closely related to crystal electric field splitting, particularly the low-lying heavy band formed by crystal electric field splitting.