Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
80works
0followers
35topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

80 published item(s)

preprint2026arXiv

PresentAgent-2: Towards Generalist Multimodal Presentation Agents

Presentation generation is moving beyond static slide creation toward end-to-end presentation video generation with research grounding, multimodal media, and interactive delivery. We introduce PresentAgent-2, an agentic framework for generating presentation videos from user queries. Given an open-ended user query and a selected presentation mode, PresentAgent-2 first summarizes the query into a focused topic and performs deep research over presentation-friendly sources to collect multimodal resources, including relevant text, images, GIFs, and videos. It then constructs presentation slides, generates mode-specific scripts, and composes slides, audio, and dynamic media into a complete presentation video. PresentAgent-2 supports three independent presentation modes within a unified framework: Single Presentation, which generates a single-speaker narrated presentation video; Discussion, which creates a multi-speaker presentation with structured speaker roles, such as for asking guiding questions, explaining concepts, clarifying details, and summarizing key points; and Interaction, which independently supports answering audience questions grounded in the generated slides, scripts, retrieved evidence, and presentation context. To evaluate these capabilities, we build a multimodal presentation benchmark covering single presentation, discussion, and interaction scenarios, with task-specific evaluation criteria for content quality, media relevance, dynamic media use, dialogue naturalness, and interaction grounding. Overall, PresentAgent-2 extends presentation generation from document-dependent slide creation to query-driven, research-grounded presentation video generation with multimodal media, dialogue, and interaction. Code: https://github.com/AIGeeksGroup/PresentAgent-2. Website: https://aigeeksgroup.github.io/PresentAgent-2.

preprint2023arXiv

Polar Codes with Local-Global Decoding

In this paper, we investigate a coupled polar code architecture that supports both local and global decoding. This local-global construction is motivated by practical applications in data storage and transmission where reduced-latency recovery of sub-blocks of the coded information is required. Local decoding allows random access to sub-blocks of the full code block. When local decoding performance is insufficient, global decoding provides improved data reliability. The coupling scheme incorporates a systematic outer polar code and a partitioned mapping of the outer codeword to semipolarized bit-channels of the inner polar codes. Error rate simulation results are presented for 2 and 4 sub-blocks. Design issues affecting the trade-off between local and global decoding performance are also discussed.

preprint2023arXiv

RGB-T Multi-Modal Crowd Counting Based on Transformer

Crowd counting aims to estimate the number of persons in a scene. Most state-of-the-art crowd counting methods based on color images can't work well in poor illumination conditions due to invisible objects. With the widespread use of infrared cameras, crowd counting based on color and thermal images is studied. Existing methods only achieve multi-modal fusion without count objective constraint. To better excavate multi-modal information, we use count-guided multi-modal fusion and modal-guided count enhancement to achieve the impressive performance. The proposed count-guided multi-modal fusion module utilizes a multi-scale token transformer to interact two-modal information under the guidance of count information and perceive different scales from the token perspective. The proposed modal-guided count enhancement module employs multi-scale deformable transformer decoder structure to enhance one modality feature and count information by the other modality. Experiment in public RGBT-CC dataset shows that our method refreshes the state-of-the-art results. https://github.com/liuzywen/RGBTCC

preprint2022arXiv

A ferrotoroidic candidate with well-separated spin chains

The search of novel quasi one-dimensional (1D) materials is one of the important aspects in the field of material science. Toroidal moment, the order parameter of ferrotoroidic order, can be generated by a head-to-tail configuration of magnetic moment. It has been theoretically proposed that one-dimensional (1D) dimerized and antiferromagnetic-like spin chain hosts ferrotoroidicity and has the toroidal moment composed of only two antiparallel spins. Here, we report a ferrotoroidic candidate of Ba6Cr2S10 with such a theoretical model of spin chain. The structure consists of unique dimerized face-sharing CrS6 octahedral chains along the c axis. An antiferromagnetic-like ordering at ~10 K breaks both space- and time-reversal symmetries and the magnetic point group of mm'2' allows three ferroic orders in Ba6Cr2S10: (anti)ferromagnetic, ferroelectric and ferrotoroidic orders. Our investigation reveals that Ba6Cr2S10 is a rare ferrotoroidic candidate with quasi 1D spin chain, which can be considered as a starting point for the further exploration of the physics and applications of ferrotoroidicity.

preprint2022arXiv

A Stochastic Process Model for Time Warping Functions

Time warping function provides a mathematical representation to measure phase variability in functional data. Recent studies have developed various approaches to estimate optimal warping between functions and provide non-Euclidean models. However, a principled, linear, generative model on time warping functions is still under-explored. This is a highly challenging problem because the space of warping functions is non-linear with the conventional Euclidean metric. To address this problem, we propose a stochastic process model for time warping functions, where the key is to define a linear, inner-product structure on the time warping space and then transform the warping functions into a sub-space of the $\mathbb L^2$ Euclidean space. With certain constraints on the warping functions, this transformation is an isometric isomorphism. In the transformed space, we adopt the $\mathbb L^2$ basis in the Hilbert space for representation. This new framework can easily build generative model on time warping by using different types of stochastic process. It can also be used to conduct statistical inferences such as functional PCA, functional ANOVA, and functional regressions. Furthermore, we demonstrate the effectiveness of this new framework by using it as a new prior in the Bayesian registration, and propose an efficient gradient method to address the important maximum a posteriori estimation. We illustrate the new Bayesian method using simulations which properly characterize nonuniform and correlated constraints in the time domain. Finally, we apply the new framework to the famous Berkeley growth data and obtain reasonable results on modeling, resampling, group comparison, and classification analysis.

preprint2022arXiv

AirCode: A Robust Object Encoding Method

Object encoding and identification are crucial for many robotic tasks such as autonomous exploration and semantic relocalization. Existing works heavily rely on the tracking of detected objects but have difficulty recalling revisited objects precisely. In this paper, we propose a novel object encoding method, which is named as AirCode, based on a graph of key-points. To be robust to the number of key-points detected, we propose a feature sparse encoding and object dense encoding method to ensure that each key-point can only affect a small part of the object descriptors, leading it to be robust to viewpoint changes, scaling, occlusion, and even object deformation. In the experiments, we show that it achieves superior performance for object identification than the state-of-the-art algorithms and is able to provide reliable semantic relocalization. It is a plug-and-play module and we expect that it will play an important role in various applications.

preprint2022arXiv

Anomalous thermal Hall effect and anomalous Nernst effect of CsV$_{3}$Sb$_{5}$

Motived by time-reversal symmetry breaking and giant anomalous Hall effect in kagome superconductor \textit{A}V$_3$Sb$_5$ (\textit{A} = Cs, K, Rb), we carried out the thermal transport measurements on CsV$_3$Sb$_5$. In addition to the anomalous Hall effect, the anomalous Nernst effect and the anomalous thermal Hall effect emerge. Interestingly, the longitudinal thermal conductivity $κ_{xx}$ largely deviates from the electronic contribution obtained from the longitudinal conductivity $σ_{xx}$ by the Wiedemann-Franz law. In contrast, the thermal Hall conductivity $κ_{xy}$ is roughly consistent with the Wiedemann-Franz law from electronic contribution. All these results indicate the large phonon contribution in the longitudinal thermal conductivity. Moreover, the thermal Hall conductivity is also slightly greater than the theoretical electronic contribution, indicating other charge neutral contributions. More than that, the Nernst coefficient and Hall resistivity show the multi-band behavior with possible additional contribution from Berry curvature at the low fields.

preprint2022arXiv

Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking

Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest. However, existing tracking approaches rely on customized sub-modules and need prior knowledge for architecture selection, hindering the tracking development in a more general system. This paper presents a Simplified Tracking architecture (SimTrack) by leveraging a transformer backbone for joint feature extraction and interaction. Unlike existing Siamese trackers, we serialize the input images and concatenate them directly before the one-branch backbone. Feature interaction in the backbone helps to remove well-designed interaction modules and produce a more efficient and effective framework. To reduce the information loss from down-sampling in vision transformers, we further propose a foveal window strategy, providing more diverse input patches with acceptable computational costs. Our SimTrack improves the baseline with 2.5%/2.6% AUC gains on LaSOT/TNL2K and gets results competitive with other specialized tracking algorithms without bells and whistles.

preprint2022arXiv

Boosting Camouflaged Object Detection with Dual-Task Interactive Transformer

Camouflaged object detection intends to discover the concealed objects hidden in the surroundings. Existing methods follow the bio-inspired framework, which first locates the object and second refines the boundary. We argue that the discovery of camouflaged objects depends on the recurrent search for the object and the boundary. The recurrent processing makes the human tired and helpless, but it is just the advantage of the transformer with global search ability. Therefore, a dual-task interactive transformer is proposed to detect both accurate position of the camouflaged object and its detailed boundary. The boundary feature is considered as Query to improve the camouflaged object detection, and meanwhile the object feature is considered as Query to improve the boundary detection. The camouflaged object detection and the boundary detection are fully interacted by multi-head self-attention. Besides, to obtain the initial object feature and boundary feature, transformer-based backbones are adopted to extract the foreground and background. The foreground is just object, while foreground minus background is considered as boundary. Here, the boundary feature can be obtained from blurry boundary region of the foreground and background. Supervised by the object, the background and the boundary ground truth, the proposed model achieves state-of-the-art performance in public datasets. https://github.com/liuzywen/COD

preprint2022arXiv

CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations

Pre-trained Language Models (PLMs) have achieved remarkable performance gains across numerous downstream tasks in natural language understanding. Various Chinese PLMs have been successively proposed for learning better Chinese language representation. However, most current models use Chinese characters as inputs and are not able to encode semantic information contained in Chinese words. While recent pre-trained models incorporate both words and characters simultaneously, they usually suffer from deficient semantic interactions and fail to capture the semantic relation between words and characters. To address the above issues, we propose a simple yet effective PLM CLOWER, which adopts the Contrastive Learning Over Word and charactER representations. In particular, CLOWER implicitly encodes the coarse-grained information (i.e., words) into the fine-grained representations (i.e., characters) through contrastive learning on multi-grained information. CLOWER is of great value in realistic scenarios since it can be easily incorporated into any existing fine-grained based PLMs without modifying the production pipelines.Extensive experiments conducted on a range of downstream tasks demonstrate the superior performance of CLOWER over several state-of-the-art baselines.

preprint2022arXiv

Continuous-variable quantum sensing of a dissipative reservoir

We propose a continuous-variable quantum sensing scheme, in which a harmonic oscillator is employed as the probe to estimate the parameters in the spectral density of a quantum reservoir, within a non-Markovian dynamical framework. It is revealed that the sensing sensitivity can be effectively boosted by (i) optimizing the weight of the momentum-position-type coupling in the whole probe-reservoir interaction Hamiltonian, (ii) the initial quantum squeezing resource provided by the probe, (iii) the noncanonical equilibration induced by the non-Markovian effect, and (iv) applying an external driving field. Our results may have some potential applications in understanding and controlling the decoherence of dissipative continuous-variable systems.

preprint2022arXiv

Data-Driven, Soft Alignment of Functional Data Using Shapes and Landmarks

Alignment or registration of functions is a fundamental problem in statistical analysis of functions and shapes. While there are several approaches available, a more recent approach based on Fisher-Rao metric and square-root velocity functions (SRVFs) has been shown to have good performance. However, this SRVF method has two limitations: (1) it is susceptible to over alignment, i.e., alignment of noise as well as the signal, and (2) in case there is additional information in form of landmarks, the original formulation does not prescribe a way to incorporate that information. In this paper we propose an extension that allows for incorporation of landmark information to seek a compromise between matching curves and landmarks. This results in a soft landmark alignment that pushes landmarks closer, without requiring their exact overlays to finds a compromise between contributions from functions and landmarks. The proposed method is demonstrated to be superior in certain practical scenarios.

preprint2022arXiv

Dimensionality of the superconductivity in the transition metal pnictide WP

We report theoretical and experimental results on the transition metal pnictide WP. The theoretical outcomes based on tight-binding calculations and density functional theory indicate that WP is a three-dimensional superconductor with an anisotropic electronic structure and nonsymmorphic symmetries. On the other hand, magnetoresistance experimental data and the analysis of superconducting fluctuations of the conductivity in external magnetic field indicate a weakly anisotropic three-dimensional superconducting phase.

preprint2022arXiv

Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization

The most advanced abstractive dialogue summarizers lack generalization ability on new domains and the existing researches for domain adaptation in summarization generally rely on large-scale pre-trainings. To explore the lightweight fine-tuning methods for domain adaptation of dialogue summarization, in this paper, we propose an efficient and generalizable Domain-Oriented Prefix-tuning model, which utilizes a domain word initialized prefix module to alleviate domain entanglement and adopts discrete prompts to guide the model to focus on key contents of dialogues and enhance model generalization. We conduct zero-shot experiments and build domain adaptation benchmarks on two multi-domain dialogue summarization datasets, TODSum and QMSum. Adequate experiments and qualitative analysis prove the effectiveness of our methods.

preprint2022arXiv

Effects of counter-rotating-wave terms on the noisy frequency estimation

We investigate the problem of estimating the tunneling frequency of a two-level atomic system embedded in a dissipative environment by employing a numerically rigorous hierarchical equations of motion method. The effect of counter-rotating-wave terms on the attainable precision of the noisy quantum metrology is systematically studied beyond the usual framework of perturbative treatments. We find the counter-rotating-wave terms are able to boost the noisy quantum metrological performance in the intermediate and strong coupling regimes, whether the dissipative environment is composed of bosons or fermions. The result presented in this paper may pave a guideline to design a high-precision quantum estimation scenario under practical decoherence.

preprint2022arXiv

Electromagnetic Source Imaging via a Data-Synthesis-Based Convolutional Encoder-Decoder Network

Electromagnetic source imaging (ESI) requires solving a highly ill-posed inverse problem. To seek a unique solution, traditional ESI methods impose various forms of priors that may not accurately reflect the actual source properties, which may hinder their broad applications. To overcome this limitation, in this paper a novel data-synthesized spatio-temporally convolutional encoder-decoder network method termed DST-CedNet is proposed for ESI. DST-CedNet recasts ESI as a machine learning problem, where discriminative learning and latent-space representations are integrated in a convolutional encoder-decoder network (CedNet) to learn a robust mapping from the measured electroencephalography/magnetoencephalography (E/MEG) signals to the brain activity. In particular, by incorporating prior knowledge regarding dynamical brain activities, a novel data synthesis strategy is devised to generate large-scale samples for effectively training CedNet. This stands in contrast to traditional ESI methods where the prior information is often enforced via constraints primarily aimed for mathematical convenience. Extensive numerical experiments as well as analysis of a real MEG and Epilepsy EEG dataset demonstrate that DST-CedNet outperforms several state-of-the-art ESI methods in robustly estimating source signals under a variety of source configurations.

preprint2022arXiv

Ensemble Multi-Relational Graph Neural Networks

It is well established that graph neural networks (GNNs) can be interpreted and designed from the perspective of optimization objective. With this clear optimization objective, the deduced GNNs architecture has sound theoretical foundation, which is able to flexibly remedy the weakness of GNNs. However, this optimization objective is only proved for GNNs with single-relational graph. Can we infer a new type of GNNs for multi-relational graphs by extending this optimization objective, so as to simultaneously solve the issues in previous multi-relational GNNs, e.g., over-parameterization? In this paper, we propose a novel ensemble multi-relational GNNs by designing an ensemble multi-relational (EMR) optimization objective. This EMR optimization objective is able to derive an iterative updating rule, which can be formalized as an ensemble message passing (EnMP) layer with multi-relations. We further analyze the nice properties of EnMP layer, e.g., the relationship with multi-relational personalized PageRank. Finally, a new multi-relational GNNs which well alleviate the over-smoothing and over-parameterization issues are proposed. Extensive experiments conducted on four benchmark datasets well demonstrate the effectiveness of the proposed model.

preprint2022arXiv

Gamma-ray spectral properties of the Galactic globular clusters: constraint on the numbers of millisecond pulsars

We study the gamma-ray spectra of 30 globular clusters (GCs) thus far detected with the Fermi Gamma-ray Space Telescope. Presuming that gamma-ray emission of a GC comes from millisecond pulsars (MSPs) contained in, a model that generates spectra for the GCs is built based on the gamma-ray properties of the detected MSP sample. We fit the GCs' spectra with the model, and for 27 of them, their emission can be explained with arising from MSPs. The spectra of the other three, NGC 7078, 2MS-GC01, and Terzan 1, can not be fit with our model, indicating that MSPs' emission should not be the dominant one in the first two and the third one has a unique hard spectrum. We also investigate six nearby GCs that have relatively high encounter rates as the comparison cases. The candidate spectrum of NGC 6656 can be fit with that of one MSP, supporting its possible association with the gamma-ray source at its position. The five others do not have detectable gamma-ray emission. Their spectral upper limits set limits of $\leq 1$ MSPs in them, consistent with the numbers of radio MSPs found in them. The estimated numbers of MSPs in the gamma-ray GCs generally match well those reported for radio pulsars. Our studies of the gamma-ray GCs and the comparison nearby GCs indicate that the encounter rate should not be the only factor determining the number of MSPs a GC contains.

preprint2022arXiv

Generalized Intent Discovery: Learning from Open World Dialogue System

Traditional intent classification models are based on a pre-defined intent set and only recognize limited in-domain (IND) intent classes. But users may input out-of-domain (OOD) queries in a practical dialogue system. Such OOD queries can provide directions for future improvement. In this paper, we define a new task, Generalized Intent Discovery (GID), which aims to extend an IND intent classifier to an open-world intent set including IND and OOD intents. We hope to simultaneously classify a set of labeled IND intent classes while discovering and recognizing new unlabeled OOD types incrementally. We construct three public datasets for different application scenarios and propose two kinds of frameworks, pipeline-based and end-to-end for future work. Further, we conduct exhaustive experiments and qualitative analysis to comprehend key challenges and provide new guidance for future GID research.

preprint2022arXiv

Graph Adaptive Semantic Transfer for Cross-domain Sentiment Classification

Cross-domain sentiment classification (CDSC) aims to use the transferable semantics learned from the source domain to predict the sentiment of reviews in the unlabeled target domain. Existing studies in this task attach more attention to the sequence modeling of sentences while largely ignoring the rich domain-invariant semantics embedded in graph structures (i.e., the part-of-speech tags and dependency relations). As an important aspect of exploring characteristics of language comprehension, adaptive graph representations have played an essential role in recent years. To this end, in the paper, we aim to explore the possibility of learning invariant semantic features from graph-like structures in CDSC. Specifically, we present Graph Adaptive Semantic Transfer (GAST) model, an adaptive syntactic graph embedding method that is able to learn domain-invariant semantics from both word sequences and syntactic graphs. More specifically, we first raise a POS-Transformer module to extract sequential semantic features from the word sequences as well as the part-of-speech tags. Then, we design a Hybrid Graph Attention (HGAT) module to generate syntax-based semantic features by considering the transferable dependency relations. Finally, we devise an Integrated aDaptive Strategy (IDS) to guide the joint learning process of both modules. Extensive experiments on four public datasets indicate that GAST achieves comparable effectiveness to a range of state-of-the-art models.

preprint2022arXiv

Graph Neural Network-Based Scheduling for Multi-UAV-Enabled Communications in D2D Networks

In this paper, we jointly design the power control and position dispatch for Multi-unmanned aerial vehicle (UAV)-enabled communication in device-to-device (D2D) networks. Our objective is to maximize the total transmission rate of downlink users (DUs). Meanwhile, the quality of service (QoS) of all D2D users must be satisfied. We comprehensively considered the interference among D2D communications and downlink transmissions. The original problem is strongly non-convex, which requires high computational complexity for traditional optimization methods. And to make matters worse, the results are not necessarily globally optimal. In this paper, we propose a novel graph neural networks (GNN) based approach that can map the considered system into a specific graph structure and achieve the optimal solution in a low complexity manner. Particularly, we first construct a GNN-based model for the proposed network, in which the transmission links and interference links are formulated as vertexes and edges, respectively. Then, by taking the channel state information and the coordinates of ground users as the inputs, as well as the location of UAVs and the transmission power of all transmitters as outputs, we obtain the mapping from inputs to outputs through training the parameters of GNN. Simulation results verified that the way to maximize the total transmission rate of DUs can be extracted effectively via the training on samples. Moreover, it also shows that the performance of proposed GNN-based method is better than that of traditional means.

preprint2022arXiv

HQANN: Efficient and Robust Similarity Search for Hybrid Queries with Structured and Unstructured Constraints

The in-memory approximate nearest neighbor search (ANNS) algorithms have achieved great success for fast high-recall query processing, but are extremely inefficient when handling hybrid queries with unstructured (i.e., feature vectors) and structured (i.e., related attributes) constraints. In this paper, we present HQANN, a simple yet highly efficient hybrid query processing framework which can be easily embedded into existing proximity graph-based ANNS algorithms. We guarantee both low latency and high recall by leveraging navigation sense among attributes and fusing vector similarity search with attribute filtering. Experimental results on both public and in-house datasets demonstrate that HQANN is 10x faster than the state-of-the-art hybrid ANNS solutions to reach the same recall quality and its performance is hardly affected by the complexity of attributes. It can reach 99\% recall@10 in just around 50 microseconds On GLOVE-1.2M with thousands of attribute constraints.

preprint2022arXiv

InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER

Recently, prompt-based methods have achieved significant performance in few-shot learning scenarios by bridging the gap between language model pre-training and fine-tuning for downstream tasks. However, existing prompt templates are mostly designed for sentence-level tasks and are inappropriate for sequence labeling objectives. To address the above issue, we propose a multi-task instruction-based generative framework, named InstructionNER, for low-resource named entity recognition. Specifically, we reformulate the NER task as a generation problem, which enriches source sentences with task-specific instructions and answer options, then inferences the entities and types in natural language. We further propose two auxiliary tasks, including entity extraction and entity typing, which enable the model to capture more boundary information of entities and deepen the understanding of entity type semantics, respectively. Experimental results show that our method consistently outperforms other baselines on five datasets in few-shot settings.

preprint2022arXiv

Intelligent Resource Allocations for IRS-Assisted OFDM Communications: A Hybrid MDQN-DDPG Approach

In this paper, we study the resource allocation problem for an intelligent reflecting surface (IRS)-assisted OFDM system. The system sum rate maximization framework is formulated by jointly optimizing subcarrier allocation, base station transmit beamforming and IRS phase shift. Considering the continuous and discrete hybrid action space characteristics of the optimization variables, we propose an efficient resource allocation algorithm combining multiple deep Q networks (MDQN) and deep deterministic policy-gradient (DDPG) to deal with this issue. In our algorithm, MDQN are employed to solve the problem of large discrete action space, while DDPG is introduced to tackle the continuous action allocation. Compared with the traditional approaches, our proposed MDQN-DDPG based algorithm has the advantage of continuous behavior improvement through learning from the environment. Simulation results demonstrate superior performance of our design in terms of system sum rate compared with the benchmark schemes.

preprint2022arXiv

Investigation of the Effect of Quantum Measurement on Parity-Time Symmetry

Symmetry, including the parity-time ($\mathcal{PT}$)-symmetry, is a striking topic, widely discussed and employed in many fields. It is well-known that quantum measurement can destroy or disturb quantum systems. However, can and how does quantum measurement destroy the symmetry of the measured system? To answer the pertinent question, we establish the correlation between the quantum measurement and Floquet $\mathcal{PT}$-symmetry and investigate for the first time how the measurement frequency and measurement strength affect the $\mathcal{PT}$-symmetry of the measured system using the $^{40}\mathrm{Ca}^{+}$ ion. It is already shown that the measurement at high frequencies would break the $\mathcal{PT}$ symmetry. Notably, even for an inadequately fast measurement frequency, if the measurement strength is sufficiently strong, the $\mathcal{PT}$ symmetry breaking can occur. The current work can enhance our knowledge of quantum measurement and symmetry and may inspire further research on the effect of quantum measurement on symmetry.

preprint2022arXiv

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

Tuning pre-trained language models (PLMs) with task-specific prompts has been a promising approach for text classification. Particularly, previous studies suggest that prompt-tuning has remarkable superiority in the low-data scenario over the generic fine-tuning methods with extra classifiers. The core idea of prompt-tuning is to insert text pieces, i.e., template, to the input and transform a classification problem into a masked language modeling problem, where a crucial step is to construct a projection, i.e., verbalizer, between a label space and a label word space. A verbalizer is usually handcrafted or searched by gradient descent, which may lack coverage and bring considerable bias and high variances to the results. In this work, we focus on incorporating external knowledge into the verbalizer, forming a knowledgeable prompt-tuning (KPT), to improve and stabilize prompt-tuning. Specifically, we expand the label word space of the verbalizer using external knowledge bases (KBs) and refine the expanded label word space with the PLM itself before predicting with the expanded label word space. Extensive experiments on zero and few-shot text classification tasks demonstrate the effectiveness of knowledgeable prompt-tuning.

preprint2022arXiv

Learning to Express in Knowledge-Grounded Conversation

Grounding dialogue generation by extra knowledge has shown great potentials towards building a system capable of replying with knowledgeable and engaging responses. Existing studies focus on how to synthesize a response with proper knowledge, yet neglect that the same knowledge could be expressed differently by speakers even under the same context. In this work, we mainly consider two aspects of knowledge expression, namely the structure of the response and style of the content in each part. We therefore introduce two sequential latent variables to represent the structure and the content style respectively. We propose a segmentation-based generation model and optimize the model by a variational approach to discover the underlying pattern of knowledge expression in a response. Evaluation results on two benchmarks indicate that our model can learn the structure style defined by a few examples and generate responses in desired content style.

preprint2022arXiv

Learning What You Need from What You Did: Product Taxonomy Expansion with User Behaviors Supervision

Taxonomies have been widely used in various domains to underpin numerous applications. Specially, product taxonomies serve an essential role in the e-commerce domain for the recommendation, browsing, and query understanding. However, taxonomies need to constantly capture the newly emerged terms or concepts in e-commerce platforms to keep up-to-date, which is expensive and labor-intensive if it relies on manual maintenance and updates. Therefore, we target the taxonomy expansion task to attach new concepts to existing taxonomies automatically. In this paper, we present a self-supervised and user behavior-oriented product taxonomy expansion framework to append new concepts into existing taxonomies. Our framework extracts hyponymy relations that conform to users' intentions and cognition. Specifically, i) to fully exploit user behavioral information, we extract candidate hyponymy relations that match user interests from query-click concepts; ii) to enhance the semantic information of new concepts and better detect hyponymy relations, we model concepts and relations through both user-generated content and structural information in existing taxonomies and user click logs, by leveraging Pre-trained Language Models and Graph Neural Network combined with Contrastive Learning; iii) to reduce the cost of dataset construction and overcome data skews, we construct a high-quality and balanced training dataset from existing taxonomy with no supervision. Extensive experiments on real-world product taxonomies in Meituan Platform, a leading Chinese vertical e-commerce platform to order take-out with more than 70 million daily active users, demonstrate the superiority of our proposed framework over state-of-the-art methods. Notably, our method enlarges the size of real-world product taxonomies from 39,263 to 94,698 relations with 88% precision.

preprint2022arXiv

Let Me Check the Examples: Enhancing Demonstration Learning via Explicit Imitation

Demonstration learning aims to guide the prompt prediction via providing answered demonstrations in the few shot settings. Despite achieving promising results, existing work only concatenates the answered examples as demonstrations to the prompt template (including the raw context) without any additional operation, neglecting the prompt-demonstration dependencies. Besides, prior research found that randomly replacing the labels of demonstrations marginally hurts performance, illustrating that the model could not properly learn the knowledge brought by the demonstrations. Inspired by the human learning process, in this paper, we introduce Imitation DEMOnstration Learning (Imitation-Demo) to strengthen demonstration learning via explicitly imitating human review behaviour, which includes: (1) contrastive learning mechanism to concentrate on the similar demonstrations. (2) demonstration-label re-prediction method to consolidate known knowledge. Experiment results show that our proposed method achieves state-of-the-art performance on 11 out of 14 classification corpora. Further studies also prove that Imitation-Demo strengthen the association between prompt and demonstrations, which could provide the basis for exploring how demonstration learning works.

preprint2022arXiv

Local central limit theorem for gradient field models

We consider the gradient field model in $\left[ -N,N\right] ^{2}\cap \mathbb{Z}^{2}$ with a uniformly convex interaction potential. Naddaf-Spencer \cite{NS} and Miller \cite{Mi} proved that the macroscopic averages of linear statistics of the field converge to a continuum Gaussian free field. In this paper we prove the distribution of $ϕ(0)/\sqrt{\log N}$ converges uniformly to a Gaussian density, with a Berry-Esseen type bound. This implies the distribution of $ϕ(0)$ is sufficiently `Gaussian like' between $[-\sqrt {\log N}, \sqrt {\log N}]$.

preprint2022arXiv

Long Short-Term Preference Modeling for Continuous-Time Sequential Recommendation

Modeling the evolution of user preference is essential in recommender systems. Recently, dynamic graph-based methods have been studied and achieved SOTA for recommendation, majority of which focus on user's stable long-term preference. However, in real-world scenario, user's short-term preference evolves over time dynamically. Although there exists sequential methods that attempt to capture it, how to model the evolution of short-term preference with dynamic graph-based methods has not been well-addressed yet. In particular: 1) existing methods do not explicitly encode and capture the evolution of short-term preference as sequential methods do; 2) simply using last few interactions is not enough for modeling the changing trend. In this paper, we propose Long Short-Term Preference Modeling for Continuous-Time Sequential Recommendation (LSTSR) to capture the evolution of short-term preference under dynamic graph. Specifically, we explicitly encode short-term preference and optimize it via memory mechanism, which has three key operations: Message, Aggregate and Update. Our memory mechanism can not only store one-hop information, but also trigger with new interactions online. Extensive experiments conducted on five public datasets show that LSTSR consistently outperforms many state-of-the-art recommendation methods across various lines.

preprint2022arXiv

Mask and Reason: Pre-Training Knowledge Graph Transformers for Complex Logical Queries

Knowledge graph (KG) embeddings have been a mainstream approach for reasoning over incomplete KGs. However, limited by their inherently shallow and static architectures, they can hardly deal with the rising focus on complex logical queries, which comprise logical operators, imputed edges, multiple source entities, and unknown intermediate entities. In this work, we present the Knowledge Graph Transformer (kgTransformer) with masked pre-training and fine-tuning strategies. We design a KG triple transformation method to enable Transformer to handle KGs, which is further strengthened by the Mixture-of-Experts (MoE) sparse activation. We then formulate the complex logical queries as masked prediction and introduce a two-stage masked pre-training strategy to improve transferability and generalizability. Extensive experiments on two benchmarks demonstrate that kgTransformer can consistently outperform both KG embedding-based baselines and advanced encoders on nine in-domain and out-of-domain reasoning tasks. Additionally, kgTransformer can reason with explainability via providing the full reasoning paths to interpret given answers.

preprint2022arXiv

Non-Markovian quantum thermometry

The rapidly developing quantum technologies and thermodynamics have put forward a requirement to precisely control and measure the temperature of microscopic matter at the quantum level. Many quantum thermometry schemes have been proposed. However, precisely measuring low temperature is still challenging because the obtained sensing errors generally tend to diverge with decreasing temperature. Using a continuous-variable system as a thermometer, we propose non-Markovian quantum thermometry to measure the temperature of a quantum reservoir. A mechanism to make the sensing error $δT$ scale with the temperature $T$ as the Landau bound $δT\simeq T$ in the full-temperature regime is discovered. Our analysis reveals that it is the quantum criticality of the total thermometer-reservoir system that causes this enhanced sensitivity. Efficiently avoiding the error-divergence problem, our result gives an efficient way to precisely measure the low temperature of quantum systems.

preprint2022arXiv

Pay More Attention to History: A Context Modelling Strategy for Conversational Text-to-SQL

Conversational text-to-SQL aims at converting multi-turn natural language queries into their corresponding SQL (Structured Query Language) representations. One of the most intractable problems of conversational text-to-SQL is modelling the semantics of multi-turn queries and gathering the proper information required for the current query. This paper shows that explicitly modelling the semantic changes by adding each turn and the summarization of the whole context can bring better performance on converting conversational queries into SQLs. In particular, we propose two conversational modelling tasks in both turn grain and conversation grain. These two tasks simply work as auxiliary training tasks to help with multi-turn conversational semantic parsing. We conducted empirical studies and achieved new state-of-the-art results on the large-scale open-domain conversational text-to-SQL dataset. The results demonstrate that the proposed mechanism significantly improves the performance of multi-turn semantic parsing.

preprint2022arXiv

Perceptual Quality Assessment for Fine-Grained Compressed Images

Recent years have witnessed the rapid development of image storage and transmission systems, in which image compression plays an important role. Generally speaking, image compression algorithms are developed to ensure good visual quality at limited bit rates. However, due to the different compression optimization methods, the compressed images may have different levels of quality, which needs to be evaluated quantificationally. Nowadays, the mainstream full-reference (FR) metrics are effective to predict the quality of compressed images at coarse-grained levels (the bit rates differences of compressed images are obvious), however, they may perform poorly for fine-grained compressed images whose bit rates differences are quite subtle. Therefore, to better improve the Quality of Experience (QoE) and provide useful guidance for compression algorithms, we propose a full-reference image quality assessment (FR-IQA) method for compressed images of fine-grained levels. Specifically, the reference images and compressed images are first converted to $YCbCr$ color space. The gradient features are extracted from regions that are sensitive to compression artifacts. Then we employ the Log-Gabor transformation to further analyze the texture difference. Finally, the obtained features are fused into a quality score. The proposed method is validated on the fine-grained compression image quality assessment (FGIQA) database, which is especially constructed for assessing the quality of compressed images with close bit rates. The experimental results show that our metric outperforms mainstream FR-IQA metrics on the FGIQA database. We also test our method on other commonly used compression IQA databases and the results show that our method obtains competitive performance on the coarse-grained compression IQA databases as well.

preprint2022arXiv

Power law decay at criticality for the q-state antiferromagnetic Potts model on regular trees

We present a proof of the power law decay of magnetic moment for the $q$-state antiferromagnetic Potts model on the regular tree at the critical temperature, and also justify that the exact exponent is $\frac{1}{2}$. Our proof relies on the assumption of the uniqueness at the critical temperature, which has been established for $q=3,4$, and for $q \ge 5$ with large degree. An iterative contraction inequality is developed for independent interests.

preprint2022arXiv

Searching for Optimal Subword Tokenization in Cross-domain NER

Input distribution shift is one of the vital problems in unsupervised domain adaptation (UDA). The most popular UDA approaches focus on domain-invariant representation learning, trying to align the features from different domains into similar feature distributions. However, these approaches ignore the direct alignment of input word distributions between domains, which is a vital factor in word-level classification tasks such as cross-domain NER. In this work, we shed new light on cross-domain NER by introducing a subword-level solution, X-Piece, for input word-level distribution shift in NER. Specifically, we re-tokenize the input words of the source domain to approach the target subword distribution, which is formulated and solved as an optimal transport problem. As this approach focuses on the input level, it can also be combined with previous DIRL methods for further improvement. Experimental results show the effectiveness of the proposed method based on BERT-tagger on four benchmark NER datasets. Also, the proposed method is proved to benefit DIRL methods such as DANN.

preprint2022arXiv

Self-Testing of a Single Quantum System: Theory and Experiment

Certifying individual quantum devices with minimal assumptions is crucial for the development of quantum technologies. Here, we investigate how to leverage single-system contextuality to realize self-testing. We develop a robust self-testing protocol based on the simplest contextuality witness for the simplest contextual quantum system, the Klyachko-Can-Binicioğlu-Shumovsky (KCBS) inequality for the qutrit. We establish a lower bound on the fidelity of the state and the measurements (to an ideal configuration) as a function of the value of the witness under a pragmatic assumption on the measurements we call the KCBS orthogonality condition. We apply the method in an experiment with randomly chosen measurements on a single trapped $^{40}{\rm Ca}^+$ and near-perfect detection efficiency. The observed statistics allow us to self-test the system and provide the first experimental demonstration of quantum self-testing of a single system. Further, we quantify and report that deviations from our assumptions are minimal, an aspect previously overlooked by contextuality experiments.

preprint2022arXiv

Statistical Depth for Point Process via the Isometric Log-Ratio Transformation

Statistical depth, a useful tool to measure the center-outward rank of multivariate and functional data, is still under-explored in temporal point processes. Recent studies on point process depth proposed a weighted product of two terms - one indicates the depth of the cardinality of the process, and the other characterizes the conditional depth of the temporal events given the cardinality. The second term is of great challenge because of the apparent nonlinear structure of event times, and so far only basic parametric representations such as Gaussian and Dirichlet densities were adopted in the definitions. However, these simplified forms ignore the underlying distribution of the process events, which makes the methods difficult to interpret and to apply to complicated patterns. To deal with these problems, we in this paper propose a distribution-based approach to the conditional depth via the well-known Isometric Log-Ratio (ILR) transformation on the inter-event times. The new depth, called the ILR depth, is at first defined for homogeneous Poisson process by using the density function on the transformed space. The definition is then extended to any general point process via a time-rescaling transformation. We illustrate the ILR depth using simulations of Poisson and non-Poisson processes and demonstrate its superiority over previous methods. We also thoroughly examine its mathematical properties and asymptotics in large samples. Finally, we apply the ILR depth in a real dataset and the result clearly shows the effectiveness of the new method.

preprint2022arXiv

Structural Bias for Aspect Sentiment Triplet Extraction

Structural bias has recently been exploited for aspect sentiment triplet extraction (ASTE) and led to improved performance. On the other hand, it is recognized that explicitly incorporating structural bias would have a negative impact on efficiency, whereas pretrained language models (PLMs) can already capture implicit structures. Thus, a natural question arises: Is structural bias still a necessity in the context of PLMs? To answer the question, we propose to address the efficiency issues by using an adapter to integrate structural bias in the PLM and using a cheap-to-compute relative position structure in place of the syntactic dependency structure. Benchmarking evaluation is conducted on the SemEval datasets. The results show that our proposed structural adapter is beneficial to PLMs and achieves state-of-the-art performance over a range of strong baselines, yet with a light parameter demand and low latency. Meanwhile, we give rise to the concern that the current evaluation default with data of small scale is under-confident. Consequently, we release a large-scale dataset for ASTE. The results on the new dataset hint that the structural adapter is confidently effective and efficient to a large scale. Overall, we draw the conclusion that structural bias shall still be a necessity even with PLMs.

preprint2022arXiv

TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization

Although pre-trained language models (PLMs) have achieved great success and become a milestone in NLP, abstractive conversational summarization remains a challenging but less studied task. The difficulty lies in two aspects. One is the lack of large-scale conversational summary data. Another is that applying the existing pre-trained models to this task is tricky because of the structural dependence within the conversation and its informal expression, etc. In this work, we first build a large-scale (11M) pretraining dataset called RCS, based on the multi-person discussions in the Reddit community. We then present TANet, a thread-aware Transformer-based network. Unlike the existing pre-trained models that treat a conversation as a sequence of sentences, we argue that the inherent contextual dependency among the utterances plays an essential role in understanding the entire conversation and thus propose two new techniques to incorporate the structural information into our model. The first is thread-aware attention which is computed by taking into account the contextual dependency within utterances. Second, we apply thread prediction loss to predict the relations between utterances. We evaluate our model on four datasets of real conversations, covering types of meeting transcripts, customer-service records, and forum threads. Experimental results demonstrate that TANET achieves a new state-of-the-art in terms of both automatic evaluation and human judgment.

preprint2022arXiv

Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Domain adaptive object detection (DAOD) is a promising way to alleviate performance drop of detectors in new scenes. Albeit great effort made in single source domain adaptation, a more generalized task with multiple source domains remains not being well explored, due to knowledge degradation during their combination. To address this issue, we propose a novel approach, namely target-relevant knowledge preservation (TRKP), to unsupervised multi-source DAOD. Specifically, TRKP adopts the teacher-student framework, where the multi-head teacher network is built to extract knowledge from labeled source domains and guide the student network to learn detectors in unlabeled target domain. The teacher network is further equipped with an adversarial multi-source disentanglement (AMSD) module to preserve source domain-specific knowledge and simultaneously perform cross-domain alignment. Besides, a holistic target-relevant mining (HTRM) scheme is developed to re-weight the source images according to the source-target relevance. By this means, the teacher network is enforced to capture target-relevant knowledge, thus benefiting decreasing domain shift when mentoring object detection in the target domain. Extensive experiments are conducted on various widely used benchmarks with new state-of-the-art scores reported, highlighting the effectiveness.

preprint2022arXiv

Unified Knowledge Prompt Pre-training for Customer Service Dialogues

Dialogue bots have been widely applied in customer service scenarios to provide timely and user-friendly experience. These bots must classify the appropriate domain of a dialogue, understand the intent of users, and generate proper responses. Existing dialogue pre-training models are designed only for several dialogue tasks and ignore weakly-supervised expert knowledge in customer service dialogues. In this paper, we propose a novel unified knowledge prompt pre-training framework, UFA (\textbf{U}nified Model \textbf{F}or \textbf{A}ll Tasks), for customer service dialogues. We formulate all the tasks of customer service dialogues as a unified text-to-text generation task and introduce a knowledge-driven prompt strategy to jointly learn from a mixture of distinct dialogue tasks. We pre-train UFA on a large-scale Chinese customer service corpus collected from practical scenarios and get significant improvements on both natural language understanding (NLU) and natural language generation (NLG) benchmarks.

preprint2022arXiv

Unmanned Aerial Vehicle Swarm-Enabled Edge Computing: Potentials, Promising Technologies, and Challenges

Unmanned aerial vehicle (UAV) swarm enabled edge computing is envisioned to be promising in the sixth generation wireless communication networks due to their wide application sensories and flexible deployment. However, most of the existing works focus on edge computing enabled by a single or a small scale UAVs, which are very different from UAV swarm-enabled edge computing. In order to facilitate the practical applications of UAV swarm-enabled edge computing, the state of the art research is presented in this article. The potential applications, architectures and implementation considerations are illustrated. Moreover, the promising enabling technologies for UAV swarm-enabled edge computing are discussed. Furthermore, we outline challenges and open issues in order to shed light on the future research directions.

preprint2022arXiv

Unsupervised Learning of Accurate Siamese Tracking

Unsupervised learning has been popular in various computer vision tasks, including visual object tracking. However, prior unsupervised tracking approaches rely heavily on spatial supervision from template-search pairs and are still unable to track objects with strong variation over a long time span. As unlimited self-supervision signals can be obtained by tracking a video along a cycle in time, we investigate evolving a Siamese tracker by tracking videos forward-backward. We present a novel unsupervised tracking framework, in which we can learn temporal correspondence both on the classification branch and regression branch. Specifically, to propagate reliable template feature in the forward propagation process so that the tracker can be trained in the cycle, we first propose a consistency propagation transformation. We then identify an ill-posed penalty problem in conventional cycle training in backward propagation process. Thus, a differentiable region mask is proposed to select features as well as to implicitly penalize tracking errors on intermediate frames. Moreover, since noisy labels may degrade training, we propose a mask-guided loss reweighting strategy to assign dynamic weights based on the quality of pseudo labels. In extensive experiments, our tracker outperforms preceding unsupervised methods by a substantial margin, performing on par with supervised methods on large-scale datasets such as TrackingNet and LaSOT. Code is available at https://github.com/FlorinShum/ULAST.

preprint2022arXiv

Work statistics and thermal phase transitions

Many previous studies have demonstrated that work statistics can exhibit certain singular behaviors in the quantum critical regimes of many-body systems at zero or very low temperatures. However, as the temperature increases, it is commonly believed that such singularities will vanish. Contrary to this common recognition, we report a nonanalytic behavior of the averaged work done, which occurs at finite temperature, in the Dicke model as well as the Lipkin-Meshkov-Glick model subjected to the sudden quenches of their work parameters. It is revealed that work statistics can be viewed as a signature of the thermal phase transition when the quenched parameters are tuned across the critical line that separates two different thermal phases.

preprint2021arXiv

BaPipe: Exploration of Balanced Pipeline Parallelism for DNN Training

The size of deep neural networks (DNNs) grows rapidly as the complexity of the machine learning algorithm increases. To satisfy the requirement of computation and memory of DNN training, distributed deep learning based on model parallelism has been widely recognized. We propose a new pipeline parallelism training framework, BaPipe, which can automatically explore pipeline parallelism training methods and balanced partition strategies for DNN distributed training. In BaPipe, each accelerator calculates the forward propagation and backward propagation of different parts of networks to implement the intra-batch pipeline parallelism strategy. BaPipe uses a new load balancing automatic exploration strategy that considers the parameters of DNN models and the computation, memory, and communication resources of accelerator clusters. We have trained different DNNs such as VGG-16, ResNet-50, and GNMT on GPU clusters and simulated the performance of different FPGA clusters. Compared with state-of-the-art data parallelism and pipeline parallelism frameworks, BaPipe provides up to 3.2x speedup and 4x memory reduction in various platforms.

preprint2021arXiv

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation

Generating human action proposals in untrimmed videos is an important yet challenging task with wide applications. Current methods often suffer from the noisy boundary locations and the inferior quality of confidence scores used for proposal retrieving. In this paper, we present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation. First, we propose a novel boundary regressor based on the complementary characteristics of both starting and ending boundary classifiers. Specifically, we utilize the U-shaped architecture with nested skip connections to capture rich contexts and introduce bi-directional boundary matching mechanism to improve boundary precision. Second, to account for the proposal-proposal relations ignored in previous methods, we devise a proposal relation block to which includes two self-attention modules from the aspects of position and channel. Furthermore, we find that there inevitably exists data imbalanced problems in the positive/negative proposals and temporal durations, which harm the model performance on tail distributions. To relieve this issue, we introduce the scale-balanced re-sampling strategy. Extensive experiments are conducted on two popular benchmarks: ActivityNet-1.3 and THUMOS14, which demonstrate that BSN++ achieves the state-of-the-art performance. Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.

preprint2021arXiv

Electro-Optic Lithium Niobate Metasurfaces

Many applications of metasurfaces require an ability to dynamically change their properties in time domain. Electrical tuning techniques are of particular interest, since they pave a way to on-chip integration of metasurfaces with optoelectronic devices. In this work, we propose and experimentally demonstrate an electro-optic lithium niobate (EO-LN) metasurface that shows dynamic modulations to phase retardation of transmitted light. Quasi-bound states in the continuum (QBIC) are observed from our metasurface. And by applying external electric voltages, the refractive index of the LN is changed by Pockels EO nonlinearity, leading to efficient phase modulations to the transmitted light around the QBIC wavelength. Our EO-LN metasurface opens up new routes for potential applications in the field of displaying, pulse shaping, and spatial light modulating.

preprint2021arXiv

Learning Statistical Texture for Semantic Segmentation

Existing semantic segmentation works mainly focus on learning the contextual information in high-level semantic features with CNNs. In order to maintain a precise boundary, low-level texture features are directly skip-connected into the deeper layers. Nevertheless, texture features are not only about local structure, but also include global statistical knowledge of the input image. In this paper, we fully take advantages of the low-level texture features and propose a novel Statistical Texture Learning Network (STLNet) for semantic segmentation. For the first time, STLNet analyzes the distribution of low level information and efficiently utilizes them for the task. Specifically, a novel Quantization and Counting Operator (QCO) is designed to describe the texture information in a statistical manner. Based on QCO, two modules are introduced: (1) Texture Enhance Module (TEM), to capture texture-related information and enhance the texture details; (2) Pyramid Texture Feature Extraction Module (PTFEM), to effectively extract the statistical texture features from multiple scales. Through extensive experiments, we show that the proposed STLNet achieves state-of-the-art performance on three semantic segmentation benchmarks: Cityscapes, PASCAL Context and ADE20K.

preprint2021arXiv

Non-Fermi liquid phase and linear-in-temperature scattering rate in overdoped two dimensional Hubbard model

Understanding electronic properties that violate the Landau Fermi liquid paradigm in cuprate superconductors remains a major challenge in condensed matter physics. The strange metal state in overdoped cuprates that exhibits linear-in-temperature scattering rate and dc resistivity is a particularly puzzling example. Here, we compute the electronic scattering rate in the two-dimensional Hubbard model using cluster generalization of dynamical mean-field theory. We present a global phase diagram documenting an apparent non-Fermi liquid phase, in between the pseudogap and Fermi liquid phase in the doped Mott insulator regime. We discover that in this non-Fermi liquid phase, the electronic scattering rate $γ_k(T)$ can display linear temperature dependence as temperature $T$ goes to zero. In the temperature range that we can access, the $T-$ dependent scattering rate is isotropic on the Fermi surface, in agreement with recent experiments. Using fluctuation diagnostic techniques, we identify antiferromagnetic fluctuations as the physical origin of the $T-$ linear electronic scattering rate.

preprint2021arXiv

SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks

The performance and efficiency of distributed training of Deep Neural Networks highly depend on the performance of gradient averaging among all participating nodes, which is bounded by the communication between nodes. There are two major strategies to reduce communication overhead: one is to hide communication by overlapping it with computation, and the other is to reduce message sizes. The first solution works well for linear neural architectures, but latest networks such as ResNet and Inception offer limited opportunity for this overlapping. Therefore, researchers have paid more attention to minimizing communication. In this paper, we present a novel gradient compression framework derived from insights of real gradient distributions, and which strikes a balance between compression ratio, accuracy, and computational overhead. Our framework has two major novel components: sparsification of gradients in the frequency domain, and a range-based floating point representation to quantize and further compress gradients frequencies. Both components are dynamic, with tunable parameters that achieve different compression ratio based on the accuracy requirement and systems' platforms, and achieve very high throughput on GPUs. We prove that our techniques guarantee the convergence with a diminishing compression ratio. Our experiments show that the proposed compression framework effectively improves the scalability of most popular neural networks on a 32 GPU cluster to the baseline of no compression, without compromising the accuracy and convergence speed.

preprint2020arXiv

A linear combination of atomic orbitals (LCAO) model for deterministically placed acceptor arrays in silicon

We develop a tight-binding model based on linear combination of atomic orbitals (LCAO) methods to describe the electronic structure of arrays of acceptors, where the underlying basis states are derived from an effective-mass-theory solution for a single acceptor in either the spherical approximation or the cubic model. Our model allows for arbitrarily strong spin-orbit coupling in the valence band of the semiconductor. We have studied pairs and dimerised linear chains of acceptors in silicon in the `independent-hole' approximation, and investigated the conditions for the existence of topological edge states in the chains. For the finite chain we find a complex interplay between electrostatic effects and the dimerisation, with the long-range Coulomb attraction of the hole to the acceptors splitting off states localised at the end acceptors from the rest of the chain. A further pair of states then splits off from each band, to form a pair localised on the next-to-end acceptors, for one sense of the bond alternation and merges into the bulk bands for the other sense of the alternation. We confirm the topologically non-trivial nature of these next-to-end localised states by calculating the Zak phase. We argue that for the more physically accessible case of one hole per acceptor these long-range electrostatic effects will be screened out; we show this by treating a simple phenomenologically screened model in which electrostatic contributions from beyond the nearest neighbours of acceptor each pair are removed. Topological states are now found on the end acceptors of the chains. In some cases the termination of the chain required to produce topological states is not the one expected on the basis of simple geometry (short versus long bonds); we argue this is because of a non-monotonic relationship between the bond length and the effective Hamiltonian matrix elements between the acceptors.

preprint2020arXiv

Class-wise Dynamic Graph Convolution for Semantic Segmentation

Recent works have made great progress in semantic segmentation by exploiting contextual information in a local or global manner with dilated convolutions, pyramid pooling or self-attention mechanism. In order to avoid potential misleading contextual information aggregation in previous works, we propose a class-wise dynamic graph convolution (CDGC) module to adaptively propagate information. The graph reasoning is performed among pixels in the same class. Based on the proposed CDGC module, we further introduce the Class-wise Dynamic Graph Convolution Network(CDGCNet), which consists of two main parts including the CDGC module and a basic segmentation network, forming a coarse-to-fine paradigm. Specifically, the CDGC module takes the coarse segmentation result as class mask to extract node features for graph construction and performs dynamic graph convolutions on the constructed graph to learn the feature aggregation and weight allocation. Then the refined feature and the original feature are fused to get the final prediction. We conduct extensive experiments on three popular semantic segmentation benchmarks including Cityscapes, PASCAL VOC 2012 and COCO Stuff, and achieve state-of-the-art performance on all three benchmarks.

preprint2020arXiv

Collaborative Distillation in the Parameter and Spectrum Domains for Video Action Recognition

Recent years have witnessed the significant progress of action recognition task with deep networks. However, most of current video networks require large memory and computational resources, which hinders their applications in practice. Existing knowledge distillation methods are limited to the image-level spatial domain, ignoring the temporal and frequency information which provide structural knowledge and are important for video analysis. This paper explores how to train small and efficient networks for action recognition. Specifically, we propose two distillation strategies in the frequency domain, namely the feature spectrum and parameter distribution distillations respectively. Our insight is that appealing performance of action recognition requires \textit{explicitly} modeling the temporal frequency spectrum of video features. Therefore, we introduce a spectrum loss that enforces the student network to mimic the temporal frequency spectrum from the teacher network, instead of \textit{implicitly} distilling features as many previous works. Second, the parameter frequency distribution is further adopted to guide the student network to learn the appearance modeling process from the teacher. Besides, a collaborative learning strategy is presented to optimize the training process from a probabilistic view. Extensive experiments are conducted on several action recognition benchmarks, such as Kinetics, Something-Something, and Jester, which consistently verify effectiveness of our approach, and demonstrate that our method can achieve higher performance than state-of-the-art methods with the same backbone.

preprint2020arXiv

Complementary Boundary Generator with Scale-Invariant Relation Modeling for Temporal Action Localization: Submission to ActivityNet Challenge 2020

This technical report presents an overview of our solution used in the submission to ActivityNet Challenge 2020 Task 1 (\textbf{temporal action localization/detection}). Temporal action localization requires to not only precisely locate the temporal boundaries of action instances, but also accurately classify the untrimmed videos into specific categories. In this paper, we decouple the temporal action localization task into two stages (i.e. proposal generation and classification) and enrich the proposal diversity through exhaustively exploring the influences of multiple components from different but complementary perspectives. Specifically, in order to generate high-quality proposals, we consider several factors including the video feature encoder, the proposal generator, the proposal-proposal relations, the scale imbalance, and ensemble strategy. Finally, in order to obtain accurate detections, we need to further train an optimal video classifier to recognize the generated proposals. Our proposed scheme achieves the state-of-the-art performance on the temporal action localization task with \textbf{42.26} average mAP on the challenge testing set.

preprint2020arXiv

Controllable dynamics of a dissipative two-level system

We propose a strategy to modulate the decoherence dynamics of a two-level system, which interacts with a dissipative bosonic environment, by introducing an assisted degree of freedom. It is revealed that the decay rate of the two-level system can be significantly suppressed under suitable steers of the assisted degree of freedom. Our result provides an alternative way to fight against decoherence and realize a controllable dissipative dynamics.

preprint2020arXiv

Convenient Real-Time Monitoring of the Contamination of Surface Ion Trap

Recent studies indicated that contamination by adatoms on the surface ion trap can generate contact potential, leading to fluctuations in patch potential. By investigating contamination induced by surface adatoms during a loading process, a direct physical image of the contamination process and the relationship between the capacitance change and the contamination from surface adatoms is examined theoretically and experimentally. From the relationship, the contamination by surface adatoms and the effect of in situ treatment process can be monitored by the capacitance between electrodes in real time. This study is foundational to further research on anomalous heating with practical applications in quantum information processing from surface ion traps.

preprint2020arXiv

Coreference Resolution as Query-based Span Prediction

In this paper, we present an accurate and extensible approach for the coreference resolution task. We formulate the problem as a span prediction task, like in machine reading comprehension (MRC): A query is generated for each candidate mention using its surrounding context, and a span prediction module is employed to extract the text spans of the coreferences within the document using the generated query. This formulation comes with the following key advantages: (1) The span prediction strategy provides the flexibility of retrieving mentions left out at the mention proposal stage; (2) In the MRC framework, encoding the mention and its context explicitly in a query makes it possible to have a deep and thorough examination of cues embedded in the context of coreferent mentions; and (3) A plethora of existing MRC datasets can be used for data augmentation to improve the model's generalization capability. Experiments demonstrate significant performance boost over previous models, with 87.5 (+2.5) F1 score on the GAP benchmark and 83.1 (+3.5) F1 score on the CoNLL-2012 benchmark.

preprint2020arXiv

Deep learning to estimate the physical proportion of infected region of lung for COVID-19 pneumonia with CT image set

Utilizing computed tomography (CT) images to quickly estimate the severity of cases with COVID-19 is one of the most straightforward and efficacious methods. Two tasks were studied in this present paper. One was to segment the mask of intact lung in case of pneumonia. Another was to generate the masks of regions infected by COVID-19. The masks of these two parts of images then were converted to corresponding volumes to calculate the physical proportion of infected region of lung. A total of 129 CT image set were herein collected and studied. The intrinsic Hounsfiled value of CT images was firstly utilized to generate the initial dirty version of labeled masks both for intact lung and infected regions. Then, the samples were carefully adjusted and improved by two professional radiologists to generate the final training set and test benchmark. Two deep learning models were evaluated: UNet and 2.5D UNet. For the segment of infected regions, a deep learning based classifier was followed to remove unrelated blur-edged regions that were wrongly segmented out such as air tube and blood vessel tissue etc. For the segmented masks of intact lung and infected regions, the best method could achieve 0.972 and 0.757 measure in mean Dice similarity coefficient on our test benchmark. As the overall proportion of infected region of lung, the final result showed 0.961 (Pearson's correlation coefficient) and 11.7% (mean absolute percent error). The instant proportion of infected regions of lung could be used as a visual evidence to assist clinical physician to determine the severity of the case. Furthermore, a quantified report of infected regions can help predict the prognosis for COVID-19 cases which were scanned periodically within the treatment cycle.

preprint2020arXiv

Description Based Text Classification with Reinforcement Learning

The task of text classification is usually divided into two stages: {\it text feature extraction} and {\it classification}. In this standard formalization categories are merely represented as indexes in the label vocabulary, and the model lacks for explicit instructions on what to classify. Inspired by the current trend of formalizing NLP problems as question answering tasks, we propose a new framework for text classification, in which each category label is associated with a category description. Descriptions are generated by hand-crafted templates or using abstractive/extractive models from reinforcement learning. The concatenation of the description and the text is fed to the classifier to decide whether or not the current label should be assigned to the text. The proposed strategy forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances. We observe significant performance boosts over strong baselines on a wide range of text classification tasks including single-label classification, multi-label classification and multi-aspect sentiment analysis.

preprint2020arXiv

Estimation of the Laser Frequency Nosie Spectrum by Continuous Dynamical Decoupling

Decoherence induced by the laser frequency noise is one of the most important obstacles in the quantum information processing. In order to suppress this decoherence, the noise power spectral density needs to be accurately characterized. In particular, the noise spectrum measurement based on the coherence characteristics of qubits would be a meaningful and still challenging method. Here, we theoretically analyze and experimentally obtain the spectrum of laser frequency noise based on the continuous dynamical decoupling technique. We first estimate the mixture-noise (including laser and magnetic noises) spectrum up to $(2π)$530 kHz by monitoring the transverse relaxation from an initial state $+X$, followed by a gradient descent data process protocol. Then the contribution from the laser noise is extracted by enconding the qubits on different Zeeman sublevels. We also investigate two sufficiently strong noise components by making an analogy between these noises and driving lasers whose linewidth assumed to be negligible. This method is verified experimentally and finally helps to characterize the noise.

preprint2020arXiv

Glyce: Glyph-vectors for Chinese Character Representations

It is intuitive that NLP tasks for logographic languages like Chinese should benefit from the use of the glyph information in those languages. However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found. In this paper, we address this gap by presenting Glyce, the glyph-vectors for Chinese character representations. We make three major innovations: (1) We use historical Chinese scripts (e.g., bronzeware script, seal script, traditional Chinese, etc) to enrich the pictographic evidence in characters; (2) We design CNN structures (called tianzege-CNN) tailored to Chinese character image processing; and (3) We use image-classification as an auxiliary task in a multi-task learning setup to increase the model's ability to generalize. We show that glyph-based models are able to consistently outperform word/char ID-based models in a wide range of Chinese NLP tasks. We are able to set new state-of-the-art results for a variety of Chinese NLP tasks, including tagging (NER, CWS, POS), sentence pair classification, single sentence classification tasks, dependency parsing, and semantic role labeling. For example, the proposed model achieves an F1 score of 80.6 on the OntoNotes dataset of NER, +1.5 over BERT; it achieves an almost perfect accuracy of 99.8\% on the Fudan corpus for text classification. Code found at https://github.com/ShannonAI/glyce.

preprint2020arXiv

Heat transfer in a nonequilibrium spin-boson model: A perturbative approach

We investigate the heat transport in a nonequilibrium spin-boson model, where a two level system bridging two harmonic reservoirs at different temperatures, by employing a unitary transformation along with a resolvent operator expansion technique. Analytical expressions of the heat current and the thermal conductance of this model are obtained. Compared with the performances of other methods, namely, the nonequilibrium Green's function method and the equation of motion formulation, our approach provides a reasonable description of heat transfer properties of the nonequilibrium spin-boson model for the weak-coupling region at low temperature.

preprint2020arXiv

Hierarchical Feature Embedding for Attribute Recognition

Attribute recognition is a crucial but challenging task due to viewpoint changes, illumination variations and appearance diversities, etc. Most of previous work only consider the attribute-level feature embedding, which might perform poorly in complicated heterogeneous conditions. To address this problem, we propose a hierarchical feature embedding (HFE) framework, which learns a fine-grained feature embedding by combining attribute and ID information. In HFE, we maintain the inter-class and intra-class feature embedding simultaneously. Not only samples with the same attribute but also samples with the same ID are gathered more closely, which could restrict the feature embedding of visually hard samples with regard to attributes and improve the robustness to variant conditions. We establish this hierarchical structure by utilizing HFE loss consisted of attribute-level and ID-level constraints. We also introduce an absolute boundary regularization and a dynamic loss weight as supplementary components to help build up the feature embedding. Experiments show that our method achieves the state-of-the-art results on two pedestrian attribute datasets and a facial attribute dataset.

preprint2020arXiv

Influence of equilibrium and nonequilibrium environments on macroscopic realism through the Leggett-Garg inequalities

We study the macroscopic realism (macrorealism) through the two- and three-time Leggett-Garg inequalities (LGIs) in a two interacting qubits system. The two qubits are coupled either with two bosonic (thermal or photonic) baths or fermionic (electronic) baths. We study both how the equilibrium and nonequilibrium environments influence the LGIs. One way to characterize the nonequilibrium condition is by the temperature difference (for the bosonic bath) or the chemical potential difference (for the fermionic bath). We also study the heat or particle current and the entropy production rate generated by the nonequilibrium environments. Analytical forms of LGIs and the maximal value of LGIs based on the quantum master equation beyond the secular approximation are derived. The LGI functions and the corresponding maximal value have separated contributions, the part describing the coherent evolution and the part describing the coupling between the system and environments. The environment-coupling part can be from the equilibrium environment or the nonequilibrium environment. The nonequilibrium dynamics is quantified by the Bloch-Redfield equation which is beyond the Lindblad form. We found that the nonequilibriumness quantified by the temperature difference or the chemical potential difference can lead to the LGIs violations or the increase of the maximal value of LGIs, restoring the quantum nature from certain equilibrium cases where LGIs are preserved. The corresponding nonequilibrium thermodynamic cost is quantified by the nonzero entropy production rate. Our finding of the nonequilibrium promoted LGIs violations suggests a new strategy for the design of quantum information processing and quantum computational devices to maintain the quantum nature and quantum correlations for long.

preprint2020arXiv

Low-Resource Knowledge-Grounded Dialogue Generation

Responding with knowledge has been recognized as an important capability for an intelligent conversational agent. Yet knowledge-grounded dialogues, as training data for learning such a response generation model, are difficult to obtain. Motivated by the challenge in practice, we consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available. In such a low-resource setting, we devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model. By this means, the major part of the model can be learned from a large number of ungrounded dialogues and unstructured documents, while the remaining small parameters can be well fitted using the limited training examples. Evaluation results on two benchmarks indicate that with only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.

preprint2020arXiv

Massless Phases for the Villain model in $d\geq 3$

We consider the classical Villain rotator model in $\mathbb{Z}^d, d\geq 3$ at sufficiently low temperature, and prove that the truncated two-point function decays asymptotically as $|x|^{2-d}$, with an algebraic rate of convergence. We also obtain the same asymptotic decay separately for the transversal two-point functions. This quantifies the spontaneous magnetization result for the Villain model at low temperature, and rigorously establishes the Gaussian spin-wave conjecture in dimension $d\ge 3$. We believe that our method extends to finite range interactions and to other abelian spin systems and abelian gauge theory in $d\geq 3$.

preprint2020arXiv

Mott transition and high-temperature crossovers at half-filling

The interaction-driven Mott transition in the half-filled Hubbard model is a first-order phase transition that terminates at a critical point $(T_\mathrm{c},U_\mathrm{c})$ in the temperature-interaction plane $T-U$. A number of crossovers occur along lines that extend for some range above $(T_\mathrm{c},U_\mathrm{c})$. Asymptotically close to $(T_\mathrm{c},U_\mathrm{c})$, these lines coalesce into the so-called Widom line. The existence of $(T_\mathrm{c},U_\mathrm{c})$ and of the associated crossovers becomes unclear when long-wavelength fluctuations or long-range order occur above $(T_\mathrm{c},U_\mathrm{c})$. We study this problem using continuous-time quantum Monte Carlo methods as impurity solvers for both Dynamical Mean-Field Theory (DMFT) and Cellular Dynamical Mean-Field Theory (CDMFT). We contrast the cases of the square lattice, where antiferromagnetic fluctuations dominate in the vicinity of the Mott transition, and the triangular lattice where they do not. The inflexion points and maxima found near the Widom line for the square lattice can serve as proxy for the triangular lattice case. But the only crossover observable in all cases at sufficiently high temperature is that associated with the opening of the Mott gap. The same physics also controls an analog crossover in the resistivity called the "Quantum Widom line".

preprint2020arXiv

Optically Addressed Spatial Light Modulator based on Nonlinear Metasurface

Spatial light modulators (SLMs) are devices for modulating amplitude, phase or polarization of a light beam on demand. Such devices have been playing an indispensable inuence in many areas from our daily entertainments to scientific researches. In the past decades, the SLMs have been mainly operated in electrical addressing (EASLM) manner, wherein the writing images are created and loaded via conventional electronic interfaces. However, adoption of pixelated electrodes puts limits on both resolution and efficiency of the EASLMs. Here, we present an optically addressed SLM based on a nonlinear metasurface (MS-OASLM), by which signal light is directly modulated by another writing beam requiring no electrode. The MS-OASLM shows unprecedented compactness and is 400 nm in total thickness benefitting from the outstanding nonlinearity of the metasurface. And their subwavelength feature size enables a high resolution up to 250 line pairs per millimeter, which is more than one order of magnitude better than any currently commercial SLMs. Such MS-OASLMs could provide opportunities to develop the next generation of high resolution displays and all-optical information processing technologies.

preprint2020arXiv

Regression Models Using Shapes of Functions as Predictors

Functional variables are often used as predictors in regression problems. A commonly-used parametric approach, called {\it scalar-on-function regression}, uses the $\ltwo$ inner product to map functional predictors into scalar responses. This method can perform poorly when predictor functions contain undesired phase variability, causing phases to have disproportionately large influence on the response variable. One past solution has been to perform phase-amplitude separation (as a pre-processing step) and then use only the amplitudes in the regression model. Here we propose a more integrated approach, termed elastic functional regression model (EFRM), where phase-separation is performed inside the regression model, rather than as a pre-processing step. This approach generalizes the notion of phase in functional data, and is based on the norm-preserving time warping of predictors. Due to its invariance properties, this representation provides robustness to predictor phase variability and results in improved predictions of the response variable over traditional models. We demonstrate this framework using a number of datasets involving gait signals, NMR data, and stock market prices.

preprint2020arXiv

SAMOT: Switcher-Aware Multi-Object Tracking and Still Another MOT Measure

Multi-Object Tracking (MOT) is a popular topic in computer vision. However, identity issue, i.e., an object is wrongly associated with another object of a different identity, still remains to be a challenging problem. To address it, switchers, i.e., confusing targets thatmay cause identity issues, should be focused. Based on this motivation,this paper proposes a novel switcher-aware framework for multi-object tracking, which consists of Spatial Conflict Graph model (SCG) and Switcher-Aware Association (SAA). The SCG eliminates spatial switch-ers within one frame by building a conflict graph and working out the optimal subgraph. The SAA utilizes additional information from potential temporal switcher across frames, enabling more accurate data association. Besides, we propose a new MOT evaluation measure, Still Another IDF score (SAIDF), aiming to focus more on identity issues.This new measure may overcome some problems of the previous measures and provide a better insight for identity issues in MOT. Finally,the proposed framework is tested under both the traditional measures and the new measure we proposed. Extensive experiments show that ourmethod achieves competitive results on all measure.

preprint2020arXiv

Scope Head for Accurate Localization in Object Detection

Existing anchor-based and anchor-free object detectors in multi-stage or one-stage pipelines have achieved very promising detection performance. However, they still encounter the design difficulty in hand-crafted 2D anchor definition and the learning complexity in 1D direct location regression. To tackle these issues, in this paper, we propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship. This approach quantizes the prediction space and employs a coarse-to-fine strategy for localization. It achieves superior flexibility as in the regression based anchor-free methods, while produces more precise prediction. Besides, an inherit anchor selection score is learned to indicate the localization quality of the detection result, and we propose to better represent the confidence of a detection box by combining the category-classification score and the anchor-selection score. With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO

preprint2020arXiv

Towards information-rich, logical text generation with knowledge-enhanced neural models

Text generation system has made massive promising progress contributed by deep learning techniques and has been widely applied in our life. However, existing end-to-end neural models suffer from the problem of tending to generate uninformative and generic text because they cannot ground input context with background knowledge. In order to solve this problem, many researchers begin to consider combining external knowledge in text generation systems, namely knowledge-enhanced text generation. The challenges of knowledge enhanced text generation including how to select the appropriate knowledge from large-scale knowledge bases, how to read and understand extracted knowledge, and how to integrate knowledge into generation process. This survey gives a comprehensive review of knowledge-enhanced text generation systems, summarizes research progress to solving these challenges and proposes some open issues and research directions.

preprint2020arXiv

Triply magic conditions for microwave transitions of optically trapped alkali-metal atoms

We report the finding of "triply magic" conditions (the doubly magic frequency-intensity conditions of an optical dipole trap plus the magic magnetic field) for the microwave transitions of optically trapped alkali-metal atoms. The differential light shift (DLS) induced by a degenerate two-photon process is adopted to compensate a DLS associated with the one-photon process. Thus, doubly magic conditions for the intensity and frequency of the optical trap beam can be found. Moreover, the DLS decouples from the magnetic field in a linearly polarized optical dipole trap, so that the magic condition of the magnetic field can be applied independently. Therefore, the "triply magic" conditions can be realized simultaneously. We also experimentally demonstrate the doubly magic frequency-intensity conditions as well as the independence of the magnetic field. When the triply magic conditions are fulfilled, the inhomogeneous and homogeneous decoherences for the optically trapped atom will be dramatically suppressed, and the coherence time can be extended significantly.

preprint2019arXiv

High-numerical-aperture and long-working-distance objectives for single-atom experiments

We present two long-working-distance objective lenses with numerical apertures (NA) of 0.29 and 0.4 for single-atom experiments. The objective lenses are assembled entirely by the commercial on-catalog $Φ$1'' singlets. Both the objectives are capable to correct the spherical aberrations due to the standard flat vacuum glass windows with various thickness. The working distances of NA$=0.29$ and NA$=0.4$ objectives are 34.6 mm and 18.2 mm, respectively, at the design wavelength of 852 nm with 5-mm thick silica window. In addition, the objectives can also be optimized to work at diffraction limit at single wavelength in the entire visible and near infrared regions by slightly tuning the distance between the first two lenses. The diffraction limited fields of view for NA$=0.29$ and NA$=0.4$ objectives are 0.62 mm and 0.61 mm, and the spatial resolutions are 1.8 $μ$m and 1.3 $μ$m at the design wavelength. The performances are simulated by the commercial ray-tracing software and confirmed by imaging the resolution chart and a 1.18 $μ$m pinhole. The two objectives can be used for trapping and manipulating single atoms of various species.

preprint2019arXiv

Low-Resource Response Generation with Template Prior

We study open domain response generation with limited message-response pairs. The problem exists in real-world applications but is less explored by the existing work. Since the paired data now is no longer enough to train a neural generation model, we consider leveraging the large scale of unpaired data that are much easier to obtain, and propose response generation with both paired and unpaired data. The generation model is defined by an encoder-decoder architecture with templates as prior, where the templates are estimated from the unpaired data as a neural hidden semi-markov model. By this means, response generation learned from the small paired data can be aided by the semantic and syntactic knowledge in the large unpaired data. To balance the effect of the prior and the input message to response generation, we propose learning the whole generation model with an adversarial approach. Empirical studies on question response generation and sentiment response generation indicate that when only a few pairs are available, our model can significantly outperform several state-of-the-art response generation models in terms of both automatic and human evaluation.

preprint2019arXiv

Not all doped Mott insulators have a pseudogap: key role of van Hove singularities

The Mott insulating phase of the parent compounds is frequently taken as starting point for the underdoped high-$T_c$ cuprate superconductors. In particular, the pseudogap state is often considered as deriving from the Mott insulator. In this work, we systematically investigate different weakly-doped Mott insulators on the square and triangular lattice to clarify the relationship between the pseudogap and Mottness. We show that doping a two-dimensional Mott insulator does not necessarily lead to a pseudogap phase. Despite its inherent strong-coupling nature, we find that the existence or absence of a pseudogap depends sensitively on non-interacting band parameters and identify the crucial role played by the van Hove singularities of the system. Motivated by a SU(2) gauge theory for the pseudogap state, we propose and verify numerically a simple equation that governs the evolution of characteristic features in the electronic scattering rate.

preprint2018arXiv

Prolonged mixed phase induced by high pressure in MnRuP

Hexagonally structured MnRuP was studied under high pressure up to 35 GPa from 5 to 300 K using synchrotron X-ray diffraction. We observed that a partial phase transition from hexagonal to orthorhombic symmetry started at 11 GPa. The new and denser orthorhombic phase coexisted with its parent phase for an unusually long pressure range, ΔP ~ 50 GPa. We attribute this structural transformation to a magnetic origin, where a decisive criterion for the boundary of the mixed phase lays in the different distances between the Mn-Mn atoms. In addition, our theoretical study shows that the orthorhombic phase of MnRuP remains steady even at very high pressures up to ~ 250 GPa, when it should transform to a new tetragonal phase.