Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
88works
0followers
36topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

88 published item(s)

preprint2026arXiv

CL-bench Life: Can Language Models Learn from Real-Life Context?

Today's AI assistants such as OpenClaw are designed to handle context effectively, making context learning an increasingly important capability for models. As these systems move beyond professional settings into everyday life, the nature of the contexts they must handle also shifts. Real-life contexts are often messy, fragmented, and deeply tied to personal and social experience, such as multi-party conversations, personal archives, and behavioral traces. Yet it remains unclear whether current frontier language models can reliably learn from such contexts and solve tasks grounded in them. To this end, we introduce CL-bench Life, a fully human-curated benchmark comprising 405 context-task pairs and 5,348 verification rubrics, covering common real-life scenarios. Solving tasks in CL-bench Life requires models to reason over complex, messy real-life contexts, calling for strong real-life context learning abilities that go far beyond those evaluated in existing benchmarks. We evaluate ten frontier LMs and find that real-life context learning remains highly challenging: even the best-performing model achieves only 19.3% task solving rate, while the average performance across models is only 13.8%. Models still struggle to reason over contexts such as messy group chat histories and fragmented behavioral records from everyday life. CL-bench Life provides a crucial testbed for advancing real-life context learning, and progress on it can enable more intelligent and reliable AI assistants in everyday life.

preprint2024arXiv

Weakly Augmented Variational Autoencoder in Time Series Anomaly Detection

Due to their unsupervised training and uncertainty estimation, deep Variational Autoencoders (VAEs) have become powerful tools for reconstruction-based Time Series Anomaly Detection (TSAD). Existing VAE-based TSAD methods, either statistical or deep, tune meta-priors to estimate the likelihood probability for effectively capturing spatiotemporal dependencies in the data. However, these methods confront the challenge of inherent data scarcity, which is often the case in anomaly detection tasks. Such scarcity easily leads to latent holes, discontinuous regions in latent space, resulting in non-robust reconstructions on these discontinuous spaces. We propose a novel generative framework that combines VAEs with self-supervised learning (SSL) to address this issue.

preprint2023arXiv

Forward-backward stochastic differential equations on tensor fields and application to Navier-Stokes equations on Riemannian manifolds

In this paper we introduce a class of forward-backward stochastic differential equations on tensor fields of Riemannian manifolds, which are related to semi-linear parabolic partial differential equations on tensor fields. Moreover, we will use these forward-backward stochastic differential equations to give a stochastic characterization of incompressible Navier-Stokes equations on Riemannian manifolds, where some extra conditions used in [22] are not required.

preprint2023arXiv

One-Bit-Aided Modulo Sampling for DOA Estimation

Modulo sampling has recently drawn a great deal of attention for cutting-edge applications, due to overcoming the barrier of information loss through sensor saturation and clipping. This is a significant problem, especially when the range of signal amplitudes is unknown or in the near-far case. To overcome this fundamental bottleneck, we propose a one-bit-aided (1bit-aided) modulo sampling scheme for direction-of-arrival (DOA) estimation. On the one hand, one-bit quantization involving a simple comparator offers the advantages of low-cost and low-complexity implementation. On the other hand, one-bit quantization provides an estimate of the normalized covariance matrix of the unquantized measurements via the arcsin law. The estimate of the normalized covariance matrix is used to implement blind integer-forcing (BIF) decoder to unwrap the modulo samples to construct the covariance matrix, and subspace methods can be used to perform the DOA estimation. Our approach named as 1bit-aided-BIF addresses the near-far problem well and overcomes the intrinsic low dynamic range of one-bit quantization. Numerical experiments validate the excellent performance of the proposed algorithm.

preprint2022arXiv

A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured Sentiment Analysis

Structured sentiment analysis, which aims to extract the complex semantic structures such as holders, expressions, targets, and polarities, has obtained widespread attention from both industry and academia. Unfortunately, the existing structured sentiment analysis datasets refer to a few languages and are relatively small, limiting neural network models' performance. In this paper, we focus on the cross-lingual structured sentiment analysis task, which aims to transfer the knowledge from the source language to the target one. Notably, we propose a Knowledge-Enhanced Adversarial Model (\texttt{KEAM}) with both implicit distributed and explicit structural knowledge to enhance the cross-lingual transfer. First, we design an adversarial embedding adapter for learning an informative and robust representation by capturing implicit semantic information from diverse multi-lingual embeddings adaptively. Then, we propose a syntax GCN encoder to transfer the explicit semantic information (e.g., universal dependency tree) among multiple languages. We conduct experiments on five datasets and compare \texttt{KEAM} with both the supervised and unsupervised methods. The extensive experimental results show that our \texttt{KEAM} model outperforms all the unsupervised baselines in various metrics.

preprint2022arXiv

A Multi-agent Reinforcement Learning Approach for Efficient Client Selection in Federated Learning

Federated learning (FL) is a training technique that enables client devices to jointly learn a shared model by aggregating locally-computed models without exposing their raw data. While most of the existing work focuses on improving the FL model accuracy, in this paper, we focus on the improving the training efficiency, which is often a hurdle for adopting FL in real-world applications. Specifically, we design an efficient FL framework which jointly optimizes model accuracy, processing latency and communication efficiency, all of which are primary design considerations for real implementation of FL. Inspired by the recent success of Multi-Agent Reinforcement Learning (MARL) in solving complex control problems, we present \textit{FedMarl}, an MARL-based FL framework which performs efficient run-time client selection. Experiments show that FedMarl can significantly improve model accuracy with much lower processing latency and communication cost.

preprint2022arXiv

A Semantic Alignment System for Multilingual Query-Product Retrieval

This paper mainly describes our winning solution (team name: www) to Amazon ESCI Challenge of KDD CUP 2022, which achieves a NDCG score of 0.9043 and wins the first place on task 1: the query-product ranking track. In this competition, participants are provided with a real-world large-scale multilingual shopping queries data set and it contains query-product pairs in English, Japanese and Spanish. Three different tasks are proposed in this competition, including ranking the results list as task 1, classifying the query/product pairs into Exact, Substitute, Complement, or Irrelevant (ESCI) categories as task 2 and identifying substitute products for a given query as task 3. We mainly focus on task 1 and propose a semantic alignment system for multilingual query-product retrieval. Pre-trained multilingual language models (LM) are adopted to get the semantic representation of queries and products. Our models are all trained with cross-entropy loss to classify the query-product pairs into ESCI 4 categories at first, and then we use weighted sum with the 4-class probabilities to get the score for ranking. To further boost the model, we also do elaborative data preprocessing, data augmentation by translation, specially handling English texts with English LMs, adversarial training with AWP and FGM, self distillation, pseudo labeling, label smoothing and ensemble. Finally, Our solution outperforms others both on public and private leaderboard.

preprint2022arXiv

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

This paper reviews the Challenge on Super-Resolution of Compressed Image and Video at AIM 2022. This challenge includes two tracks. Track 1 aims at the super-resolution of compressed image, and Track~2 targets the super-resolution of compressed video. In Track 1, we use the popular dataset DIV2K as the training, validation and test sets. In Track 2, we propose the LDV 3.0 dataset, which contains 365 videos, including the LDV 2.0 dataset (335 videos) and 30 additional videos. In this challenge, there are 12 teams and 2 teams that submitted the final results to Track 1 and Track 2, respectively. The proposed methods and solutions gauge the state-of-the-art of super-resolution on compressed image and video. The proposed LDV 3.0 dataset is available at https://github.com/RenYang-home/LDV_dataset. The homepage of this challenge is at https://github.com/RenYang-home/AIM22_CompressSR.

preprint2022arXiv

An Effective Way for Cross-Market Recommendation with Hybrid Pre-Ranking and Ranking Models

The Cross-Market Recommendation task of WSDM CUP 2022 is about finding solutions to improve individual recommendation systems in resource-scarce target markets by leveraging data from similar high-resource source markets. Finally, our team OPDAI won the first place with NDCG@10 score of 0.6773 on the leaderboard. Our solution to this task will be detailed in this paper. To better transform information from source markets to target markets, we adopt two stages of ranking. In pre-ranking stage, we adopt diverse pre-ranking methods or models to do feature generation. After elaborate feature analysis and feature selection, we train LightGBM with 10-fold bagging to do the final ranking.

preprint2022arXiv

An Ion Exchange Mechanism Inspired Story Ending Generator for Different Characters

Story ending generation aims at generating reasonable endings for a given story context. Most existing studies in this area focus on generating coherent or diversified story endings, while they ignore that different characters may lead to different endings for a given story. In this paper, we propose a Character-oriented Story Ending Generator (CoSEG) to customize an ending for each character in a story. Specifically, we first propose a character modeling module to learn the personalities of characters from their descriptive experiences extracted from the story context. Then, inspired by the ion exchange mechanism in chemical reactions, we design a novel vector breaking/forming module to learn the intrinsic interactions between each character and the corresponding context through an analogical information exchange procedure. Finally, we leverage the attention mechanism to learn effective character-specific interactions and feed each interaction into a decoder to generate character-orient endings. Extensive experimental results and case studies demonstrate that CoSEG achieves significant improvements in the quality of generated endings compared with state-of-the-art methods, and it effectively customizes the endings for different characters.

preprint2022arXiv

Bayesian Sequential Stacking Algorithm for Concurrently Designing Molecules and Synthetic Reaction Networks

In the last few years, de novo molecular design using machine learning has made great technical progress but its practical deployment has not been as successful. This is mostly owing to the cost and technical difficulty of synthesizing such computationally designed molecules. To overcome such barriers, various methods for synthetic route design using deep neural networks have been studied intensively in recent years. However, little progress has been made in designing molecules and their synthetic routes simultaneously. Here, we formulate the problem of simultaneously designing molecules with the desired set of properties and their synthetic routes within the framework of Bayesian inference. The design variables consist of a set of reactants in a reaction network and its network topology. The design space is extremely large because it consists of all combinations of purchasable reactants, often in the order of millions or more. In addition, the designed reaction networks can adopt any topology beyond simple multistep linear reaction routes. To solve this hard combinatorial problem, we present a powerful sequential Monte Carlo algorithm that recursively designs a synthetic reaction network by sequentially building up single-step reactions. In a case study of designing drug-like molecules based on commercially available compounds, compared with heuristic combinatorial search methods, the proposed method shows overwhelming performance in terms of computational efficiency and coverage and novelty with respect to existing compounds.

preprint2022arXiv

Bitcoin Transaction Forecasting with Deep Network Representation Learning

Bitcoin and its decentralized computing paradigm for digital currency trading are one of the most disruptive technology in the 21st century. This paper presents a novel approach to developing a Bitcoin transaction forecast model, DLForecast, by leveraging deep neural networks for learning Bitcoin transaction network representations. DLForecast makes three original contributions. First, we explore three interesting properties between Bitcoin transaction accounts: topological connectivity pattern of Bitcoin accounts, transaction amount pattern, and transaction dynamics. Second, we construct a time-decaying reachability graph and a time-decaying transaction pattern graph, aiming at capturing different types of spatial-temporal Bitcoin transaction patterns. Third, we employ node embedding on both graphs and develop a Bitcoin transaction forecasting system between user accounts based on historical transactions with built-in time-decaying factor. To maintain an effective transaction forecasting performance, we leverage the multiplicative model update (MMU) ensemble to combine prediction models built on different transaction features extracted from each corresponding Bitcoin transaction graph. Evaluated on real-world Bitcoin transaction data, we show that our spatial-temporal forecasting model is efficient with fast runtime and effective with forecasting accuracy over 60\% and improves the prediction performance by 50\% when compared to forecasting model built on the static graph baseline.

preprint2022arXiv

Causal Intervention Improves Implicit Sentiment Analysis

Despite having achieved great success for sentiment analysis, existing neural models struggle with implicit sentiment analysis. This may be due to the fact that they may latch onto spurious correlations ("shortcuts", e.g., focusing only on explicit sentiment words), resulting in undermining the effectiveness and robustness of the learned model. In this work, we propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV). We first review sentiment analysis from a causal perspective and analyze the confounders existing in this task. Then, we introduce an instrumental variable to eliminate the confounding causal effects, thus extracting the pure causal effect between sentence and sentiment. We compare the proposed ISAIV model with several strong baselines on both the general implicit sentiment analysis and aspect-based implicit sentiment analysis tasks. The results indicate the great advantages of our model and the efficacy of implicit sentiment reasoning.

preprint2022arXiv

Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap

Recently, contrastive learning has risen to be a promising approach for large-scale self-supervised learning. However, theoretical understanding of how it works is still unclear. In this paper, we propose a new guarantee on the downstream performance without resorting to the conditional independence assumption that is widely adopted in previous work but hardly holds in practice. Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations, thus simply aligning the positive samples (augmented views of the same sample) could make contrastive learning cluster intra-class samples together. Based on this augmentation overlap perspective, theoretically, we obtain asymptotically closed bounds for downstream performance under weaker assumptions, and empirically, we propose an unsupervised model selection metric ARC that aligns well with downstream accuracy. Our theory suggests an alternative understanding of contrastive learning: the role of aligning positive samples is more like a surrogate task than an ultimate goal, and the overlapped augmented views (i.e., the chaos) create a ladder for contrastive learning to gradually learn class-separated representations. The code for computing ARC is available at https://github.com/zhangq327/ARC.

preprint2022arXiv

CMMD: Cross-Metric Multi-Dimensional Root Cause Analysis

In large-scale online services, crucial metrics, a.k.a., key performance indicators (KPIs), are monitored periodically to check their running statuses. Generally, KPIs are aggregated along multiple dimensions and derived by complex calculations among fundamental metrics from the raw data. Once abnormal KPI values are observed, root cause analysis (RCA) can be applied to identify the reasons for anomalies, so that we can troubleshoot quickly. Recently, several automatic RCA techniques were proposed to localize the related dimensions (or a combination of dimensions) to explain the anomalies. However, their analyses are limited to the data on the abnormal metric and ignore the data of other metrics which may be also related to the anomalies, leading to imprecise or even incorrect root causes. To this end, we propose a cross-metric multi-dimensional root cause analysis method, named CMMD, which consists of two key components: 1) relationship modeling, which utilizes graph neural network (GNN) to model the unknown complex calculation among metrics and aggregation function among dimensions from historical data; 2) root cause localization, which adopts the genetic algorithm to efficiently and effectively dive into the raw data and localize the abnormal dimension(s) once the KPI anomalies are detected. Experiments on synthetic datasets, public datasets and online production environment demonstrate the superiority of our proposed CMMD method compared with baselines. Currently, CMMD is running as an online service in Microsoft Azure.

preprint2022arXiv

Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games

Recent success in cooperative multi-agent reinforcement learning (MARL) relies on centralized training and policy sharing. Centralized training eliminates the issue of non-stationarity MARL yet induces large communication costs, and policy sharing is empirically crucial to efficient learning in certain tasks yet lacks theoretical justification. In this paper, we formally characterize a subclass of cooperative Markov games where agents exhibit a certain form of homogeneity such that policy sharing provably incurs no suboptimality. This enables us to develop the first consensus-based decentralized actor-critic method where the consensus update is applied to both the actors and the critics while ensuring convergence. We also develop practical algorithms based on our decentralized actor-critic method to reduce the communication cost during training, while still yielding policies comparable with centralized training.

preprint2022arXiv

Convergence and Price of Anarchy Guarantees of the Softmax Policy Gradient in Markov Potential Games

We study the performance of policy gradient methods for the subclass of Markov games known as Markov potential games (MPGs), which extends the notion of normal-form potential games to the stateful setting and includes the important special case of the fully cooperative setting where the agents share an identical reward function. Our focus in this paper is to study the convergence of the policy gradient method for solving MPGs under softmax policy parameterization, both tabular and parameterized with general function approximators such as neural networks. We first show the asymptotic convergence of this method to a Nash equilibrium of MPGs for tabular softmax policies. Second, we derive the finite-time performance of the policy gradient in two settings: 1) using the log-barrier regularization, and 2) using the natural policy gradient under the best-response dynamics (NPG-BR). Finally, extending the notion of price of anarchy (POA) and smoothness in normal-form games, we introduce the POA for MPGs and provide a POA bound for NPG-BR. To our knowledge, this is the first POA bound for solving MPGs. To support our theoretical results, we empirically compare the convergence rates and POA of policy gradient variants for both tabular and neural softmax policies.

preprint2022arXiv

Cross-View Cross-Scene Multi-View Crowd Counting

Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution. However, the current multi-view paradigm trains and tests on the same single scene and camera-views, which limits its practical application. In this paper, we propose a cross-view cross-scene (CVCS) multi-view crowd counting paradigm, where the training and testing occur on different scenes with arbitrary camera layouts. To dynamically handle the challenge of optimal view fusion under scene and camera layout change and non-correspondence noise due to camera calibration errors or erroneous features, we propose a CVCS model that attentively selects and fuses multiple views together using camera layout geometry, and a noise view regularization method to train the model to handle non-correspondence errors. We also generate a large synthetic multi-camera crowd counting dataset with a large number of scenes and camera views to capture many possible variations, which avoids the difficulty of collecting and annotating such a large real dataset. We then test our trained CVCS model on real multi-view counting datasets, by using unsupervised domain transfer. The proposed CVCS model trained on synthetic data outperforms the same model trained only on real data, and achieves promising performance compared to fully supervised methods that train and test on the same single scene.

preprint2022arXiv

Deblur-NeRF: Neural Radiance Fields from Blurry Images

Neural Radiance Field (NeRF) has gained considerable attention recently for 3D scene reconstruction and novel view synthesis due to its remarkable synthesis quality. However, image blurriness caused by defocus or motion, which often occurs when capturing scenes in the wild, significantly degrades its reconstruction quality. To address this problem, We propose Deblur-NeRF, the first method that can recover a sharp NeRF from blurry input. We adopt an analysis-by-synthesis approach that reconstructs blurry views by simulating the blurring process, thus making NeRF robust to blurry inputs. The core of this simulation is a novel Deformable Sparse Kernel (DSK) module that models spatially-varying blur kernels by deforming a canonical sparse kernel at each spatial location. The ray origin of each kernel point is jointly optimized, inspired by the physical blurring process. This module is parameterized as an MLP that has the ability to be generalized to various blur types. Jointly optimizing the NeRF and the DSK module allows us to restore a sharp NeRF. We demonstrate that our method can be used on both camera motion blur and defocus blur: the two most common types of blur in real scenes. Evaluation results on both synthetic and real-world data show that our method outperforms several baselines. The synthetic and real datasets along with the source code is publicly available at https://limacv.github.io/deblurnerf/

preprint2022arXiv

Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective

Natural language understanding (NLU) models tend to rely on spurious correlations (i.e., dataset bias) to achieve high performance on in-distribution datasets but poor performance on out-of-distribution ones. Most of the existing debiasing methods often identify and weaken these samples with biased features (i.e., superficial surface features that cause such spurious correlations). However, down-weighting these samples obstructs the model in learning from the non-biased parts of these samples. To tackle this challenge, in this paper, we propose to eliminate spurious correlations in a fine-grained manner from a feature space perspective. Specifically, we introduce Random Fourier Features and weighted re-sampling to decorrelate the dependencies between features to mitigate spurious correlations. After obtaining decorrelated features, we further design a mutual-information-based method to purify them, which forces the model to learn features that are more relevant to tasks. Extensive experiments on two well-studied NLU tasks demonstrate that our method is superior to other comparative approaches.

preprint2022arXiv

Direct Hydrogen Production from Water/Seawater by Irradiation/Vibration-Activated Using Defective Ferroelectric BaTiO3-x Nanoparticles

Hydrogen is a promising fossil-fuel alternative fuel owing to its environmentally neutral emissions and high energy density. However, the need for purified water and external power are critical hindrances to implementation of hydrogen production. The present work reveals the potential to overcome these shortcomings through piezo-photocatalysis of seawater using BaTiO3-x (BTO) nanoparticles. This material was made piezoelectrically active by annealing under different atmospheres, including O2, N2, Ar, and H2, the latter of which caused Ti4+ to Ti(4-x)+ multiple reductions and structural expansions that stabilized piezoelectric tetragonal BTO domains. The resultant defect equilibria combine ionic and electron effects, including Ti redox reactions, charge-compensating surface oxygen vacancy formation, and color centre alterations. Further, variety of experimental techniques revealed the effects of reduction on the energy band structure. A strong piezoelectric effect and the presence of self-polarization were confirmed by piezoresponse force microscopy, while simulation work clarified the role of vibration on band bending deriving from the former. The performance data contrasted H2 evolution using deionized (DI) water, simulated seawater, and natural seawater subjected to photocatalysis, piezocatalysis, and piezo-photocatalysis. An efficient H2 evolution rate of 132.4 micromol/g/h was achieved from DI water using piezo-photocatalysis for 5 h. In contrast, piezocatalysis for 2 h followed by piezo-photocatalysis for 3 h resulted in H2 evolution rates of 100.7 micromol/g/h for DI water, 63.4 micromol/g/h for simulated seawater, and 48.7 micromol/g/h for natural seawater. This work provides potential new strategies for large-scale green H2 production using abundant natural resources with conventional piezoelectric material while leveraging the effects of ions dissolved in seawater.

preprint2022arXiv

Distill-VQ: Learning Retrieval Oriented Vector Quantization By Distilling Knowledge from Dense Embeddings

Vector quantization (VQ) based ANN indexes, such as Inverted File System (IVF) and Product Quantization (PQ), have been widely applied to embedding based document retrieval thanks to the competitive time and memory efficiency. Originally, VQ is learned to minimize the reconstruction loss, i.e., the distortions between the original dense embeddings and the reconstructed embeddings after quantization. Unfortunately, such an objective is inconsistent with the goal of selecting ground-truth documents for the input query, which may cause severe loss of retrieval quality. Recent works identify such a defect, and propose to minimize the retrieval loss through contrastive learning. However, these methods intensively rely on queries with ground-truth documents, whose performance is limited by the insufficiency of labeled data. In this paper, we propose Distill-VQ, which unifies the learning of IVF and PQ within a knowledge distillation framework. In Distill-VQ, the dense embeddings are leveraged as "teachers", which predict the query's relevance to the sampled documents. The VQ modules are treated as the "students", which are learned to reproduce the predicted relevance, such that the reconstructed embeddings may fully preserve the retrieval result of the dense embeddings. By doing so, Distill-VQ is able to derive substantial training signals from the massive unlabeled data, which significantly contributes to the retrieval quality. We perform comprehensive explorations for the optimal conduct of knowledge distillation, which may provide useful insights for the learning of VQ based ANN index. We also experimentally show that the labeled data is no longer a necessity for high-quality vector quantization, which indicates Distill-VQ's strong applicability in practice.

preprint2022arXiv

Distributed Deep Learning Inference Acceleration using Seamless Collaboration in Edge Computing

This paper studies inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing. To ensure inference accuracy in inference task partitioning, we consider the receptive-field when performing segment-based partitioning. To maximize the parallelization between the communication and computing processes, thereby minimizing the total inference time of an inference task, we design a novel task collaboration scheme in which the overlapping zone of the sub-tasks on secondary edge servers (ESs) is executed on the host ES, named as HALP. We further extend HALP to the scenario of multiple tasks. Experimental results show that HALP can accelerate CNN inference in VGG-16 by 1.7-2.0x for a single task and 1.7-1.8x for 4 tasks per batch on GTX 1080TI and JETSON AGX Xavier, which outperforms the state-of-the-art work MoDNN. Moreover, we evaluate the service reliability under time-variant channel, which shows that HALP is an effective solution to ensure high service reliability with strict service deadline.

preprint2022arXiv

Divide and Conquer: Text Semantic Matching with Disentangled Keywords and Intents

Text semantic matching is a fundamental task that has been widely used in various scenarios, such as community question answering, information retrieval, and recommendation. Most state-of-the-art matching models, e.g., BERT, directly perform text comparison by processing each word uniformly. However, a query sentence generally comprises content that calls for different levels of matching granularity. Specifically, keywords represent factual information such as action, entity, and event that should be strictly matched, while intents convey abstract concepts and ideas that can be paraphrased into various expressions. In this work, we propose a simple yet effective training strategy for text semantic matching in a divide-and-conquer manner by disentangling keywords from intents. Our approach can be easily combined with pre-trained language models (PLM) without influencing their inference efficiency, achieving stable performance improvements against a wide range of PLMs on three benchmarks.

preprint2022arXiv

Dynamic Split Computing for Efficient Deep Edge Intelligence

Deploying deep neural networks (DNNs) on IoT and mobile devices is a challenging task due to their limited computational resources. Thus, demanding tasks are often entirely offloaded to edge servers which can accelerate inference, however, it also causes communication cost and evokes privacy concerns. In addition, this approach leaves the computational capacity of end devices unused. Split computing is a paradigm where a DNN is split into two sections; the first section is executed on the end device, and the output is transmitted to the edge server where the final section is executed. Here, we introduce dynamic split computing, where the optimal split location is dynamically selected based on the state of the communication channel. By using natural bottlenecks that already exist in modern DNN architectures, dynamic split computing avoids retraining and hyperparameter optimization, and does not have any negative impact on the final accuracy of DNNs. Through extensive experiments, we show that dynamic split computing achieves faster inference in edge computing environments where the data rate and server load vary over time.

preprint2022arXiv

Enhancing Event-Level Sentiment Analysis with Structured Arguments

Previous studies about event-level sentiment analysis (SA) usually model the event as a topic, a category or target terms, while the structured arguments (e.g., subject, object, time and location) that have potential effects on the sentiment are not well studied. In this paper, we redefine the task as structured event-level SA and propose an End-to-End Event-level Sentiment Analysis ($\textit{E}^{3}\textit{SA}$) approach to solve this issue. Specifically, we explicitly extract and model the event structure information for enhancing event-level SA. Extensive experiments demonstrate the great advantages of our proposed approach over the state-of-the-art methods. Noting the lack of the dataset, we also release a large-scale real-world dataset with event arguments and sentiment labelling for promoting more researches\footnote{The dataset is available at https://github.com/zhangqi-here/E3SA}.

preprint2022arXiv

Evidence for electronic signature of magnetic transition in topological magnet HoSbTe

Topological insulators with intrinsic magnetic order are emerging as an exciting platform to realize fundamentally new excitations from topological quantum states of matter. To study these systems and their physics, people have proposed a variety of magnetic topological insulator systems, including HoSbTe, an antiferromagnetic weak topological insulator candidate. In this work, we use scanning tunneling microscopy to probe the electronic structure of HoSbTe with antiferromagnetic and ferromagnetic orders that are tuned by applying an external magnetic field. Although around the Fermi energy, we find minor differences between the quasi-particle interferences under the ferromagnetic and antiferromagnetic orders, deep inside the valance region, a new quasi-particle interference signal emerges with ferromagnetism. This observation is consistent with our first-principles calculations indicating the magnetism-driven transition of the electronic states in this spin-orbit coupled topological magnet.

preprint2022arXiv

Exploring Anchor-based Detection for Ego4D Natural Language Query

In this paper we provide the technique report of Ego4D natural language query challenge in CVPR 2022. Natural language query task is challenging due to the requirement of comprehensive understanding of video contents. Most previous works address this task based on third-person view datasets while few research interest has been placed in the ego-centric view by far. Great progress has been made though, we notice that previous works can not adapt well to ego-centric view datasets e.g., Ego4D mainly because of two reasons: 1) most queries in Ego4D have a excessively small temporal duration (e.g., less than 5 seconds); 2) queries in Ego4D are faced with much more complex video understanding of long-term temporal orders. Considering these, we propose our solution of this challenge to solve the above issues.

preprint2022arXiv

Fairness-aware Maximal Clique in Large Graphs: Concepts and Algorithms

Cohesive subgraph mining on attributed graphs is a fundamental problem in graph data analysis. Existing cohesive subgraph mining algorithms on attributed graphs do not consider the fairness of attributes in the subgraph. In this paper, we, for the first time, introduce fairness into the widely-used clique model to mine fairness-aware cohesive subgraphs. In particular, we propose three novel fairness-aware maximal clique models on attributed graphs, called weak fair clique, strong fair clique and relative fair clique, respectively. To enumerate all weak fair cliques, we develop an efficient backtracking algorithm called WFCEnum equipped with a novel colorful k-core based pruning technique. We also propose an efficient enumeration algorithm called SFCEnum to find all strong fair cliques based on a new attribute-alternatively-selection search technique. To further improve the efficiency, we also present several non-trivial ordering techniques for both weak and strong fair clique enumerations. To enumerate all relative fair cliques, we design an enhanced colorful k-core based pruning technique for 2D attribute, and then develop two efficient search algorithms: RFCRefineEnum and RFCAlterEnum based on the ideas of WFCEnum and SFCEnum for arbitrary dimension attribute. The results of extensive experiments on four real-world graphs demonstrate the efficiency, scalability and effectiveness of the proposed algorithms.

preprint2022arXiv

FENeRF: Face Editing in Neural Radiance Fields

Previous portrait image generation methods roughly fall into two categories: 2D GANs and 3D-aware GANs. 2D GANs can generate high fidelity portraits but with low view consistency. 3D-aware GAN methods can maintain view consistency but their generated images are not locally editable. To overcome these limitations, we propose FENeRF, a 3D-aware generator that can produce view-consistent and locally-editable portrait images. Our method uses two decoupled latent codes to generate corresponding facial semantics and texture in a spatial aligned 3D volume with shared geometry. Benefiting from such underlying 3D representation, FENeRF can jointly render the boundary-aligned image and semantic mask and use the semantic mask to edit the 3D volume via GAN inversion. We further show such 3D representation can be learned from widely available monocular image and semantic mask pairs. Moreover, we reveal that joint learning semantics and texture helps to generate finer geometry. Our experiments demonstrate that FENeRF outperforms state-of-the-art methods in various face editing tasks.

preprint2022arXiv

Gain-gain and gain-lossless PT-symmetry broken from PT-phase diagram

Parity-time (PT) symmetry and broken in micro/nano photonic structures have been investigated extensively as they bring new opportunities to control the flow of light based on non-Hermitian optics. Previous studies have focused on the situations of PT-symmetry broken in loss-loss or gain-loss coupling systems. Here, we theoretically predict the gain-gain and gain-lossless PT-broken from phase diagram, where the boundaries between PT-symmetry and PT-broken can be clearly defined in the full-parameter space including gain, lossless and loss. For specific micro/nano photonic structures, such as coupled waveguides, we give the transmission matrices of each phase space, which can be used for beam splitting. Taking coupled waveguides as an example, we obtain periodic energy exchange in PT-symmetry phase and exponential gain or loss in PT-broken phase, which are consistent with the phase diagram. The scenario giving a full view of PT-symmetry or broken, will not only deepen the understanding of fundamental physics, but also will promote the breakthrough of photonic applications like optical routers and beam splitters.

preprint2022arXiv

Hallucinated Neural Radiance Fields in the Wild

Neural Radiance Fields (NeRF) has recently gained popularity for its impressive novel view synthesis ability. This paper studies the problem of hallucinated NeRF: i.e., recovering a realistic NeRF at a different time of day from a group of tourism images. Existing solutions adopt NeRF with a controllable appearance embedding to render novel views under various conditions, but they cannot render view-consistent images with an unseen appearance. To solve this problem, we present an end-to-end framework for constructing a hallucinated NeRF, dubbed as Ha-NeRF. Specifically, we propose an appearance hallucination module to handle time-varying appearances and transfer them to novel views. Considering the complex occlusions of tourism images, we introduce an anti-occlusion module to decompose the static subjects for visibility accurately. Experimental results on synthetic data and real tourism photo collections demonstrate that our method can hallucinate the desired appearances and render occlusion-free images from different views. The project and supplementary materials are available at https://rover-xingyu.github.io/Ha-NeRF/.

preprint2022arXiv

HousE: Knowledge Graph Embedding with Householder Parameterization

The effectiveness of knowledge graph embedding (KGE) largely depends on the ability to model intrinsic relation patterns and mapping properties. However, existing approaches can only capture some of them with insufficient modeling capacity. In this work, we propose a more powerful KGE framework named HousE, which involves a novel parameterization based on two kinds of Householder transformations: (1) Householder rotations to achieve superior capacity of modeling relation patterns; (2) Householder projections to handle sophisticated relation mapping properties. Theoretically, HousE is capable of modeling crucial relation patterns and mapping properties simultaneously. Besides, HousE is a generalization of existing rotation-based models while extending the rotations to high-dimensional spaces. Empirically, HousE achieves new state-of-the-art performance on five benchmark datasets. Our code is available at https://github.com/anrep/HousE.

preprint2022arXiv

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering

Multi-hop reasoning requires aggregating multiple documents to answer a complex question. Existing methods usually decompose the multi-hop question into simpler single-hop questions to solve the problem for illustrating the explainable reasoning process. However, they ignore grounding on the supporting facts of each reasoning step, which tends to generate inaccurate decompositions. In this paper, we propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation at each intermediate step, and utilize the inference of the current hop for the next until reasoning out the final result. We employ a unified reader model for both intermediate hop reasoning and final hop inference and adopt joint optimization for more accurate and robust multi-hop reasoning. We conduct experiments on two benchmark datasets HotpotQA and 2WikiMultiHopQA. The results show that our method can effectively boost performance and also yields a better interpretable reasoning process without decomposition supervision.

preprint2022arXiv

Magnetization-direction-tunable kagome Weyl line

Kagome magnets provide a fascinating platform for a plethora of topological quantum phenomena. Here, utilizing angle-resolved photoemission spectroscopy, we demonstrate Weyl lines with strong out-of-plane dispersion in an A-A stacked kagome magnet TbxGd1-xMn6Sn6. On the Gd rich side, the Weyl line remains nearly spin-orbit-gapless due to a remarkable cooperative interplay between Kane-Mele spin-orbit-coupling, low site symmetry and in-plane magnetic order. Under Tb substitution, the kagome Weyl line gaps due to a magnetic reorientation to out-of-plane order. Our results illustrate the magnetic moment direction as an efficient tuning knob for realizing distinct three-dimensional topological phases.

preprint2022arXiv

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

Model quantization has emerged as an indispensable technique to accelerate deep learning inference. While researchers continue to push the frontier of quantization algorithms, existing quantization work is often unreproducible and undeployable. This is because researchers do not choose consistent training pipelines and ignore the requirements for hardware deployments. In this work, we propose Model Quantization Benchmark (MQBench), a first attempt to evaluate, analyze, and benchmark the reproducibility and deployability for model quantization algorithms. We choose multiple different platforms for real-world deployments, including CPU, GPU, ASIC, DSP, and evaluate extensive state-of-the-art quantization algorithms under a unified training pipeline. MQBench acts like a bridge to connect the algorithm and the hardware. We conduct a comprehensive analysis and find considerable intuitive or counter-intuitive insights. By aligning the training settings, we find existing algorithms have about the same performance on the conventional academic track. While for the hardware-deployable quantization, there is a huge accuracy gap which remains unsettled. Surprisingly, no existing algorithm wins every challenge in MQBench, and we hope this work could inspire future research directions.

preprint2022arXiv

Multi-view Information Bottleneck Without Variational Approximation

By "intelligently" fusing the complementary information across different views, multi-view learning is able to improve the performance of classification tasks. In this work, we extend the information bottleneck principle to a supervised multi-view learning scenario and use the recently proposed matrix-based R{é}nyi's $α$-order entropy functional to optimize the resulting objective directly, without the necessity of variational approximation or adversarial training. Empirical results in both synthetic and real-world datasets suggest that our method enjoys improved robustness to noise and redundant information in each view, especially given limited training samples. Code is available at~\url{https://github.com/archy666/MEIB}.

preprint2022arXiv

Noisy induced entanglement transition in one-dimensional random quantum circuits

Random quantum circuit is a minimally structured model to study the entanglement dynamics of many-body quantum systems. In this paper, we considered a one-dimensional quantum circuit with noisy Haar-random unitary gates using density matrix operator and tensor contraction methods. It is shown that the entanglement evolution of the random quantum circuits is properly characterized by the logarithmic entanglement negativity. By performing exact numerical calculations, we found that, as the physical error rate is decreased below a critical value $p_c\approx 0.056$, the logarithmic entanglement negativity changes from the area law to the volume law, giving rise to an entanglement transition. The critical exponent of the correlation length can be determined from the finite-size scaling analysis, revealing the universal dynamic property of the noisy intermediate-scale quantum devices.

preprint2022arXiv

NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at a fixed bit-rate. Besides, the quality enhancement of Tracks 1 and 3 targets at improving the fidelity (PSNR), and Track 2 targets at enhancing the perceptual quality. The three tracks totally attract 482 registrations. In the test phase, 12 teams, 8 teams and 11 teams submitted the final results of Tracks 1, 2 and 3, respectively. The proposed methods and solutions gauge the state-of-the-art of video quality enhancement. The homepage of the challenge: https://github.com/RenYang-home/NTIRE21_VEnh

preprint2022arXiv

PEAR: Personalized Re-ranking with Contextualized Transformer for Recommendation

The goal of recommender systems is to provide ordered item lists to users that best match their interests. As a critical task in the recommendation pipeline, re-ranking has received increasing attention in recent years. In contrast to conventional ranking models that score each item individually, re-ranking aims to explicitly model the mutual influences among items to further refine the ordering of items given an initial ranking list. In this paper, we present a personalized re-ranking model (dubbed PEAR) based on contextualized transformer. PEAR makes several major improvements over the existing methods. Specifically, PEAR not only captures feature-level and item-level interactions, but also models item contexts from both the initial ranking list and the historical clicked item list. In addition to item-level ranking score prediction, we also augment the training of PEAR with a list-level classification task to assess users' satisfaction on the whole ranking list. Experimental results on both public and production datasets have shown the superior effectiveness of PEAR compared to the previous re-ranking models.

preprint2022arXiv

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation

Pixel synthesis is a promising research paradigm for image generation, which can well exploit pixel-wise prior knowledge for generation. However, existing methods still suffer from excessive memory footprint and computation overhead. In this paper, we propose a progressive pixel synthesis network towards efficient image generation, coined as PixelFolder. Specifically, PixelFolder formulates image generation as a progressive pixel regression problem and synthesizes images via a multi-stage structure, which can greatly reduce the overhead caused by large tensor transformations. In addition, we introduce novel pixel folding operations to further improve model efficiency while maintaining pixel-wise prior knowledge for end-to-end regression. With these innovative designs, we greatly reduce the expenditure of pixel synthesis, e.g., reducing 89% computation and 53% parameters compared with the latest pixel synthesis method CIPS. To validate our approach, we conduct extensive experiments on two benchmark datasets, namely FFHQ and LSUN Church. The experimental results show that with much less expenditure, PixelFolder obtains new state-of-the-art (SOTA) performance on two benchmark datasets, i.e., 3.77 FID and 2.45 FID on FFHQ and LSUN Church, respectively.Meanwhile, PixelFolder is also more efficient than the SOTA methods like StyleGAN2, reducing about 72% computation and 31% parameters, respectively. These results greatly validate the effectiveness of the proposed PixelFolder.

preprint2022arXiv

Process Knowledge-infused Learning for Suicidality Assessment on Social Media

Improving the performance and natural language explanations of deep learning algorithms is a priority for adoption by humans in the real world. In several domains, such as healthcare, such technology has significant potential to reduce the burden on humans by providing quality assistance at scale. However, current methods rely on the traditional pipeline of predicting labels from data, thus completely ignoring the process and guidelines used to obtain the labels. Furthermore, post hoc explanations on the data to label prediction using explainable AI (XAI) models, while satisfactory to computer scientists, leave much to be desired to the end-users due to lacking explanations of the process in terms of human-understandable concepts. We \textit{introduce}, \textit{formalize}, and \textit{develop} a novel Artificial Intelligence (A) paradigm -- Process Knowledge-infused Learning (PK-iL). PK-iL utilizes a structured process knowledge that explicitly explains the underlying prediction process that makes sense to end-users. The qualitative human evaluation confirms through a annotator agreement of 0.72, that humans are understand explanations for the predictions. PK-iL also performs competitively with the state-of-the-art (SOTA) baselines.

preprint2022arXiv

Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval

Ad-hoc search calls for the selection of appropriate answers from a massive-scale corpus. Nowadays, the embedding-based retrieval (EBR) becomes a promising solution, where deep learning based document representation and ANN search techniques are allied to handle this task. However, a major challenge is that the ANN index can be too large to fit into memory, given the considerable size of answer corpus. In this work, we tackle this problem with Bi-Granular Document Representation, where the lightweight sparse embeddings are indexed and standby in memory for coarse-grained candidate search, and the heavyweight dense embeddings are hosted in disk for fine-grained post verification. For the best of retrieval accuracy, a Progressive Optimization framework is designed. The sparse embeddings are learned ahead for high-quality search of candidates. Conditioned on the candidate distribution induced by the sparse embeddings, the dense embeddings are continuously learned to optimize the discrimination of ground-truth from the shortlisted candidates. Besides, two techniques: the contrastive quantization and the locality-centric sampling are introduced for the learning of sparse and dense embeddings, which substantially contribute to their performances. Thanks to the above features, our method effectively handles massive-scale EBR with strong advantages in accuracy: with up to +4.3% recall gain on million-scale corpus, and up to +17.5% recall gain on billion-scale corpus. Besides, Our method is applied to a major sponsored search platform with substantial gains on revenue (+1.95%), Recall (+1.01%) and CTR (+0.49%). Our code is available at https://github.com/microsoft/BiDR.

preprint2022arXiv

Quantum reflection of single photons in a cold Rydberg atomic gas

We propose and analyze a scheme for realizing the quantum reflection of single photons in a cold Rydberg atomic gas via electromagnetically induced transparency, by which a deep and tunable attractive potential well can be prepared by using stored gate photons. Such a scheme is promising for designing dispersion-type single-photon switches, and may be taken as a quantum device for observing the wave and particle natures of photons simultaneously.

preprint2022arXiv

Quantum Squeezing of Slow-Light Solitons

We investigate the quantum squeezing of slow-light solitons generated in a $Λ$-shaped three-level atomic system working under condition of electromagnetically induced transparency (EIT). We show that due to the giant Kerr nonlinearity contributed from the EIT effect, significant quantum squeezing of the slow-light soliton can be realized within a short propagation distance. The results reported here are helpful for understanding the quantum property of slow-light solitons and for realizing light squeezing via EIT in cold atomic gases experimentally.

preprint2022arXiv

Rapid Phase Ambiguity Elimination Methods for DOA Estimator via Hybrid Massive MIMO Receive Array

For a sub-connected hybrid multiple-input multiple-output (MIMO) receiver with $K$ subarrays and $N$ antennas, there exists a challenging problem of how to rapidly remove phase ambiguity in only single time-slot. First, a DOA estimator of maximizing received power (Max-RP) is proposed to find the maximum value of $K$-subarray output powers, where each subarray is in charge of one sector, and the center angle of the sector corresponding to the maximum output is the estimated true DOA. To make an enhancement on precision, Max-RP plus quadratic interpolation (Max-RP-QI) method is designed. In the proposed Max-RP-QI, a quadratic interpolation scheme is adopted to interpolate the three DOA values corresponding to the largest three receive powers of Max-RP. Finally, to achieve the CRLB, a Root-MUSIC plus Max-RP-QI scheme is developed. Simulation results show that the proposed three methods eliminate the phase ambiguity during one time-slot and also show low-computational-complexities. In particular, the proposed Root-MUSIC plus Max-RP-QI scheme can reach the CRLB, and the proposed Max-RP and Max-RP-QI are still some performance losses $2dB\thicksim4dB$ compared to the CRLB.

preprint2022arXiv

Searching for Optimal Subword Tokenization in Cross-domain NER

Input distribution shift is one of the vital problems in unsupervised domain adaptation (UDA). The most popular UDA approaches focus on domain-invariant representation learning, trying to align the features from different domains into similar feature distributions. However, these approaches ignore the direct alignment of input word distributions between domains, which is a vital factor in word-level classification tasks such as cross-domain NER. In this work, we shed new light on cross-domain NER by introducing a subword-level solution, X-Piece, for input word-level distribution shift in NER. Specifically, we re-tokenize the input words of the source domain to approach the target subword distribution, which is formulated and solved as an optimal transport problem. As this approach focuses on the input level, it can also be combined with previous DIRL methods for further improvement. Experimental results show the effectiveness of the proposed method based on BERT-tagger on four benchmark NER datasets. Also, the proposed method is proved to benefit DIRL methods such as DANN.

preprint2022arXiv

Sequential/Session-based Recommendations: Challenges, Approaches, Applications and Opportunities

In recent years, sequential recommender systems (SRSs) and session-based recommender systems (SBRSs) have emerged as a new paradigm of RSs to capture users' short-term but dynamic preferences for enabling more timely and accurate recommendations. Although SRSs and SBRSs have been extensively studied, there are many inconsistencies in this area caused by the diverse descriptions, settings, assumptions and application domains. There is no work to provide a unified framework and problem statement to remove the commonly existing and various inconsistencies in the area of SR/SBR. There is a lack of work to provide a comprehensive and systematic demonstration of the data characteristics, key challenges, most representative and state-of-the-art approaches, typical real-world applications and important future research directions in the area. This work aims to fill in these gaps so as to facilitate further research in this exciting and vibrant area.

preprint2022arXiv

Single-Frame based Deep View Synchronization for Unsynchronized Multi-Camera Surveillance

Multi-camera surveillance has been an active research topic for understanding and modeling scenes. Compared to a single camera, multi-cameras provide larger field-of-view and more object cues, and the related applications are multi-view counting, multi-view tracking, 3D pose estimation or 3D reconstruction, etc. It is usually assumed that the cameras are all temporally synchronized when designing models for these multi-camera based tasks. However, this assumption is not always valid,especially for multi-camera systems with network transmission delay and low frame-rates due to limited network bandwidth, resulting in desynchronization of the captured frames across cameras. To handle the issue of unsynchronized multi-cameras, in this paper, we propose a synchronization model that works in conjunction with existing DNN-based multi-view models, thus avoiding the redesign of the whole model. Under the low-fps regime, we assume that only a single relevant frame is available from each view, and synchronization is achieved by matching together image contents guided by epipolar geometry. We consider two variants of the model, based on where in the pipeline the synchronization occurs, scene-level synchronization and camera-level synchronization. The view synchronization step and the task-specific view fusion and prediction step are unified in the same framework and trained in an end-to-end fashion. Our view synchronization models are applied to different DNNs-based multi-camera vision tasks under the unsynchronized setting, including multi-view counting and 3D pose estimation, and achieve good performance compared to baselines.

preprint2022arXiv

Single-Layer Vision Transformers for More Accurate Early Exits with Less Overhead

Deploying deep learning models in time-critical applications with limited computational resources, for instance in edge computing systems and IoT networks, is a challenging task that often relies on dynamic inference methods such as early exiting. In this paper, we introduce a novel architecture for early exiting based on the vision transformer architecture, as well as a fine-tuning strategy that significantly increase the accuracy of early exit branches compared to conventional approaches while introducing less overhead. Through extensive experiments on image and audio classification as well as audiovisual crowd counting, we show that our method works for both classification and regression problems, and in both single- and multi-modal settings. Additionally, we introduce a novel method for integrating audio and visual modalities within early exits in audiovisual data analysis, that can lead to a more fine-grained dynamic inference.

preprint2022arXiv

Stability of ferroelectric bubble domains

Nanoscale ferroelectric topologies such as vortices, anti-vortices, bubble patterns etc. are stabilized in thin films by a delicate balance of both mechanical and electrical boundary conditions. A systematic understanding of the phase stability of bubble domains, particularly when the above factors act simultaneously, remains elusive. Here we present first-principle-based simulations in combination with scanning probe microscopy of ultrathin epitaxial (001) PbZr0.4Ti0.6O3 heterostructures to address this gap. The simulations predict that as-grown labyrinthine domains will transform to bubbles under combinations of reduced film thickness, increased mechanical pressure and/or improved electrical screening. These topological transitions are explained by a common fundamental mechanism. Namely, we argue that, independently of the nature of the driving force, the evolution of the domain morphology allows the system to conserve its original residual depolarization field. Thereby, the latter remains pinned to a value determined by an external or built-in electric bias. To verify our predictions, we then exploit tomographic atomic force microscopy to achieve the concurrent effect of reducing film thickness and increased mechanical stimulus. The results provide a systematic understanding of phase stability and demonstrate controlled manipulation of nanoscale ferroelectric bubble domains.

preprint2022arXiv

Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis

This paper studies the problem of view synthesis with certain amount of rotations from a pair of images, what we called stereo unstructured magnification. While the multi-plane image representation is well suited for view synthesis with depth invariant, how to generalize it to unstructured views remains a significant challenge. This is primarily due to the depth-dependency caused by camera frontal parallel representation. Here we propose a novel multiple homography image (MHI) representation, comprising of a set of scene planes with fixed normals and distances. A two-stage network is developed for novel view synthesis. Stage-1 is an MHI reconstruction module that predicts the MHIs and composites layered multi-normal images along the normal direction. Stage-2 is a normal-blending module to find blending weights. We also derive an angle-based cost to guide the blending of multi-normal images by exploiting per-normal geometry. Compared with the state-of-the-art methods, our method achieves superior performance for view synthesis qualitatively and quantitatively, especially for cases when the cameras undergo rotations.

preprint2022arXiv

Supervised Deep Hashing for High-dimensional and Heterogeneous Case-based Reasoning

Case-based Reasoning (CBR) on high-dimensional and heterogeneous data is a trending yet challenging and computationally expensive task in the real world. A promising approach is to obtain low-dimensional hash codes representing cases and perform a similarity retrieval of cases in Hamming space. However, previous methods based on data-independent hashing rely on random projections or manual construction, inapplicable to address specific data issues (e.g., high-dimensionality and heterogeneity) due to their insensitivity to data characteristics. To address these issues, this work introduces a novel deep hashing network to learn similarity-preserving compact hash codes for efficient case retrieval and proposes a deep-hashing-enabled CBR model HeCBR. Specifically, we introduce position embedding to represent heterogeneous features and utilize a multilinear interaction layer to obtain case embeddings, which effectively filtrates zero-valued features to tackle high-dimensionality and sparsity and captures inter-feature couplings. Then, we feed the case embeddings into fully-connected layers, and subsequently a hash layer generates hash codes with a quantization regularizer to control the quantization loss during relaxation. To cater to incremental learning of CBR, we further propose an adaptive learning strategy to update the hash function. Extensive experiments on public datasets show that HeCBR greatly reduces storage and significantly accelerates case retrieval. HeCBR achieves desirable performance compared with the state-of-the-art CBR methods and performs significantly better than hashing-based CBR methods in classification.

preprint2022arXiv

Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in Bing Sponsored Search

Embedding based retrieval (EBR) is a fundamental building block in many web applications. However, EBR in sponsored search is distinguished from other generic scenarios and technically challenging due to the need of serving multiple retrieval purposes: firstly, it has to retrieve high-relevance ads, which may exactly serve user's search intent; secondly, it needs to retrieve high-CTR ads so as to maximize the overall user clicks. In this paper, we present a novel representation learning framework Uni-Retriever developed for Bing Search, which unifies two different training modes knowledge distillation and contrastive learning to realize both required objectives. On one hand, the capability of making high-relevance retrieval is established by distilling knowledge from the ``relevance teacher model''. On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus. The two training modes are jointly performed as a multi-objective learning process, such that the ads of high relevance and CTR can be favored by the generated embeddings. Besides the learning strategy, we also elaborate our solution for EBR serving pipeline built upon the substantially optimized DiskANN, where massive-scale EBR can be performed with competitive time and memory efficiency, and accomplished in high-quality. We make comprehensive offline and online experiments to evaluate the proposed techniques, whose findings may provide useful insights for the future development of EBR systems. Uni-Retriever has been mainstreamed as the major retrieval path in Bing's production thanks to the notable improvements on the representation and EBR serving quality.

preprint2022arXiv

Visualizing the out-of-plane electronic dispersions in an intercalated transition metal dichalcogenide

Layered transition metal dichalcogenides have rich phase diagram and they feature two dimensionality on numerous physical properties. Co1/3NbS2 is one of the newest members of this family where Co atoms are intercalated into the Van der Waals gaps between NbS2 layers. We study the three-dimensional electronic band structure of Co1/3NbS2 using both surface and bulk sensitive angle-resolved photoemission spectroscopy. We show that the electronic bands do not fit into the rigid-band-shift picture after the Co intercalation. Instead, Co1/3NbS2 displays a different orbital character near the Fermi level compared to the pristine NbS2 compound and has a clear band dispersion in kz direction despite its layered structure. Our photoemission study demonstrates the out-of-plane electronic correlations introduced by the Co intercalation, thus offering a new perspective on this compound. Finally, we propose how Fermi level tuning could lead to exotic phases such as spin density wave instability.

preprint2022arXiv

Wide-Area Crowd Counting: Multi-View Fusion Networks for Counting in Large Scenes

Crowd counting in single-view images has achieved outstanding performance on existing counting datasets. However, single-view counting is not applicable to large and wide scenes (e.g., public parks, long subway platforms, or event spaces) because a single camera cannot capture the whole scene in adequate detail for counting, e.g., when the scene is too large to fit into the field-of-view of the camera, too long so that the resolution is too low on faraway crowds, or when there are too many large objects that occlude large portions of the crowd. Therefore, to solve the wide-area counting task requires multiple cameras with overlapping fields-of-view. In this paper, we propose a deep neural network framework for multi-view crowd counting, which fuses information from multiple camera views to predict a scene-level density map on the ground-plane of the 3D world. We consider three versions of the fusion framework: the late fusion model fuses camera-view density map; the naive early fusion model fuses camera-view feature maps; and the multi-view multi-scale early fusion model ensures that features aligned to the same ground-plane point have consistent scales. A rotation selection module further ensures consistent rotation alignment of the features. We test our 3 fusion models on 3 multi-view counting datasets, PETS2009, DukeMTMC, and a newly collected multi-view counting dataset containing a crowded street intersection. Our methods achieve state-of-the-art results compared to other multi-view counting baselines.

preprint2021arXiv

Connection between inverse engineering and optimal control in shortcuts to adiabaticity

We consider fast high-fidelity quantum control by using a shortcut to adiabaticity (STA) technique and optimal control theory (OCT). Three specific examples, including expansion of cold atoms from the harmonic trap, atomic transport by moving harmonic trap, and spin dynamics in the presence of dissipation, are explicitly detailed. Using OCT as a qualitative guide, we demonstrate how STA protocols designed from inverse engineering method, can approach with very high precision optimal solutions built about physical constraints, by a proper choice of the interpolation function and with a very reduced number of adjustable parameters.

preprint2021arXiv

Knowledge Infused Policy Gradients for Adaptive Pandemic Control

COVID-19 has impacted nations differently based on their policy implementations. The effective policy requires taking into account public information and adaptability to new knowledge. Epidemiological models built to understand COVID-19 seldom provide the policymaker with the capability for adaptive pandemic control (APC). Among the core challenges to be overcome include (a) inability to handle a high degree of non-homogeneity in different contributing features across the pandemic timeline, (b) lack of an approach that enables adaptive incorporation of public health expert knowledge, and (c) transparent models that enable understanding of the decision-making process in suggesting policy. In this work, we take the early steps to address these challenges using Knowledge Infused Policy Gradient (KIPG) methods. Prior work on knowledge infusion does not handle soft and hard imposition of varying forms of knowledge in disease information and guidelines to necessarily comply with. Furthermore, the models do not attend to non-homogeneity in feature counts, manifesting as partial observability in informing the policy. Additionally, interpretable structures are extracted post-learning instead of learning an interpretable model required for APC. To this end, we introduce a mathematical framework for KIPG methods that can (a) induce relevant feature counts over multi-relational features of the world, (b) handle latent non-homogeneous counts as hidden variables that are linear combinations of kernelized aggregates over the features, and (b) infuse knowledge as functional constraints in a principled manner. The study establishes a theory for imposing hard and soft constraints and simulates it through experiments. In comparison with knowledge-intensive baselines, we show quick sample efficient adaptation to new knowledge and interpretability in the learned policy, especially in a pandemic context.

preprint2021arXiv

Robust Consumption Portfolio Optimization with Stochastic Differential Utility

This paper examines a continuous time intertemporal consumption and portfolio choice problem with a stochastic differential utility preference of Epstein-Zin type for a robust investor, who worries about model misspecification and seeks robust decision rules. We provide a verification theorem which formulates the Hamilton-Jacobi-Bellman-Isaacs equation under a non-Lipschitz condition. Then, with the verification theorem, the explicit closed-form optimal robust consumption and portfolio solutions to a Heston model are given. Also we compare our robust solutions with the non-robust ones, and the comparisons shown in a few figures coincide with our common sense.

preprint2020arXiv

A framework for the analysis of supervised discrete event systems under attack

This paper focuses on the problem of cyber attacks for discrete event systems under supervisory control. In more detail, the goal of the supervisor, who has a partial observation of the system evolution, is that of preventing the system from reaching a set of unsafe states. An attacker may act in two different ways: he can corrupt the observation of the supervisor editing the sensor readings, and can enable events that are disabled by the supervisor. This is done with the aim of leading the plant to an unsafe state, and keeping the supervisor unaware of that before the unsafe state is reached. A special automaton, called attack structure is constructed as the parallel composition of two special structures. Such an automaton can be used by the attacker to select appropriate actions (if any) to reach the above goal, or equivalently by the supervisor, to validate its robustness with respect to such attacks.

preprint2020arXiv

A Machine Learning-enhanced Robust P-Phase Picker for Real-time Seismic Monitoring

Identifying the arrival times of seismic P-phases plays a significant role in real-time seismic monitoring, which provides critical guidance for emergency response activities. While considerable research has been conducted on this topic, efficiently capturing the arrival times of seismic P-phases hidden within intensively distributed and noisy seismic waves, such as those generated by the aftershocks of destructive earthquakes, remains a real challenge since most common existing methods in seismology rely on laborious expert supervision. To this end, in this paper, we present a machine learning-enhanced framework based on ensemble learning strategy, EL-Picker, for the automatic identification of seismic P-phase arrivals on continuous and massive waveforms. More specifically, EL-Picker consists of three modules, namely, Trigger, Classifier, and Refiner, and an ensemble learning strategy is exploited to integrate several machine learning classifiers. An evaluation of the aftershocks following the MS 8.0 Wenchuan earthquake demonstrates that EL-Picker can not only achieve the best identification performance but also identify 120% more seismic P-phase arrivals as complementary data. Meanwhile, experimental results also reveal both the applicability of different machine learning models for waveforms collected from different seismic stations and the regularities of seismic P-phase arrivals that might be neglected during manual inspection. These findings clearly validate the effectiveness, efficiency, flexibility and stability of EL-Picker.

preprint2020arXiv

A Unified Framework for Adjustable Robust Optimization with Endogenous Uncertainty

This work proposes a framework for multistage adjustable robust optimization that unifies the treatment of three different types of endogenous uncertainty, where decisions, respectively, (i) alter the uncertainty set, (ii) affect the materialization of uncertain parameters, and (iii) determine the time when the true values of uncertain parameters are observed. We provide a systematic analysis of the different types of endogenous uncertainty and highlight the connection between optimization under endogenous uncertainty and active learning. We consider decision-dependent polyhedral uncertainty sets and propose a decision rule approach that incorporates both continuous and binary recourse, including recourse decisions that affect the uncertainty set. The proposed method enables the modeling of decision-dependent nonanticipativity and results in a tractable reformulation of the problem. We demonstrate the effectiveness of the approach in computational experiments that cover a range of applications, including plant redesign, maintenance planning with inspections, optimizing revision points in capacity planning, and production scheduling with active parameter estimation. The results show significant benefits from the proper modeling of endogenous uncertainty and active learning.

preprint2020arXiv

Adaptive Task Partitioning at Local Device or Remote Edge Server for Offloading in MEC

Mobile edge computing (MEC) is one of the promising solutions to process computational-intensive tasks for the emerging time-critical Internet-of-Things (IoT) use cases, e.g., virtual reality (VR), augmented reality (AR), autonomous vehicle. The latency can be reduced further, when a task is partitioned and computed by multiple edge servers' (ESs) collaboration. However, the state-of-the-art work studies the MEC-enabled offloading based on a static framework, which partitions tasks at either the local user equipment (UE) or the primary ES. The dynamic selection between the two offloading schemes has not been well studied yet. In this paper, we investigate a dynamic offloading framework in a multi-user scenario. Each UE can decide who partitions a task according to the network status, e.g., channel quality and allocated computation resource. Based on the framework, we model the latency to complete a task, and formulate an optimization problem to minimize the average latency among UEs. The problem is solved by jointly optimizing task partitioning and the allocation of the communication and computation resources. The numerical results show that, compared with the static offloading schemes, the proposed algorithm achieves the lower latency in all tested scenarios. Moreover, both mathematical derivation and simulation illustrate that the wireless channel quality difference between a UE and different ESs can be used as an important criterion to determine the right scheme.

preprint2020arXiv

An under-approximation for the robust uncertain two-level cooperative set covering problem

This paper investigates the robust uncertain two-level cooperative set covering problem (RUTLCSCP). Given two types of facilities, which are called y-facility and z-facility. The problem is to decide which facilities of both types to be selected, in order to cover the demand nodes cooperatively with minimal cost. It combines the concepts of robust, probabilistic, and cooperative covering by introducting "$Γ$-robust two-level-cooperative $α$-cover" constraints. Additionally, the constraint relaxed verison of the RUTLCSCP, which is also a linear approximation robust counterpart version of RUTLCSCP (RUTLCSCP-LA-RC), is developed by linear approximation of the constraints, and can be stated as a compact mixed-integer linear programming problem. We show that the solution for RUTLCSCP-LA-RC, $\varepsilon$-under-approximate solution, can also be the solution for RUTLCSCP on some conditions. Computational experiments show that the solutions in 333 instances (10125 instances in total) with 12 types which tinily violate the constraints of RUTLCSCP, can be an efficient under-approximate solutions, while the feasible solutions in other instances are proven to be optimal.

preprint2020arXiv

Branch-and-Price for a Class of Nonconvex Mixed-Integer Nonlinear Programs

This work attempts to combine the strengths of two major technologies that have matured over the last three decades: global mixed-integer nonlinear optimization and branch-and-price. We consider a class of generally nonconvex mixed-integer nonlinear programs (MINLPs) with linear complicating constraints and integer linking variables. If the complicating constraints are removed, the problem becomes easy to solve, e.g. due to decomposable structure. Integrality of the linking variables allows us to apply a discretization approach to derive a Dantzig-Wolfe reformulation and solve the problem to global optimality using branch-and-price. It is a remarkably simple idea; but to our surprise, it has barely found any application in the literature. In this work, we show that many relevant problems directly fall or can be reformulated into this class of MINLPs. We present the branch-and-price algorithm and demonstrate its effectiveness (and sometimes ineffectiveness) in an extensive computational study considering multiple large-scale problems of practical relevance, showing that, in many cases, orders-of-magnitude reductions in solution time can be achieved.

preprint2020arXiv

Computation Resource Allocation for Heterogeneous Time-Critical IoT Services in MEC

Mobile edge computing (MEC) is one of the promising solutions to process computational-intensive tasks within short latency for emerging Internet-of-Things (IoT) use cases, e.g., virtual reality (VR), augmented reality (AR), autonomous vehicle. Due to the coexistence of heterogeneous services in MEC system, the task arrival interval and required execution time can vary depending on services. It is challenging to schedule computation resource for the services with stochastic arrivals and runtime at an edge server (ES). In this paper, we propose a flexible computation offloading framework among users and ESs. Based on the framework, we propose a Lyapunov-based algorithm to dynamically allocate computation resource for heterogeneous time-critical services at the ES. The proposed algorithm minimizes the average timeout probability without any prior knowledge on task arrival process and required runtime. The numerical results show that, compared with the standard queuing models used at ES, the proposed algorithm achieves at least 35% reduction of the timeout probability, and approximated utilization efficiency of computation resource to non-cause queuing model under various scenarios.

preprint2020arXiv

Deep-AIR: A Hybrid CNN-LSTM Framework forFine-Grained Air Pollution Forecast

Poor air quality has become an increasingly critical challenge for many metropolitan cities, which carries many catastrophicphysical and mental consequences on human health and quality of life. However, accurately monitoring and forecasting air qualityremains a highly challenging endeavour. Limited by geographically sparse data, traditional statistical models and newly emergingdata-driven methods of air quality forecasting mainly focused on the temporal correlation between the historical temporal datasets of airpollutants. However, in reality, both distribution and dispersion of air pollutants are highly location-dependant. In this paper, we proposea novel hybrid deep learning model that combines Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM)together to forecast air quality at high-resolution. Our model can utilize the spatial correlation characteristic of our air pollutant datasetsto achieve higher forecasting accuracy than existing deep learning models of air pollution forecast.

preprint2020arXiv

Linear Response Theory for Nonlinear Stochastic Differential Equations with $α$-stable Lévy Noises

We consider a nonlinear stochastic differential equation driven by an $α$-stable Lévy process ($1<α<2$). We first obtain some regularity results for the probability density of its invariant measure via establishing the a priori estimate of the corresponding stationary Fokker-Planck equation. Then by the a priori estimate of Kolmogorov backward equations and the perturbation property of Markov semigroup, we derive the response function and generalize the famous linear response theory in nonequilibrium statistical mechanics to non-Gaussian stochastic dynamic systems.

preprint2020arXiv

Multistage Robust Mixed-Integer Optimization Under Endogenous Uncertainty

Endogenous, i.e. decision-dependent, uncertainty has received increased interest in the stochastic programming community. In the robust optimization context, however, it has rarely been considered. This work addresses multistage robust mixed-integer optimization with decision-dependent uncertainty sets. The proposed framework allows us to consider both continuous and integer recourse, including recourse decisions that affect the uncertainty set. We derive a tractable reformulation of the problem by leveraging recent advances in the construction of nonlinear decision rules, and introduce discontinuous piecewise linear decision rules for continuous recourse. Computational experiments are performed to gain insights on the impact of endogenous uncertainty, the benefit of discrete recourse, and computational performance. Our results indicate that the level of conservatism in the solution can be significantly reduced if endogenous uncertainty and mixed-integer recourse are properly modeled.

preprint2020arXiv

Multivariate Time-series Anomaly Detection via Graph Attention Network

Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications. Recent approaches have achieved significant progress in this topic, but there is remaining limitations. One major limitation is that they do not capture the relationships between different time-series explicitly, resulting in inevitable false alarms. In this paper, we propose a novel self-supervised framework for multivariate time-series anomaly detection to address this issue. Our framework considers each univariate time-series as an individual feature and includes two graph attention layers in parallel to learn the complex dependencies of multivariate time-series in both temporal and feature dimensions. In addition, our approach jointly optimizes a forecasting-based model and are construction-based model, obtaining better time-series representations through a combination of single-timestamp prediction and reconstruction of the entire time-series. We demonstrate the efficacy of our model through extensive experiments. The proposed method outperforms other state-of-the-art models on three real-world datasets. Further analysis shows that our method has good interpretability and is useful for anomaly diagnosis.

preprint2020arXiv

Nanoscale electrometry based on a magnetic-field-resistant spin sensor

The nitrogen-vacancy (NV) center is a potential atomic-scale spin sensor for electric field sensing. However, its natural susceptibility to the magnetic field hinders effective detection of the electric field. Here we propose a robust electrometric method utilizing continuous dynamic decoupling (CDD) technique. During the CDD period, the NV center evolves in a dressed-state space, where the sensor is resistant to magnetic fields but remains sensitive to electric fields. As an example, we use this method to isolate the electric noise from a complex electro-magnetical environment near diamond surface via measuring the dephasing rate between dressed states. By reducing the surface electric noise with different covered liquids, we observe an unambiguous relation between the dephasing rate and the dielectric permittivity of the liquid, which enables a quantitative investigation of electric noise model near diamond surface.

preprint2020arXiv

Observation of sixfold degenerate fermions in PdSb$_2$

Three types of fermions have been extensively studied in topological quantum materials: Dirac, Weyl, and Majorana fermions. Beyond the fundamental fermions in high energy physics, exotic fermions are allowed in condensed matter systems residing in three-, six- or eightfold degenerate band crossings. Here, we use angle-resolved photoemission spectroscopy to directly visualize three-doubly-degenerate bands in PdSb$_2$. The ultrahigh energy resolution we are able to achieve allows for the confirmation of all the sixfold degenerate bands at the R point, in remarkable consistency with first-principles calculations. Moreover, we find that this sixfold degenerate crossing has quadratic dispersion as predicted by theory. Finally, we compare sixfold degenerate fermions with previously confirmed fermions to demonstrate the importance of this work: our study indicates a topological fermion beyond the constraints of high energy physics.

preprint2020arXiv

On quotients of Boolean control networks

In this paper, we focus on the study of quotients of Boolean control networks (BCNs) with the motivation that they might serve as smaller models that still carry enough information about the original network. Given a BCN and an equivalence relation on the state set, we consider a labeled transition system that is generated by the BCN. The resulting quotient transition system then naturally captures the quotient dynamics of the BCN concerned. We therefore develop a method for constructing a Boolean system that behaves equivalently to the resulting quotient transition system. The use of the obtained quotient system for control design is discussed and we show that for BCNs, controller synthesis can be done by first designing a controller for a quotient and subsequently lifting it to the original model. We finally demonstrate the applicability of the proposed techniques on a biological example.

preprint2020arXiv

Pentaquarks with the $qqs\bar{Q}Q$ configuration in the Chiral Quark Model

We study the five-quark system composed of $qqs\bar{Q}Q$ configuration ($q = u$ or $d$, $Q=b$ or $c$), in the framework of the chiral quark model. In consequence, a series of bound states with heavy flavors are predicted by precise five-body dynamical calculations. We found that taking color-octet structure into consideration always provides more bounding energy than color-singlet structure, and the more heavier quark prevents, the easier to form the bound states. We suggest $qqs\bar{b}b$ configuration is a compact $\bar{b}b$-pair surrounded by three other quarks, while $qqs\bar{b}c$, $qqs\bar{c}b$ and $qqs\bar{c}c$ configurations are molecular states.

preprint2020arXiv

Person Re-identification in Aerial Imagery

Nowadays, with the rapid development of consumer Unmanned Aerial Vehicles (UAVs), visual surveillance by utilizing the UAV platform has been very attractive. Most of the research works for UAV captured visual data are mainly focused on the tasks of object detection and tracking. However, limited attention has been paid to the task of person Re-identification (ReID) which has been widely studied in ordinary surveillance cameras with fixed emplacements. In this paper, to facilitate the research of person ReID in aerial imagery, we collect a large scale airborne person ReID dataset named as Person ReID for Aerial Imagery (PRAI-1581), which consists of 39,461 images of 1581 person identities. The images of the dataset are shot by two DJI consumer UAVs flying at an altitude ranging from 20 to 60 meters above the ground, which covers most of the real UAV surveillance scenarios. In addition, we propose to utilize subspace pooling of convolution feature maps to represent the input person images. Our method can learn a discriminative and compact feature representation for ReID in aerial imagery and can be trained in an end-to-end fashion efficiently. We conduct extensive experiments on the proposed dataset and the experimental results demonstrate that re-identify persons in aerial imagery is a challenging problem, where our method performs favorably against state of the arts. Our dataset can be accessed via \url{https://github.com/stormyoung/PRAI-1581}.

preprint2020arXiv

Quantum Reflections of Nonlocal Optical Solitons in a Cold Rydberg Atomic Gas

Quantum reflection refers to a non-vanishing reflection probability in the absence of a classically turning point. Much attention has been paid to such reflections due to their fundamental, intriguing physics and potential practical applications. Here we propose a scheme to realize a quantum reflection of nonlocal nonlinear optical beams in a cold Rydberg atomic gas via electromagnetically induced transparency working in a dispersion regime. Based on the long-range interaction between Rydberg atoms, we found that the system supports low-power nonlocal optical solitons. Such nonlocal solitons can display a sharp transition between reflection, trapping, and transmission when scattered by a linear attractive potential, created by gate photons stored in another Rydberg state. Different from conventional physical systems explored up to now, the quantum reflection of the nonlocal optical solitons in the Rydberg atomic gas exhibits interesting anomalous behaviors, which can be actively manipulated by tuning the incident velocity and intensity of the probe field, as well as the nonlocality of the Kerr nonlinearity inherent in the Rydberg atomic gas. The results reported here are not only useful for developing Rydberg nonlinear optics but also helpful for characterizing the physical property of the Rydberg gas and for designing novel nonlinear optical devices.

preprint2020arXiv

ReCO: A Large Scale Chinese Reading Comprehension Dataset on Opinion

This paper presents the ReCO, a human-curated ChineseReading Comprehension dataset on Opinion. The questions in ReCO are opinion based queries issued to the commercial search engine. The passages are provided by the crowdworkers who extract the support snippet from the retrieved documents. Finally, an abstractive yes/no/uncertain answer was given by the crowdworkers. The release of ReCO consists of 300k questions that to our knowledge is the largest in Chinese reading comprehension. A prominent characteristic of ReCO is that in addition to the original context paragraph, we also provided the support evidence that could be directly used to answer the question. Quality analysis demonstrates the challenge of ReCO that requires various types of reasoning skills, such as causal inference, logical reasoning, etc. Current QA models that perform very well on many question answering problems, such as BERT, only achieve 77% accuracy on this dataset, a large margin behind humans nearly 92% performance, indicating ReCO presents a good challenge for machine reading comprehension. The codes, datasets are freely available at https://github.com/benywon/ReCO.

preprint2020arXiv

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers.

preprint2020arXiv

Single DNA Electron Spin Resonance Spectroscopy in Aqueous Solutions

Magnetic resonance spectroscopy of single biomolecules under near-physiological conditions may substantially advance understanding of biological function, yet remains very challenging. Here we use nitrogen-vacancy centers in diamonds to detect electron spin resonance spectra of individual, tethered DNA duplexes labeled with a nitroxide spin label in aqueous buffer solutions at ambient temperatures. This paves the way for magnetic resonance studies on single biomolecules and their inter-molecular interactions in a native-like environment.

preprint2020arXiv

Spin-orbit quantum impurity in a topological kagome magnet

Quantum states induced by single-atomic-impurities are the current frontier of material and information science. Recently the spin-orbit coupled correlated kagome magnets are emerging as a new class of topological quantum materials, although the effect of single-atomic impurities remains unexplored. Here we use state-of-the-art scanning tunneling microscopy/spectroscopy (STM/S) to study the atomic indium impurity in a topological kagome magnet Co3Sn2S2, which is designed to support the spin-orbit quantum state. We find each impurity features a strongly localized bound state. Our systematic magnetization-polarized tunneling probe reveals its spin-down polarized nature with an unusual moment of -5uB, indicative of additional orbital magnetization. As the separation between two impurities progressively shrinks, their respective bound states interact and form quantized molecular orbital states. The molecular orbital of three neighboring impurities further exhibits an intriguing splitting owing to the combination of geometry, magnetism, and spin-orbit coupling, analogous to the splitting of the topological Weyl fermion line12,19. Our work demonstrates the quantum-level interplay between magnetism and spin-orbit coupling at an individual atomic impurity, which provides insights into the emergent impurity behavior in a topological kagome magnet and the potential of spin-orbit quantum impurities for information science.

preprint2020arXiv

Tensor network approach to phase transitions of a non-Abelian topological phase

The non-abelian topological phase with Fibonacci anyons minimally supports universal quantum computation. In order to investigate the possible phase transitions out of the Fibonacci topological phase, we propose a generic quantum-net wavefunction with two tuning parameters dual with each other, and the norm can be exactly mapped into a partition function of the two-coupled $ϕ^{2}$-state Potts models, where $ϕ=(\sqrt{5}+1)/2$ is the golden ratio. By developing the tensor network representation of this wavefunction on a square lattice, we can accurately calculate the full phase diagram with the numerical methods of tensor networks. More importantly, it is found that the non-abelian Fibonacci topological phase is enclosed by three distinct non-topological phases and their dual phases of a single $ϕ^{2}$-state Potts model: the gapped dilute net phase, critical dense net phase, and spontaneous translation symmetry breaking gapped phase. We also determine the critical properties of the phase transitions among the Fibonacci topological phase and those non-topological phases.

preprint2020arXiv

Terahertz Emission From an Exchange-Coupled Synthetic Antiferromagnet

We report on terahertz emission from FeMnPt/Ru/FeMnPt and Pt/CoFeB/Ru/CoFeB/Pt synthetic antiferromagnet (SAF) structures upon irradiation by a femtosecond laser; the former is via the anomalous Hall effect, whereas the latter is through the inverse spin Hall effect. The antiparallel alignment of the two ferromagnetic layers leads to a terahertz emission peak amplitude that is almost double that for a corresponding single-layer or bilayer emitter with the same equivalent thickness. In addition, we demonstrate by both simulation and experiment that terahertz emission provides a powerful tool to probe the magnetization reversal processes of individual ferromagnetic layers in a SAF structure, as the terahertz signal is proportional to the vector difference of the magnetizations of the two ferromagnetic layers.

preprint2020arXiv

The Non-Hermitian quantum mechanics and its canonical structure

The non-Hermitian Schrödinger equation is re-expressed generally in the form of Hamilton&#39;s canonical equation without any approximation. Its quantization called non-Hermitian quantum field theory is discussed. By virtue of the canonical equation, the theory of non-Hermitian quantum mechanics is totally reformulated, including the probability amplitudes of states, the expectations of operators, as well as the expressions of interaction terms. The conventional difficulties in non-Hermitian quantum mechanics are totally overcome by the reformulation. Specifically, the imaginary parts the non-Hermitian eigenenergy and adiabatic geometric phase are actually unphysical, although they are mathematically perfect.

preprint2020arXiv

Two Step Joint Model for Drug Drug Interaction Extraction

When patients need to take medicine, particularly taking more than one kind of drug simultaneously, they should be alarmed that there possibly exists drug-drug interaction. Interaction between drugs may have a negative impact on patients or even cause death. Generally, drugs that conflict with a specific drug (or label drug) are usually described in its drug label or package insert. Since more and more new drug products come into the market, it is difficult to collect such information by manual. We take part in the Drug-Drug Interaction (DDI) Extraction from Drug Labels challenge of Text Analysis Conference (TAC) 2018, choosing task1 and task2 to automatically extract DDI related mentions and DDI relations respectively. Instead of regarding task1 as named entity recognition (NER) task and regarding task2 as relation extraction (RE) task then solving it in a pipeline, we propose a two step joint model to detect DDI and it&#39;s related mentions jointly. A sequence tagging system (CNN-GRU encoder-decoder) finds precipitants first and search its fine-grained Trigger and determine the DDI for each precipitant in the second step. Moreover, a rule based model is built to determine the sub-type for pharmacokinetic interation. Our system achieved best result in both task1 and task2. F-measure reaches 0.46 in task1 and 0.40 in task2.

preprint2019arXiv

Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties

We study a generalization of deduplication, which enables lossless deduplication of highly similar data and show that standard deduplication with fixed chunk length is a special case. We provide bounds on the expected length of coded sequences for generalized deduplication and show that the coding has asymptotic near-entropy cost under the proposed source model. More importantly, we show that generalized deduplication allows for multiple orders of magnitude faster convergence than standard deduplication. This means that generalized deduplication can provide compression benefits much earlier than standard deduplication, which is key in practical systems. Numerical examples demonstrate our results, showing that our lower bounds are achievable, and illustrating the potential gain of using the generalization over standard deduplication. In fact, we show that even for a simple case of generalized deduplication, the gain in convergence speed is linear with the size of the data chunks.

preprint2019arXiv

Monopoles in non-Hermitian systems

The monopole for the geometric curvature is studied for non-Hermitian systems. We find that the monopole contains not only the exceptional points but also branch cuts. As the mathematical choice of branch cut in the complex plane is rather arbitrary, the monopole changes with the branch-cut choice. Despite this branch-cut dependence, our monopole is invariant under the $GL(l,\mathbb{C})$ gauge transformation that is inherent in non-Hermitian systems. Although our results are generic, they are presented in the context of a two-mode non-Hermitian Dirac model. A corresponding two-mode Hermitian system is also discussed to illustrate the essential difference between monopoles in Hermitian systems and non-Hermitian systems.

preprint2019arXiv

Multidimensional Variational Line Spectra Estimation

The fundamental multidimensional line spectral estimation problem is addressed utilizing the Bayesian methods. Motivated by the recently proposed variational line spectral estimation (VALSE) algorithm, multidimensional VALSE (MDVALSE) is developed. MDVALSE inherits the advantages of VALSE such as automatically estimating the model order, noise variance and providing uncertain degrees of frequency estimates. Compared to VALSE, the multidimensional frequencies of a single component is treated as a whole, and the probability density function is projected as independent univariate von Mises distribution to perform tractable inference. Besides, for the initialization, efficient fast Fourier transform (FFT) is adopted to significantly reduce the computation complexity of MDVALSE. Numerical results demonstrate the effectiveness of the MDVALSE, compared to state-of-art methods.

preprint2019arXiv

Terahertz emission from anomalous Hall effect in a single-layer ferromagnet

We report on terahertz emission from a single layer ferromagnet which involves the generation of backflow nonthermal charge current from the ferromagnet/dielectric interface by femtosecond laser excitation and subsequent conversion of the charge current to a transverse transient charge current via the anomalous Hall effect, thereby generating the THz radiation. The THz emission can be either enhanced or suppressed, or even the polarity can be reversed, by introducing a magnetization gradient in the thickness direction of the ferromagnet. Unlike spintronic THz emitters reported previously, it does not require additional non-magnetic layer or Rashba interface.