Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
69works
0followers
29topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

69 published item(s)

preprint2026arXiv

CacheRAG: A Semantic Caching System for Retrieval-Augmented Generation in Knowledge Graph Question Answering

The integration of Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) has significantly advanced Knowledge Graph Question Answering (KGQA). However, existing LLM-driven KGQA systems act as stateless planners, generating retrieval plans in isolation without exploiting historical query patterns: analogous to a database system that optimizes every query from scratch without a plan cache. This fundamental design flaw leads to schema hallucinations and limited retrieval coverage. We propose CacheRAG, a systematic cache-augmented architecture for LLM-based KGQA that transforms stateless planners into continual learners. Unlike traditional database plan caching (which optimizes for frequency), CacheRAG introduces three novel design principles tailored for LLM contexts: (1) Schema-agnostic user interface: A two-stage semantic parsing framework via Intermediate Semantic Representation (ISR) enables non-expert users to interact purely in natural language, while a Backend Adapter grounds the LLM with local schema context to compile executable physical queries safely. (2) Diversity-optimized cache retrieval: A two-layer hierarchical index (Domain $\rightarrow$ Aspect) coupled with Maximal Marginal Relevance (MMR) maximizes structural variety in cached examples, effectively mitigating reasoning homogeneity. (3) Bounded heuristic expansion: Deterministic depth and breadth subgraph operators with strict complexity guarantees significantly enhance retrieval recall without risking unbounded API execution. Extensive experiments on multiple benchmarks demonstrate that CacheRAG significantly outperforms state-of-the-art baselines (e.g., +13.2% accuracy and +17.5% truthfulness on the CRAG dataset).

preprint2022arXiv

A Data-Driven Column Generation Algorithm For Bin Packing Problem in Manufacturing Industry

The bin packing problem exists widely in real logistic scenarios (e.g., packing pipeline, express delivery), with its goal to improve the packing efficiency and reduce the transportation cost. In this NP-hard combinatorial optimization problem, the position and quantity of each item in the box are strictly restricted by complex constraints and special customer requirements. Existing approaches are hard to obtain the optimal solution since rigorous constraints cannot be handled within a reasonable computation load. In this paper, for handling this difficulty, the packing knowledge is extracted from historical data collected from the packing pipeline of Huawei. First, by fully exploiting the relationship between historical packing records and input orders(orders to be packed) , the problem is reformulated as a set cover problem. Then, two novel strategies, the constraint handling and process acceleration strategies are applied to the classic column generation approach to solve this set cover problem. The cost of solving pricing problem for generating new columns is high due to the complex constraints and customer requirements. The proposed constraints handling strategy exploits the historical packing records with the most negative value of the reduced cost. Those constraints have been implicitly satisfied in these historical packing records so that there is no need to conduct further evaluation on constraints, thus the computational load is saved. To further eliminate the iteration process of column generation algorithm and accelerate the optimization process, a Learning to Price approach called Modified Pointer Network is proposed, by which we can determine which historical packing records should be selected directly. Through experiments on realworld datasets, we show our proposed method can improve the packing success rate and decrease the computation time simultaneously.

preprint2022arXiv

A gap at 1 au in the disk of DI Cha A revealed by infrared interferometry

DI Cha A is K0-type pre-main sequence star, the brightest component of a quadruple stellar system. Here we report on a detailed study of this star based on archival VLTI/MIDI and VLTI/PIONIER infrared interferometric observations, as well as optical--infrared photometric monitoring from ground-based and space-born instruments. We determined the structure of the circumstellar disk by fitting simultaneously the interferometric visibilities and the spectral energy distribution, using both analytical models and the radiative transfer code RADMC-3D. The modeling revealed that the radial density distribution of the disk appears to have a gap between 0.21 and 3.0 au. The inner ring, whose inner size coincides with the sublimation radius, is devoid of small, submicrometer-sized dust grains. The inner edge of the outer disk features a puffed-up rim, typically seen in intermediate-mass stars. Grain growth, although less progressed, was also detected in the outer disk. The inner ring is variable at mid-infrared wavelengths on both daily and annual timescales, while the star stays remarkably constant in the optical, pointing to geometrical or accretion changes in the disk as possible explanation for the flux variations.

preprint2022arXiv

A nanodiamonds-engineered optical-fiber plasmonic interface for sensitivity-enhanced biosensing

Benefitting from the excellent characteristics such as low cytotoxicity, functionalization versatility, and tunable fluorescence, nanodiamonds (NDs) have shown enormous application potentials in the biomedical field. Herein, we proposed, for the first time to our best knowledge, to integrate NDs on a plasmonic interface constructed on a side-polished fiber using drop-casting method. The added NDs engineers the plasmonic interface towards improving the sensing field, thus enhancing the sensitivity, which, moreover, is significantly dependent on the number of drop-casting cycles (DCs) and the used concentration of NDs dispersion solution. Experimental results suggest that properly increasing the NDs dispersion concentration is beneficial to obtain a higher sensitivity while using a fewer number of DCs, but the excessive concentration extremely deteriorates the resonance dip. Experimentally, using the optimal 0.2 mg/mL concentration and 3 DCs, we achieve the highest RI sensitivity of 3582 nm/RIU, which shows an enhancement of 73.8% compared to the case without NDs modification. The sensitivity enhancement in biosensing is also proved by employing bovine serum albumin as a demo. The behind mechanism is explored via characterizations and simulations. This work opens up a new application form for NDs, i.e. integrating NDs with a plasmonic interface towards high-performance biosensing.

preprint2022arXiv

A new method controlling the error probability for detecting the photon-number-splitting attack in the decoy-state quantum key distribution

The existing decoy-state quantum key distribution (QKD) beating photon-number-splitting (PNS) attack provides a more accurate method to estimate secure key rate, while it still considers that only single-photon pulses can generate secure keys in any case. However, multiphoton pulses can also generate secure keys if we can confirm that there is no attack. In this paper, under the null hypothesis of no PNS attack, we first determine whether there is an attack or not by retrieving the missing information of the existing decoy-state protocols, extract a Cauchy distribution statistic, and further provide a detection method and the Type I error probability. If the result is judged to be an attack, we can use the existing decoy-state method and the GLLP formula to estimate secure key rate. Otherwise, all pulses received including both single-photon pulses and multiphoton pulses, can be used to generate the keys and we give the secure key rate in this case. Finally, the associated experiments we performed (i.e., the significance level is $5\%$) show the correctness of our method.

preprint2022arXiv

An I/O-Efficient Disk-based Graph System for Scalable Second-Order Random Walk of Large Graphs

Random walk is widely used in many graph analysis tasks, especially the first-order random walk. However, as a simplification of real-world problems, the first-order random walk is poor at modeling higher-order structures in the data. Recently, second-order random walk-based applications (e.g., Node2vec, Second-order PageRank) have become attractive. Due to the complexity of the second-order random walk models and memory limitations, it is not scalable to run second-order random walk-based applications on a single machine. Existing disk-based graph systems are only friendly to the first-order random walk models and suffer from expensive disk I/Os when executing the second-order random walks. This paper introduces an I/O-efficient disk-based graph system for the scalable second-order random walk of large graphs, called GraSorw. First, to eliminate massive light vertex I/Os, we develop a bi-block execution engine that converts random I/Os into sequential I/Os by applying a new triangular bi-block scheduling strategy, the bucket-based walk management, and the skewed walk storage. Second, to improve the I/O utilization, we design a learning-based block loading model to leverage the advantages of the full-load and on-demand load methods. Finally, we conducted extensive experiments on six large real datasets as well as several synthetic datasets. The empirical results demonstrate that the end-to-end time cost of popular tasks in GraSorw is reduced by more than one order of magnitude compared to the existing disk-based graph systems.

preprint2022arXiv

Arbitrary Shape Text Detection via Segmentation with Probability Maps

Arbitrary shape text detection is a challenging task due to the significantly varied sizes and aspect ratios, arbitrary orientations or shapes, inaccurate annotations, etc. Due to the scalability of pixel-level prediction, segmentation-based methods can adapt to various shape texts and hence attracted considerable attention recently. However, accurate pixel-level annotations of texts are formidable, and the existing datasets for scene text detection only provide coarse-grained boundary annotations. Consequently, numerous misclassified text pixels or background pixels inside annotations always exist, degrading the performance of segmentation-based text detection methods. Generally speaking, whether a pixel belongs to text or not is highly related to the distance with the adjacent annotation boundary. With this observation, in this paper, we propose an innovative and robust segmentation-based detection method via probability maps for accurately detecting text instances. To be concrete, we adopt a Sigmoid Alpha Function (SAF) to transfer the distances between boundaries and their inside pixels to a probability map. However, one probability map can not cover complex probability distributions well because of the uncertainty of coarse-grained text boundary annotations. Therefore, we adopt a group of probability maps computed by a series of Sigmoid Alpha Functions to describe the possible probability distributions. In addition, we propose an iterative model to learn to predict and assimilate probability maps for providing enough information to reconstruct text instances. Finally, simple region growth algorithms are adopted to aggregate probability maps to complete text instances. Experimental results demonstrate that our method achieves state-of-the-art performance in terms of detection accuracy on several benchmarks.

preprint2022arXiv

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Action recognition models have shown a promising capability to classify human actions in short video clips. In a real scenario, multiple correlated human actions commonly occur in particular orders, forming semantically meaningful human activities. Conventional action recognition approaches focus on analyzing single actions. However, they fail to fully reason about the contextual relations between adjacent actions, which provide potential temporal logic for understanding long videos. In this paper, we propose a prompt-based framework, Bridge-Prompt (Br-Prompt), to model the semantics across adjacent actions, so that it simultaneously exploits both out-of-context and contextual information from a series of ordinal actions in instructional videos. More specifically, we reformulate the individual action labels as integrated text prompts for supervision, which bridge the gap between individual action semantics. The generated text prompts are paired with corresponding video clips, and together co-train the text encoder and the video encoder via a contrastive approach. The learned vision encoder has a stronger capability for ordinal-action-related downstream tasks, e.g. action segmentation and human activity recognition. We evaluate the performances of our approach on several video datasets: Georgia Tech Egocentric Activities (GTEA), 50Salads, and the Breakfast dataset. Br-Prompt achieves state-of-the-art on multiple benchmarks. Code is available at https://github.com/ttlmh/Bridge-Prompt

preprint2022arXiv

Conceptual design of the Spin Physics Detector

The Spin Physics Detector, a universal facility for studying the nucleon spin structure and other spin-related phenomena with polarized proton and deuteron beams, is proposed to be placed in one of the two interaction points of the NICA collider that is under construction at the Joint Institute for Nuclear Research (Dubna, Russia). At the heart of the project there is huge experience with polarized beams at JINR. The main objective of the proposed experiment is the comprehensive study of the unpolarized and polarized gluon content of the nucleon. Spin measurements at the Spin Physics Detector at the NICA collider have bright perspectives to make a unique contribution and challenge our understanding of the spin structure of the nucleon. In this document the Conceptual Design of the Spin Physics Detector is presented.

preprint2022arXiv

Cost-Effective Algorithms for Average-Case Interactive Graph Search

Interactive graph search (IGS) uses human intelligence to locate the target node in hierarchy, which can be applied for image classification, product categorization and searching a database. Specifically, IGS aims to categorize an object from a given category hierarchy via several rounds of interactive queries. In each round of query, the search algorithm picks a category and receives a boolean answer on whether the object is under the chosen category. The main efficiency goal asks for the minimum number of queries to identify the correct hierarchical category for the object. In this paper, we study the average-case interactive graph search (AIGS) problem that aims to minimize the expected number of queries when the objects follow a probability distribution. We propose a greedy search policy that splits the candidate categories as evenly as possible with respect to the probability weights, which offers an approximation guarantee of $O(\log n)$ for AIGS given the category hierarchy is a directed acyclic graph (DAG), where $n$ is the total number of categories. Meanwhile, if the input hierarchy is a tree, we show that a constant approximation factor of $(1+\sqrt{5})/2$ can be achieved. Furthermore, we present efficient implementations of the greedy policy, namely GreedyTree and GreedyDAG, that can quickly categorize the object in practice. Extensive experiments in real-world scenarios are carried out to demonstrate the superiority of our proposed methods.

preprint2022arXiv

Efficient k-clique Listing with Set Intersection Speedup [Technical Report]

Listing all k-cliques is a fundamental problem in graph mining, with applications in finance, biology, and social network analysis. However, owing to the exponential growth of the search space as k increases, listing all k-cliques is algorithmically challenging. DDegree and DDegCol are the state-of-the-art algorithms that exploit ordering heuristics based on degree ordering and color ordering, respectively. Both DDegree and DDegCol induce high time and space overhead for set intersections cause they construct and maintain all induced subgraphs. Meanwhile, it is non-trivial to implement the data level parallelism to further accelerate on DDegree and DDegCol. In this paper, we propose two efficient algorithms SDegree and BitCol for k-clique listing. We mainly focus on accelerating the set intersections for k-clique listing. Both SDegree and BitCol exploit the data level parallelism for further acceleration with single instruction multiple data (SIMD) or vector instruction sets. Furthermore, we propose two preprocessing techniques Pre-Core and Pre-List, which run in linear time. The preprocessing techniques significantly reduce the size of the original graph and prevent exploring a large number of invalid nodes. In the theoretical analysis, our algorithms have a comparable time complexity and a slightly lower space complexity than the state-of-the-art algorithms. The comprehensive experiments reveal that our algorithms outperform the state-of-the-art algorithms by 3.75x for degree ordering and 5.67x for color ordering on average.

preprint2022arXiv

Electron correlations and charge density wave in the topological kagome metal FeGe

Charge order in kagome metals is of extensive current interest. Recently, a charge density wave was discovered in the magnetic binary kagome metal FeGe. In analogy to its predecessor, the non-magnetic $A$V$_3$Sb$_5$ ($A$=K, Cs, Rb), the in-plane ordering occurs at the $M$ point. In contrast, however, the system manifestly shows effects of substantial correlations. Here we identify the topological bands crossing the Fermi energy (E$_F$) in FeGe and characterize the correlation-induced renormalization of these bands. We then derive a charge order from an effective model comprising topological kagome `flat' bands in the presence of a magnetic order. We demonstrate edge states as well as excess out-of-plane magnetic moment associated with the charge order; both are fingerprints of non-trivial band topology and consistent with recent experimental observations. Our results point to FeGe as an ideal platform to realize and elucidate correlated topological physics.

preprint2022arXiv

GridTuner: Reinvestigate Grid Size Selection for Spatiotemporal Prediction Models [Technical Report]

With the development of traffic prediction technology, spatiotemporal prediction models have attracted more and more attention from academia communities and industry. However, most existing researches focus on reducing model's prediction error but ignore the error caused by the uneven distribution of spatial events within a region. In this paper, we study a region partitioning problem, namely optimal grid size selection problem (OGSS), which aims to minimize the real error of spatiotemporal prediction models by selecting the optimal grid size. In order to solve OGSS, we analyze the upper bound of real error of spatiotemporal prediction models and minimize the real error by minimizing its upper bound. Through in-depth analysis, we find that the upper bound of real error will decrease then increase when the number of model grids increase from 1 to the maximum allowed value. Then, we propose two algorithms, namely Ternary Search and Iterative Method, to automatically find the optimal grid size. Finally, the experiments verify that the error of prediction has the same trend as its upper bound, and the change trend of the upper bound of real error with respect to the increase of the number of model grids will decrease then increase. Meanwhile, in a case study, by selecting the optimal grid size, the order dispatching results of a state-of-the-art prediction-based algorithm can be improved up to 13.6%, which shows the effectiveness of our methods on tuning the region partition for spatiotemporal prediction models.

preprint2022arXiv

Hot 2DHG states in tellurium

Element semiconductor Te is very popular in both fundamental electronic structure study, and device fabrication research area due to its unique band structure. Specifically, in low temperatures, Te possesses strong quantum oscillations with magnetic field applied in basal plane, either following Shubnikov-de Haas (SdH) oscillation rule or following log-periodic oscillation rule. With magnetic field applied along the [001] direction, the SdH oscillations are attributed to the two-dimensional hole gas (2DHG) surface states. Here we reported an interesting SdH oscillation in Te-based single crystals, with the magnetic field applied along the [001] direction of the crystals, showing the maximum oscillation intensity at ~ 75 K, and still traceable at 200 K, which indicates a rather hot 2DHG state. The nontrivial Berry phase can be also obtained from the oscillations, implying the contribution from topological states. More importantly, the high temperature SdH oscillation phenomena are observed in different Te single crystals samples, and Te single crystals with nonmagnetic/magnetic dopants, showing robustness to bulk defects. Therefore, the oscillation may be contributed by the bulk symmetry protected hot 2DHG states, which will offer a new platform for high-temperature quantum transport studies.

preprint2022arXiv

Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering

To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR). However, there still remains a large discrepancy between the provided upstream signals and the downstream question-passage relevance, which leads to less improvement. To bridge this gap, we propose the HyperLink-induced Pre-training (HLP), a method to pre-train the dense retriever with the text relevance induced by hyperlink-based topology within Web documents. We demonstrate that the hyperlink-based structures of dual-link and co-mention can provide effective relevance signals for large-scale pre-training that better facilitate downstream passage retrieval. We investigate the effectiveness of our approach across a wide range of open-domain QA datasets under zero-shot, few-shot, multi-hop, and out-of-domain scenarios. The experiments show our HLP outperforms the BM25 by up to 7 points as well as other pre-training methods by more than 10 points in terms of top-20 retrieval accuracy under the zero-shot scenario. Furthermore, HLP significantly outperforms other pre-training methods under the other scenarios.

preprint2022arXiv

Improving Sequential Latent Variable Models with Autoregressive Flows

We propose an approach for improving sequence modeling based on autoregressive normalizing flows. Each autoregressive transform, acting across time, serves as a moving frame of reference, removing temporal correlations, and simplifying the modeling of higher-level dynamics. This technique provides a simple, general-purpose method for improving sequence modeling, with connections to existing and classical techniques. We demonstrate the proposed approach both with standalone flow-based models and as a component within sequential latent variable models. Results are presented on three benchmark video datasets, where autoregressive flow-based dynamics improve log-likelihood performance over baseline models. Finally, we illustrate the decorrelation and improved generalization properties of using flow-based dynamics.

preprint2022arXiv

Investigating Accuracy-Novelty Performance for Graph-based Collaborative Filtering

Recent years have witnessed the great accuracy performance of graph-based Collaborative Filtering (CF) models for recommender systems. By taking the user-item interaction behavior as a graph, these graph-based CF models borrow the success of Graph Neural Networks (GNN), and iteratively perform neighborhood aggregation to propagate the collaborative signals. While conventional CF models are known for facing the challenges of the popularity bias that favors popular items, one may wonder "Whether the existing graph-based CF models alleviate or exacerbate popularity bias of recommender systems?" To answer this question, we first investigate the two-fold performances w.r.t. accuracy and novelty for existing graph-based CF methods. The empirical results show that symmetric neighborhood aggregation adopted by most existing graph-based CF models exacerbate the popularity bias and this phenomenon becomes more serious as the depth of graph propagation increases. Further, we theoretically analyze the cause of popularity bias for graph-based CF. Then, we propose a simple yet effective plugin, namely r-AdjNorm, to achieve an accuracy-novelty trade-off by controlling the normalization strength in the neighborhood aggregation process. Meanwhile, r-AdjNorm can be smoothly applied to the existing graph-based CF backbones without additional computation. Finally, experimental results on three benchmark datasets show that our proposed method can improve novelty without sacrificing accuracy under various graph-based CF backbones.

preprint2022arXiv

KMIR: A Benchmark for Evaluating Knowledge Memorization, Identification and Reasoning Abilities of Language Models

Previous works show the great potential of pre-trained language models (PLMs) for storing a large amount of factual knowledge. However, to figure out whether PLMs can be reliable knowledge sources and used as alternative knowledge bases (KBs), we need to further explore some critical features of PLMs. Firstly, knowledge memorization and identification abilities: traditional KBs can store various types of entities and relationships; do PLMs have a high knowledge capacity to store different types of knowledge? Secondly, reasoning ability: a qualified knowledge source should not only provide a collection of facts, but support a symbolic reasoner. Can PLMs derive new knowledge based on the correlations between facts? To evaluate these features of PLMs, we propose a benchmark, named Knowledge Memorization, Identification, and Reasoning test (KMIR). KMIR covers 3 types of knowledge, including general knowledge, domain-specific knowledge, and commonsense, and provides 184,348 well-designed questions. Preliminary experiments with various representative pre-training language models on KMIR reveal many interesting phenomenons: 1) The memorization ability of PLMs depends more on the number of parameters than training schemes. 2) Current PLMs are struggling to robustly remember the facts. 3) Model compression technology retains the amount of knowledge well, but hurts the identification and reasoning abilities. We hope KMIR can facilitate the design of PLMs as better knowledge sources.

preprint2022arXiv

MATrIX -- Modality-Aware Transformer for Information eXtraction

We present MATrIX - a Modality-Aware Transformer for Information eXtraction in the Visual Document Understanding (VDU) domain. VDU covers information extraction from visually rich documents such as forms, invoices, receipts, tables, graphs, presentations, or advertisements. In these, text semantics and visual information supplement each other to provide a global understanding of the document. MATrIX is pre-trained in an unsupervised way with specifically designed tasks that require the use of multi-modal information (spatial, visual, or textual). We consider the spatial and text modalities all at once in a single token set. To make the attention more flexible, we use a learned modality-aware relative bias in the attention mechanism to modulate the attention between the tokens of different modalities. We evaluate MATrIX on 3 different datasets each with strong baselines.

preprint2022arXiv

Modeling mandatory and discretionary lane changes using dynamic interaction networks

A quantitative understanding of dynamic lane-changing (LC) interaction patterns is indispensable for improving the decision-making of autonomous vehicles, especially in mixed traffic with human-driven vehicles. This paper develops a novel framework combining the hidden Markov model and graph structure to identify the difference in dynamic interaction networks between mandatory lane changes (MLC) and discretionary lane changes (DLC). A hidden Markov model is developed to decompose LC interactions into homogenous segments and reveal the temporal properties of these segments. Then, conditional mutual information is used to quantify the interaction intensity, and the graph structure is used to characterize the connectivity between vehicles. Finally, the critical vehicle in each dynamic interaction network is identified. Based on the LC events extracted from the INTERACTION dataset, the proposed analytical framework is applied to modeling MLC and DLC under congested traffic with levels of service E and F. The results show that there are multiple heterogeneous dynamic interaction network structures in an LC process. A comparison of MLC and DLC demonstrates that MLC are more complex, while DLC are more random. The complexity of MLC is attributed to the intense interaction and frequent transition of the interaction network structure, while the random DLC demonstrate no obvious evolution rules and dominant vehicles in interaction networks. The findings in this study are useful for understanding the connectivity structure between vehicles in LC interactions, and for designing appropriate and well-directed driving decision-making models for autonomous vehicles and advanced driver-assistance systems.

preprint2022arXiv

Nanodiamonds based optical-fiber quantum probe for magnetic field and biological sensing

Owing to the unique electronic spin properties, the nitrogen-vacancy (NV) centers hosted in diamond have emerged as a powerful quantum sensor for various physical parameters and biological species. In this work, a miniature optical-fiber quantum probe, configured by chemically-modifying nanodiamonds NV centers on the surface of a cone fiber tip, is developed. Based on continue-wave optically detected magnetic resonance method and lock-in amplifying technique, it is found that the sensing performance of the probe can be engineered by varying the nanodiamonds dispersion concentration and modification duration in the chemical modification process. Combined with a pair of magnetic flux concentrators, the magnetic field detection sensitivity of the probe is significantly enhanced to 0.57 nT/Hz1/2 @ 1Hz, a new record among the fiber magnetometers based on nanodiamonds NV. Taking Gd3+ as the demo, the capability of the probe in paramagnetic species detection is also demonstrated experimentally. Our work provides a new approach to develop NV center as quantum probe featuring high integration, miniature size, multifunction, and high sensitivity, etc.

preprint2022arXiv

Nielsen Realization for sphere twists on 3-manifolds

For a 3-manifold M, the twist group Twist(M) is the subgroup of the mapping class group Mod(M) generated by twists about embedded 2-spheres. We study the Nielsen realization problem for subgroups of Twist(M). We prove that a nontrivial subgroup G<Twist(M) is realized by diffeomorphisms if and only if G is cyclic and M is a connected sum of lens spaces. We also apply our methods to the Burnside problem for 3-manifolds and show that Diff(M) does not contain an infinite torsion group when M is reducible and not a connected sum of lens spaces.

preprint2022arXiv

Non-realizability of some big mapping class groups

In this note, we prove that the compactly supported mapping class group of a surface containing a genus $3$ subsurface has no realization as a subgroup of the homeomorphism group. We also prove that for certain surfaces with order $6$ symmetries, their mapping class groups have no realization as a subgroup of the homeomorphism group. Examples of such surfaces include the plane minus a Cantor set and the sphere minus a Cantor set.

preprint2022arXiv

Orbital-selective Mott phase as a dehybridization fixed point

Studies on the iron-based superconductors and related strongly correlated systems have focused attention on bad-metal normal state in proximity to antiferromagnetic order. An orbital-selective Mott phase (OSMP) has been extensively discussed as anchoring the orbital-selective correlation phenomena in this regime. Motivated by recent experiments, we advance the notion that an OSMP is synonymous to correlation-driven dehybridization. This idea is developed in terms of a competition between inter-orbital hopping and dynamical spatial spin correlations. Within effective models that arise from extended dynamical mean-field theory (EDMFT), and using a combination of continuous-time quantum Monte Carlo and analytical methods, we show how the OSMP emerges as a stable dehybridization fixed point. Concomitantly, the stability of the OSMP is demonstrated. Connections of this mechanism with partial localization-delocalization transition in other strongly correlated metals are discussed.

preprint2022arXiv

PSP: Million-level Protein Sequence Dataset for Protein Structure Prediction

Proteins are essential component of human life and their structures are important for function and mechanism analysis. Recent work has shown the potential of AI-driven methods for protein structure prediction. However, the development of new models is restricted by the lack of dataset and benchmark training procedure. To the best of our knowledge, the existing open source datasets are far less to satisfy the needs of modern protein sequence-structure related research. To solve this problem, we present the first million-level protein structure prediction dataset with high coverage and diversity, named as PSP. This dataset consists of 570k true structure sequences (10TB) and 745k complementary distillation sequences (15TB). We provide in addition the benchmark training procedure for SOTA protein structure prediction model on this dataset. We validate the utility of this dataset for training by participating CAMEO contest in which our model won the first place. We hope our PSP dataset together with the training benchmark can enable a broader community of AI/biology researchers for AI-driven protein related research.

preprint2022arXiv

Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search

Sequential recommender systems (SRS) have become a research hotspot due to its power in modeling user dynamic interests and sequential behavioral patterns. To maximize model expressive ability, a default choice is to apply a larger and deeper network architecture, which, however, often brings high network latency when generating online recommendations. Naturally, we argue that compressing the heavy recommendation models into middle- or light- weight neural networks is of great importance for practical production systems. To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by using differentiable Neural Architecture Search (NAS). Specifically, we introduce a target-oriented distillation loss to guide the structure search process for finding the student network architecture, and a cost-sensitive loss as constraints for model size, which achieves a superior trade-off between recommendation effectiveness and efficiency. In addition, we leverage Earth Mover&#39;s Distance (EMD) to realize many-to-many layer mapping during knowledge distillation, which enables each intermediate student layer to learn from other intermediate teacher layers adaptively. Extensive experiments on real-world recommendation datasets demonstrate that our model achieves competitive or better accuracy with notable inference speedup comparing to strong counterparts, while discovering diverse neural architectures for sequential recommender models under different recommendation scenes.

preprint2022arXiv

Security measurement of a medical communication scheme based on chaos and DNA coding

To encrypt sensitive information existing in a color DICOM images, a medical privacy protection scheme (called as MPPS) based on chaos and DNA coding was proposed by using two coupled chaotic systems to produce cryptographic primitives. Relying on some empirical analyses and experimental results, the designers of MPPS claimed that it can withstand a chosen-plaintext attack and some other classic attacking models. However, this statement is groundless. In this paper, we investigate the essential properties of MPPS and DNA coding, and we then propose an efficient chosen-plaintext attack to disclose its equivalent secret-key. The attack only needs $\lceil \log_{256}(3\cdot M\cdot N)\rceil+4$ pair of chosen plain-images and the corresponding cipher-images, where $M \times N$ and ``3&#34; are the size of the RGB color image and the number of color channels, respectively. In addition, the other claimed superiorities are questioned from the perspective of modern cryptography. Both theoretical and experimental results are presented to support the efficiency of the proposed attack and the other reported security faults. The proposed cryptanalysis results will promote the proper application of DNA encoding to protect multimedia privacy data, especially that in a DICOM image.

preprint2022arXiv

SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation

Since context modeling is critical for estimating depth from a single image, researchers put tremendous effort into obtaining global context. Many global manipulations are designed for traditional CNN-based architectures to overcome the locality of convolutions. Attention mechanisms or transformers originally designed for capturing long-range dependencies might be a better choice, but usually complicates architectures and could lead to a decrease in inference speed. In this work, we propose a pure transformer architecture called SideRT that can attain excellent predictions in real-time. In order to capture better global context, Cross-Scale Attention (CSA) and Multi-Scale Refinement (MSR) modules are designed to work collaboratively to fuse features of different scales efficiently. CSA modules focus on fusing features of high semantic similarities, while MSR modules aim to fuse features at corresponding positions. These two modules contain a few learnable parameters without convolutions, based on which a lightweight yet effective model is built. This architecture achieves state-of-the-art performances in real-time (51.3 FPS) and becomes much faster with a reasonable performance drop on a smaller backbone Swin-T (83.1 FPS). Furthermore, its performance surpasses the previous state-of-the-art by a large margin, improving AbsRel metric 6.9% on KITTI and 9.7% on NYU. To the best of our knowledge, this is the first work to show that transformer-based networks can attain state-of-the-art performance in real-time in the single image depth estimation field. Code will be made available soon.

preprint2022arXiv

Structure theorems for actions of homeomorphism groups

We give general classification and structure theorems for actions of groups of homeomorphisms and diffeomorphisms on manifolds, reminiscent of classical results for actions of (locally) compact groups. This gives a negative answer to Ghys&#39; &#34;extension problem&#34; for diffeomorphisms of manifolds with boundary, as well as a classification of all homomorphisma $\mathrm{Homeo}_0(M) \to \mathrm{Homeo}_0(N)$ when dim(M) = dim(N) (and related results for diffeomorphisms), and a complete classification of actions of $\mathrm{Homeo}_0(S^1)$ on surfaces. This resolves many problems in a program initiated by Ghys, and gives definitive answers to conjectures of Militon and Hurtado and a question of Rubin.

preprint2022arXiv

The rationality about the assumption that the signal and decoy states are indistinguishable in decoy-state quantum key distribution

Decoy-state quantum key distribution (QKD) has become the most efficient method to resist the photon-number-splitting (PNS) attack and estimate the secure key rate. The decoy-state method has many assumptions, among which a critical one is that an eavesdropper (Eve) cannot distinguish between the signal and decoy states. However, a rigorous proof of the rationality about this assumption is not yet available so far. In fact, due to the difference of photon-number probability distribution between the signal and decoy states, Eve is able to distinguish the two states with a certain probability. In this work, we adopt the Bayesian decision to distinguish the signal and decoy states in one-decoy-state QKD, and perform different PNS attack strategies for the two states according to the previous decision. The numerical simulations indicate that the attack effect is not obvious or even failed. Thus, it is reasonable to assume that the signal and decoy states are indistinguishable in decoy-state QKD. In addition, we also provide the method to set the intensities of signal and decoy states properly, which can not only reduce the preparation cost and improve the communication efficiency, but also avoid the attack from Eve using the intensity difference between the signal and decoy states.

preprint2022arXiv

Topological semimetals without quasiparticles

The interplay between interactions and topology in quantum materials is of extensive current interest. Strong correlations are known to be important for insulating topological states, as exemplified by the fractional quantum Hall effect. For the metallic case, whether and how they can drive topological states that have no free-electron counterparts is an open and pressing question. We introduce a general framework for lattice symmetries to constrain single-particle excitations even when they are not quasiparticles, and substantiate it in a periodic Anderson model with two channels of conduction electrons. We demonstrate that symmetry constrains correlation-induced emergent excitations to produce non-Fermi liquid topological phases. The loss of quasiparticles in these phases is manifested in a non-Fermi liquid form of spectral and transport properties, whereas its topological nature is characterized by surface states and valley and spin Hall conductivities. We also identify candidate materials to realize the proposed phases. Our work opens a door to a variety of non-Fermi liquid topological phases in a broad range of strongly correlated materials.

preprint2022arXiv

Use of Transmission and Reflection Complex Time Delays to Reveal Scattering Matrix Poles and Zeros: Example of the Ring Graph

We identify the poles and zeros of the scattering matrix of a simple quantum graph by means of systematic measurement and analysis of Wigner, transmission, and reflection complex time delays. We examine the ring graph because it displays both shape and Feshbach resonances, the latter of which arises from an embedded eigenstate on the real frequency axis. Our analysis provides a unified understanding of the so-called shape, Feshbach, electromagnetically-induced transparency, and Fano resonances, on the basis of the distribution of poles and zeros of the scattering matrix in the complex frequency plane. It also provides a first-principles understanding of sharp resonant scattering features, and associated large time delay, in a variety of practical devices, including photonic microring resonators, microwave ring resonators, and mesoscopic ring-shaped conductor devices. Our analysis is the first use of reflection time difference, as well as the first comprehensive use of complex time delay, to analyze experimental scattering data.

preprint2021arXiv

Certification of Genuine Multipartite Entanglement with General and Robust Device-independent Witnesses

Genuine multipartite entanglement represents the strongest type of entanglement, which is an essential resource for quantum information processing. Standard methods to detect genuine multipartite entanglement, e.g., entanglement witnesses, state tomography, or quantum state verification, require full knowledge of the Hilbert space dimension and precise calibration of measurement devices, which are usually difficult to acquire in an experiment. The most radical way to overcome these problems is to detect entanglement solely based on the Bell-like correlations of measurement outcomes collected in the experiment, namely, device-independently (DI). However, it is difficult to certify genuine entanglement of practical multipartite states in this way, and even more difficult to quantify it, due to the difficulty to identify optimal multipartite Bell inequalities and protocols tolerant to state impurity. In this work, we explore a general and robust DI method which can be applied to various realistic multipartite quantum state in arbitrary finite dimension, while merely relying on bipartite Bell inequalities. Our method allows us both to certify the presence of genuine multipartite entanglement and to quantify it. Several important classes of entangled states are tested with this method, leading to the detection of genuinely entangled states. We also certify genuine multipartite entanglement in weakly-entangled GHZ states, thus showing that the method applies equally well to less standard states.

preprint2021arXiv

Chaotic dynamics of Bose-Einstein condensate in a density-dependent gauge field

In this work we study the effect of density-dependent gauge field on the collective dynamics of a harmonically trapped Bose-Einstein condensate, beyond the linear response regime. The densitydependent gauge field, as a backaction of the condensate, can in turn affect the condensate dynamics, resulting in highly nonlinear equations of motion. We find that the dipole and breathing oscillations of the condensate along the direction of gauge field are coupled by this field. For a quasi-onedimensional condensate, this coupling makes the collective motion quasiperiodic. While for a quasitwo-dimensional condensate, the gauge field can also induce a Hall effect, manifested as an additional coupling between dipole and breathing oscillations in perpendicular direction. When the densitydependent gauge field is strong, the interplay between these oscillations can cause the collective dynamics of the condensate to become chaotic. Our findings reveal an important effect of dynamical gauge field on the nonlinear dynamics of a Bose-Einstein condensate.

preprint2021arXiv

Far-Field Super-Resolution Imaging By Nonlinear Excited Evanescent Waves

Abbe&#39;s resolution limit, one of the best-known physical limitations, poses a great challenge for any wave systems in imaging, wave transport, and dynamics. Originally formulated in linear optics, this Abbe&#39;s limit can be broken using nonlinear optical interactions. Here we extend the Abbe theory into a nonlinear regime and experimentally demonstrate a far-field, label-free, and scan-free super-resolution imaging technique based on nonlinear four-wave mixing to retrieve near-field scattered evanescent waves, achieving sub-wavelength resolution of $λ/15.6$. This method paves the way for application in biomedical imaging, semiconductor metrology, and photolithography.

preprint2021arXiv

Observation of robust edge superconductivity in Fe(Se,Te) under strong magnetic perturbation

The iron-chalcogenide high temperature superconductor Fe(Se,Te) (FST) has been reported to exhibit complex magnetic ordering and nontrivial band topology which may lead to novel superconducting phenomena. However, the recent studies have so far been largely concentrated on its band and spin structures while its mesoscopic electronic and magnetic response, crucial for future device applications, has not been explored experimentally. Here, we used scanning superconducting quantum interference device microscopy for its sensitivity to both local diamagnetic susceptibility and current distribution in order to image the superfluid density and supercurrent in FST. We found that in FST with 10% interstitial Fe, whose magnetic structure was heavily disrupted, bulk superconductivity was significantly suppressed whereas edge still preserved strong superconducting diamagnetism. The edge dominantly carried supercurrent despite of a very long magnetic penetration depth. The temperature dependence of the superfluid density and supercurrent distribution were distinctively different between the edge and the bulk. Our Heisenberg modeling showed that magnetic dopants stabilize anti-ferromagnetic spin correlation along the edge, which may contribute towards its robust superconductivity. Our observations hold implication for FST as potential platforms for topological quantum computation and superconducting spintronics.

preprint2021arXiv

Proposal for measuring Newtonian constant of gravitation at an exceptional point in an optomechanical system

We develop a quantum mechanical method of measuring the Newtonian constant of gravitation, G. In this method, an optomechanical system consisting of two cavities and two membrane resonators is used. The added source mass would induce the shifts of the eigenfrequencies of the supermodes. Via detecting the shifts, we can perform our measurement of G. Furthermore, our system can features exceptional point (EP) which are branch point singularities of the spectrum and eigenfunctions. In the paper, we demonstrate that operating the system at EP can enhance our measurement of G. In addition, we derive the relationship between EP enlarged eigenfrequency shift and the Newtonian constant. This work provides a way to engineer EP-assisted optomechanical devices for applications in the field of precision measurement of G

preprint2021arXiv

Revealing the intrinsic superconducting gap anisotropy in surface-neutralized BaFe$_2$(As$_{0.7}$P$_{0.3}$)$_2$

Alkaline-earth iron arsenide (122) is one of the most studied families of iron-based superconductors, especially for angle-resolved photoemission spectroscopy. While extensive photoemission results have been obtained, the surface complexity of 122 caused by its charge-non-neutral surface is rarely considered. Here, we show that the surface of 122 can be neutralized by potassium deposition. In potassium-coated BaFe$_2$(As$_{0.7}$P$_{0.3}$)$_2$, the surface-induced spectral broadening is strongly suppressed, and hence the coherent spectra that reflect the intrinsic bulk electronic state recover. This enables the measuring of superconducting gap with unpreceded precision. The result shows the existence of two pairing channels. While the gap anisotropy on the outer hole/electron pockets can be well fitted using an s$_\pm$ gap function, the gap anisotropy on the inner hole/electron shows a clear deviation. Our results provide quantitative constraints for refining theoretical models and also demonstrate an experimental method for revealing the intrinsic electronic properties of 122 in future studies.

preprint2021arXiv

UniNet: Scalable Network Representation Learning with Metropolis-Hastings Sampling

Network representation learning (NRL) technique has been successfully adopted in various data mining and machine learning applications. Random walk based NRL is one popular paradigm, which uses a set of random walks to capture the network structural information, and then employs word2vec models to learn the low-dimensional representations. However, until now there is lack of a framework, which unifies existing random walk based NRL models and supports to efficiently learn from large networks. The main obstacle comes from the diverse random walk models and the inefficient sampling method for the random walk generation. In this paper, we first introduce a new and efficient edge sampler based on Metropolis-Hastings sampling technique, and theoretically show the convergence property of the edge sampler to arbitrary discrete probability distributions. Then we propose a random walk model abstraction, in which users can easily define different transition probability by specifying dynamic edge weights and random walk states. The abstraction is efficiently supported by our edge sampler, since our sampler can draw samples from unnormalized probability distribution in constant time complexity. Finally, with the new edge sampler and random walk model abstraction, we carefully implement a scalable NRL framework called UniNet. We conduct comprehensive experiments with five random walk based NRL models over eleven real-world datasets, and the results clearly demonstrate the efficiency of UniNet over billion-edge networks.

preprint2020arXiv

2-Entity RANSAC for robust visual localization in changing environment

Visual localization has attracted considerable attention due to its low-cost and stable sensor, which is desired in many applications, such as autonomous driving, inspection robots and unmanned aerial vehicles. However, current visual localization methods still struggle with environmental changes across weathers and seasons, as there is significant appearance variation between the map and the query image. The crucial challenge in this situation is that the percentage of outliers, i.e. incorrect feature matches, is high. In this paper, we derive minimal closed form solutions for 3D-2D localization with the aid of inertial measurements, using only 2 pairs of point matches or 1 pair of point match and 1 pair of line match. These solutions are further utilized in the proposed 2-entity RANSAC, which is more robust to outliers as both line and point features can be used simultaneously and the number of matches required for pose calculation is reduced. Furthermore, we introduce three feature sampling strategies with different advantages, enabling an automatic selection mechanism. With the mechanism, our 2-entity RANSAC can be adaptive to the environments with different distribution of feature types in different segments. Finally, we evaluate the method on both synthetic and real-world datasets, validating its performance and effectiveness in inter-session scenarios.

preprint2020arXiv

Adapting Grad-CAM for Embedding Networks

The gradient-weighted class activation mapping (Grad-CAM) method can faithfully highlight important regions in images for deep model prediction in image classification, image captioning and many other tasks. It uses the gradients in back-propagation as weights (grad-weights) to explain network decisions. However, applying Grad-CAM to embedding networks raises significant challenges because embedding networks are trained by millions of dynamically paired examples (e.g. triplets). To overcome these challenges, we propose an adaptation of the Grad-CAM method for embedding networks. First, we aggregate grad-weights from multiple training examples to improve the stability of Grad-CAM. Then, we develop an efficient weight-transfer method to explain decisions for any image without back-propagation. We extensively validate the method on the standard CUB200 dataset in which our method produces more accurate visual attention than the original Grad-CAM method. We also apply the method to a house price estimation application using images. The method produces convincing qualitative results, showcasing the practicality of our approach.

preprint2020arXiv

AutoSF: Searching Scoring Functions for Knowledge Graph Embedding

Scoring functions (SFs), which measure the plausibility of triplets in knowledge graph (KG), have become the crux of KG embedding. Lots of SFs, which target at capturing different kinds of relations in KGs, have been designed by humans in recent years. However, as relations can exhibit complex patterns that are hard to infer before training, none of them can consistently perform better than others on existing benchmark data sets. In this paper, inspired by the recent success of automated machine learning (AutoML), we propose to automatically design SFs (AutoSF) for distinct KGs by the AutoML techniques. However, it is non-trivial to explore domain-specific information here to make AutoSF efficient and effective. We firstly identify a unified representation over popularly used SFs, which helps to set up a search space for AutoSF. Then, we propose a greedy algorithm to search in such a space efficiently. The algorithm is further sped up by a filter and a predictor, which can avoid repeatedly training SFs with same expressive ability and help removing bad candidates during the search before model training. Finally, we perform extensive experiments on benchmark data sets. Results on link prediction and triplets classification show that the searched SFs by AutoSF, are KG dependent, new to the literature, and outperform the state-of-the-art SFs designed by humans.

preprint2020arXiv

Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting

This work proposes a novel approach for multiple time series forecasting. At first, multi-way delay embedding transform (MDT) is employed to represent time series as low-rank block Hankel tensors (BHT). Then, the higher-order tensors are projected to compressed core tensors by applying Tucker decomposition. At the same time, the generalized tensor Autoregressive Integrated Moving Average (ARIMA) is explicitly used on consecutive core tensors to predict future samples. In this manner, the proposed approach tactically incorporates the unique advantages of MDT tensorization (to exploit mutual correlations) and tensor ARIMA coupled with low-rank Tucker decomposition into a unified framework. This framework exploits the low-rank structure of block Hankel tensors in the embedded space and captures the intrinsic correlations among multiple TS, which thus can improve the forecasting results, especially for multiple short time series. Experiments conducted on three public datasets and two industrial datasets verify that the proposed BHT-ARIMA effectively improves forecasting accuracy and reduces computational cost compared with the state-of-the-art methods.

preprint2020arXiv

CoinMagic: A Differential Privacy Framework for Ring Signature Schemes

By allowing users to obscure their transactions via including &#34;mixins&#34; (chaff coins), ring signature schemes have been widely used to protect a sender&#39;s identity of a transaction in privacy-preserving blockchain systems, like Monero and Bytecoin. However, recent works point out that the existing ring signature scheme is vulnerable to the &#34;chain-reaction&#34; analysis (i.e., the spent coin in a given ring signature can be deduced through elimination). Especially, when the diversity of mixins is low, the spent coin will have a high risk to be detected. To overcome the weakness, the ring signature should be consisted of a set of mixins with high diversity and produce observations having &#34;similar&#34; distributions for any two coins. In this paper, we propose a notion, namely $ε$-coin-indistinguishability ($ε$-CI), to formally define the &#34;similar&#34; distribution guaranteed through a differential privacy scheme. Then, we formally define the CI-aware mixins selection problem with disjoint-superset constraint (CIA-MS-DS), which aims to find a mixin set that has maximal diversity and satisfies the constraints of $ε$-CI and the budget. In CIA-MS-DS, each ring signature is either disjoint with or the superset of its preceding ring signatures. We prove that CIA-MS-DS is NP-hard and thus intractable. To solve the CIA-MS-DS problem, we propose two approximation algorithms, namely the Progressive Algorithm and the Game Theoretic Algorithm, with theoretic guarantees. Through extensive experiments on both real data sets and synthetic data sets, we demonstrate the efficiency and the effectiveness of our approaches.

preprint2020arXiv

Consistent and Complementary Graph Regularized Multi-view Subspace Clustering

This study investigates the problem of multi-view clustering, where multiple views contain consistent information and each view also includes complementary information. Exploration of all information is crucial for good multi-view clustering. However, most traditional methods blindly or crudely combine multiple views for clustering and are unable to fully exploit the valuable information. Therefore, we propose a method that involves consistent and complementary graph-regularized multi-view subspace clustering (GRMSC), which simultaneously integrates a consistent graph regularizer with a complementary graph regularizer into the objective function. In particular, the consistent graph regularizer learns the intrinsic affinity relationship of data points shared by all views. The complementary graph regularizer investigates the specific information of multiple views. It is noteworthy that the consistent and complementary regularizers are formulated by two different graphs constructed from the first-order proximity and second-order proximity of multiple views, respectively. The objective function is optimized by the augmented Lagrangian multiplier method in order to achieve multi-view clustering. Extensive experiments on six benchmark datasets serve to validate the effectiveness of the proposed method over other state-of-the-art multi-view clustering methods.

preprint2020arXiv

DDSL: Efficient Subgraph Listing on Distributed and Dynamic Graphs

Subgraph listing is a fundamental problem in graph theory and has wide applications in areas like sociology, chemistry, and social networks. Modern graphs can usually be large-scale as well as highly dynamic, which challenges the efficiency of existing subgraph listing algorithms. Recent works have shown the benefits of partitioning and processing big graphs in a distributed system, however, there is only few work targets subgraph listing on dynamic graphs in a distributed environment. In this paper, we propose an efficient approach, called Distributed and Dynamic Subgraph Listing (DDSL), which can incrementally update the results instead of running from scratch. DDSL follows a general distributed join framework. In this framework, we use a Neighbor-Preserved storage for data graphs, which takes bounded extra space and supports dynamic updating. After that, we propose a comprehensive cost model to estimate the I/O cost of listing subgraphs. Then based on this cost model, we develop an algorithm to find the optimal join tree for a given pattern. To handle dynamic graphs, we propose an efficient left-deep join algorithm to incrementally update the join results. Extensive experiments are conducted on real-world datasets. The results show that DDSL outperforms existing methods in dealing with both static dynamic graphs in terms of the responding time.

preprint2020arXiv

DR Loss: Improving Object Detection by Distributional Ranking

Most of object detection algorithms can be categorized into two classes: two-stage detectors and one-stage detectors. Recently, many efforts have been devoted to one-stage detectors for the simple yet effective architecture. Different from two-stage detectors, one-stage detectors aim to identify foreground objects from all candidates in a single stage. This architecture is efficient but can suffer from the imbalance issue with respect to two aspects: the inter-class imbalance between the number of candidates from foreground and background classes and the intra-class imbalance in the hardness of background candidates, where only a few candidates are hard to be identified. In this work, we propose a novel distributional ranking (DR) loss to handle the challenge. For each image, we convert the classification problem to a ranking problem, which considers pairs of candidates within the image, to address the inter-class imbalance problem. Then, we push the distributions of confidence scores for foreground and background towards the decision boundary. After that, we optimize the rank of the expectations of derived distributions in lieu of original pairs. Our method not only mitigates the intra-class imbalance issue in background candidates but also improves the efficiency for the ranking algorithm. By merely replacing the focal loss in RetinaNet with the developed DR loss and applying ResNet-101 as the backbone, mAP of the single-scale test on COCO can be improved from 39.1% to 41.7% without bells and whistles, which demonstrates the effectiveness of the proposed loss function. Code is available at \url{https://github.com/idstcv/DR_loss}.

preprint2020arXiv

Fragile Insulator and Electronic Nematicity in a Graphene Moire System

Strongly correlated quantum matter exhibits a rich variety of remarkable properties, but the organizing principles that underlie the behavior remain to be established. Graphene heterostructures, which can host narrow moire electron bands that amplify the correlation effect, represent a new setting to make progress on this overarching issue. In such correlated moire systems, an insulating state is a prominent feature of the phase diagram and may hold the key to understanding the basic physics. Here we advance the notion of a fragile insulator, a correlation-driven insulating state that is on the verge of a delocalization transition into a bad metal. Using a realistic multiorbital Hubbard model as a prototype for narrow band moire systems, we realize such a fragile insulator and demonstrate a nematic order in this state as well as in the nearby bad metal regime. Our results are consistent with the observed electronic anisotropy in the graphene moire systems and provide a natural understanding of what happens when the insulator is tuned into a bad metal. We propose the fragile insulator and the accompanying bad metal as competing states at integer fillings that analogously anchor the overall phase diagram of the correlated moire systems and beyond.

preprint2020arXiv

Globally optimal consensus maximization for robust visual inertial localization in point and line map

Map based visual inertial localization is a crucial step to reduce the drift in state estimation of mobile robots. The underlying problem for localization is to estimate the pose from a set of 3D-2D feature correspondences, of which the main challenge is the presence of outliers, especially in changing environment. In this paper, we propose a robust solution based on efficient global optimization of the consensus maximization problem, which is insensitive to high percentage of outliers. We first introduce translation invariant measurements (TIMs) for both points and lines to decouple the consensus maximization problem into rotation and translation subproblems, allowing for a two-stage solver with reduced solution dimensions. Then we show that (i) the rotation can be calculated by minimizing TIMs using only 1-dimensional branch-and-bound (BnB), (ii) the translation can be found by running 1-dimensional search for three times with prioritized progressive voting. Compared with the popular randomized solver, our solver achieves deterministic global convergence without depending on an initial value. While compared with existing BnB based methods, ours is exponentially faster. Finally, by evaluating the performance on both simulation and real-world datasets, our approach gives accurate pose even when there are 90\% outliers (only 2 inliers).

preprint2020arXiv

Gravitational waves detection with exceptional points in micro cavities

Here we propose a new gravitational waves(GWs) detector in broad frequency band, which is operated at exceptional points(EPs) in micro cavities. The detected signal is an eigenfrequency split of the mechanical modes caused by the spatial strain. Due to the complex square root topology near the EP, the splitting is greatly enhanced for sufficiently small perturbations. Compared to current strategies, it can be achieved at the room temperature and has advantages in micro device scale, wide frequency band and higher sensitivity.

preprint2020arXiv

Learning to Transfer Graph Embeddings for Inductive Graph based Recommendation

With the increasing availability of videos, how to edit them and present the most interesting parts to users, i.e., video highlight, has become an urgent need with many broad applications. As users&#39;visual preferences are subjective and vary from person to person, previous generalized video highlight extraction models fail to tailor to users&#39; unique preferences. In this paper, we study the problem of personalized video highlight recommendation with rich visual content. By dividing each video into non-overlapping segments, we formulate the problem as a personalized segment recommendation task with many new segments in the test stage. The key challenges of this problem lie in: the cold-start users with limited video highlight records in the training data and new segments without any user ratings at the test stage. In this paper, we propose an inductive Graph based Transfer learning framework for personalized video highlight Recommendation (TransGRec). TransGRec is composed of two parts: a graph neural network followed by an item embedding transfer network. Specifically, the graph neural network part exploits the higher-order proximity between users and segments to alleviate the user cold-start problem. The transfer network is designed to approximate the learned item embeddings from graph neural networks by taking each item&#39;s visual content as input, in order to tackle the new segment problem in the test phase. We design two detailed implementations of the transfer learning optimization function, and we show how the two parts of TransGRec can be efficiently optimized with different transfer learning optimization functions. Extensive experimental results on a real-world dataset clearly show the effectiveness of our proposed model.

preprint2020arXiv

Multi-View Self-Attention for Interpretable Drug-Target Interaction Prediction

The drug discovery stage is a vital aspect of the drug development process and forms part of the initial stages of the development pipeline. In recent times, machine learning-based methods are actively being used to model drug-target interactions for rational drug discovery due to the successful application of these methods in other domains. In machine learning approaches, the numerical representation of molecules is critical to the performance of the model. While significant progress has been made in molecular representation engineering, this has resulted in several descriptors for both targets and compounds. Also, the interpretability of model predictions is a vital feature that could have several pharmacological applications. In this study, we propose a self-attention-based multi-view representation learning approach for modeling drug-target interactions. We evaluated our approach using three benchmark kinase datasets and compared the proposed method to some baseline models. Our experimental results demonstrate the ability of our method to achieve competitive prediction performance and offer biologically plausible drug-target interaction interpretations.

preprint2020arXiv

Noise-Sampling Cross Entropy Loss: Improving Disparity Regression Via Cost Volume Aware Regularizer

Recent end-to-end deep neural networks for disparity regression have achieved the state-of-the-art performance. However, many well-acknowledged specific properties of disparity estimation are omitted in these deep learning algorithms. Especially, matching cost volume, one of the most important procedure, is treated as a normal intermediate feature for the following softargmin regression, lacking explicit constraints compared with those traditional algorithms. In this paper, inspired by previous canonical definition of cost volume, we propose the noise-sampling cross entropy loss function to regularize the cost volume produced by deep neural networks to be unimodal and coherent. Extensive experiments validate that the proposed noise-sampling cross entropy loss can not only help neural networks learn more informative cost volume, but also lead to better stereo matching performance compared with several representative algorithms.

preprint2020arXiv

Non-realizability of the pure braid group as area-preserving homeomorphisms

Let $\text{Homeo}_+(D^2_n)$ be the group of orientation-preserving homeomorphisms of $D^2$ fixing the boundary pointwise and $n$ marked points as a set. Nielsen realization problem for the braid group asks whether the natural projection $p_n:\text{Homeo}_+(D^2_n)\to B_n:=π_0(\text{Homeo}_+(D^2_n))$ has a section over subgroups of $B_n$. All of the previous methods either use torsions or Thurston stability, which do not apply to the pure braid group $PB_n$, the subgroup of $B_n$ that fixes $n$ marked points pointwise. In this paper, we show that the pure braid group has no realization inside the area-preserving homeomorphisms using rotation numbers.

preprint2020arXiv

Personalized Multimedia Item and Key Frame Recommendation

When recommending or advertising items to users, an emerging trend is to present each multimedia item with a key frame image (e.g., the poster of a movie). As each multimedia item can be represented as multiple fine-grained visual images (e.g., related images of the movie), personalized key frame recommendation is necessary in these applications to attract users&#39; unique visual preferences. However, previous personalized key frame recommendation models relied on users&#39; fine-grained image behavior of multimedia items (e.g., user-image interaction behavior), which is often not available in real scenarios. In this paper, we study the general problem of joint multimedia item and key frame recommendation in the absence of the fine-grained user-image behavior. We argue that the key challenge of this problem lies in discovering users&#39; visual profiles for key frame recommendation, as most recommendation models would fail without any users&#39; fine-grained image behavior. To tackle this challenge, we leverage users&#39; item behavior by projecting users (items) in two latent spaces: a collaborative latent space and a visual latent space. We further design a model to discern both the collaborative and visual dimensions of users, and model how users make decisive item preferences from these two spaces. As a result, the learned user visual profiles could be directly applied for key frame recommendation. Finally, experimental results on a real-world dataset clearly show the effectiveness of our proposed model on the two recommendation tasks.

preprint2020arXiv

Quantitative Evaluations on Saliency Methods: An Experimental Study

It has been long debated that eXplainable AI (XAI) is an important topic, but it lacks rigorous definition and fair metrics. In this paper, we briefly summarize the status quo of the metrics, along with an exhaustive experimental study based on them, including faithfulness, localization, false-positives, sensitivity check, and stability. With the experimental results, we conclude that among all the methods we compare, no single explanation method dominates others in all metrics. Nonetheless, Gradient-weighted Class Activation Mapping (Grad-CAM) and Randomly Input Sampling for Explanation (RISE) perform fairly well in most of the metrics. Utilizing a set of filtered metrics, we further present a case study to diagnose the classification bases for models. While providing a comprehensive experimental study of metrics, we also examine measuring factors that are missed in current metrics and hope this valuable work could serve as a guide for future research.

preprint2020arXiv

Revisiting Graph based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering (CF) based Recommender Systems (RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LRGCCF.

preprint2020arXiv

Semi-Anchored Detector for One-Stage Object Detection

A standard one-stage detector is comprised of two tasks: classification and regression. Anchors of different shapes are introduced for each location in the feature map to mitigate the challenge of regression for multi-scale objects. However, the performance of classification can degrade due to the highly class-imbalanced problem in anchors. Recently, many anchor-free algorithms have been proposed to classify locations directly. The anchor-free strategy benefits the classification task but can lead to sup-optimum for the regression task due to the lack of prior bounding boxes. In this work, we propose a semi-anchored framework. Concretely, we identify positive locations in classification, and associate multiple anchors to the positive locations in regression. With ResNet-101 as the backbone, the proposed semi-anchored detector achieves 43.6% mAP on COCO data set, which demonstrates the state-of-art performance among one-stage detectors.

preprint2020arXiv

SPL-MLL: Selecting Predictable Landmarks for Multi-Label Learning

Although significant progress achieved, multi-label classification is still challenging due to the complexity of correlations among different labels. Furthermore, modeling the relationships between input and some (dull) classes further increases the difficulty of accurately predicting all possible labels. In this work, we propose to select a small subset of labels as landmarks which are easy to predict according to input (predictable) and can well recover the other possible labels (representative). Different from existing methods which separate the landmark selection and landmark prediction in the 2-step manner, the proposed algorithm, termed Selecting Predictable Landmarks for Multi-Label Learning (SPL-MLL), jointly conducts landmark selection, landmark prediction, and label recovery in a unified framework, to ensure both the representativeness and predictableness for selected landmarks. We employ the Alternating Direction Method (ADM) to solve our problem. Empirical studies on real-world datasets show that our method achieves superior classification performance over other state-of-the-art methods.

preprint2020arXiv

STAS: Adaptive Selecting Spatio-Temporal Deep Features for Improving Bias Correction on Precipitation

Numerical Weather Prediction (NWP) can reduce human suffering by predicting disastrous precipitation in time. A commonly-used NWP in the world is the European Centre for medium-range weather forecasts (EC). However, it is necessary to correct EC forecast through Bias Correcting on Precipitation (BCoP) since we still have not fully understood the mechanism of precipitation, making EC often have some biases. The existing BCoPs suffers from limited prior data and the fixed Spatio-Temporal (ST) scale. We thus propose an end-to-end deep-learning BCoP model named Spatio-Temporal feature Auto-Selective (STAS) model to select optimal ST regularity from EC via the ST Feature-selective Mechanisms (SFM/TFM). Given different input features, these two mechanisms can automatically adjust the spatial and temporal scales for correcting. Experiments on an EC public dataset indicate that compared with 8 published BCoP methods, STAS shows state-of-the-art performance on several criteria of BCoP, named threat scores (TS). Further, ablation studies justify that the SFM/TFM indeed work well in boosting the performance of BCoP, especially on the heavy precipitation.

preprint2020arXiv

There are no exotic actions of diffeomorphism groups on 1-manifolds

Let $M$ be a manifold, $N$ a 1-dimensional manifold. Assuming $r \neq \dim(M)+1$, we show that any nontrivial homomorphism $ρ: \text{Diff}^r_c(M)\to \text{Homeo}(N)$ has a standard form: necessarily $M$ is $1$-dimensional, and there are countably many embeddings $ϕ_i: M\to N$ with disjoint images such that the action of $ρ$ is conjugate (via the product of the $ϕ_i$) to the diagonal action of $\text{Diff}^r_c(M)$ on $M \times M \times ...$ on $\bigcup_i ϕ_i(M)$, and trivial elsewhere. This solves a conjecture of Matsumoto. We also show that the groups $\text{Diff}^r_c(M)$ have no countable index subgroups.

preprint2020arXiv

U-Net Using Stacked Dilated Convolutions for Medical Image Segmentation

This paper proposes a novel U-Net variant using stacked dilated convolutions for medical image segmentation (SDU-Net). SDU-Net adopts the architecture of vanilla U-Net with modifications in the encoder and decoder operations (an operation indicates all the processing for feature maps of the same resolution). Unlike vanilla U-Net which incorporates two standard convolutions in each encoder/decoder operation, SDU-Net uses one standard convolution followed by multiple dilated convolutions and concatenates all dilated convolution outputs as input to the next operation. Experiments showed that SDU-Net outperformed vanilla U-Net, attention U-Net (AttU-Net), and recurrent residual U-Net (R2U-Net) in all four tested segmentation tasks while using parameters around 40% of vanilla U-Net&#39;s, 17% of AttU-Net&#39;s, and 15% of R2U-Net&#39;s.

preprint2019arXiv

A Hierarchical Attention Model for Social Contextual Image Recommendation

Image based social networks are among the most popular social networking services in recent years. With tremendous images uploaded everyday, understanding users&#39; preferences on user-generated images and making recommendations have become an urgent need. In fact, many hybrid models have been proposed to fuse various kinds of side information~(e.g., image visual representation, social network) and user-item historical behavior for enhancing recommendation performance. However, due to the unique characteristics of the user generated images in social image platforms, the previous studies failed to capture the complex aspects that influence users&#39; preferences in a unified framework. Moreover, most of these hybrid models relied on predefined weights in combining different kinds of information, which usually resulted in sub-optimal recommendation performance. To this end, in this paper, we develop a hierarchical attention model for social contextual image recommendation. In addition to basic latent user interest modeling in the popular matrix factorization based recommendation, we identify three key aspects (i.e., upload history, social influence, and owner admiration) that affect each user&#39;s latent preferences, where each aspect summarizes a contextual factor from the complex relationships between users and images. After that, we design a hierarchical attention network that naturally mirrors the hierarchical relationship (elements in each aspects level, and the aspect level) of users&#39; latent interests with the identified key aspects. Specifically, by taking embeddings from state-of-the-art deep learning models that are tailored for each kind of data, the hierarchical attention network could learn to attend differently to more or less content. Finally, extensive experimental results on real-world datasets clearly show the superiority of our proposed model.

preprint2019arXiv

On the non-realizability of braid groups by homeomorphisms

In this paper, we will show that the projection $\text{Homeo}^+(D^2_n)\to B_n$ does not have a section; i.e. the braid group $B_n$ cannot be geometrically realized as a group of homeomorphisms of a disk fixing the boundary point-wise and $n$ marked points in the interior as a set. We also give a new proof of a result of Markovic that the mapping class group of a closed surface cannot be geometrically realized as a group of homeomorphisms.

preprint2019arXiv

Spectral evolution and radial dust transport in the prototype young eruptive system EX Lup

EX Lup is the prototype of a class of pre-main sequence eruptive stars defined by their repetitive outbursts lasting several months. In 2008 January-September EX Lup underwent its historically largest outburst, brightening by about 4 magnitudes in visual light. In previous studies we discovered on-going silicate crystal formation in the inner disk during the outburst, but also noticed that the measured crystallinity fraction started decreasing after the source returned to the quiescent phase. Here we present new observations of the 10 $μ$m silicate feature, obtained with the MIDI and VISIR instruments at Paranal Observatory. The observations demonstrate that within five years practically all crystalline forsterite disappeared from the surface of the inner disk. We reconstruct this process by presenting a series of parametric axisymmetric radiative transfer models of an expanding dust cloud that transports the crystals from the terrestrial zone to outer disk regions where comets are supposed to form. Possibly the early Sun also experienced similar flare-ups, and the forming planetesimals might have incorporated crystalline silicate material produced by such outbursts. Finally, we discuss how far the location of the dust cloud could be constrained by future JWST observations.