Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
60works
0followers
35topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

60 published item(s)

preprint2026arXiv

Electronic Nematicity Revealed by Polarized Ultrafast Spectroscopy in Bilayer La$_3$Ni$_2$O$_7$

We report a polarized ultrafast pump-probe study of the normal-state electronic dynamics in bilayer La$_3$Ni$_2$O$_7$ and trilayer La$_4$Ni$_3$O$_{10}$ single crystals at ambient pressure. While both nickelates exhibit density-wave (DW) transitions accompanied by the opening of a quasiparticle relaxation bottleneck, their electronic responses display strikingly different symmetry properties. La$_4$Ni$_3$O$_{10}$ maintains an isotropic optical response across the entire temperature range. In contrast, La$_3$Ni$_2$O$_7$ exhibits a pronounced twofold ($C_2$) anisotropy in its low-temperature electronic dynamics. This electronic nematicity, evident in both the relaxation dynamics and the effective gap scales, competes with a secondary isotropic order emerging below 115 K. The presence of macroscopic electronic anisotropy in the bilayer system, and its absence in the trilayer system, suggests an intimate relation between electronic nematic fluctuations and superconducting pairing in La$_3$Ni$_2$O$_7$ that worth for deeper explorations.

preprint2026arXiv

Long-horizon prediction of three-dimensional wall-bounded turbulence with CTA-Swin-UNet and resolvent analysis

Long-horizon prediction of three-dimensional (3D) wall-bounded turbulence with machine-learning methods remains a challenging task, due to the rapid accumulation of autoregressive errors and the substantially computational cost. To address these challenges, we present a hybrid machine-learning framework, in which a channel-time-attention Swin-UNet (CTA-Swin-UNet) and a multi-time-scale fusion correction (MTFC) strategy are developed to predict the turbulent flow fields in a wall-parallel plane, with affordable computational cost. Then, 3D flow fields are reconstructed via a resolvent-based spectral linear stochastic estimation (SLSE), rooting from the predicted planar flow. Results show that the CTA-Swin-UNet outperforms the baseline models (LSTM, FNO and traditional Swin-UNet) in both single-step prediction and autoregressive rollouts, indicating the effectiveness of introducing the CTA module into the Swin-UNet architecture. At the same temporal interval, the CTA-Swin-UNet remains stable for approximately 150 rollout steps, while the baseline models fail within 20 to 50 rollout steps. After introducing the MTFC strategy, a longer horizon upto 300 steps is achieved. Using the resolvent-based SLSE reconstruction further recovers the 3D flow structures and energy spectral distributions from the predicted planar inputs, which demonstrates that the proposed framework provides an effective and computationally efficient approach for long-horizon autoregressive prediction of 3D wall-bounded turbulence.

preprint2026arXiv

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

We present NextFlow, a unified decoder-only autoregressive transformer trained on 6 trillion interleaved text-image discrete tokens. By leveraging a unified vision representation within a unified autoregressive architecture, NextFlow natively activates multimodal understanding and generation capabilities, unlocking abilities of image editing, interleaved content and video generation. Motivated by the distinct nature of modalities - where text is strictly sequential and images are inherently hierarchical - we retain next-token prediction for text but adopt next-scale prediction for visual generation. This departs from traditional raster-scan methods, enabling the generation of 1024x1024 images in just 5 seconds - orders of magnitude faster than comparable AR models. We address the instabilities of multi-scale generation through a robust training recipe. Furthermore, we introduce a prefix-tuning strategy for reinforcement learning. Experiments demonstrate that NextFlow achieves state-of-the-art performance among unified models and rivals specialized diffusion baselines in visual quality.

preprint2026arXiv

Unleashing the Native Recommendation Potential: LLM-Based Generative Recommendation via Structured Term Identifiers

Leveraging the vast open-world knowledge and understanding capabilities of Large Language Models (LLMs) to develop general-purpose, semantically-aware recommender systems has emerged as a pivotal research direction in generative recommendation. However, existing methods face bottlenecks in constructing item identifiers. Text-based methods introduce LLMs' vast output space, leading to hallucination, while methods based on Semantic IDs (SIDs) encounter a semantic gap between SIDs and LLMs' native vocabulary, requiring costly vocabulary expansion and alignment training. To address this, this paper introduces Term IDs (TIDs), defined as a set of semantically rich and standardized textual keywords, to serve as robust item identifiers. We propose GRLM, a novel framework centered on TIDs, employs Context-aware Term Generation to convert item's metadata into standardized TIDs and utilizes Integrative Instruction Fine-tuning to collaboratively optimize term internalization and sequential recommendation. Additionally, Elastic Identifier Grounding is designed for robust item mapping. Extensive experiments on real-world datasets demonstrate that GRLM significantly outperforms baselines across multiple scenarios, pointing a promising direction for generalizable and high-performance generative recommendation systems.

preprint2023arXiv

Disentangling Past-Future Modeling in Sequential Recommendation via Dual Networks

Sequential recommendation (SR) plays an important role in personalized recommender systems because it captures dynamic and diverse preferences from users' real-time increasing behaviors. Unlike the standard autoregressive training strategy, future data (also available during training) has been used to facilitate model training as it provides richer signals about user's current interests and can be used to improve the recommendation quality. However, these methods suffer from a severe training-inference gap, i.e., both past and future contexts are modeled by the same encoder when training, while only historical behaviors are available during inference. This discrepancy leads to potential performance degradation. To alleviate the training-inference gap, we propose a new framework DualRec, which achieves past-future disentanglement and past-future mutual enhancement by a novel dual network. Specifically, a dual network structure is exploited to model the past and future context separately. And a bi-directional knowledge transferring mechanism enhances the knowledge learnt by the dual network. Extensive experiments on four real-world datasets demonstrate the superiority of our approach over baseline methods. Besides, we demonstrate the compatibility of DualRec by instantiating using RNN, Transformer, and filter-MLP as backbones. Further empirical analysis verifies the high utility of modeling future contexts under our DualRec framework. The code of DualRec is publicly available at https://github.com/zhy99426/DualRec.

preprint2023arXiv

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Large language models (LLMs) have exhibited remarkable performance on various natural language processing (NLP) tasks, especially for question answering. However, in the face of problems beyond the scope of knowledge, these LLMs tend to talk nonsense with a straight face, where the potential solution could be incorporating an Information Retrieval (IR) module and generating response based on these retrieved knowledge. In this paper, we present a novel framework to assist LLMs, such as ChatGPT, to retrieve question-related structured information on the knowledge graph, and demonstrate that Knowledge-based question answering (Keqing) could be a nature Chain-of-Thought (CoT) mentor to guide the LLM to sequentially find the answer entities of a complex question through interpretable logical chains. Specifically, the workflow of Keqing will execute decomposing a complex question according to predefined templates, retrieving candidate entities on knowledge graph, reasoning answers of sub-questions, and finally generating response with reasoning paths, which greatly improves the reliability of LLM's response. The experimental results on KBQA datasets show that Keqing can achieve competitive performance and illustrate the logic of answering each question.

preprint2023arXiv

Learning stability of partially observed switched linear systems

This paper deals with learning stability of partially observed switched linear systems under arbitrary switching. Such systems are widely used to describe cyber-physical systems which arise by combining physical systems with digital components. In many real-world applications, the internal states cannot be observed directly. It is thus more realistic to conduct system analysis using the outputs of the system. Stability is one of the most frequent requirement for safety and robustness of cyber-physical systems. Existing methods for analyzing stability of switched linear systems often require the knowledge of the parameters and/or all the states of the underlying system. In this paper, we propose an algorithm for deciding stability of switched linear systems under arbitrary switching based purely on observed output data. The proposed algorithm essentially relies on an output-based Lyapunov stability framework and returns an estimate of the joint spectral radius (JSR). We also prove a probably approximately correct error bound on the quality of the estimate of the JSR from the perspective of statistical learning theory.

preprint2023arXiv

Semantic-aware Contrastive Learning for More Accurate Semantic Parsing

Since the meaning representations are detailed and accurate annotations which express fine-grained sequence-level semtantics, it is usually hard to train discriminative semantic parsers via Maximum Likelihood Estimation (MLE) in an autoregressive fashion. In this paper, we propose a semantic-aware contrastive learning algorithm, which can learn to distinguish fine-grained meaning representations and take the overall sequence-level semantic into consideration. Specifically, a multi-level online sampling algorithm is proposed to sample confusing and diverse instances. Three semantic-aware similarity functions are designed to accurately measure the distance between meaning representations as a whole. And a ranked contrastive loss is proposed to pull the representations of the semantic-identical instances together and push negative instances away. Experiments on two standard datasets show that our approach achieves significant improvements over MLE baselines and gets state-of-the-art performances by simply applying semantic-aware contrastive learning on a vanilla Seq2Seq model.

preprint2022arXiv

1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object Segmentation

The task of referring video object segmentation aims to segment the object in the frames of a given video to which the referring expressions refer. Previous methods adopt multi-stage approach and design complex pipelines to obtain promising results. Recently, the end-to-end method based on Transformer has proved its superiority. In this work, we draw on the advantages of the above methods to provide a simple and effective pipeline for RVOS. Firstly, We improve the state-of-the-art one-stage method ReferFormer to obtain mask sequences that are strongly correlated with language descriptions. Secondly, based on a reliable and high-quality keyframe, we leverage the superior performance of video object segmentation model to further enhance the quality and temporal consistency of the mask results. Our single model reaches 70.3 J &F on the Referring Youtube-VOS validation set and 63.0 on the test set. After ensemble, we achieve 64.1 on the final leaderboard, ranking 1st place on CVPR2022 Referring Youtube-VOS challenge. Code will be available at https://github.com/Zhiweihhh/cvpr2022-rvos-challenge.git.

preprint2022arXiv

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

The human brain can effortlessly recognize and localize objects, whereas current 3D object detection methods based on LiDAR point clouds still report inferior performance for detecting occluded and distant objects: the point cloud appearance varies greatly due to occlusion, and has inherent variance in point densities along the distance to sensors. Therefore, designing feature representations robust to such point clouds is critical. Inspired by human associative recognition, we propose a novel 3D detection framework that associates intact features for objects via domain adaptation. We bridge the gap between the perceptual domain, where features are derived from real scenes with sub-optimal representations, and the conceptual domain, where features are extracted from augmented scenes that consist of non-occlusion objects with rich detailed information. A feasible method is investigated to construct conceptual scenes without external datasets. We further introduce an attention-based re-weighting module that adaptively strengthens the feature adaptation of more informative regions. The network's feature enhancement ability is exploited without introducing extra cost during inference, which is plug-and-play in various 3D detection frameworks. We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed. Experiments on nuScenes and Waymo datasets also validate the versatility of our method.

preprint2022arXiv

Asynchronous Optimisation for Event-based Visual Odometry

Event cameras open up new possibilities for robotic perception due to their low latency and high dynamic range. On the other hand, developing effective event-based vision algorithms that fully exploit the beneficial properties of event cameras remains work in progress. In this paper, we focus on event-based visual odometry (VO). While existing event-driven VO pipelines have adopted continuous-time representations to asynchronously process event data, they either assume a known map, restrict the camera to planar trajectories, or integrate other sensors into the system. Towards map-free event-only monocular VO in SE(3), we propose an asynchronous structure-from-motion optimisation back-end. Our formulation is underpinned by a principled joint optimisation problem involving non-parametric Gaussian Process motion modelling and incremental maximum a posteriori inference. A high-performance incremental computation engine is employed to reason about the camera trajectory with every incoming event. We demonstrate the robustness of our asynchronous back-end in comparison to frame-based methods which depend on accurate temporal accumulation of measurements.

preprint2022arXiv

Binary Neural Networks as a general-propose compute paradigm for on-device computer vision

For binary neural networks (BNNs) to become the mainstream on-device computer vision algorithm, they must achieve a superior speed-vs-accuracy tradeoff than 8-bit quantization and establish a similar degree of general applicability in vision tasks. To this end, we propose a BNN framework comprising 1) a minimalistic inference scheme for hardware-friendliness, 2) an over-parameterized training scheme for high accuracy, and 3) a simple procedure to adapt to different vision tasks. The resultant framework overtakes 8-bit quantization in the speed-vs-accuracy tradeoff for classification, detection, segmentation, super-resolution and matching: our BNNs not only retain the accuracy levels of their 8-bit baselines but also showcase 1.3-2.4$\times$ faster FPS on mobile CPUs. Similar conclusions can be drawn for prototypical systolic-array-based AI accelerators, where our BNNs promise 2.8-7$\times$ fewer execution cycles than 8-bit and 2.1-2.7$\times$ fewer cycles than alternative BNN designs. These results suggest that the time for large-scale BNN adoption could be upon us.

preprint2022arXiv

Charge Carrier Mediation and Ferromagnetism induced in MnBi6Te10 Magnetic Topological Insulators by antimony doping

A new kind of intrinsic magnetic topological insulators (MTI) MnBi2Te4 family have shed light on the observation of novel topological quantum effect such as quantum anomalous Hall effect (QAHE). However, the strong anti-ferromagnetic (AFM) coupling and high carrier concentration in the bulk hinder the practical applications. In closely related materials MnBi4Te7 and MnBi6Te10, the interlayer magnetic coupling is greatly suppressed by Bi2Te3 layer intercalation. However, AFM is still the ground state in these compounds. Here by magnetic and transport measurements, we demonstrate that Sb substitutional dopant plays a dual role in MnBi6Te10, which can not only adjust the charge carrier type and the concentration, but also induce the solid into a ferromagnetic (FM) ground state. AFM ground state region which is also close to the charge neutral point can be found in the phase diagram of Mn(SbxBi1-x)6Te10 when x ~ 0.25. An intrinsic FM-MTI candidate is thus demonstrated, and it may take a step further for the realization of high-quality and high-temperature QAHE and the related topological quantum effects in the future.

preprint2022arXiv

CODE: Contrastive Pre-training with Adversarial Fine-tuning for Zero-shot Expert Linking

Expert finding, a popular service provided by many online websites such as Expertise Finder, LinkedIn, and AMiner, is beneficial to seeking candidate qualifications, consultants, and collaborators. However, its quality is suffered from lack of ample sources of expert information. This paper employs AMiner as the basis with an aim at linking any external experts to the counterparts on AMiner. As it is infeasible to acquire sufficient linkages from arbitrary external sources, we explore the problem of zero-shot expert linking. In this paper, we propose CODE, which first pre-trains an expert linking model by contrastive learning on AMiner such that it can capture the representation and matching patterns of experts without supervised signals, then it is fine-tuned between AMiner and external sources to enhance the models transferability in an adversarial manner. For evaluation, we first design two intrinsic tasks, author identification and paper clustering, to validate the representation and matching capability endowed by contrastive learning. Then the final external expert linking performance on two genres of external sources also implies the superiority of the adversarial fine-tuning method. Additionally, we show the online deployment of CODE, and continuously improve its online performance via active learning.

preprint2022arXiv

Differential Privacy for Symbolic Systems with Application to Markov Chains

Data-driven systems are gathering increasing amounts of data from users, and sensitive user data requires privacy protections. In some cases, the data gathered is non-numerical or symbolic, and conventional approaches to privacy, e.g., adding noise, do not apply, though such systems still require privacy protections. Accordingly, we present a novel differential privacy framework for protecting trajectories generated by symbolic systems. These trajectories can be represented as words or strings over a finite alphabet. We develop new differential privacy mechanisms that approximate a sensitive word using a random word that is likely to be near it. An offline mechanism is implemented efficiently using a Modified Hamming Distance Automaton to generate whole privatized output words over a finite time horizon. Then, an online mechanism is implemented by taking in a sensitive symbol and generating a randomized output symbol at each timestep. This work is extended to Markov chains to generate differentially private state sequences that a given Markov chain could have produced. Statistical accuracy bounds are developed to quantify the accuracy of these mechanisms, and numerical results validate the accuracy of these techniques for strings of English words.

preprint2022arXiv

Distributed Estimation for Interconnected Systems with Arbitrary Coupling Structures

This paper is concerned with the problem of distributed estimation for time-varying interconnected dynamic systems with arbitrary coupling structures. To guarantee the robustness of the designed estimators, novel distributed stability conditions are proposed with only local information and the information from neighbors. Then, simplified stability conditions which do not require timely exchange of neighbors' estimator gain information is further developed for systems with delayed communication. By merging these subsystem-level stability conditions and the optimization-based estimator gain design, the distributed, stable and optimal estimators are proposed. Quite notably, these optimization solutions can be easily obtained by standard software packages, and it is also shown that the designed estimators are scalable in the sense of adding or subtracting subsystems. Finally, an illustrative example is employed to show the effectiveness of the proposed methods.

preprint2022arXiv

Distributed Event-Triggered Nonlinear Fusion Estimation under Resource Constraints

This paper studies the event-triggered distributed fusion estimation problems for a class of nonlinear networked multisensor fusion systems without noise statistical characteristics. When considering the limited resource problems of two kinds of communication channels (i.e., sensor-to-remote estimator channel and smart sensor-to-fusion center channel), an event-triggered strategy and a dimensionality reduction strategy are introduced in a unified networked framework to lighten the communication burden. Then, two kinds of compensation strategies in terms of a unified model are designed to restructure the untransmitted information, and the local/fusion estimators are proposed based on the compensation information. Furthermore, the linearization errors caused by the Taylor expansion are modeled by the state-dependent matrices with uncertain parameters when establishing estimation error systems, and then different robust recursive optimization problems are constructed to determine the estimator gains and the fusion criteria. Meanwhile, the stability conditions are also proposed such that the square errors of the designed nonlinear estimators are bounded. Finally, a vehicle localization system is employed to demonstrate the effectiveness and advantages of the proposed methods.

preprint2022arXiv

Equivariant Bordism of 2-Torus Manifolds and Unitary Toric Manifolds

The equivariant bordism classification of manifolds with group actions is an essential subject in the study of transformation groups. We are interesting in the action of 2-torus group $\mathbb{Z}_2^n$ and torus group $T^n$, and study the equivariant bordism of 2-torus manifolds and unitary toric manifolds. In this paper, we give a new description of the group $\mathcal{Z}_n(\mathbb{Z}_2^n)$ of 2-torus manifolds, and determine the dimention of $\mathcal{Z}_n(\mathbb{Z}_2^n)$ as a $\mathbb{Z}_2$-vector space. With the help of toric topology, Lü and Tan proved that the bordism groups $\mathcal{Z}_n(\mathbb{Z}_2^n)$ are generated by small covers. We will give a new proof to this result. These results can be generalized to the equivariant bordism of unitary toric manifolds, that is, we will give a new description of the group $\mathcal{Z}_n^U(T^n)$ of unitary torus manifolds, and prove that $\mathcal{Z}_n^U(T^n)$ can be generated by quasitoric manifolds with omniorientations.

preprint2022arXiv

Graph Contrastive Learning for Anomaly Detection

Graph-based anomaly detection has been widely used for detecting malicious activities in real-world applications. Existing attempts to address this problem have thus far focused on structural feature engineering or learning in the binary classification regime. In this work, we propose to leverage graph contrastive coding and present the supervised GraphCAD model for contrasting abnormal nodes with normal ones in terms of their distances to the global context (e.g., the average of all nodes). To handle scenarios with scarce labels, we further enable GraphCAD as a self-supervised framework by designing a graph corrupting strategy for generating synthetic node labels. To achieve the contrastive objective, we design a graph neural network encoder that can infer and further remove suspicious links during message passing, as well as learn the global context of the input graph. We conduct extensive experiments on four public datasets, demonstrating that 1) GraphCAD significantly and consistently outperforms various advanced baselines and 2) its self-supervised version without fine-tuning can achieve comparable performance with its fully supervised version.

preprint2022arXiv

Large Exchange Bias Effect and Coverage-Dependent Interfacial Coupling in CrI3/MnBi2Te4 van der Waals Heterostructures

Igniting interface magnetic ordering of magnetic topological insulators by building a van der Waals heterostructure can help to reveal novel quantum states and design functional devices. Here, we observe an interesting exchange bias effect, indicating successful interfacial magnetic coupling, in CrI3/MnBi2Te4 ferromagnetic insulator/antiferromagnetic topological insulator (FMI/AFM-TI) heterostructure devices. The devices originally exhibit a negative exchange bias field, which decays with increasing temperature and is unaffected by the back-gate voltage. When we change the device configuration to be half-covered by CrI3, the exchange bias becomes positive with a very large exchange bias field exceeding 300 mT. Such sensitive manipulation is explained by the competition between the FM and AFM coupling at the interface of CrI3 and MnBi2Te4, pointing to coverage-dependent interfacial magnetic interactions. Our work will facilitate the development of topological and antiferromagnetic devices.

preprint2022arXiv

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

Observing a set of images and their corresponding paragraph-captions, a challenging task is to learn how to produce a semantically coherent paragraph to describe the visual content of an image. Inspired by recent successes in integrating semantic topics into this task, this paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework, which couples a visual extractor with a deep topic model to guide the learning of a language model. To capture the correlations between the image and text at multiple levels of abstraction and learn the semantic topics from images, we design a variational inference network to build the mapping from image features to textual captions. To guide the paragraph generation, the learned hierarchical topics and visual features are integrated into the language model, including Long Short-Term Memory (LSTM) and Transformer, and jointly optimized. Experiments on public datasets demonstrate that the proposed models, which are competitive with many state-of-the-art approaches in terms of standard evaluation metrics, can be used to both distill interpretable multi-layer semantic topics and generate diverse and coherent captions. We release our code at https://github.com/DandanGuo1993/VTCM-based-image-paragraph-caption.git

preprint2022arXiv

MISS: Multi-Interest Self-Supervised Learning Framework for Click-Through Rate Prediction

CTR prediction is essential for modern recommender systems. Ranging from early factorization machines to deep learning based models in recent years, existing CTR methods focus on capturing useful feature interactions or mining important behavior patterns. Despite the effectiveness, we argue that these methods suffer from the risk of label sparsity (i.e., the user-item interactions are highly sparse with respect to the feature space), label noise (i.e., the collected user-item interactions are usually noisy), and the underuse of domain knowledge (i.e., the pairwise correlations between samples). To address these challenging problems, we propose a novel Multi-Interest Self-Supervised learning (MISS) framework which enhances the feature embeddings with interest-level self-supervision signals. With the help of two novel CNN-based multi-interest extractors,self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item). Based on that, contrastive learning losses are further applied to the augmented views of interest representations, which effectively improves the feature representation learning. Furthermore, our proposed MISS framework can be used as an plug-in component with existing CTR prediction models and further boost their performances. Extensive experiments on three large-scale datasets show that MISS significantly outperforms the state-of-the-art models, by up to 13.55% in AUC, and also enjoys good compatibility with representative deep CTR models.

preprint2022arXiv

Neural Re-ranking in Multi-stage Recommender Systems: A Review

As the final stage of the multi-stage recommender system (MRS), re-ranking directly affects user experience and satisfaction by rearranging the input ranking lists, and thereby plays a critical role in MRS. With the advances in deep learning, neural re-ranking has become a trending topic and been widely applied in industrial applications. This review aims at integrating re-ranking algorithms into a broader picture, and paving ways for more comprehensive solutions for future research. For this purpose, we first present a taxonomy of current methods on neural re-ranking. Then we give a description of these methods along with the historic development according to their objectives. The network structure, personalization, and complexity are also discussed and compared. Next, we provide benchmarks of the major neural re-ranking models and quantitatively analyze their re-ranking performance. Finally, the review concludes with a discussion on future prospects of this field. A list of papers discussed in this review, the benchmark datasets, our re-ranking library LibRerank, and detailed parameter settings are publicly available at https://github.com/LibRerank-Community/LibRerank.

preprint2022arXiv

Observation of magnetism induced topological edge state in antiferromagnetic topological insulator MnBi4Te7

Breaking time reversal symmetry in a topological insulator may lead to quantum anomalous Hall effect and axion insulator phase. MnBi4Te7 is a recently discovered antiferromagnetic topological insulator with TN ~12.5 K, which is constituted of alternatively stacked magnetic layer (MnBi2Te4) and non-magnetic layer (Bi2Te3). By means of scanning tunneling spectroscopy, we clearly observe the electronic state present at a step edge of a magnetic MnBi2Te4 layer but absent at non-magnetic Bi2Te3 layers at 4.5 K. Furthermore, we find that as the temperature rises above TN, the edge state vanishes, while the point defect induced state persists upon temperature increasing. These results confirm the observation of magnetism induced edge states. Our analysis based on an axion insulator theory reveals that the nontrivial topological nature of the observed edge state.

preprint2022arXiv

Ordinal Graph Gamma Belief Network for Social Recommender Systems

To build recommender systems that not only consider user-item interactions represented as ordinal variables, but also exploit the social network describing the relationships between the users, we develop a hierarchical Bayesian model termed ordinal graph factor analysis (OGFA), which jointly models user-item and user-user interactions. OGFA not only achieves good recommendation performance, but also extracts interpretable latent factors corresponding to representative user preferences. We further extend OGFA to ordinal graph gamma belief network, which is a multi-stochastic-layer deep probabilistic model that captures the user preferences and social communities at multiple semantic levels. For efficient inference, we develop a parallel hybrid Gibbs-EM algorithm, which exploits the sparsity of the graphs and is scalable to large datasets. Our experimental results show that the proposed models not only outperform recent baselines on recommendation datasets with explicit or implicit feedback, but also provide interpretable latent representations.

preprint2022arXiv

Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document and hence often suffers from poor performance in analyzing short documents. In addition, its parameter estimation often relies on approximate posterior inference that is either not scalable or suffers from large approximation error. This paper introduces a new topic-modeling framework where each document is viewed as a set of word embedding vectors and each topic is modeled as an embedding vector in the same embedding space. Embedding the words and topics in the same vector space, we define a method to measure the semantic difference between the embedding vectors of the words of a document and these of the topics, and optimize the topic embeddings to minimize the expected difference over all documents. Experiments on text analysis demonstrate that the proposed method, which is amenable to mini-batch stochastic gradient descent based optimization and hence scalable to big corpora, provides competitive performance in discovering more coherent and diverse topics and extracting better document representations.

preprint2022arXiv

Secure Fusion Estimation Against FDI Sensor Attacks in Cyber-Physical Systems

This paper is concerned with the problem of secure multi-sensors fusion estimation for cyber-physical systems, where sensor measurements may be tampered with by false data injection (FDI) attacks. In this work, it is considered that the adversary may not be able to attack all sensors. That is, several sensors remain not being attacked. In this case, new local reorganized subsystems including the FDI attack signals and un-attacked sensor measurements are constructed by the augmentation method. Then, a joint Kalman fusion estimator is designed under linear minimum variance sense to estimate the system state and FDI attack signals simultaneously. Finally, illustrative examples are employed to show the effectiveness and advantages of the proposed methods.

preprint2022arXiv

The Block-based Mobile PDE Systems Are Not Secure -- Experimental Attacks

Nowadays, mobile devices have been used broadly to store and process sensitive data. To ensure confidentiality of the sensitive data, Full Disk Encryption (FDE) is often integrated in mainstream mobile operating systems like Android and iOS. FDE however cannot defend against coercive attacks in which the adversary can force the device owner to disclose the decryption key. To combat the coercive attacks, Plausibly Deniable Encryption (PDE) is leveraged to plausibly deny the very existence of sensitive data. However, most of the existing PDE systems for mobile devices are deployed at the block layer and suffer from deniability compromises. Having observed that none of existing works in the literature have experimentally demonstrated the aforementioned compromises, our work bridges this gap by experimentally confirming the deniability compromises of the block-layer mobile PDE systems. We have built a mobile device testbed, which consists of a host computing device and a flash storage device. Additionally, we have deployed both the hidden volume PDE and the steganographic file system at the block layer of the testbed and performed disk forensics to assess potential compromises on the raw NAND flash. Our experimental results confirm it is indeed possible for the adversary to compromise the block-layer PDE systems by accessing the raw NAND flash in practice. We also discuss potential issues when performing such attacks in real world.

preprint2022arXiv

Three-dimensional Propagation of the Global EUV Wave associated with a solar eruption on 2021 October 28

We present a case study for the global extreme ultraviolet (EUV) wave and its chromospheric counterpart `Moreton-Ramsey wave' associated with the second X-class flare in Solar Cycle 25 and a halo coronal mass ejection (CME). The EUV wave was observed in the H$α$ and EUV passbands with different characteristic temperatures. In the 171 Å and 193/195 Å images, the wave propagates circularly with an initial velocity of 600-720 km s$^{-1}$ and a deceleration of 110-320 m s$^{-2}$. The local coronal plasma is heated from log(T/K)=5.9 to log(T/K)=6.2 during the passage of the wavefront. The H$α$ and 304 Å images also reveal signatures of wave propagation with a velocity of 310-540 km s$^{-1}$. With multi-wavelength and dual-perspective observations, we found that the wavefront likely propagates forwardly inclined to the solar surface with a tilt angle of ~53.2$^{\circ}$. Our results suggest that this EUV wave is a fast-mode magnetohydrodynamic wave or shock driven by the expansion of the associated CME, whose wavefront is likely a dome-shaped structure that could impact the upper chromosphere, transition region and corona.

preprint2022arXiv

Three-step Formation of Diamonds in Shock-compressed Hydrocarbons: Decomposition, Species Separation, and Nucleation

The accumulation and circulation of carbon-hydrogen dictate the chemical evolution of ice giant planets. Species separation and diamond precipitation have been reported in carbon-hydrogen systems, verified by static and shock-compression experiments. Nevertheless, the dynamic formation processes for the above-mentioned phenomena are still insufficiently understood. Here, combing deep learning model, we demonstrate that diamonds form through a three-step process involving decomposition, species separation and nucleation procedures. Under shock condition of 125 GPa and 4590 K, hydrocarbons are decomposed to give hydrogen and low-molecular-weight alkanes (CH4 and C2H6), which escape from the carbon chains resulting in C/H species separation. The remaining carbon atoms without C-H bonds accumulate and nucleate to form diamond crystals. The process of diamond growth is found to associated with a critical nucleus size where dynamic energy barrier plays a key role. These dynamic processes for diamonds formation are insightful in establishing the model for ice giant planet evolution.

preprint2022arXiv

Towards Large-Scale and Spatio-temporally Resolved Diagnosis of Electronic Density of States by Deep Learning

Modern laboratory techniques like ultrafast laser excitation and shock compression can bring matter into highly nonequilibrium states with complex structural transformation, metallization and dissociation dynamics. To understand and model the dramatic change of both electronic structures and ion dynamics during such dynamic processes, the traditional method faces difficulties. Here, we demonstrate the ability of deep neural network (DNN) to capture the atomic local-environment dependence of electronic density of states (DOS) for both multicomponent system under exoplanet thermodynamic condition and nonequilibrium system during super-heated melting process. Large scale and time-resolved diagnosis of DOS can be efficiently achieved within the accuracy of ab initio method. Moreover, the atomic contribution to DOS given by DNN model accurately reveals the information of local neighborhood for selected atom, thus can serve as robust order parameters to identify different phases and intermediate local structures, strongly highlights the efficacy of this DNN model in studying dynamic processes.

preprint2022arXiv

Update Compression for Deep Neural Networks on the Edge

An increasing number of artificial intelligence (AI) applications involve the execution of deep neural networks (DNNs) on edge devices. Many practical reasons motivate the need to update the DNN model on the edge device post-deployment, such as refining the model, concept drift, or outright change in the learning task. In this paper, we consider the scenario where retraining can be done on the server side based on a copy of the DNN model, with only the necessary data transmitted to the edge to update the deployed model. However, due to bandwidth constraints, we want to minimise the transmission required to achieve the update. We develop a simple approach based on matrix factorisation to compress the model update -- this differs from compressing the model itself. The key idea is to preserve existing knowledge in the current model and optimise only small additional parameters for the update which can be used to reconstitute the model on the edge. We compared our method to similar techniques used in federated learning; our method usually requires less than half of the update size of existing methods to achieve the same accuracy.

preprint2021arXiv

Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation

A fundamental ability of humans is to utilize commonsense knowledge in language understanding and question answering. In recent years, many knowledge-enhanced Commonsense Question Answering (CQA) approaches have been proposed. However, it remains unclear: (1) How far can we get by exploiting external knowledge for CQA? (2) How much potential of knowledge has been exploited in current CQA models? (3) Which are the most promising directions for future CQA? To answer these questions, we benchmark knowledge-enhanced CQA by conducting extensive experiments on multiple standard CQA datasets using a simple and effective knowledge-to-text transformation framework. Experiments show that: (1) Our knowledge-to-text framework is effective and achieves state-of-the-art performance on CommonsenseQA dataset, providing a simple and strong knowledge-enhanced baseline for CQA; (2) The potential of knowledge is still far from being fully exploited in CQA -- there is a significant performance gap from current models to our models with golden knowledge; and (3) Context-sensitive knowledge selection, heterogeneous knowledge exploitation, and commonsense-rich language models are promising CQA directions.

preprint2021arXiv

Experimental evidence on the dissipationless transport of chiral edge state of the high-field Chern insulator in MnBi2Te4 nanodevices

We demonstrate the dissipationless transport of the chiral edge state (CES) in the nanodevices of quantum anomalous Hall insulator candidate MnBi2Te4. The device presents a near-zero longitudinal resistance together with a quantized Hall plateau in excess of 0.97 h/e2 over a range of temperatures from very low up to the Neel temperature of 22 K. Each of four-probe nonlocal measurements gives near-zero resistance and two-probe measurements exhibit a plateau of +1 h/e2, while the results of three-probe nonlocal measurements depend on the magnetic field. This indicates non-dissipation as well as the chirality of the edge state. The CES shows three regimes of temperature dependence, i.e., well-preserved dissipationless transport below 6 K, variable range hopping while increasing the temperature and thermal activation at higher than 22 K. Even at the lowest temperature, a current of over 1.4 μA breaks the dissipationless transport. These form a complete set of evidences of the Chern insulator state in the MnBi2Te4 systems.

preprint2021arXiv

Memory-Efficient Network for Large-scale Video Compressive Sensing

Video snapshot compressive imaging (SCI) captures a sequence of video frames in a single shot using a 2D detector. The underlying principle is that during one exposure time, different masks are imposed on the high-speed scene to form a compressed measurement. With the knowledge of masks, optimization algorithms or deep learning methods are employed to reconstruct the desired high-speed video frames from this snapshot measurement. Unfortunately, though these methods can achieve decent results, the long running time of optimization algorithms or huge training memory occupation of deep networks still preclude them in practical applications. In this paper, we develop a memory-efficient network for large-scale video SCI based on multi-group reversible 3D convolutional neural networks. In addition to the basic model for the grayscale SCI system, we take one step further to combine demosaicing and SCI reconstruction to directly recover color video from Bayer measurements. Extensive results on both simulation and real data captured by SCI cameras demonstrate that our proposed model outperforms previous state-of-the-art with less memory and thus can be used in large-scale problems. The code is at https://github.com/BoChenGroup/RevSCI-net.

preprint2021arXiv

MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing

To capture high-speed videos using a two-dimensional detector, video snapshot compressive imaging (SCI) is a promising system, where the video frames are coded by different masks and then compressed to a snapshot measurement. Following this, efficient algorithms are desired to reconstruct the high-speed frames, where the state-of-the-art results are achieved by deep learning networks. However, these networks are usually trained for specific small-scale masks and often have high demands of training time and GPU memory, which are hence {\bf \em not flexible} to $i$) a new mask with the same size and $ii$) a larger-scale mask. We address these challenges by developing a Meta Modulated Convolutional Network for SCI reconstruction, dubbed MetaSCI. MetaSCI is composed of a shared backbone for different masks, and light-weight meta-modulation parameters to evolve to different modulation parameters for each mask, thus having the properties of {\bf \em fast adaptation} to new masks (or systems) and ready to {\bf \em scale to large data}. Extensive simulation and real data results demonstrate the superior performance of our proposed approach. Our code is available at {\small\url{https://github.com/xyvirtualgroup/MetaSCI-CVPR2021}}.

preprint2021arXiv

Robust Trajectory-Constrained Frequency Control for Microgrids Considering Model Linearization Error

The capability to switch between grid-connected and islanded modes has promoted adoption of microgrid technology for powering remote locations. Stabilizing frequency during the islanding event, however, is a challenging control task, particularly under high penetration of converter-interfaced sources. In this paper, a numerical optimal control (NOC)-based control synthesis methodology is proposed for preparedness of microgrid islanding that ensure guaranteed performance. The key feature of the proposed paradigm is near real-time centralized scheduling for real-time decentralized executing. For tractable computation, linearized models are used in the problem formulation. To accommodate the linearization errors, interval analysis is employed to compute linearization-induced uncertainty as numerical intervals so that the NOC problem can be formulated into a robust mixed-integer linear program. The proposed control is verified on the full nonlinear model in Simulink. The simulation results shown effectiveness of the proposed control paradigm and the necessity of considering linearization-induced uncertainty.

preprint2020arXiv

A framework for constructing a huge name disambiguation dataset: algorithms, visualization and human collaboration

We present a manually-labeled Author Name Disambiguation(AND) Dataset called WhoisWho, which consists of 399,255 documents and 45,187 distinct authors with 421 ambiguous author names. To label such a great amount of AND data of high accuracy, we propose a novel annotation framework where the human and computer collaborate efficiently and precisely. Within the framework, we also propose an inductive disambiguation model to classify whether two documents belong to the same author. We evaluate the proposed method and other state-of-the-art disambiguation methods on WhoisWho. The experiment results show that: (1) Our model outperforms other disambiguation algorithms on this challenging benchmark. (2) The AND problem still remains largely unsolved and requires more in-depth research. We believe that such a large-scale benchmark would bring great value for the author name disambiguation task. We also conduct several experiments to prove our annotation framework could assist annotators to make accurate results efficiently and eliminate wrong label problems made by human annotators effectively.

preprint2020arXiv

A highly efficient integrated source of twisted single-photons

Photons with a helical phase front (twisted photons) can carry a discrete, in principle, unbounded amount of orbital angular momentum (OAM). Twisted single-photons have been demonstrated as a high-dimensional quantum system with information processing ability far beyond the widely used two-level qubits. To date, the generations of single-photons carrying OAM merely rely on the non-linear process in bulk crystals, e.g., spontaneous parametric down-conversion (SPDC), which unavoidably limits both the efficiency and the scalability of the source. Therefore, an on-demand OAM quantum light source on a semiconductor chip is yet illusive and highly desirable for integrated photonic quantum technologies. Here we demonstrate highly-efficient emission of twisted single-photons from solid-state quantum emitters embedded in a microring with angular gratings. The cavity QED effect allows the generations of single-photons and encoding OAM in the same nanostructure and therefore enables the realization of devices with very small footprints and great scalability. The OAM states of singe-photons are clearly identified via quantum interference of single-photons with themselves. Our device may boost the development of integrated quantum photonic devices with potential applications towards high-dimensional quantum information processing.

preprint2020arXiv

Can weight sharing outperform random architecture search? An investigation with TuNAS

Efficient Neural Architecture Search methods based on weight sharing have shown good promise in democratizing Neural Architecture Search for computer vision models. There is, however, an ongoing debate whether these efficient methods are significantly better than random search. Here we perform a thorough comparison between efficient and random search methods on a family of progressively larger and more challenging search spaces for image classification and detection on ImageNet and COCO. While the efficacies of both methods are problem-dependent, our experiments demonstrate that there are large, realistic tasks where efficient search methods can provide substantial gains over random search. In addition, we propose and evaluate techniques which improve the quality of searched architectures and reduce the need for manual hyper-parameter tuning. Source code and experiment data are available at https://github.com/google-research/google-research/tree/master/tunas

preprint2020arXiv

Comparison and Benchmark of Graph Clustering Algorithms

Graph clustering is widely used in analysis of biological networks, social networks and etc. For over a decade many graph clustering algorithms have been published, however a comprehensive and consistent performance comparison is not available. In this paper we benchmarked more than 70 graph clustering programs to evaluate their runtime and quality performance for both weighted and unweighted graphs. We also analyzed the characteristics of ground truth that affects the performance. Our work is capable to not only supply a start point for engineers to select clustering algorithms but also could provide a viewpoint for researchers to design new algorithms.

preprint2020arXiv

CONNA: Addressing Name Disambiguation on The Fly

Name disambiguation is a key and also a very tough problem in many online systems such as social search and academic search. Despite considerable research, a critical issue that has not been systematically studied is disambiguation on the fly -- to complete the disambiguation in the real-time. This is very challenging, as the disambiguation algorithm must be accurate, efficient, and error tolerance. In this paper, we propose a novel framework -- CONNA -- to train a matching component and a decision component jointly via reinforcement learning. The matching component is responsible for finding the top matched candidate for the given paper, and the decision component is responsible for deciding on assigning the top matched person or creating a new person. The two components are intertwined and can be bootstrapped via jointly training. Empirically, we evaluate CONNA on two name disambiguation datasets. Experimental results show that the proposed framework can achieve a 1.21%-19.84% improvement on F1-score using joint training of the matching and the decision components. The proposed CONNA has been successfully deployed on AMiner -- a large online academic search system.

preprint2020arXiv

Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference

To build a flexible and interpretable model for document analysis, we develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network. In order to provide scalable posterior inference for the parameters of the generative network, we develop topic-layer-adaptive stochastic gradient Riemannian MCMC that jointly learns simplex-constrained global parameters across all layers and topics, with topic and layer specific learning rates. Given a posterior sample of the global parameters, in order to efficiently infer the local latent representations of a document under DATM across all stochastic layers, we propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a Weibull distribution based stochastic downward generative model. To jointly model documents and their associated labels, we further propose supervised DATM that enhances the discriminative power of its latent representations. The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.

preprint2020arXiv

End-to-End Learnable Geometric Vision by Backpropagating PnP Optimization

Deep networks excel in learning patterns from large amounts of data. On the other hand, many geometric vision tasks are specified as optimization problems. To seamlessly combine deep learning and geometric vision, it is vital to perform learning and geometric optimization end-to-end. Towards this aim, we present BPnP, a novel network module that backpropagates gradients through a Perspective-n-Points (PnP) solver to guide parameter updates of a neural network. Based on implicit differentiation, we show that the gradients of a "self-contained" PnP solver can be derived accurately and efficiently, as if the optimizer block were a differentiable function. We validate BPnP by incorporating it in a deep model that can learn camera intrinsics, camera extrinsics (poses) and 3D structure from training datasets. Further, we develop an end-to-end trainable pipeline for object pose estimation, which achieves greater accuracy by combining feature-based heatmap losses with 2D-3D reprojection errors. Since our approach can be extended to other optimization problems, our work helps to pave the way to perform learnable geometric vision in a principled manner. Our PyTorch implementation of BPnP is available on http://github.com/BoChenYS/BPnP.

preprint2020arXiv

Global weak solutions for Landau-Lifshitz flows and heat flows associated to micromagnetic energy functional

We follow the idea of Wang \cite{W} to show the existence of global weak solutions to the Cauchy problems of Landau-Lifshtiz type equations and related heat flows from a $n$-dimensional Euclidean domain $\Om$ or a $n$-dimensional closed Riemannian manifold $M$ into a 2-dimensional unit sphere $\U^{2}$. Our conclusions extend a series of related results obtained in the previous literature.

preprint2020arXiv

JarKA: Modeling Attribute Interactions for Cross-lingual Knowledge Alignment

Abstract. Cross-lingual knowledge alignment is the cornerstone in building a comprehensive knowledge graph (KG), which can benefit various knowledge-driven applications. As the structures of KGs are usually sparse, attributes of entities may play an important role in aligning the entities. However, the heterogeneity of the attributes across KGs prevents from accurately embedding and comparing entities. To deal with the issue, we propose to model the interactions between attributes, instead of globally embedding an entity with all the attributes. We further propose a joint framework to merge the alignments inferred from the attributes and the structures. Experimental results show that the proposed model outperforms the state-of-art baselines by up to 38.48% HitRatio@1. The results also demonstrate that our model can infer the alignments between attributes, relationships and values, in addition to entities.

preprint2020arXiv

Large few-layer hexagonal boron nitride flakes for nonlinear optics

Hexagonal boron nitride (hBN) is a layered dielectric material with a wide range of applications in optics and photonics. In this work, we demonstrate a fabrication method for few-layer hBN flakes with areas up to 5000 $\rm μm$. We show that hBN in this form can be integrated with photonic microstructures: as an example, we use a circular Bragg grating (CBG). The layer quality of the exfoliated hBN flake on a CBG is confirmed by second-harmonic generation (SHG) microscopy. We show that the SHG signal is uniform across the hBN sample outside the CBG and is amplified in the centre of the CBG.

preprint2020arXiv

Large Magnetoresistance in Topological Insulator Candidate TaSe3

Large unsaturated magnetoresistance (XMR) with magnitude about 1000% is observed in topological insulator candidate TaSe3 from our high field (up to 38 T) measurements. Two oscillation modes, associated with one hole pocket and two electron pockets in the bulk, respectively, are detected from our Shubnikov-de Hass (SdH) measurements, consistent with our first-principles calculations. With the detailed Hall measurements performed, our two-band model analysis exhibits an imperfect density ratio n_h/n_e closing 0.9 at T< 20 K , which suggests that the carrier compensations account for the XMR in TaSe3.

preprint2020arXiv

Mesh Independence of an Accelerated Block Coordinate Descent Method for Sparse Optimal Control Problems

An accelerated block coordinate descent (ABCD) method in Hilbert space is analyzed to solve the sparse optimal control problem via its dual. The finite element approximation of this method is investigated and convergence results are presents. Based on the second order growth condition of the dual objective function, we show that iteration sequence of dual variables has the iteration complexity of $O(1/k)$. Moreover, we also prove iteration complexity for the primal problem. Two types of mesh-independence for ABCD method are proved, which asserts that asymptotically the infinite dimensional ABCD method and finite dimensional discretizations have the same convergence property, and the iterations of ABCD method remain nearly constant as the discretization is refined.

preprint2020arXiv

MnasFPN: Learning Latency-aware Pyramid Architecture for Object Detection on Mobile Devices

Despite the blooming success of architecture search for vision tasks in resource-constrained environments, the design of on-device object detection architectures have mostly been manual. The few automated search efforts are either centered around non-mobile-friendly search spaces or not guided by on-device latency. We propose MnasFPN, a mobile-friendly search space for the detection head, and combine it with latency-aware architecture search to produce efficient object detection models. The learned MnasFPN head, when paired with MobileNetV2 body, outperforms MobileNetV3+SSDLite by 1.8 mAP at similar latency on Pixel. It is also both 1.0 mAP more accurate and 10% faster than NAS-FPNLite. Ablation studies show that the majority of the performance gain comes from innovations in the search space. Further explorations reveal an interesting coupling between the search space design and the search algorithm, and that the complexity of MnasFPN search space may be at a local optimum.

preprint2020arXiv

Numerical schemes to reconstruct three dimensional time-dependent point sources of acoustic waves

This paper is concerned with the numerical simulation of three dimensional time-dependent inverse source problems of acoustic waves. The reconstructions of both multiple stationary point sources and a moving point source are considered. The modified method of fundamental solutions (MMFS), which expands the solution utilizing the time convolution of the Green&#39;s function and the signal function, is proposed to solve the problem. For the reconstruction of a moving point source, moreover, the MMFS is simplified as a simple sampling method at each time step. Numerical experiments are provided to show the effectiveness of the proposed methods.

preprint2020arXiv

Recurrent Hierarchical Topic-Guided RNN for Language Generation

To simultaneously capture syntax and global semantics from a text corpus, we propose a new larger-context recurrent neural network (RNN) based language model, which extracts recurrent hierarchical semantic structure via a dynamic deep topic model to guide natural language generation. Moving beyond a conventional RNN-based language model that ignores long-range word dependencies and sentence order, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence topic dependencies. For inference, we develop a hybrid of stochastic-gradient Markov chain Monte Carlo and recurrent autoencoding variational Bayes. Experimental results on a variety of real-world text corpora demonstrate that the proposed model not only outperforms larger-context RNN-based language models, but also learns interpretable recurrent multilayer topics and generates diverse sentences and paragraphs that are syntactically correct and semantically coherent.

preprint2020arXiv

Towards Designing A Secure Plausibly Deniable System for Mobile Devices against Multi-snapshot Adversaries -- A Preliminary Design

Mobile computing devices have been used broadly to store, manage and process sensitive or even mission critical data. To protect confidentiality of data stored in mobile devices, major mobile operating systems use full disk encryption, which relies on traditional encryption mechanisms and requires that decryption keys will not be disclosed. This however, is not necessarily true, since an active attacker may coerce victims for decryption keys. Plausibly deniable encryption (PDE) can defend against such a coercive attacker by disguising the true secret key with a decoy key. Leveraging concept of PDE, various deniable storage systems have been built for both PC and mobile platforms. However, a secure PDE system for mobile devices is still missing which can be compatible with mainstream mobile devices and, meanwhile, remains secure when facing a strong multi-snapshot adversary. In this work, we propose a preliminary PDE system design for mobile computing devices using flash memory as underlying storage medium. Ours is the first secure PDE system for mobile devices which has the following new design features: 1) it is compatible with mainstream mobile devices due to its integration of PDE into flash translation layer (FTL), the most popular form of flash memory being used by modern mobile devices; and 2) it can defend against the multi-snapshot adversary by denying hidden writes (over the flash memory) caused by hidden sensitive data using random dummy writes.

preprint2020arXiv

Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling

For bidirectional joint image-text modeling, we develop variational hetero-encoder (VHE) randomized generative adversarial network (GAN), a versatile deep generative model that integrates a probabilistic text decoder, probabilistic image encoder, and GAN into a coherent end-to-end multi-modality learning framework. VHE randomized GAN (VHE-GAN) encodes an image to decode its associated text, and feeds the variational posterior as the source of randomness into the GAN image generator. We plug three off-the-shelf modules, including a deep topic model, a ladder-structured image encoder, and StackGAN++, into VHE-GAN, which already achieves competitive performance. This further motivates the development of VHE-raster-scan-GAN that generates photo-realistic images in not only a multi-scale low-to-high-resolution manner, but also a hierarchical-semantic coarse-to-fine fashion. By capturing and relating hierarchical semantic and visual concepts with end-to-end training, VHE-raster-scan-GAN achieves state-of-the-art performance in a wide variety of image-text multi-modality learning and generation tasks.

preprint2020arXiv

WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling

To train an inference network jointly with a deep generative topic model, making it both scalable to big corpora and fast in out-of-sample prediction, we develop Weibull hybrid autoencoding inference (WHAI) for deep latent Dirichlet allocation, which infers posterior samples via a hybrid of stochastic-gradient MCMC and autoencoding variational Bayes. The generative network of WHAI has a hierarchy of gamma distributions, while the inference network of WHAI is a Weibull upward-downward variational autoencoder, which integrates a deterministic-upward deep neural network, and a stochastic-downward deep generative model based on a hierarchy of Weibull distributions. The Weibull distribution can be used to well approximate a gamma distribution with an analytic Kullback-Leibler divergence, and has a simple reparameterization via the uniform noise, which help efficiently compute the gradients of the evidence lower bound with respect to the parameters of the inference network. The effectiveness and efficiency of WHAI are illustrated with experiments on big corpora.

preprint2019arXiv

Causal variance decompositions for institutional comparisons in healthcare

There is increasing interest in comparing institutions delivering healthcare in terms of disease-specific quality indicators (QIs) that capture processes or outcomes showing variations in the care provided. Such comparisons can be framed in terms of causal models, where adjusting for patient case-mix is analogous to controlling for confounding, and exposure is being treated in a given hospital, for instance. Our goal here is to help identifying good QIs rather than comparing hospitals in terms of an already chosen QI, and so we focus on the presence and magnitude of overall variation in care between the hospitals rather than the pairwise differences between any two hospitals. We consider how the observed variation in care received at patient level can be decomposed into that causally explained by the hospital performance adjusting for the case-mix, the case-mix itself, and residual variation. For this purpose, we derive a three-way variance decomposition, with particular attention to its causal interpretation in terms of potential outcome variables. We propose model-based estimators for the decomposition, accommodating different link functions and either fixed or random effect models. We evaluate their performance in a simulation study and demonstrate their use in a real data application.

preprint2019arXiv

Experimental observation of the gate-controlled reversal of the anomalous Hall effect in the intrinsic magnetic topological insulator MnBi2Te4 device

Here we report the reserved anomalous Hall effect (AHE) in the 5-septuple-layer van der Waals device of the intrinsic magnetic topological insulator MnBi2Te4. By employing the top/bottom gate, a negative AHE loop gradually decreases to zero and changes to a reversed sign. The reversed AHE exhibits distinct coercive fields and temperature dependence from the previous AHE. It reaches the maximum inside the gap of the Dirac cone. The newly-seen reversed AHE is attributed to the competition of the intrinsic Berry curvature and the Dirac-gap enhanced extrinsic skew scattering. Its gate-controlled switching contributes a scheme for the topological spin field-effect transistors.

preprint2019arXiv

Magneto-transport and Shubnikov-de Haas oscillations in the layered ternary telluride Ta3SiTe6 topological semimetal

Topological semimetals characterize a novel class of quantum materials hosting Dirac/Weyl fermions. The important features of topological fermions can be exhibited by quantum oscillations. Here we report the magnetoresistance and Shubnikov-de Haas (SdH) quantum oscillation of longitudinal resistance in the single crystal of topological semimetal Ta3SiTe6 with the magnetic field up to 38 T. Periodic amplitude of the oscillations reveals related information about the Fermi surface. The fast Fourier transformation spectra represent a single oscillatory frequency. The analysis of the oscillations shows the Fermi pocket with a cross-section area of 0.13 angstrom power minus 2. Combining magneto-transport measurements and the first-principles calculation, we find that these oscillations come from the hole pocket. Hall resistivity and the SdH oscillations recommend that Ta3SiTe6 is a hole dominated system.

preprint2015arXiv

Structural Properties of an Open Problem in Preemptive Scheduling

Structural properties of optimal preemptive schedules have been studied in a number of recent papers with a primary focus on two structural parameters: the minimum number of preemptions necessary, and a tight lower bound on `shifts&#39;, i.e., the sizes of intervals bounded by the times created by preemptions, job starts, or completions. So far only rough bounds for these parameters have been derived for specific problems. This paper sharpens the bounds on these structural parameters for a well-known open problem in the theory of preemptive scheduling: Instances consist of in-trees of $n$ unit-execution-time jobs with release dates, and the objective is to minimize the total completion time on two processors. This is among the current, tantalizing `threshold&#39; problems of scheduling theory: Our literature survey reveals that any significant generalization leads to an NP-hard problem, but that any significant simplification leads to tractable problem. For the above problem, we show that the number of preemptions necessary for optimality need not exceed $2n-1$; that the number must be of order $Ω(\log n)$ for some instances; and that the minimum shift need not be less than $2^{-2n+1}$. These bounds are obtained by combinatorial analysis of optimal schedules rather than by the analysis of polytope corners for linear-program formulations, an approach to be found in earlier papers. The bounds immediately follow from a fundamental structural property called `normality&#39;, by which minimal shifts of a job are exponentially decreasing functions. In particular, the first interval between a preempted job&#39;s start and its preemption is a multiple of 1/2, the second such interval is a multiple of 1/4, and in general, the $i$-th preemption occurs at a multiple of $2^{-i}$. We expect the new structural properties to play a prominent role in finally settling a vexing, still-open question of complexity.