Source author record

Xue Liu

Xue Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

36works

24topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MINER: Mining Multimodal Internal Representation for Efficient Retrieval

Visual document retrieval has become essential for accessing information in visually rich documents. Existing approaches fall into two camps. Late-interaction retrievers achieve strong quality through fine-grained token-level matching but store hundreds of vectors per page, incurring large index footprints and high serving costs. By contrast, dense single-vector retrievers retain storage and latency advantages but consistently lag in quality because they compress all information into a single final-layer embedding. In this work, we first conduct a layerwise diagnostic on single-vector retrievers, revealing that retrieval-relevant signal resides in internal representations. Motivated by these findings, we propose MINER (Mining Multimodal Internal RepreseNtation for Efficient Retrieval), a lightweight plug-in module that probes and fuses internal signals across transformer layers into a single compact embedding without modifying the backbone or sacrificing single-vector efficiency. The first Retrieval-Aligned Layer Probing stage attaches a lightweight probe at each layer, surfacing which dimensions carry retrieval-relevant information. The subsequent Adaptive Sparse Multi-Layer Fusion stage applies performance-adaptive neuron-level masking to the selected layers and fuses the surviving signals into the final dense vector. Across ViDoRe V1/V2/V3, MINER outperforms existing dense single-vector retrievers on the majority of benchmarks, with up to 4.5% nDCG@5 improvement over its corresponding backbone. Compared to strong late-interaction baselines, in some settings MINER substantially narrows the nDCG@$5$ gap to $0.2$ while preserving the storage and serving advantages of dense retrieval.

preprint2023arXiv

Anonymous Pattern Molecular Fingerprint and its Applications on Property Identification

Molecular fingerprints are significant cheminformatics tools to map molecules into vectorial space according to their characteristics in diverse functional groups, atom sequences, and other topological structures. In this paper, we set out to investigate a novel molecular fingerprint \emph{Anonymous-FP} that possesses abundant perception about the underlying interactions shaped in small, medium, and large molecular scale links. In detail, the possible inherent atom chains are sampled from each molecule and are extended in a certain anonymous pattern. After that, the molecular fingerprint \emph{Anonymous-FP} is encoded in virtue of the Natural Language Processing technique \emph{PV-DBOW}. \emph{Anonymous-FP} is studied on molecular property identification and has shown valuable advantages such as rich information content, high experimental performance, and full structural significance. During the experimental verification, the scale of the atom chain or its anonymous manner matters significantly to the overall representation ability of \emph{Anonymous-FP}. Generally, the typical scale $r = 8$ enhances the performance on a series of real-world molecules, and specifically, the accuracy could level up to above $93\%$ on all NCI datasets.

preprint2023arXiv

Dual-space Hierarchical Learning for Goal-guided Conversational Recommendation

Proactively and naturally guiding the dialog from the non-recommendation context (e.g., Chit-chat) to the recommendation scenario (e.g., Music) is crucial for the Conversational Recommender System (CRS). Prior studies mainly focus on planning the next dialog goal~(e.g., chat on a movie star) conditioned on the previous dialog. However, we find the dialog goals can be simultaneously observed at different levels, which can be utilized to improve CRS. In this paper, we propose Dual-space Hierarchical Learning (DHL) to leverage multi-level goal sequences and their hierarchical relationships for conversational recommendation. Specifically, we exploit multi-level goal sequences from both the representation space and the optimization space. In the representation space, we propose the hierarchical representation learning where a cross attention module derives mutually enhanced multi-level goal representations. In the optimization space, we devise the hierarchical weight learning to reweight lower-level goal sequences, and introduce bi-level optimization for stable update. Additionally, we propose a soft labeling strategy to guide optimization gradually. Experiments on two real-world datasets verify the effectiveness of our approach. Code and data are available here.

preprint2022arXiv

A Graph Data Augmentation Strategy with Entropy Preservation

The Graph Convolutional Networks (GCN) proposed by Kipf and Welling is an effective model for semi-supervised learning, but faces the obstacle of over-smoothing, which will weaken the representation ability of GCN. Recently some works are proposed to tackle above limitation by randomly perturbing graph topology or feature matrix to generate data augmentations as input for training. However, these operations inevitably do damage to the integrity of information structures and have to sacrifice the smoothness of feature manifold. In this paper, we first introduce a novel graph entropy definition as a measure to quantitatively evaluate the smoothness of a data manifold and then point out that this graph entropy is controlled by triangle motif-based information structures. Considering the preservation of graph entropy, we propose an effective strategy to generate randomly perturbed training data but maintain both graph topology and graph entropy. Extensive experiments have been conducted on real-world datasets and the results verify the effectiveness of our proposed method in improving semi-supervised node classification accuracy compared with a surge of baselines. Beyond that, our proposed approach could significantly enhance the robustness of training process for GCN.

preprint2022arXiv

Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey

In recent years, the prevalent online services generate a sheer volume of user activity data. Service providers collect these data in order to perform client behavior analysis, and offer better and more customized services. Majority of these data can be modeled and stored as graph, such as the social graph in Facebook, user-video interaction graph in Youtube. These graphs need to evolve over time to capture the dynamics in the real world, leading to the invention of dynamic graphs. However, the temporal information embedded in the dynamic graphs brings new challenges in analyzing and deploying them. Events staleness, temporal information learning and explicit time dimension usage are some example challenges in dynamic graph learning. In order to offer a convenient reference to both the industry and academia, this survey presents the Three Stages Recurrent Temporal Learning Framework based on dynamic graph evolution theories, so as to interpret the learning of temporal information with a generalized framework. Under this framework, this survey categories and reviews different learnable encoder-decoder architectures for supervised dynamic graph learning. We believe that this survey could supply useful guidelines to researchers and engineers in finding suitable graph structures for their dynamic learning tasks.

preprint2022arXiv

Layer-dependent interlayer antiferromagnetic spin reorientation in air-stable semiconductor CrSBr

Magnetic van der Waals (vdW) materials offer a fantastic platform to investigate and exploit rich spin configurations stabilized in reduced dimensions. One tantalizing magnetic order is the interlayer antiferromagnetism in A-type vdW antiferromagnet, which may be effectively modified by the magnetic field, stacking order and thickness scaling. However, atomically revealing the interlayer spin orientation in the vdW antiferromagnet is highly challenging, because most of the material candidates exhibit an insulating ground state or instability in ambient conditions. Here, we report the layer-dependent interlayer antiferromagnetic reorientation in air-stable semiconductor CrSBr using magnetotransport characterization and first-principles calculations. We reveal a pronounced odd-even layer effect of interlayer reorientation, which originates from the competitions among interlayer exchange, magnetic anisotropy energy and extra Zeeman energy of uncompensated magnetization. Furthermore, we quantitatively constructed the layer-dependent magnetic phase diagram with the help of a linear-chain model. Our work uncovers the layer-dependent interlayer antiferromagnetic reorientation engineered by magnetic field in the air-stable semiconductor, which could contribute to future vdW spintronic devices.

preprint2022arXiv

Multi-FR: A Multi-objective Optimization Framework for Multi-stakeholder Fairness-aware Recommendation

Nowadays, most online services are hosted on multi-stakeholder marketplaces, where consumers and producers may have different objectives. Conventional recommendation systems, however, mainly focus on maximizing consumers' satisfaction by recommending the most relevant items to each individual. This may result in unfair exposure of items, thus jeopardizing producer benefits. Additionally, they do not care whether consumers from diverse demographic groups are equally satisfied. To address these limitations, we propose a multi-objective optimization framework for fairness-aware recommendation, Multi-FR, that adaptively balances accuracy and fairness for various stakeholders with Pareto optimality guarantee. We first propose four fairness constraints on consumers and producers. In order to train the whole framework in an end-to-end way, we utilize the smooth rank and stochastic ranking policy to make these fairness criteria differentiable and friendly to back-propagation. Then, we adopt the multiple gradient descent algorithm to generate a Pareto set of solutions, from which the most appropriate one is selected by the Least Misery Strategy. The experimental results demonstrate that Multi-FR largely improves recommendation fairness on multiple stakeholders over the state-of-the-art approaches while maintaining almost the same recommendation accuracy. The training efficiency study confirms our model's ability to simultaneously optimize different fairness constraints for many stakeholders efficiently.

preprint2022arXiv

Multi-Pair D2D Communications Aided by An Active RIS over Spatially Correlated Channels with Phase Noise

This paper investigates a multi-pair device-to-device (D2D) communication system aided by an active reconfigurable intelligent surface (RIS) with phase noise and direct link. The approximate closed-form expression of the ergodic sum rate is derived over spatially correlated Rician fading channels with statistical channel state information (CSI). When the Rician factors go to infinity, the asymptotic expressions of the ergodic sum rates are presented to give insights in poor scattering environment. The power scaling law for the special case of a single D2D pair is presented without phase noise under uncorrelated Rician fading condition. Then, to solve the ergodic sum rate maximization problem, a method based on genetic algorithm (GA) is proposed for joint power control and discrete phase shifts optimization. Simulation results verify the accuracy of our derivations, and also show that the active RIS outperforms the passive RIS.

preprint2022arXiv

Nonreciprocal transport in a bilayer of MnBi2Te4 and Pt

MnBi2Te4 (MBT) is the first intrinsic magnetic topological insulator with the interaction of spin-momentum locked surface electrons and intrinsic magnetism, and it exhibits novel magnetic and topological phenomena. Recent studies suggested that the interaction of electrons and magnetism can be affected by the Mn-doped Bi2Te3 phase at the surface due to inevitable structural defects. Here we report an observation of nonreciprocal transport, i.e. current-direction-dependent resistance, in a bilayer composed of antiferromagnetic MBT and nonmagnetic Pt. The emergence of the nonreciprocal response below the Néel temperature confirms a correlation between nonreciprocity and intrinsic magnetism in the surface state of MBT. The angular dependence of the nonreciprocal transport indicates that nonreciprocal response originates from the asymmetry scattering of electrons at the surface of MBT mediated by magnon. Our work provides an insight into nonreciprocity arising from the correlation between magnetism and Dirac surface electrons in intrinsic magnetic topological insulators.

preprint2022arXiv

SPENDER: A Platform for Secure and Privacy-Preserving Decentralized P2P E-Commerce

The blockchain technology empowers secure, trustless, and privacy-preserving trading with cryptocurrencies. However, existing blockchain-based trading platforms only support trading cryptocurrencies with digital assets (e.g., NFTs). Although several payment service providers have started to accept cryptocurrency as a payment method for tangible goods (e.g., Visa, PayPal), customers still need to trust and hand over their private information to centralized E-commerce platforms (e.g., Amazon, eBay). To enable trustless and privacy-preserving trading between cryptocurrencies and real goods, we propose SPENDER, a smart-contract-based platform for Secure and Privacy-PresErviNg Decentralized P2P E-commeRce. The design of our platform enables various advantageous features and brings unlimited future potential. Moreover, our platform provides a complete paradigm for designing real-world Web3 infrastructures on the blockchain, which broadens the application scope and exploits the intrinsic values of cryptocurrencies. The platform has been built and tested on the Terra ecosystem, and we plan to open-source the code later.

preprint2022arXiv

Unbiased Implicit Feedback via Bi-level Optimization

Implicit feedback is widely leveraged in recommender systems since it is easy to collect and provides weak supervision signals. Recent works reveal a huge gap between the implicit feedback and user-item relevance due to the fact that implicit feedback is also closely related to the item exposure. To bridge this gap, existing approaches explicitly model the exposure and propose unbiased estimators to improve the relevance. Unfortunately, these unbiased estimators suffer from the high gradient variance, especially for long-tail items, leading to inaccurate gradient updates and degraded model performance. To tackle this challenge, we propose a low-variance unbiased estimator from a probabilistic perspective, which effectively bounds the variance of the gradient. Unlike previous works which either estimate the exposure via heuristic-based strategies or use a large biased training set, we propose to estimate the exposure via an unbiased small-scale validation set. Specifically, we first parameterize the user-item exposure by incorporating both user and item information, and then construct an unbiased validation set from the biased training set. By leveraging the unbiased validation set, we adopt bi-level optimization to automatically update exposure-related parameters along with recommendation model parameters during the learning. Experiments on two real-world datasets and two semi-synthetic datasets verify the effectiveness of our method.

preprint2022arXiv

Variational Nested Dropout

Nested dropout is a variant of dropout operation that is able to order network parameters or features based on the pre-defined importance during training. It has been explored for: I. Constructing nested nets: the nested nets are neural networks whose architectures can be adjusted instantly during testing time, e.g., based on computational constraints. The nested dropout implicitly ranks the network parameters, generating a set of sub-networks such that any smaller sub-network forms the basis of a larger one. II. Learning ordered representation: the nested dropout applied to the latent representation of a generative model (e.g., auto-encoder) ranks the features, enforcing explicit order of the dense representation over dimensions. However, the dropout rate is fixed as a hyper-parameter during the whole training process. For nested nets, when network parameters are removed, the performance decays in a human-specified trajectory rather than in a trajectory learned from data. For generative models, the importance of features is specified as a constant vector, restraining the flexibility of representation learning. To address the problem, we focus on the probabilistic counterpart of the nested dropout. We propose a variational nested dropout (VND) operation that draws samples of multi-dimensional ordered masks at a low cost, providing useful gradients to the parameters of nested dropout. Based on this approach, we design a Bayesian nested neural network that learns the order knowledge of the parameter distributions. We further exploit the VND under different generative models for learning ordered latent distributions. In experiments, we show that the proposed approach outperforms the nested network in terms of accuracy, calibration, and out-of-domain detection in classification tasks. It also outperforms the related generative models on data generation tasks.

preprint2021arXiv

Graph Classification Based on Skeleton and Component Features

Most existing popular methods for learning graph embedding only consider fixed-order global structural features and lack structures hierarchical representation. To address this weakness, we propose a novel graph embedding algorithm named GraphCSC that realizes classification based on skeleton information using fixed-order structures learned in anonymous random walks manner, and component information using different size subgraphs. Two graphs are similar if their skeletons and components are both similar, thus in our model, we integrate both of them together into embeddings as graph homogeneity characterization. We demonstrate our model on different datasets in comparison with a comprehensive list of up-to-date state-of-the-art baselines, and experiments show that our work is superior in real-world graph classification tasks.

preprint2021arXiv

Knowledge-Enhanced Top-K Recommendation in Poincaré Ball

Personalized recommender systems are increasingly important as more content and services become available and users struggle to identify what might interest them. Thanks to the ability for providing rich information, knowledge graphs (KGs) are being incorporated to enhance the recommendation performance and interpretability. To effectively make use of the knowledge graph, we propose a recommendation model in the hyperbolic space, which facilitates the learning of the hierarchical structure of knowledge graphs. Furthermore, a hyperbolic attention network is employed to determine the relative importances of neighboring entities of a certain item. In addition, we propose an adaptive and fine-grained regularization mechanism to adaptively regularize items and their neighboring representations. Via a comparison using three real-world datasets with state-of-the-art methods, we show that the proposed model outperforms the best existing models by 2-16% in terms of NDCG@K on Top-K recommendation.

preprint2021arXiv

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this, we develop a distance-based recommendation model with several novel aspects: (i) each user and item are parameterized by Gaussian distributions to capture the learning uncertainties; (ii) an adaptive margin generation scheme is proposed to generate the margins regarding different training triplets; (iii) explicit user-user/item-item similarity modeling is incorporated in the objective function. The Wasserstein distance is employed to determine preferences because it obeys the triangle inequality and can measure the distance between probabilistic distributions. Via a comparison using five real-world datasets with state-of-the-art methods, the proposed model outperforms the best existing models by 4-22% in terms of recall@K on Top-K recommendation.

preprint2020arXiv

Feature Statistics Guided Efficient Filter Pruning

Building compact convolutional neural networks (CNNs) with reliable performance is a critical but challenging task, especially when deploying them in real-world applications. As a common approach to reduce the size of CNNs, pruning methods delete part of the CNN filters according to some metrics such as $l1$-norm. However, previous methods hardly leverage the information variance in a single feature map and the similarity characteristics among feature maps. In this paper, we propose a novel filter pruning method, which incorporates two kinds of feature map selections: diversity-aware selection (DFS) and similarity-aware selection (SFS). DFS aims to discover features with low information diversity while SFS removes features that have high similarities with others. We conduct extensive empirical experiments with various CNN architectures on publicly available datasets. The experimental results demonstrate that our model obtains up to 91.6% parameter decrease and 83.7% FLOPs reduction with almost no accuracy loss.

preprint2020arXiv

Out-of-Distribution Detection for Skin Lesion Images with Deep Isolation Forest

In this paper, we study the problem of out-of-distribution detection in skin disease images. Publicly available medical datasets normally have a limited number of lesion classes (e.g. HAM10000 has 8 lesion classes). However, there exists a few thousands of clinically identified diseases. Hence, it is important if lesions not in the training data can be differentiated. Toward this goal, we propose DeepIF, a non-parametric Isolation Forest based approach combined with deep convolutional networks. We conduct comprehensive experiments to compare our DeepIF with three baseline models. Results demonstrate state-of-the-art performance of our proposed approach on the task of detecting abnormal skin lesions.

preprint2020arXiv

Quantum oscillation and unusual protection mechanism of the surface state in nonsymmorphic semimetals

In a topological semimetal with Dirac or Weyl points, the bulk edge correspondence principle predicts a gapless edge mode if the essential symmetry is still preserved at the surface. The detection of such topological surface state has been considered as the fingerprint prove for crystals with nontrivial topological bulk band. On the contrary, it has been proposed that even with symmetry broken at the surface, a new surface band can emerge in nonsymmorphic topological semimetals. The symmetry reduction at the surface lifts the bulk band degeneracies, produces an unusual floating surface band with trivial topology. Here, we report quantum transport probing to ZrSiSe thin flakes and reveal transport signatures of this new surface state. Remarkably, though topologically trivial, such a surface band exhibit substantial two dimensional Shubnikov de Haas quantum oscillations with high mobility, which signifies a new protection mechanism and may open applications for surface-related devices.

preprint2020arXiv

Reinforced Epidemic Control: Saving Both Lives and Economy

Saving lives or economy is a dilemma for epidemic control in most cities while smart-tracing technology raises people's privacy concerns. In this paper, we propose a solution for the life-or-economy dilemma that does not require private data. We bypass the private-data requirement by suppressing epidemic transmission through a dynamic control on inter-regional mobility that only relies on Origin-Designation (OD) data. We develop DUal-objective Reinforcement-Learning Epidemic Control Agent (DURLECA) to search mobility-control policies that can simultaneously minimize infection spread and maximally retain mobility. DURLECA hires a novel graph neural network, namely Flow-GNN, to estimate the virus-transmission risk induced by urban mobility. The estimated risk is used to support a reinforcement learning agent to generate mobility-control actions. The training of DURLECA is guided with a well-constructed reward function, which captures the natural trade-off relation between epidemic control and mobility retaining. Besides, we design two exploration strategies to improve the agent's searching efficiency and help it get rid of local optimums. Extensive experimental results on a real-world OD dataset show that DURLECA is able to suppress infections at an extremely low level while retaining 76\% of the mobility in the city. Our implementation is available at https://github.com/anyleopeace/DURLECA/.

preprint2020arXiv

Representation Learning of Graphs Using Graph Convolutional Multilayer Networks Based on Motifs

The graph structure is a commonly used data storage mode, and it turns out that the low-dimensional embedded representation of nodes in the graph is extremely useful in various typical tasks, such as node classification, link prediction , etc. However, most of the existing approaches start from the binary relationship (i.e., edges) in the graph and have not leveraged the higher order local structure (i.e., motifs) of the graph. Here, we propose mGCMN -- a novel framework which utilizes node feature information and the higher order local structure of the graph to effectively generate node embeddings for previously unseen data. Through research we have found that different types of networks have different key motifs. And the advantages of our method over the baseline methods have been demonstrated in a large number of experiments on citation network and social network datasets. At the same time, a positive correlation between increase of the classification accuracy and the clustering coefficient is revealed. It is believed that using high order structural information can truly manifest the potential of the network, which will greatly improve the learning efficiency of the graph neural network and promote a brand-new learning mode establishment.

preprint2018arXiv

Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Study of New York City

Electrification of transportation is critical for a low-carbon society. In particular, public vehicles (e.g., taxis) provide a crucial opportunity for electrification. Despite the benefits of eco-friendliness and energy efficiency, adoption of electric taxis faces several obstacles, including constrained driving range, long recharging duration, limited charging stations and low gas price, all of which impede taxi drivers' decisions to switch to electric taxis. On the other hand, the popularity of ride-hailing mobile apps facilitates the computerization and optimization of taxi service strategies, which can provide computer-assisted decisions of navigation and roaming for taxi drivers to locate potential customers. This paper examines the viability of electric taxis with the assistance of taxi service strategy optimization, in comparison with conventional taxis with internal combustion engines. A big data study is provided using a large dataset of real-world taxi trips in New York City. Our methodology is to first model the computerized taxi service strategy by Markov Decision Process (MDP), and then obtain the optimized taxi service strategy based on NYC taxi trip dataset. The profitability of electric taxi drivers is studied empirically under various battery capacity and charging conditions. Consequently, we shed light on the solutions that can improve viability of electric taxis.

preprint2016arXiv

An Optimization Framework For Online Ride-sharing Markets

Taxi services and product delivery services are instrumental for our modern society. Thanks to the emergence of sharing economy, ride-sharing services such as Uber, Didi, Lyft and Google's Waze Rider are becoming more ubiquitous and grow into an integral part of our everyday lives. However, the efficiency of these services are severely limited by the sub-optimal and imbalanced matching between the supply and demand. We need a generalized framework and corresponding efficient algorithms to address the efficient matching, and hence optimize the performance of these markets. Existing studies for taxi and delivery services are only applicable in scenarios of the one-sided market. In contrast, this work investigates a highly generalized model for the taxi and delivery services in the market economy (abbreviated as"taxi and delivery market") that can be widely used in two-sided markets. Further, we present efficient online and offline algorithms for different applications. We verify our algorithm with theoretical analysis and trace-driven simulations under realistic settings.

preprint2016arXiv

Nano-scale Inhomogeneous Superconductivity in Fe(Te1-xSex) Probed by Nanostructure-transport

Among iron based superconductors, the layered iron chalcogenide Fe(Te1-xSex) is structurally the simplest and has attracted considerable attentions. It has been speculated from bulk studies that nanoscale inhomogeneous superconductivity may inherently exist in this system. However, this has not been directly observed from nanoscale transport measurements. In this work, through simple micromechanical exfoliation and high precision low-energy ion milling thinning, we prepared Fe(Te0.5Se0.5) nano-flake with various thickness and systematically studied the correlation between the thickness and superconducting phase transition. Our result revealed a systematic evolution of superconducting transition with thickness. When the thickness of Fe(Te0.5Se0.5) flake is reduced down to 12nm, i.e. the characteristic length of Te/Se fluctuation, the superconducting current path and the metallicity of normal state in Fe(Te0.5Se0.5) atomic sheets is suppressed. This observation provides the first direct transport evidence for the nano-scale inhomogeneous nature of superconductivity in Fe(Te1-xSex).

preprint2016arXiv

PFO: A Parallel Friendly High Performance System for Online Query and Update of Nearest Neighbors

Nearest Neighbor(s) search is the fundamental computational primitive to tackle massive dataset. Locality Sensitive Hashing (LSH) has been a bracing tool for Nearest Neighbor(s) search in high dimensional spaces. However, traditional LSH systems cannot be applied in online big data systems to handle a large volume of query/update requests, because most of the systems optimize the query efficiency with the assumption of infrequent updates and missing the parallel-friendly design. As a result, the state-of-the-art LSH systems cannot adapt the system response to the user behavior interactively. In this paper, we propose a new LSH system called PFO. It handles query/update requests in RAM and scales the system capacity by using flash memory. To achieve high streaming data throughput, PFO adopts a parallel-friendly indexing structure while preserving the distance between data points. Further, it accommodates inbound data in real-time and dispatches update requests intelligently to eliminate the cross-threads synchronization. We carried out extensive evaluations with large synthetic and standard benchmark datasets. Results demonstrate that PFO delivers shorter latency and offers scalable capacity compared with the existing LSH systems. PFO serves with higher throughput than the state-of-the-art LSH indexing structure when dealing with online query/update requests to nearest neighbors. Meanwhile, PFO returns neighbors with much better quality, thus being efficient to handle online big data applications, e.g. streaming recommendation system, interactive machine learning systems.

preprint2016arXiv

Smart Charging for Electric Vehicles: A Survey From the Algorithmic Perspective

Smart interactions among the smart grid, aggregators and EVs can bring various benefits to all parties involved, e.g., improved reliability and safety for the smart gird, increased profits for the aggregators, as well as enhanced self benefit for EV customers. This survey focus on viewing this smart interactions from an algorithmic perspective. In particular, important dominating factors for coordinated charging from three different perspectives are studied, in terms of smart grid oriented, aggregator oriented and customer oriented smart charging. Firstly, for smart grid oriented EV charging, we summarize various formulations proposed for load flattening, frequency regulation and voltage regulation, then explore the nature and substantial similarity among them. Secondly, for aggregator oriented EV charging, we categorize the algorithmic approaches proposed by research works sharing this perspective as direct and indirect coordinated control, and investigate these approaches in detail. Thirdly, for customer oriented EV charging, based on a commonly shared objective of reducing charging cost, we generalize different formulations proposed by studied research works. Moreover, various uncertainty issues, e.g., EV fleet uncertainty, electricity price uncertainty, regulation demand uncertainty, etc., have been discussed according to the three perspectives classified. At last, we discuss challenging issues that are commonly confronted during modeling the smart interactions, and outline some future research topics in this exciting area.

preprint2016arXiv

Topological nodal-line fermions in ZrSiSe and ZrSiTe

The discovery of topological semimetal phase in three-dimensional (3D) systems is a new breakthrough in topological material research. Dirac nodal-line semimetal is one of the three topological semimetal phases discovered so far; it is characterized by linear band crossing along a line/loop, contrasted with the linear band crossing at discrete momentum points in 3D Dirac and Weyl semimetals. The study of nodal-line semimetal is still at initial stage; only three material systems have been verified to host nodal line fermions until now, including PbTaSe2, PtSn 4and ZrSiS. In this letter, we report evidence of nodal line fermions in ZrSiSe and ZrSiTe probed in de Haas - van Alphen (dHvA) quantum oscillations. Although ZrSiSe and ZrSiTe share similar layered structure with ZrSiS, our measurements of angular dependences of dHvA oscillations indicate the Fermi surface (FS) enclosing Dirac nodal line is of 2D character in ZiSiTe, in contrast with 3D-like FS in ZrSiSe and ZrSiS. Another important property revealed in our experiment is that the nodal line fermion density in ZrSi(S/Se) (~ 10^20-10^21 cm^-3) is much higher than the Dirac/Weyl fermion density of any known topological materials. In addition, we have demonstrated ZrSiSe and ZrSiTe single crystals can be thinned down to 2D atomic thin layers through microexfoliation, which offers a promising platform to verify the predicted 2D topological insulator in the monolayer materials with ZrSiS-type structure

preprint2016arXiv

Using Non-invertible Data Transformations to Build Adversarial-Robust Neural Networks

Deep neural networks have proven to be quite effective in a wide variety of machine learning tasks, ranging from improved speech recognition systems to advancing the development of autonomous vehicles. However, despite their superior performance in many applications, these models have been recently shown to be susceptible to a particular type of attack possible through the generation of particular synthetic examples referred to as adversarial samples. These samples are constructed by manipulating real examples from the training data distribution in order to "fool" the original neural model, resulting in misclassification (with high confidence) of previously correctly classified samples. Addressing this weakness is of utmost importance if deep neural architectures are to be applied to critical applications, such as those in the domain of cybersecurity. In this paper, we present an analysis of this fundamental flaw lurking in all neural architectures to uncover limitations of previously proposed defense mechanisms. More importantly, we present a unifying framework for protecting deep neural models using a non-invertible data transformation--developing two adversary-resilient architectures utilizing both linear and nonlinear dimensionality reduction. Empirical results indicate that our framework provides better robustness compared to state-of-art solutions while having negligible degradation in accuracy.

preprint2015arXiv

Characterizing Information Spreading in Online Social Networks

Online social networks (OSNs) are changing the way in which the information spreads throughout the Internet. A deep understanding of the information spreading in OSNs leads to both social and commercial benefits. In this paper, we characterize the dynamic of information spreading (e.g., how fast and widely the information spreads against time) in OSNs by developing a general and accurate model based on the Interactive Markov Chains (IMCs) and mean-field theory. This model explicitly reveals the impacts of the network topology on information spreading in OSNs. Further, we extend our model to feature the time-varying user behaviors and the ever-changing information popularity. The complicated dynamic patterns of information spreading are captured by our model using six key parameters. Extensive tests based on Renren's dataset validate the accuracy of our model, which demonstrate that it can characterize the dynamic patterns of video sharing in Renren precisely and predict future spreading tendency successfully.

preprint2015arXiv

Does "Like" Really Mean Like? A Study of the Facebook Fake Like Phenomenon and an Efficient Countermeasure

Social networks help to bond people who share similar interests all over the world. As a complement, the Facebook "Like" button is an efficient tool that bonds people with the online information. People click on the "Like" button to express their fondness of a particular piece of information and in turn tend to visit webpages with high "Like" count. The important fact of the Like count is that it reflects the number of actual users who "liked" this information. However, according to our study, one can easily exploit the defects of the "Like" button to counterfeit a high "Like" count. We provide a proof-of-concept implementation of these exploits, and manage to generate 100 fake Likes in 5 minutes with a single account. We also reveal existing counterfeiting techniques used by some online sellers to achieve unfair advantage for promoting their products. To address this fake Like problem, we study the varying patterns of Like count and propose an innovative fake Like detection method based on clustering. To evaluate the effectiveness of our algorithm, we collect the Like count history of more than 9,000 websites. Our experiments successfully uncover 16 suspicious fake Like buyers that show abnormal Like count increase patterns.

preprint2014arXiv

Gate Tunable Quantum Oscillations in Air-Stable and High Mobility Few-Layer Phosphorene Heterostructures

As the only non-carbon elemental layered allotrope, few-layer black phosphorus or phosphorene has emerged as a novel two-dimensional (2D) semiconductor with both high bulk mobility and a band gap. Here we report fabrication and transport measurements of phosphorene-hexagonal BN (hBN) heterostructures with one-dimensional (1D) edge contacts. These transistors are stable in ambient conditions for >300 hours, and display ambipolar behavior, a gate-dependent metal-insulator transition, and mobility up to 4000 $cm^2$/Vs. At low temperatures, we observe gate-tunable Shubnikov de Haas (SdH) magneto-oscillations and Zeeman splitting in magnetic field with an estimated g-factor ~2. The cyclotron mass of few-layer phosphorene holes is determined to increase from 0.25 to 0.31 $m_e$ as the Fermi level moves towards the valence band edge. Our results underscore the potential of few-layer phosphorene (FLP) as both a platform for novel 2D physics and an electronic material for semiconductor applications.

preprint2014arXiv

High Performance Field-Effect Transistor Based on Multilayer Tungsten Disulfide

Semiconducting two-dimensional transition metal chalcogenide crystals have been regarded as the promising candidate for the future generation of transistor in modern electronics. However, how to fabricate those crystals into practical devices with acceptable performance still remains as a challenge. Employing tungsten disulfide multilayer thin crystals, we demonstrate that using gold as the only contact metal and choosing appropriate thickness of the crystal, high performance transistor with on/off ratio of $10^{8}$ and mobility up to $234\:cm^{2}V^{-1}s^{-1}$ at room temperature can be realized in a simple device structure. Further low temperature study revealed that the high performance of our device is caused by the minimized Schottky barrier at the contact and the existence of a shallow impurity level around 80 meV right below the conduction band edge. From the analysis on temperature dependence of field-effect mobility, we conclude that strongly suppressed phonon scattering and relatively low charge impurity density are the key factors leading to the high mobility of our tungsten disulfide devices.

preprint2011arXiv

Location Cheating: A Security Challenge to Location-based Social Network Services

Location-based mobile social network services such as foursquare and Gowalla have grown exponentially over the past several years. These location-based services utilize the geographical position to enrich user experiences in a variety of contexts, including location-based searching and location-based mobile advertising. To attract more users, the location-based mobile social network services provide real-world rewards to the user, when a user checks in at a certain venue or location. This gives incentives for users to cheat on their locations. In this report, we investigate the threat of location cheating attacks, find the root cause of the vulnerability, and outline the possible defending mechanisms. We use foursquare as an example to introduce a novel location cheating attack, which can easily pass the current location verification mechanism (e.g., cheater code of foursquare). We also crawl the foursquare website. By analyzing the crawled data, we show that automated large scale cheating is possible. Through this work, we aim to call attention to location cheating in mobile social network services and provide insights into the defending mechanisms.

preprint2011arXiv

SafeVchat: Detecting Obscene Content and Misbehaving Users in Online Video Chat Services

Online video chat services such as Chatroulette, Omegle, and vChatter that randomly match pairs of users in video chat sessions are fast becoming very popular, with over a million users per month in the case of Chatroulette. A key problem encountered in such systems is the presence of flashers and obscene content. This problem is especially acute given the presence of underage minors in such systems. This paper presents SafeVchat, a novel solution to the problem of flasher detection that employs an array of image detection algorithms. A key contribution of the paper concerns how the results of the individual detectors are fused together into an overall decision classifying the user as misbehaving or not, based on Dempster-Shafer Theory. The paper introduces a novel, motion-based skin detection method that achieves significantly higher recall and better precision. The proposed methods have been evaluated over real world data and image traces obtained from Chatroulette.com.

preprint2010arXiv

Complete Two-dimensional Muellermetric Imaging of Biological Tissue Using Heterodyned Optical Coherence Tomography

A polarization-sensitive optical coherence tomography system based on heterodyning and filtering techniques is built to perform Stokesmetric imaging of different layers of depths in a porcine tendon sample. The complete 4\times4 backscattering Muellermetric images of one layer are acquired using such a system. The images reveal information indiscernible from a conventional OCT system.

preprint2010arXiv

Intrusions into Privacy in Video Chat Environments: Attacks and Countermeasures

Video chat systems such as Chatroulette have become increasingly popular as a way to meet and converse one-on-one via video and audio with other users online in an open and interactive manner. At the same time, security and privacy concerns inherent in such communication have been little explored. This paper presents one of the first investigations of the privacy threats found in such video chat systems, identifying three such threats, namely de-anonymization attacks, phishing attacks, and man-in-the-middle attacks. The paper further describes countermeasures against each of these attacks.

preprint2010arXiv

Pulse delay via tunable white light cavities using fiber optic resonators

Previously, we proposed a data buffering system that makes use of a pair of white light cavities. For application to telecommunication systems, it would be convenient to realize such a device using fiber optic resonators. In this paper, we present the design of such a system, where the white light cavity effect is produced by using stimulated Brillouin scattering. The system consists of a pair of fiber optic white light cavities placed in series. As in the original proposal, the delay time can be controlled independently of the bandwidth of the data pulses. Furthermore, we show how the bandwidth of the system can be made as large as several times the Brillouin frequency shift. We also show that the net delay achievable in such a buffer can be significantly larger than what can be achieved using a conventional recirculating loop buffer.

Xue Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

36 published item(s)

MINER: Mining Multimodal Internal Representation for Efficient Retrieval

Anonymous Pattern Molecular Fingerprint and its Applications on Property Identification

Dual-space Hierarchical Learning for Goal-guided Conversational Recommendation

A Graph Data Augmentation Strategy with Entropy Preservation

Encoder-Decoder Architecture for Supervised Dynamic Graph Learning: A Survey

Layer-dependent interlayer antiferromagnetic spin reorientation in air-stable semiconductor CrSBr

Multi-FR: A Multi-objective Optimization Framework for Multi-stakeholder Fairness-aware Recommendation

Multi-Pair D2D Communications Aided by An Active RIS over Spatially Correlated Channels with Phase Noise

Nonreciprocal transport in a bilayer of MnBi2Te4 and Pt

SPENDER: A Platform for Secure and Privacy-Preserving Decentralized P2P E-Commerce

Unbiased Implicit Feedback via Bi-level Optimization

Variational Nested Dropout

Graph Classification Based on Skeleton and Component Features

Knowledge-Enhanced Top-K Recommendation in Poincaré Ball

Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation

Feature Statistics Guided Efficient Filter Pruning

Out-of-Distribution Detection for Skin Lesion Images with Deep Isolation Forest

Quantum oscillation and unusual protection mechanism of the surface state in nonsymmorphic semimetals

Reinforced Epidemic Control: Saving Both Lives and Economy

Representation Learning of Graphs Using Graph Convolutional Multilayer Networks Based on Motifs

Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Study of New York City

An Optimization Framework For Online Ride-sharing Markets

Nano-scale Inhomogeneous Superconductivity in Fe(Te1-xSex) Probed by Nanostructure-transport

PFO: A Parallel Friendly High Performance System for Online Query and Update of Nearest Neighbors

Smart Charging for Electric Vehicles: A Survey From the Algorithmic Perspective

Topological nodal-line fermions in ZrSiSe and ZrSiTe

Using Non-invertible Data Transformations to Build Adversarial-Robust Neural Networks

Characterizing Information Spreading in Online Social Networks

Does "Like" Really Mean Like? A Study of the Facebook Fake Like Phenomenon and an Efficient Countermeasure

Gate Tunable Quantum Oscillations in Air-Stable and High Mobility Few-Layer Phosphorene Heterostructures

High Performance Field-Effect Transistor Based on Multilayer Tungsten Disulfide

Location Cheating: A Security Challenge to Location-based Social Network Services

SafeVchat: Detecting Obscene Content and Misbehaving Users in Online Video Chat Services

Complete Two-dimensional Muellermetric Imaging of Biological Tissue Using Heterodyned Optical Coherence Tomography

Intrusions into Privacy in Video Chat Environments: Attacks and Countermeasures

Pulse delay via tunable white light cavities using fiber optic resonators