Researcher profile

Yuan Qi

Yuan Qi contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

Nested Spatio-Temporal Time Series Forecasting

Spatiotemporal forecasting is critical for real-world applications like traffic management, yet capturing reliable interactions remains challenging under noisy and non-stationary conditions. Existing methods primarily rely on historical spatial priors, often failing to account for evolving temporal correlations and suffering from systematic errors. In this work, we propose a nested forecasting framework that couples future macro-level regional trends with micro-level historical observations, enabling top-down guidance from abstract future representations for fine-grained forecasting. Specifically, we employ a spectral clustering-based approach to construct semantically coherent regions, providing both theoretical and empirical evidence that this representation effectively filters systematic noise while preserving essential trends. Building on this, we develop a progressive coarse-to-fine predictor to integrate these representative features into the inference process. This enables the model to leverage trend predictions to anticipate dynamic anomalies, such as periodic offsets, in advance. Furthermore, extensive experiments on multiple high-dimensional datasets demonstrate that our method consistently outperforms state-of-the-art baselines, validating the effectiveness of future macro-guided nested forecasting.

preprint2022arXiv

Variational Policy Propagation for Multi-agent Reinforcement Learning

We propose a \emph{collaborative} multi-agent reinforcement learning algorithm named variational policy propagation (VPP) to learn a \emph{joint} policy through the interactions over agents. We prove that the joint policy is a Markov Random Field under some mild conditions, which in turn reduces the policy space effectively. We integrate the variational inference as special differentiable layers in policy such that the actions can be efficiently sampled from the Markov Random Field and the overall policy is differentiable. We evaluate our algorithm on several large scale challenging tasks and demonstrate that it outperforms previous state-of-the-arts.

preprint2020arXiv

A Riemannian Primal-dual Algorithm Based on Proximal Operator and its Application in Metric Learning

In this paper, we consider optimizing a smooth, convex, lower semicontinuous function in Riemannian space with constraints. To solve the problem, we first convert it to a dual problem and then propose a general primal-dual algorithm to optimize the primal and dual variables iteratively. In each optimization iteration, we employ a proximal operator to search optimal solution in the primal space. We prove convergence of the proposed algorithm and show its non-asymptotic convergence rate. By utilizing the proposed primal-dual optimization technique, we propose a novel metric learning algorithm which learns an optimal feature transformation matrix in the Riemannian space of positive definite matrices. Preliminary experimental results on an optimal fund selection problem in fund of funds (FOF) management for quantitative investment showed its efficacy.

preprint2020arXiv

A Semi-supervised Graph Attentive Network for Financial Fraud Detection

With the rapid growth of financial services, fraud detection has been a very important problem to guarantee a healthy environment for both users and providers. Conventional solutions for fraud detection mainly use some rule-based methods or distract some features manually to perform prediction. However, in financial services, users have rich interactions and they themselves always show multifaceted information. These data form a large multiview network, which is not fully exploited by conventional methods. Additionally, among the network, only very few of the users are labelled, which also poses a great challenge for only utilizing labeled data to achieve a satisfied performance on fraud detection. To address the problem, we expand the labeled data through their social relations to get the unlabeled data and propose a semi-supervised attentive graph neural network, namedSemiGNN to utilize the multi-view labeled and unlabeled data for fraud detection. Moreover, we propose a hierarchical attention mechanism to better correlate different neighbors and different views. Simultaneously, the attention mechanism can make the model interpretable and tell what are the important factors for the fraud and why the users are predicted as fraud. Experimentally, we conduct the prediction task on the users of Alipay, one of the largest third-party online and offline cashless payment platform serving more than 4 hundreds of million users in China. By utilizing the social relations and the user attributes, our method can achieve a better accuracy compared with the state-of-the-art methods on two tasks. Moreover, the interpretable results also give interesting intuitions regarding the tasks.

preprint2020arXiv

AGL: a Scalable System for Industrial-purpose Graph Machine Learning

Machine learning over graphs have been emerging as powerful learning tools for graph data. However, it is challenging for industrial communities to leverage the techniques, such as graph neural networks (GNNs), and solve real-world problems at scale because of inherent data dependency in the graphs. As such, we cannot simply train a GNN with classic learning systems, for instance parameter server that assumes data parallel. Existing systems store the graph data in-memory for fast accesses either in a single machine or graph stores from remote. The major drawbacks are in three-fold. First, they cannot scale because of the limitations on the volume of the memory, or the bandwidth between graph stores and workers. Second, they require extra development of graph stores without well exploiting mature infrastructures such as MapReduce that guarantee good system properties. Third, they focus on training but ignore the optimization of inference over graphs, thus makes them an unintegrated system. In this paper, we design AGL, a scalable, fault-tolerance and integrated system, with fully-functional training and inference for GNNs. Our system design follows the message passing scheme underlying the computations of GNNs. We design to generate the $k$-hop neighborhood, an information-complete subgraph for each node, as well as do the inference simply by merging values from in-edge neighbors and propagating values to out-edge neighbors via MapReduce. In addition, the $k$-hop neighborhood contains information-complete subgraphs for each node, thus we simply do the training on parameter servers due to data independency. Our system AGL, implemented on mature infrastructures, can finish the training of a 2-layer graph attention network on a graph with billions of nodes and hundred billions of edges in 14 hours, and complete the inference in 1.2 hour.

preprint2020arXiv

Bandit Samplers for Training Graph Neural Networks

Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs). However, due to the intractable computation of optimal sampling distribution, these sampling algorithms are suboptimal for GCNs and are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT). The fundamental reason is that the embeddings of the neighbors or learned weights involved in the optimal sampling distribution are changing during the training and not known a priori, but only partially observed when sampled, thus making the derivation of an optimal variance reduced samplers non-trivial. In this paper, we formulate the optimization of the sampling variance as an adversary bandit problem, where the rewards are related to the node embeddings and learned weights, and can vary constantly. Thus a good sampler needs to acquire variance information about more neighbors (exploration) while at the same time optimizing the immediate sampling variance (exploit). We theoretically show that our algorithm asymptotically approaches the optimal variance within a factor of 3. We show the efficiency and effectiveness of our approach on multiple datasets.

preprint2020arXiv

Efficient Probabilistic Logic Reasoning with Graph Neural Networks

Markov Logic Networks (MLNs), which elegantly combine logic rules and probabilistic graphical models, can be used to address many knowledge graph problems. However, inference in MLN is computationally intensive, making the industrial-scale application of MLN very difficult. In recent years, graph neural networks (GNNs) have emerged as efficient and effective tools for large-scale graph problems. Nevertheless, GNNs do not explicitly incorporate prior logic rules into the models, and may require many labeled examples for a target task. In this paper, we explore the combination of MLNs and GNNs, and use graph neural networks for variational inference in MLN. We propose a GNN variant, named ExpressGNN, which strikes a nice balance between the representation power and the simplicity of the model. Our extensive experiments on several benchmark datasets demonstrate that ExpressGNN leads to effective and efficient probabilistic logic reasoning.

preprint2020arXiv

Generating Natural Language Adversarial Examples on a Large Scale with Generative Models

Today text classification models have been widely used. However, these classifiers are found to be easily fooled by adversarial examples. Fortunately, standard attacking methods generate adversarial texts in a pair-wise way, that is, an adversarial text can only be created from a real-world text by replacing a few words. In many applications, these texts are limited in numbers, therefore their corresponding adversarial examples are often not diverse enough and sometimes hard to read, thus can be easily detected by humans and cannot create chaos at a large scale. In this paper, we propose an end to end solution to efficiently generate adversarial texts from scratch using generative models, which are not restricted to perturbing the given texts. We call it unrestricted adversarial text generation. Specifically, we train a conditional variational autoencoder (VAE) with an additional adversarial loss to guide the generation of adversarial examples. Moreover, to improve the validity of adversarial texts, we utilize discrimators and the training framework of generative adversarial networks (GANs) to make adversarial texts consistent with real data. Experimental results on sentiment analysis demonstrate the scalability and efficiency of our method. It can attack text classification models with a higher success rate than existing methods, and provide acceptable quality for humans in the meantime.

preprint2020arXiv

Generative Adversarial User Model for Reinforcement Learning Based Recommendation System

There are great interests as well as many challenges in applying reinforcement learning (RL) to recommendation systems. In this setting, an online user is the environment; neither the reward function nor the environment dynamics are clearly defined, making the application of RL challenging. In this paper, we propose a novel model-based reinforcement learning framework for recommendation systems, where we develop a generative adversarial network to imitate user behavior dynamics and learn her reward function. Using this user model as the simulation environment, we develop a novel Cascading DQN algorithm to obtain a combinatorial recommendation policy which can handle a large number of candidate items efficiently. In our experiments with real data, we show this generative adversarial user model can better explain user behavior than alternatives, and the RL policy based on this model can lead to a better long-term reward for the user and higher click rate for the system.

preprint2020arXiv

Graph Representation Learning for Merchant Incentive Optimization in Mobile Payment Marketing

Mobile payment such as Alipay has been widely used in our daily lives. To further promote the mobile payment activities, it is important to run marketing campaigns under a limited budget by providing incentives such as coupons, commissions to merchants. As a result, incentive optimization is the key to maximizing the commercial objective of the marketing campaign. With the analyses of online experiments, we found that the transaction network can subtly describe the similarity of merchants' responses to different incentives, which is of great use in the incentive optimization problem. In this paper, we present a graph representation learning method atop of transaction networks for merchant incentive optimization in mobile payment marketing. With limited samples collected from online experiments, our end-to-end method first learns merchant representations based on an attributed transaction networks, then effectively models the correlations between the commercial objectives each merchant may achieve and the incentives under varying treatments. Thus we are able to model the sensitivity to incentive for each merchant, and spend the most budgets on those merchants that show strong sensitivities in the marketing campaign. Extensive offline and online experimental results at Alipay demonstrate the effectiveness of our proposed approach.

preprint2020arXiv

How Much Can A Retailer Sell? Sales Forecasting on Tmall

Time-series forecasting is an important task in both academic and industry, which can be applied to solve many real forecasting problems like stock, water-supply, and sales predictions. In this paper, we study the case of retailers' sales forecasting on Tmall|the world's leading online B2C platform. By analyzing the data, we have two main observations, i.e., sales seasonality after we group different groups of retails and a Tweedie distribution after we transform the sales (target to forecast). Based on our observations, we design two mechanisms for sales forecasting, i.e., seasonality extraction and distribution transformation. First, we adopt Fourier decomposition to automatically extract the seasonalities for different categories of retailers, which can further be used as additional features for any established regression algorithms. Second, we propose to optimize the Tweedie loss of sales after logarithmic transformations. We apply these two mechanisms to classic regression models, i.e., neural network and Gradient Boosting Decision Tree, and the experimental results on Tmall dataset show that both mechanisms can significantly improve the forecasting results.

preprint2020arXiv

InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance

The insurance industry has been creating innovative products around the emerging online shopping activities. Such e-commerce insurance is designed to protect buyers from potential risks such as impulse purchases and counterfeits. Fraudulent claims towards online insurance typically involve multiple parties such as buyers, sellers, and express companies, and they could lead to heavy financial losses. In order to uncover the relations behind organized fraudsters and detect fraudulent claims, we developed a large-scale insurance fraud detection system, i.e., InfDetect, which provides interfaces for commonly used graphs, standard data processing procedures, and a uniform graph learning platform. InfDetect is able to process big graphs containing up to 100 millions of nodes and billions of edges. In this paper, we investigate different graphs to facilitate fraudster mining, such as a device-sharing graph, a transaction graph, a friendship graph, and a buyer-seller graph. These graphs are fed to a uniform graph learning platform containing supervised and unsupervised graph learning algorithms. Cases on widely applied e-commerce insurance are described to demonstrate the usage and capability of our system. InfDetect has successfully detected thousands of fraudulent claims and saved over tens of thousands of dollars daily.

preprint2020arXiv

Local Contextual Attention with Hierarchical Structure for Dialogue Act Recognition

Dialogue act recognition is a fundamental task for an intelligent dialogue system. Previous work models the whole dialog to predict dialog acts, which may bring the noise from unrelated sentences. In this work, we design a hierarchical model based on self-attention to capture intra-sentence and inter-sentence information. We revise the attention distribution to focus on the local and contextual semantic information by incorporating the relative position information between utterances. Based on the found that the length of dialog affects the performance, we introduce a new dialog segmentation mechanism to analyze the effect of dialog length and context padding length under online and offline settings. The experiment shows that our method achieves promising performance on two datasets: Switchboard Dialogue Act and DailyDialog with the accuracy of 80.34\% and 85.81\% respectively. Visualization of the attention weights shows that our method can learn the context dependency between utterances explicitly.

preprint2020arXiv

Long Short-Term Sample Distillation

In the past decade, there has been substantial progress at training increasingly deep neural networks. Recent advances within the teacher--student training paradigm have established that information about past training updates show promise as a source of guidance during subsequent training steps. Based on this notion, in this paper, we propose Long Short-Term Sample Distillation, a novel training policy that simultaneously leverages multiple phases of the previous training process to guide the later training updates to a neural network, while efficiently proceeding in just one single generation pass. With Long Short-Term Sample Distillation, the supervision signal for each sample is decomposed into two parts: a long-term signal and a short-term one. The long-term teacher draws on snapshots from several epochs ago in order to provide steadfast guidance and to guarantee teacher--student differences, while the short-term one yields more up-to-date cues with the goal of enabling higher-quality updates. Moreover, the teachers for each sample are unique, such that, overall, the model learns from a very diverse set of teachers. Comprehensive experimental results across a range of vision and NLP tasks demonstrate the effectiveness of this new training method.

preprint2020arXiv

NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay

Ant Credit Pay is a consumer credit service in Ant Financial Service Group. Similar to credit card, loan default is one of the major risks of this credit product. Hence, effective algorithm for default prediction is the key to losses reduction and profits increment for the company. However, the challenges facing in our scenario are different from those in conventional credit card service. The first one is scalability. The huge volume of users and their behaviors in Ant Financial requires the ability to process industrial-scale data and perform model training efficiently. The second challenges is the cold-start problem. Different from the manual review for credit card application in conventional banks, the credit limit of Ant Credit Pay is automatically offered to users based on the knowledge learned from big data. However, default prediction for new users is suffered from lack of enough credit behaviors. It requires that the proposal should leverage other new data source to alleviate the cold-start problem. Considering the above challenges and the special scenario in Ant Financial, we try to incorporate default prediction with network information to alleviate the cold-start problem. In this paper, we propose an industrial-scale distributed network representation framework, termed NetDP, for default prediction in Ant Credit Pay. The proposal explores network information generated by various interaction between users, and blends unsupervised and supervised network representation in a unified framework for default prediction problem. Moreover, we present a parameter-server-based distributed implement of our proposal to handle the scalability challenge. Experimental results demonstrate the effectiveness of our proposal, especially in cold-start problem, as well as the efficiency for industrial-scale dataset.

preprint2020arXiv

Practical Privacy Preserving POI Recommendation

Point-of-Interest (POI) recommendation has been extensively studied and successfully applied in industry recently. However, most existing approaches build centralized models on the basis of collecting users' data. Both private data and models are held by the recommender, which causes serious privacy concerns. In this paper, we propose a novel Privacy preserving POI Recommendation (PriRec) framework. First, to protect data privacy, users' private data (features and actions) are kept on their own side, e.g., Cellphone or Pad. Meanwhile, the public data need to be accessed by all the users are kept by the recommender to reduce the storage costs of users' devices. Those public data include: (1) static data only related to the status of POI, such as POI categories, and (2) dynamic data depend on user-POI actions such as visited counts. The dynamic data could be sensitive, and we develop local differential privacy techniques to release such data to public with privacy guarantees. Second, PriRec follows the representations of Factorization Machine (FM) that consists of linear model and the feature interaction model. To protect the model privacy, the linear models are saved on users' side, and we propose a secure decentralized gradient descent protocol for users to learn it collaboratively. The feature interaction model is kept by the recommender since there is no privacy risk, and we adopt secure aggregation strategy in federated learning paradigm to learn it. To this end, PriRec keeps users' private raw data and models in users' own hands, and protects user privacy to a large extent. We apply PriRec in real-world datasets, and comprehensive experiments demonstrate that, compared with FM, PriRec achieves comparable or even better recommendation accuracy.

preprint2020arXiv

Riemannian Proximal Policy Optimization

In this paper, We propose a general Riemannian proximal optimization algorithm with guaranteed convergence to solve Markov decision process (MDP) problems. To model policy functions in MDP, we employ Gaussian mixture model (GMM) and formulate it as a nonconvex optimization problem in the Riemannian space of positive semidefinite matrices. For two given policy functions, we also provide its lower bound on policy improvement by using bounds derived from the Wasserstein distance of GMMs. Preliminary experiments show the efficacy of our proposed Riemannian proximal policy optimization algorithm.

preprint2020arXiv

RNE: A Scalable Network Embedding for Billion-scale Recommendation

Nowadays designing a real recommendation system has been a critical problem for both academic and industry. However, due to the huge number of users and items, the diversity and dynamic property of the user interest, how to design a scalable recommendation system, which is able to efficiently produce effective and diverse recommendation results on billion-scale scenarios, is still a challenging and open problem for existing methods. In this paper, given the user-item interaction graph, we propose RNE, a data-efficient Recommendation-based Network Embedding method, to give personalized and diverse items to users. Specifically, we propose a diversity- and dynamics-aware neighbor sampling method for network embedding. On the one hand, the method is able to preserve the local structure between the users and items while modeling the diversity and dynamic property of the user interest to boost the recommendation quality. On the other hand the sampling method can reduce the complexity of the whole method theoretically to make it possible for billion-scale recommendation. We also implement the designed algorithm in a distributed way to further improves its scalability. Experimentally, we deploy RNE on a recommendation scenario of Taobao, the largest E-commerce platform in China, and train it on a billion-scale user-item graph. As is shown on several online metrics on A/B testing, RNE is able to achieve both high-quality and diverse results compared with CF-based methods. We also conduct the offline experiments on Pinterest dataset comparing with several state-of-the-art recommendation methods and network embedding methods. The results demonstrate that our method is able to produce a good result while runs much faster than the baseline methods.

preprint2020arXiv

SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check

Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the similarity knowledge as either an external input resource or just heuristic rules. This paper proposes to incorporate phonological and visual similarity knowledge into language models for CSC via a specialized graph convolutional network (SpellGCN). The model builds a graph over the characters, and SpellGCN is learned to map this graph into a set of inter-dependent character classifiers. These classifiers are applied to the representations extracted by another network, such as BERT, enabling the whole network to be end-to-end trainable. Experiments (The dataset and all code for this paper are available at https://github.com/ACL2020SpellGCN/SpellGCN) are conducted on three human-annotated datasets. Our method achieves superior performance against previous models by a large margin.

preprint2020arXiv

Uncovering Insurance Fraud Conspiracy with Network Learning

Fraudulent claim detection is one of the greatest challenges the insurance industry faces. Alibaba's return-freight insurance, providing return-shipping postage compensations over product return on the e-commerce platform, receives thousands of potentially fraudulent claims every day. Such deliberate abuse of the insurance policy could lead to heavy financial losses. In order to detect and prevent fraudulent insurance claims, we developed a novel data-driven procedure to identify groups of organized fraudsters, one of the major contributions to financial losses, by learning network information. In this paper, we introduce a device-sharing network among claimants, followed by developing an automated solution for fraud detection based on graph learning algorithms, to separate fraudsters from regular customers and uncover groups of organized fraudsters. This solution applied at Alibaba achieves more than 80% precision while covering 44% more suspicious accounts compared with a previously deployed rule-based classifier after human expert investigations. Our approach can easily and effectively generalizes to other types of insurance.