Source author record

Lecheng Zheng

Lecheng Zheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computational Engineering, Finance, and Science Computer Vision

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

Networked Markets, Fragmented Data: Adaptive Graph Learning for Customer Risk Analytics and Policy Design

Financial institutions face escalating challenges in identifying high-risk customer behaviors within massive transaction networks, where fraudulent activities exploit market fragmentation and institutional boundaries. We address three fundamental problems in customer risk analytics: data silos preventing holistic relationship assessment, extreme behavioral class imbalance, and suboptimal customer intervention strategies that fail to balance compliance costs with relationship value. We develop an integrated customer intelligence framework combining federated learning, relational network analysis, and adaptive targeting policies. Our federated graph neural network enables collaborative behavior modeling across competing institutions without compromising proprietary customer data, using privacy-preserving embeddings to capture cross-market relational patterns. We introduce cross-bank Personalized PageRank to identify coordinated behavioral clusters providing interpretable customer network segmentation for risk managers. A hierarchical reinforcement learning mechanism optimizes dynamic intervention targeting, calibrating escalation policies to maximize prevention value while minimizing customer friction and operational costs. Analyzing 1.4 million customer transactions across seven markets, our approach reduces false positive and false negative rates to 4.64% and 11.07%, substantially outperforming single-institution models. The framework prevents 79.25% of potential losses versus 49.41% under fixed-rule policies, with optimal market-specific targeting thresholds reflecting heterogeneous customer base characteristics. These findings demonstrate that federated customer analytics materially improve both risk management effectiveness and customer relationship outcomes in networked competitive markets.

preprint2022arXiv

Heterogeneous Contrastive Learning

With the advent of big data across multiple high-impact applications, we are often facing the challenge of complex heterogeneity. The newly collected data usually consist of multiple modalities and are characterized with multiple labels, thus exhibiting the co-existence of multiple types of heterogeneity. Although state-of-the-art techniques are good at modeling complex heterogeneity with sufficient label information, such label information can be quite expensive to obtain in real applications. Recently, researchers pay great attention to contrastive learning due to its prominent performance by utilizing rich unlabeled data. However, existing work on contrastive learning is not able to address the problem of false negative pairs, i.e., some `negative' pairs may have similar representations if they have the same label. To overcome the issues, in this paper, we propose a unified heterogeneous learning framework, which combines both the weighted unsupervised contrastive loss and the weighted supervised contrastive loss to model multiple types of heterogeneity. We first provide a theoretical analysis showing that the vanilla contrastive learning loss easily leads to the sub-optimal solution in the presence of false negative pairs, whereas the proposed weighted loss could automatically adjust the weight based on the similarity of the learned representations to mitigate this issue. Experimental results on real-world data sets demonstrate the effectiveness and the efficiency of the proposed framework modeling multiple types of heterogeneity.

preprint2022arXiv

MentorGNN: Deriving Curriculum for Pre-Training GNNs

Graph pre-training strategies have been attracting a surge of attention in the graph mining community, due to their flexibility in parameterizing graph neural networks (GNNs) without any label information. The key idea lies in encoding valuable information into the backbone GNNs, by predicting the masked graph signals extracted from the input graphs. In order to balance the importance of diverse graph signals (e.g., nodes, edges, subgraphs), the existing approaches are mostly hand-engineered by introducing hyperparameters to re-weight the importance of graph signals. However, human interventions with sub-optimal hyperparameters often inject additional bias and deteriorate the generalization performance in the downstream applications. This paper addresses these limitations from a new perspective, i.e., deriving curriculum for pre-training GNNs. We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs with diverse structures and disparate feature spaces. To comprehend heterogeneous graph signals at different granularities, we propose a curriculum learning paradigm that automatically re-weighs graph signals in order to ensure a good generalization in the target domain. Moreover, we shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs. Extensive experiments on a wealth of real graphs validate and verify the performance of MentorGNN.

preprint2021arXiv

Deep Co-Attention Network for Multi-View Subspace Learning

Many real-world applications involve data from multiple modalities and thus exhibit the view heterogeneity. For example, user modeling on social media might leverage both the topology of the underlying social network and the content of the users' posts; in the medical domain, multiple views could be X-ray images taken at different poses. To date, various techniques have been proposed to achieve promising results, such as canonical correlation analysis based methods, etc. In the meanwhile, it is critical for decision-makers to be able to understand the prediction results from these methods. For example, given the diagnostic result that a model provided based on the X-ray images of a patient at different poses, the doctor needs to know why the model made such a prediction. However, state-of-the-art techniques usually suffer from the inability to utilize the complementary information of each view and to explain the predictions in an interpretable manner. To address these issues, in this paper, we propose a deep co-attention network for multi-view subspace learning, which aims to extract both the common information and the complementary information in an adversarial setting and provide robust interpretations behind the prediction to the end-users via the co-attention mechanism. In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation by incorporating the classifier into our model. This improves the quality of latent representation and accelerates the convergence speed. Finally, we develop an efficient iterative algorithm to find the optimal encoders and discriminator, which are evaluated extensively on synthetic and real-world data sets. We also conduct a case study to demonstrate how the proposed method robustly interprets the predictions on an image data set.