Researcher profile

Zhiyong Cheng

Zhiyong Cheng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Graph-Structured Driven Dual Adaptation for Mitigating Popularity Bias

Popularity bias is a common challenge in recommender systems. It often causes unbalanced item recommendation performance and intensifies the Matthew effect. Due to limited user-item interactions, unpopular items are frequently constrained to the embedding neighborhoods of only a few users, leading to representation collapse and weakening the model's generalization. Although existing supervised alignment and reweighting methods can help mitigate this problem, they still face two major limitations: (1) they overlook the inherent variability among different Graph Convolutional Networks (GCNs) layers, which can result in negative gains in deeper layers; (2) they rely heavily on fixed hyperparameters to balance popular and unpopular items, limiting adaptability to diverse data distributions and increasing model complexity. To address these challenges, we propose Graph-Structured Dual Adaptation Framework (GSDA), a dual adaptive framework for mitigating popularity bias in recommendation. Our theoretical analysis shows that supervised alignment in GCNs is hindered by the over-smoothing effect, where the distinction between popular and unpopular items diminishes as layers deepen, reducing the effectiveness of alignment at deeper levels. To overcome this limitation, GSDA integrates a hierarchical adaptive alignment mechanism that counteracts entropy decay across layers together with a distribution-aware contrastive weighting strategy based on the Gini coefficient, enabling the model to adapt its debiasing strength dynamically without relying on fixed hyperparameters. Extensive experiments on three benchmark datasets demonstrate that GSDA effectively alleviates popularity bias while consistently outperforming state-of-the-art methods in recommendation performance.

preprint2026arXiv

Post-hoc Provider Fairness Adaptation via Hierarchical Exposure Alignment

Provider exposure fairness is crucial for sustaining a healthy content ecosystem and preventing monopolization in recommender systems. Yet, most existing methods either incorporate fairness constraints during model training, requiring expensive retraining when fairness objectives change, or rely on post-hoc reranking with fixed criteria, which lacks adaptability to diverse fairness requirements. To overcome these limitations, we propose Post-hoc Fairness Adaptation (PFA), a lightweight framework that equips a frozen recommender with a fairness adapter, enabling flexible fairness control without retraining the backbone model. Specifically, the fairness adapter learns personalized additive score adjustments from user-item embeddings, which are injected into the original ranking scores to steer provider exposure toward fairness. To train the adapter, we minimize the KL divergence between the actual and the target fair exposure distributions. However, this global objective implicitly treats all providers equally, ignoring structural disparities such as imbalanced provider group sizes and heterogeneous exposure within groups. Consequently, fairness may appear satisfied at an aggregate level while severe inter-group and intra-group exposure imbalances persist, undermining practical fairness. To address this, we design Hierarchical Exposure Fairness Alignment (HEFA), which explicitly balances inter- and intra-group provider exposure disparities, enabling flexible adaptation to diverse fairness requirements. To mitigate potential accuracy degradation, PFA jointly optimizes HEFA with a differentiable NDCG loss, enabling end-to-end fairness optimization while preserving ranking quality. Extensive experiments on three public datasets demonstrate that PFA achieves substantial fairness gains with negligible accuracy loss, consistently outperforming strong baselines.

preprint2022arXiv

A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA

Knowledge-based Visual Question Answering (VQA) expects models to rely on external knowledge for robust answer prediction. Though significant it is, this paper discovers several leading factors impeding the advancement of current state-of-the-art methods. On the one hand, methods which exploit the explicit knowledge take the knowledge as a complement for the coarsely trained VQA model. Despite their effectiveness, these approaches often suffer from noise incorporation and error propagation. On the other hand, pertaining to the implicit knowledge, the multi-modal implicit knowledge for knowledge-based VQA still remains largely unexplored. This work presents a unified end-to-end retriever-reader framework towards knowledge-based VQA. In particular, we shed light on the multi-modal implicit knowledge from vision-language pre-training models to mine its potential in knowledge reasoning. As for the noise problem encountered by the retrieval operation on explicit knowledge, we design a novel scheme to create pseudo labels for effective knowledge supervision. This scheme is able to not only provide guidance for knowledge retrieval, but also drop these instances potentially error-prone towards question answering. To validate the effectiveness of the proposed method, we conduct extensive experiments on the benchmark dataset. The experimental results reveal that our method outperforms existing baselines by a noticeable margin. Beyond the reported numbers, this paper further spawns several insights on knowledge utilization for future research with some empirical findings.

preprint2022arXiv

Disentangled Graph Neural Networks for Session-based Recommendation

Session-based recommendation (SBR) has drawn increasingly research attention in recent years, due to its great practical value by only exploiting the limited user behavior history in the current session. Existing methods typically learn the session embedding at the item level, namely, aggregating the embeddings of items with or without the attention weights assigned to items. However, they ignore the fact that a user's intent on adopting an item is driven by certain factors of the item (e.g., the leading actors of an movie). In other words, they have not explored finer-granularity interests of users at the factor level to generate the session embedding, leading to sub-optimal performance. To address the problem, we propose a novel method called Disentangled Graph Neural Network (Disen-GNN) to capture the session purpose with the consideration of factor-level attention on each item. Specifically, we first employ the disentangled learning technique to cast item embeddings into the embedding of multiple factors, and then use the gated graph neural network (GGNN) to learn the embedding factor-wisely based on the item adjacent similarity matrix computed for each factor. Moreover, the distance correlation is adopted to enhance the independence between each pair of factors. After representing each item with independent factors, an attention mechanism is designed to learn user intent to different factors of each item in the session. The session embedding is then generated by aggregating the item embeddings with attention weights of each item's factors. To this end, our model takes user intents at the factor level into account to infer the user purpose in a session. Extensive experiments on three benchmark datasets demonstrate the superiority of our method over existing methods.

preprint2022arXiv

Temporal Action Localization with Multi-temporal Scales

Temporal action localization plays an important role in video analysis, which aims to localize and classify actions in untrimmed videos. The previous methods often predict actions on a feature space of a single-temporal scale. However, the temporal features of a low-level scale lack enough semantics for action classification while a high-level scale cannot provide rich details of the action boundaries. To address this issue, we propose to predict actions on a feature space of multi-temporal scales. Specifically, we use refined feature pyramids of different scales to pass semantics from high-level scales to low-level scales. Besides, to establish the long temporal scale of the entire video, we use a spatial-temporal transformer encoder to capture the long-range dependencies of video frames. Then the refined features with long-range dependencies are fed into a classifier for the coarse action prediction. Finally, to further improve the prediction accuracy, we propose to use a frame-level self attention module to refine the classification and boundaries of each action instance. Extensive experiments show that the proposed method can outperform state-of-the-art approaches on the THUMOS14 dataset and achieves comparable performance on the ActivityNet1.3 dataset. Compared with A2Net (TIP20, Avg\{0.3:0.7\}), Sub-Action (CSVT2022, Avg\{0.1:0.5\}), and AFSD (CVPR21, Avg\{0.3:0.7\}) on the THUMOS14 dataset, the proposed method can achieve improvements of 12.6\%, 17.4\% and 2.2\%, respectively

preprint2021arXiv

Liquidation, Leverage and Optimal Margin in Bitcoin Futures Markets

Using the generalized extreme value theory to characterize tail distributions, we address liquidation, leverage, and optimal margins for bitcoin long and short futures positions. The empirical analysis of perpetual bitcoin futures on BitMEX shows that (1) daily forced liquidations to out- standing futures are substantial at 3.51%, and 1.89% for long and short; (2) investors got forced liquidation do trade aggressively with average leverage of 60X; and (3) exchanges should elevate current 1% margin requirement to 33% (3X leverage) for long and 20% (5X leverage) for short to reduce the daily margin call probability to 1%. Our results further suggest normality assumption on return significantly underestimates optimal margins. Policy implications are also discussed.

preprint2020arXiv

A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation

As important side information, attributes have been widely exploited in the existing recommender system for better performance. In the real-world scenarios, it is common that some attributes of items/users are missing (e.g., some movies miss the genre data). Prior studies usually use a default value (i.e., &#34;other&#34;) to represent the missing attribute, resulting in sub-optimal performance. To address this problem, in this paper, we present an attribute-aware attentive graph convolution network (A${^2}$-GCN). In particular, we first construct a graph, whereby users, items, and attributes are three types of nodes and their associations are edges. Thereafter, we leverage the graph convolution network to characterize the complicated interactions among <users, items, attributes>. To learn the node representation, we turn to the message-passing strategy to aggregate the message passed from the other directly linked types of nodes (e.g., a user or an attribute). To this end, we are capable of incorporating associate attributes to strengthen the user and item representations, and thus naturally solve the attribute missing problem. Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information. Extensive experiments have been conducted on several publicly accessible datasets to justify our model. Results show that our model outperforms several state-of-the-art methods and demonstrate the effectiveness of our attention method.

preprint2020arXiv

Direction Finding of Electromagnetic Sources on a Sparse Cross-Dipole Array Using One-Bit Measurements

Sparse array arrangement has been widely used in vector-sensor arrays because of increased degree-of-freedoms for identifying more sources than sensors. For large-size sparse vector-sensor arrays, one-bit measurements can further reduce the receiver system complexity by using low-resolution ADCs. In this paper, we present a sparse cross-dipole array with one-bit measurements to estimate Direction of Arrivals (DOA) of electromagnetic sources. Based on the independence assumption of sources, we establish the relation between the covariance matrix of one-bit measurements and that of unquantized measurements by Bussgang Theorem. Then we develop a Spatial-Smooth MUSIC (SS-MUSIC) based method, One-Bit MUSIC (OB-MUSIC), to estimate the DOAs. By jointly utilizing the covariance matrices of two dipole arrays, we find that OB-MUSIC is robust against polarization states. We also derive the Cramer-Rao bound (CRB) of DOA estimation for the proposed scheme. Furthermore, we theoretically analyze the applicability of the independence assumption of sources, which is the fundamental of the proposed and other typical methods, and verify the assumption in typical communication applications. Numerical results show that, with the same number of sensors, one-bit sparse cross-dipole arrays have comparable performance with unquantized uniform linear arrays and thus provide a compromise between the DOA estimation performance and the system complexity.

preprint2020arXiv

Dual-level Semantic Transfer Deep Hashing for Efficient Social Image Retrieval

Social network stores and disseminates a tremendous amount of user shared images. Deep hashing is an efficient indexing technique to support large-scale social image retrieval, due to its deep representation capability, fast retrieval speed and low storage cost. Particularly, unsupervised deep hashing has well scalability as it does not require any manually labelled data for training. However, owing to the lacking of label guidance, existing methods suffer from severe semantic shortage when optimizing a large amount of deep neural network parameters. Differently, in this paper, we propose a Dual-level Semantic Transfer Deep Hashing (DSTDH) method to alleviate this problem with a unified deep hash learning framework. Our model targets at learning the semantically enhanced deep hash codes by specially exploiting the user-generated tags associated with the social images. Specifically, we design a complementary dual-level semantic transfer mechanism to efficiently discover the potential semantics of tags and seamlessly transfer them into binary hash codes. On the one hand, instance-level semantics are directly preserved into hash codes from the associated tags with adverse noise removing. Besides, an image-concept hypergraph is constructed for indirectly transferring the latent high-order semantic correlations of images and tags into hash codes. Moreover, the hash codes are obtained simultaneously with the deep representation learning by the discrete hash optimization strategy. Extensive experiments on two public social image retrieval datasets validate the superior performance of our method compared with state-of-the-art hashing methods. The source codes of our method can be obtained at https://github.com/research2020-1/DSTDH

preprint2020arXiv

Multi-Feature Discrete Collaborative Filtering for Fast Cold-start Recommendation

Hashing is an effective technique to address the large-scale recommendation problem, due to its high computation and storage efficiency on calculating the user preferences on items. However, existing hashing-based recommendation methods still suffer from two important problems: 1) Their recommendation process mainly relies on the user-item interactions and single specific content feature. When the interaction history or the content feature is unavailable (the cold-start problem), their performance will be seriously deteriorated. 2) Existing methods learn the hash codes with relaxed optimization or adopt discrete coordinate descent to directly solve binary hash codes, which results in significant quantization loss or consumes considerable computation time. In this paper, we propose a fast cold-start recommendation method, called Multi-Feature Discrete Collaborative Filtering (MFDCF), to solve these problems. Specifically, a low-rank self-weighted multi-feature fusion module is designed to adaptively project the multiple content features into binary yet informative hash codes by fully exploiting their complementarity. Additionally, we develop a fast discrete optimization algorithm to directly compute the binary hash codes with simple operations. Experiments on two public recommendation datasets demonstrate that MFDCF outperforms the state-of-the-arts on various aspects.