Source author record

Ziheng Chen

Ziheng Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Computer Vision hep-ex math.NA math.PR Numerical Analysis physics.ins-det q-fin.PR Computer Science and Game Theory Information Retrieval Neural and Evolutionary Computing Symbolic Computation

Catalog footprint

What is connected

15works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

FROG: Fair Removal on Graphs

With growing emphasis on privacy regulations, machine unlearning has become increasingly critical in real-world applications such as social networks and recommender systems, many of which are naturally represented as graphs. However, existing graph unlearning methods often modify nodes or edges indiscriminately, overlooking their impact on fairness. For instance, forgetting links between users of different genders may inadvertently exacerbate group disparities. To address this issue, we propose a novel framework that jointly optimizes both the graph structure and the model to achieve fair unlearning. Our method rewires the graph by removing redundant edges that hinder forgetting while preserving fairness through targeted edge augmentation. We further introduce a worst-case evaluation mechanism to assess robustness under challenging scenarios. Experiments on real-world datasets show that our approach achieves more effective and fair unlearning than existing baselines.

preprint2026arXiv

On the Fair Allocation to Asymmetric Agents with Binary XOS Valuations

We study the problem of allocating $m$ indivisible goods among $n$ agents, where each agent's valuation is fractionally subadditive (XOS). With respect to AnyPrice Share (APS) fairness, Kulkarni et al. (2024) showed that, when agents have binary marginal values, a $0.1222$-APS allocation can be found in polynomial time, and there exists an instance where no allocation is better than $0.5$-approximate APS. Very recently, Feige and Grinberg (2025) extended the problem to the asymmetric case, where agents may have different entitlements, and improved the approximation ratio to $1/6$ for general XOS valuations. In this work, we focus on the asymmetric setting with binary XOS valuations, and further improve the approximation ratio to $1/2$, which matches the known upper bound. We also present a polynomial-time algorithm to compute such an allocation. Beyond APS fairness, we also study the weighted maximin share (WMMS) fairness. Farhadi et al. (2019) showed that, a $1/n$-WMMS allocation always exists for agents with general additive valuations, and that this approximation ratio is tight. We extend this result to general XOS valuations, where a $1/n$-WMMS allocation still exists, and this approximation ratio cannot be improved even when marginal values are binary. This shows a sharp contrast to binary additive valuations, where an exact WMMS allocation exists and can be found in polynomial time.

preprint2026arXiv

Reinforcement Learning for Option Hedging: Static Implied-Volatility Fit versus Shortfall-Aware Performance

We extend the Q-learner in Black-Scholes (QLBS) framework by incorporating risk aversion and trading costs, and propose a novel Replication Learning of Option Pricing (RLOP) approach. Both methods are fully compatible with standard reinforcement learning algorithms and operate under market frictions. Using SPY and XOP option data, we evaluate performance along static and dynamic dimensions. Adaptive-QLBS achieves higher static pricing accuracy in implied volatility space, while RLOP delivers superior dynamic hedging performance by reducing shortfall probability. These results highlight the importance of evaluating option pricing models beyond static fit, emphasizing realized hedging outcomes.

preprint2026arXiv

Riemannian Networks over Full-Rank Correlation Matrices

Representations on the Symmetric Positive Definite (SPD) manifold have garnered significant attention across different applications. In contrast, the manifold of full-rank correlation matrices, a normalized alternative to SPD matrices, remains largely underexplored. This paper introduces Riemannian networks over the correlation manifold, leveraging five recently developed correlation geometries. We systematically extend basic layers, including Multinomial Logistic Regression (MLR), Fully Connected (FC), and convolutional layers, to these geometries. Besides, we present methods for accurate backpropagation for two correlation geometries. Experiments comparing our approach against existing SPD and Grassmannian networks demonstrate its effectiveness.

preprint2026arXiv

Toward Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric

Machine unlearning is becoming essential for building trustworthy and compliant language models. Yet unlearning success varies considerably across individual samples: some are reliably erased, while others persist despite the same procedure. We argue that this disparity is not only a data-side phenomenon, but also reflects model-internal mechanisms that encode and protect memorized information. We study this problem from a mechanistic perspective based on model circuits--structured interaction pathways that govern how predictions are formed. We propose Circuit-guided Unlearning Difficulty (CUD), a {\em pre-unlearning} metric that assigns each sample a continuous difficulty score using circuit-level signals. Extensive experiments demonstrate that CUD reliably separates intrinsically easy and hard samples, and remains stable across unlearning methods. We identify key circuit-level patterns that reveal a mechanistic signature of difficulty: easy-to-unlearn samples are associated with shorter, shallower interactions concentrated in earlier-to-intermediate parts of the original model, whereas hard samples rely on longer and deeper pathways closer to late-stage computation. Compared to existing qualitative studies, CUD takes a first step toward a principled, fine-grained, and interpretable analysis of unlearning difficulty; and motivates the development of unlearning methods grounded in model mechanisms.

preprint2022arXiv

AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension

Recent machine reading comprehension datasets such as ReClor and LogiQA require performing logical reasoning over text. Conventional neural models are insufficient for logical reasoning, while symbolic reasoners cannot directly apply to text. To meet the challenge, we present a neural-symbolic approach which, to predict an answer, passes messages over a graph representing logical relations between text units. It incorporates an adaptive logic graph network (AdaLoGN) which adaptively infers logical relations to extend the graph and, essentially, realizes mutual and iterative reinforcement between neural and symbolic reasoning. We also implement a novel subgraph-to-node message passing mechanism to enhance context-option interaction for answering multiple-choice questions. Our approach shows promising results on ReClor and LogiQA.

preprint2022arXiv

DreamNet: A Deep Riemannian Network based on SPD Manifold Learning for Visual Classification

Image set-based visual classification methods have achieved remarkable performance, via characterising the image set in terms of a non-singular covariance matrix on a symmetric positive definite (SPD) manifold. To adapt to complicated visual scenarios better, several Riemannian networks (RiemNets) for SPD matrix nonlinear processing have recently been studied. However, it is pertinent to ask, whether greater accuracy gains can be achieved by simply increasing the depth of RiemNets. The answer appears to be negative, as deeper RiemNets tend to lose generalization ability. To explore a possible solution to this issue, we propose a new architecture for SPD matrix learning. Specifically, to enrich the deep representations, we adopt SPDNet [1] as the backbone, with a stacked Riemannian autoencoder (SRAE) built on the tail. The associated reconstruction error term can make the embedding functions of both SRAE and of each RAE an approximate identity mapping, which helps to prevent the degradation of statistical information. We then insert several residual-like blocks with shortcut connections to augment the representational capacity of SRAE, and to simplify the training of a deeper network. The experimental evidence demonstrates that our DreamNet can achieve improved accuracy with increased depth of the network.

preprint2022arXiv

GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations

Recently, graph neural networks (GNNs) have been widely used to develop successful recommender systems. Although powerful, it is very difficult for a GNN-based recommender system to attach tangible explanations of why a specific item ends up in the list of suggestions for a given user. Indeed, explaining GNN-based recommendations is unique, and existing GNN explanation methods are inappropriate for two reasons. First, traditional GNN explanation methods are designed for node, edge, or graph classification tasks rather than ranking, as in recommender systems. Second, standard machine learning explanations are usually intended to support skilled decision-makers. Instead, recommendations are designed for any end-user, and thus their explanations should be provided in user-understandable ways. In this work, we propose GREASE, a novel method for explaining the suggestions provided by any black-box GNN-based recommender system. Specifically, GREASE first trains a surrogate model on a target user-item pair and its $l$-hop neighborhood. Then, it generates both factual and counterfactual explanations by finding optimal adjacency matrix perturbations to capture the sufficient and necessary conditions for an item to be recommended, respectively. Experimental results conducted on real-world datasets demonstrate that GREASE can generate concise and effective explanations for popular GNN-based recommender models.

preprint2022arXiv

Large deviations principle for stochastic delay differential equations with super-linearly growing coefficients

We utilize the weak convergence method to establish the Freidlin--Wentzell large deviations principle (LDP) for stochastic delay differential equations (SDDEs) with super-linearly growing coefficients, which covers a large class of cases with non-globally Lipschitz coefficients. The key ingredient in our proof is the uniform moment estimate of the controlled equation, where we handle the super-linear growth of the coefficients by an iterative argument. Our results allow both the drift and diffusion coefficients of the considered equations to super-linearly grow not only with respect to the delay variable but also to the state variable. This work extends the existing results which develop the LDPs for SDDEs with super-linearly growing coefficients only with respect to the delay variable.

preprint2022arXiv

RLOP: RL Methods in Option Pricing from a Mathematical Perspective

Abstract In this work, we build two environments, namely the modified QLBS and RLOP models, from a mathematics perspective which enables RL methods in option pricing through replicating by portfolio. We implement the environment specifications (the source code can be found at https://github.com/owen8877/RLOP), the learning algorithm, and agent parametrization by a neural network. The learned optimal hedging strategy is compared against the BS prediction. The effect of various factors is considered and studied based on how they affect the optimal price and position.

preprint2021arXiv

Costly Features Classification using Monte Carlo Tree Search

We consider the problem of costly feature classification, where we sequentially select the subset of features to make a balance between the classification error and the feature cost. In this paper, we first cast the task into a MDP problem and use Advantage Actor Critic algorithm to solve it. In order to further improve the agent's performance and make the policy explainable, we employ the Monte Carlo Tree Search to update the policy iteratively. During the procedure, we also consider its performance on the unbalanced dataset and its sensitivity to the missing value. We evaluate our model on multiple datasets and find it outperforms other methods.

preprint2020arXiv

A full-discrete exponential Euler approximation of invariant measure for parabolic stochastic partial differential equations

We discrete the ergodic semilinear stochastic partial differential equations in space dimension $d \leq 3$ with additive noise, spatially by a spectral Galerkin method and temporally by an exponential Euler scheme. It is shown that both the spatial semi-discretization and the spatio-temporal full discretization are ergodic. Further, convergence orders of the numerical invariant measures, depending on the regularity of noise, are recovered based on an easy time-independent weak error analysis without relying on Malliavin calculus. To be precise, the convergence order is $1-ε$ in space and $\frac{1}{2}-ε$ in time for the space-time white noise case and $2-ε$ in space and $1-ε$ in time for the trace class noise case in space dimension $d = 1$, with arbitrarily small $ε>0$. Numerical results are finally reported to confirm these theoretical findings.

preprint2020arXiv

CLUE: A Fast Parallel Clustering Algorithm for High Granularity Calorimeters in High Energy Physics

One of the challenges of high granularity calorimeters, such as that to be built to cover the endcap region in the CMS Phase-2 Upgrade for HL-LHC, is that the large number of channels causes a surge in the computing load when clustering numerous digitised energy deposits (hits) in the reconstruction stage. In this article, we propose a fast and fully-parallelizable density-based clustering algorithm, optimized for high occupancy scenarios, where the number of clusters is much larger than the average number of hits in a cluster. The algorithm uses a grid spatial index for fast querying of neighbours and its timing scales linearly with the number of hits within the range considered. We also show a comparison of the performance on CPU and GPU implementations, demonstrating the power of algorithmic parallelization in the coming era of heterogeneous computing in high energy physics.

preprint2020arXiv

Reconstruction in an imaging calorimeter for HL-LHC

The CMS endcap calorimeter upgrade for the High Luminosity LHC in 2027 uses silicon sensors to achieve radiation tolerance, with the further benefit of a very high readout granularity. Small scintillator tiles with individual SiPM readout are used in regions permitted by the radiation levels. A reconstruction framework is being developed to fully exploit the granularity and other significant features of the detector like precision timing, especially in the high pileup environment of HL-LHC. An iterative clustering framework (TICL) has been put in place, and is being actively developed. The framework takes as input the clusters of energy deposited in individual calorimeter layers delivered by the CLUE algorithm, which has recently been revised and tuned. Mindful of the projected extreme pressure on computing capacity in the HL-LHC era, the algorithms are being designed with modern parallel architectures in mind. Important speedup has recently been obtained for the clustering algorithm by running it on GPUs. Machine learning techniques are being developed and integrated into the reconstruction framework. This paper will describe the approaches being considered and show first results.

preprint2020arXiv

The Bayesian Inversion Problem for Thermal Average Sampling of Quantum Systems

In this article, we propose a novel method for sampling potential functions based on noisy observation data of a finite number of observables in quantum canonical ensembles, which leads to the accurate sampling of a wide class of test observables. The method is based on the Bayesian inversion framework, which provides a platform for analyzing the posterior distribution and naturally leads to an efficient numerical sampling algorithm. We highlight that, the stability estimate is obtained by treating the potential functions as intermediate variables in the following way: the discrepancy between two sets of observation data of training observables can bound the distance between corresponding posterior distributions of potential functions, while the latter naturally leads to a bound of the discrepancies between corresponding thermal averages of test observables. Besides, the training observables can be more flexible than finite samples of the local density function, which are mostly used in previous researches. The method also applies to the multi-level quantum systems in the non-adiabatic regime. In addition, we provide extensive numerical tests to verify the accuracy and efficiency of the proposed algorithm.

Ziheng Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

FROG: Fair Removal on Graphs

On the Fair Allocation to Asymmetric Agents with Binary XOS Valuations

Reinforcement Learning for Option Hedging: Static Implied-Volatility Fit versus Shortfall-Aware Performance

Riemannian Networks over Full-Rank Correlation Matrices

Toward Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric

AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension

DreamNet: A Deep Riemannian Network based on SPD Manifold Learning for Visual Classification

GREASE: Generate Factual and Counterfactual Explanations for GNN-based Recommendations

Large deviations principle for stochastic delay differential equations with super-linearly growing coefficients

RLOP: RL Methods in Option Pricing from a Mathematical Perspective

Costly Features Classification using Monte Carlo Tree Search

A full-discrete exponential Euler approximation of invariant measure for parabolic stochastic partial differential equations

CLUE: A Fast Parallel Clustering Algorithm for High Granularity Calorimeters in High Energy Physics

Reconstruction in an imaging calorimeter for HL-LHC

The Bayesian Inversion Problem for Thermal Average Sampling of Quantum Systems