Source author record

Shuo Han

Shuo Han appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Computer Vision Cryptography and Security Data Structures and Algorithms eess.IV Systems and Control Machine Learning Artificial Intelligence Computation and Language Social and Information Networks cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.supr-con Multiagent Systems physics.soc-ph

Catalog footprint

What is connected

20works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning

On-policy reinforcement learning (RL), particularly Proximal Policy Optimization (PPO) and Group Relative Policy Optimization (GRPO), has become the dominant paradigm for fine-tuning large language models (LLMs). While policy ratio clipping stabilizes training, this heuristic hard constraint incurs a fundamental cost: it indiscriminately truncates gradients from high-return yet high-divergence actions, suppressing rare but highly informative "eureka moments" in complex reasoning. Moreover, once data becomes slightly stale, hard clipping renders it unusable, leading to severe sample inefficiency. In this work, we revisit the trust-region objective in policy optimization and show that explicitly constraining the \emph{variance (second central moment) of the policy ratio} provides a principled and smooth relaxation of hard clipping. This distributional constraint stabilizes policy updates while preserving gradient signals from valuable trajectories. Building on this insight, we propose $R^2VPO$ (Ratio-Variance Regularized Policy Optimization), a novel primal-dual framework that supports stable on-policy learning and enables principled off-policy data reuse by dynamically reweighting stale samples rather than discarding them. We extensively evaluate $R^2VPO$ on fine-tuning state-of-the-art LLMs, including DeepSeek-Distill-Qwen-1.5B and the openPangu-Embedded series (1B and 7B), across challenging mathematical reasoning benchmarks. Experimental results show that $R^2VPO$ consistently achieves superior asymptotic performance, with average relative gains of up to 17% over strong clipping-based baselines, while requiring approximately 50% fewer rollouts to reach convergence. These findings establish ratio-variance control as a promising direction for improving both stability and data efficiency in RL-based LLM alignment.

preprint2026arXiv

See or Say Graphs: Agent-Driven Scalable Graph Structure Understanding with Vision-Language Models

Vision-language models (VLMs) have shown promise in graph structure understanding, but remain limited by input-token constraints, facing scalability bottlenecks and lacking effective mechanisms to coordinate textual and visual modalities. To address these challenges, we propose GraphVista, a unified framework that enhances both scalability and modality coordination in graph structure understanding. For scalability, GraphVista organizes graph information hierarchically into a lightweight GraphRAG base, which retrieves only task-relevant textual descriptions and high-resolution visual subgraphs, compressing redundant context while preserving key reasoning elements. For modality coordination, GraphVista introduces a planning agent that decomposes and routes tasks to the most suitable modality-using the text modality for direct access to explicit graph properties and the visual modality for local graph structure reasoning grounded in explicit topology. Extensive experiments demonstrate that GraphVista scales to large graphs, up to 200$\times$ larger than those used in existing benchmarks, and consistently outperforms existing textual, visual, and fusion-based methods, achieving up to 4.4$\times$ quality improvement over the state-of-the-art baselines by fully exploiting the complementary strengths of both modalities.

preprint2023arXiv

Optimal Decoy Resource Allocation for Proactive Defense in Probabilistic Attack Graphs

This paper investigates the problem of synthesizing proactive defense systems in which the defender can allocate deceptive targets and modify the cost of actions for the attacker who aims to compromise security assets in this system. We model the interaction of the attacker and the system using a formal security model -- a probabilistic attack graph. By allocating fake targets/decoys, the defender aims to distract the attacker from compromising true targets. By increasing the cost of some attack actions, the defender aims to discourage the attacker from committing to certain policies and thereby improve the defense. To optimize the defense given limited decoy resources and operational constraints, we formulate the synthesis problem as a bi-level optimization problem, while the defender designs the system, in anticipation of the attacker's best response given that the attacker has disinformation about the system due to the use of deception. Though the general formulation with bi-level optimization is NP-hard, we show that under certain assumptions, the problem can be transformed into a constrained optimization problem. We proposed an algorithm to approximately solve this constrained optimization problem using a novel incentive-design method for projected gradient ascent. We demonstrate the effectiveness of the proposed method using extensive numerical experiments.

preprint2022arXiv

A Secure and Efficient Federated Learning Framework for NLP

In this work, we consider the problem of designing secure and efficient federated learning (FL) frameworks. Existing solutions either involve a trusted aggregator or require heavyweight cryptographic primitives, which degrades performance significantly. Moreover, many existing secure FL designs work only under the restrictive assumption that none of the clients can be dropped out from the training protocol. To tackle these problems, we propose SEFL, a secure and efficient FL framework that (1) eliminates the need for the trusted entities; (2) achieves similar and even better model accuracy compared with existing FL designs; (3) is resilient to client dropouts. Through extensive experimental studies on natural language processing (NLP) tasks, we demonstrate that the SEFL achieves comparable accuracy compared to existing FL solutions, and the proposed pruning technique can improve runtime performance up to 13.7x.

preprint2022arXiv

Coordinate Translator for Learning Deformable Medical Image Registration

The majority of deep learning (DL) based deformable image registration methods use convolutional neural networks (CNNs) to estimate displacement fields from pairs of moving and fixed images. This, however, requires the convolutional kernels in the CNN to not only extract intensity features from the inputs but also understand image coordinate systems. We argue that the latter task is challenging for traditional CNNs, limiting their performance in registration tasks. To tackle this problem, we first introduce Coordinate Translator, a differentiable module that identifies matched features between the fixed and moving image and outputs their coordinate correspondences without the need for training. It unloads the burden of understanding image coordinate systems for CNNs, allowing them to focus on feature extraction. We then propose a novel deformable registration network, im2grid, that uses multiple Coordinate Translator's with the hierarchical features extracted from a CNN encoder and outputs a deformation field in a coarse-to-fine fashion. We compared im2grid with the state-of-the-art DL and non-DL methods for unsupervised 3D magnetic resonance image registration. Our experiments show that im2grid outperforms these methods both qualitatively and quantitatively.

preprint2022arXiv

Disentangling A Single MR Modality

Disentangling anatomical and contrast information from medical images has gained attention recently, demonstrating benefits for various image analysis tasks. Current methods learn disentangled representations using either paired multi-modal images with the same underlying anatomy or auxiliary labels (e.g., manual delineations) to provide inductive bias for disentanglement. However, these requirements could significantly increase the time and cost in data collection and limit the applicability of these methods when such data are not available. Moreover, these methods generally do not guarantee disentanglement. In this paper, we present a novel framework that learns theoretically and practically superior disentanglement from single modality magnetic resonance images. Moreover, we propose a new information-based metric to quantitatively evaluate disentanglement. Comparisons over existing disentangling methods demonstrate that the proposed method achieves superior performance in both disentanglement and cross-domain image-to-image translation tasks.

preprint2022arXiv

Pressure-induced superconductivity in flat-band Kagome compounds Pd$_3$P$_2$(S$_{1-x}$Se$_x$)$_8$

We performed high-pressure transport studies on the flat-band Kagome compounds, Pd$_3$P$_2$(S$_{1-x}$Se$_x$)$_8$ ($x$ = 0, 0.25), with a diamond anvil cell. For both compounds, the resistivity exhibits an insulating behavior with pressure up to 17 GPa. With pressure above 20 GPa, a metallic behavior is observed at high temperatures in Pd$_3$P$_2$S$_8$, and superconductivity emerges at low temperatures. The onset temperature of superconducting transition $T_{\rm C}$ rises monotonically from 2 K to 4.8 K and does not saturate with pressure up to 43 GPa. For the Se-doped compound Pd$_3$P$_2$(S$_{0.75}$Se$_{0.25}$)$_8$, the $T_{\rm C}$ is about 1.5 K higher than that of the undoped one over the whole pressure range, and reaches 6.4 K at 43 GPa. The upper critical field with field applied along the $c$ axis at typical pressures is about 50$\%$ of the Pauli limit, suggesting a 3D superconductivity. The Hall coefficient in the metallic phase is low and exhibits a peaked behavior at about 30 K, which suggests either a multi-band electronic structure or an electron correlation effect in the system.

preprint2021arXiv

CHAOS Challenge -- Combined (CT-MR) Healthy Abdominal Organ Segmentation

Segmentation of abdominal organs has been a comprehensive, yet unresolved, research field for many years. In the last decade, intensive developments in deep learning (DL) have introduced new state-of-the-art segmentation systems. In order to expand the knowledge on these topics, the CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation challenge has been organized in conjunction with IEEE International Symposium on Biomedical Imaging (ISBI), 2019, in Venice, Italy. CHAOS provides both abdominal CT and MR data from healthy subjects for single and multiple abdominal organ segmentation. Five different but complementary tasks have been designed to analyze the capabilities of current approaches from multiple perspectives. The results are investigated thoroughly, compared with manual annotations and interactive methods. The analysis shows that the performance of DL models for single modality (CT / MR) can show reliable volumetric analysis performance (DICE: 0.98 $\pm$ 0.00 / 0.95 $\pm$ 0.01) but the best MSSD performance remain limited (21.89 $\pm$ 13.94 / 20.85 $\pm$ 10.63 mm). The performances of participating models decrease significantly for cross-modality tasks for the liver (DICE: 0.88 $\pm$ 0.15 MSSD: 36.33 $\pm$ 21.97 mm) and all organs (DICE: 0.85 $\pm$ 0.21 MSSD: 33.17 $\pm$ 38.93 mm). Despite contrary examples on different applications, multi-tasking DL models designed to segment all organs seem to perform worse compared to organ-specific ones (performance drop around 5\%). Besides, such directions of further research for cross-modality segmentation would significantly support real-world clinical applications. Moreover, having more than 1500 participants, another important contribution of the paper is the analysis on shortcomings of challenge organizations such as the effects of multiple submissions and peeking phenomena.

preprint2020arXiv

Atomic structure of CdS magic-size clusters by X-ray absorption spectroscopy

Magic-size clusters are ultra-small colloidal semiconductor systems that are intensively studied due to their monodisperse nature and sharp UV-vis absorption peak compared with regular quantum dots. However, the small size of such clusters (<2 nm), and the large surface-to-bulk ratio significantly limit characterisation techniques that can be utilised. Here we demonstrate how a combination of EXAFS and XANES can be used to obtain information about sample stoichiometry and cluster symmetry. Investigating two types of clusters that show sharp UV-vis absorption peaks at 311 nm and 322 nm, we found that both samples possess approximately 2:1 Cd:S ratio and have similar nearest-neighbour structural arrangements. However, both samples demonstrate a significant departure from the tetrahedral structural arrangement, with an average bond angle determined to be around 106.1 degree showing a bi-fold bond angle distribution. Our results suggest that both samples are quazi-isomers. Their core structure has identical chemical composition but a different atomic arrangement with distinct bond angle distributions.

preprint2020arXiv

Gradient Methods with Dynamic Inexact Oracles

We show that the primal-dual gradient method, also known as the gradient descent ascent method, for solving convex-concave minimax problems can be viewed as an inexact gradient method applied to the primal problem. The gradient, whose exact computation relies on solving the inner maximization problem, is computed approximately by another gradient method. To model the approximate computational routine implemented by iterative algorithms, we introduce the notion of dynamic inexact oracles, which are discrete-time dynamical systems whose output asymptotically approaches the output of an exact oracle. We present a unified convergence analysis for dynamic inexact oracles realized by general first-order methods and demonstrate its use in creating new accelerated primal-dual algorithms.

preprint2019arXiv

Center-Extraction-Based Three Dimensional Nuclei Instance Segmentation of Fluorescence Microscopy Images

Fluorescence microscopy is an essential tool for the analysis of 3D subcellular structures in tissue. An important step in the characterization of tissue involves nuclei segmentation. In this paper, a two-stage method for segmentation of nuclei using convolutional neural networks (CNNs) is described. In particular, since creating labeled volumes manually for training purposes is not practical due to the size and complexity of the 3D data sets, the paper describes a method for generating synthetic microscopy volumes based on a spatially constrained cycle-consistent adversarial network. The proposed method is tested on multiple real microscopy data sets and outperforms other commonly used segmentation techniques.

preprint2016arXiv

Bio-Inspired Framework for Allocation of Protection Resources in Cyber-Physical Networks

In this chapter, we consider the problem of designing protection strategies to contain spreading processes in complex cyber-physical networks. We illustrate our ideas using a family of bio-motivated spreading models originally proposed in the epidemiological literature, e.g., the Susceptible-Infected-Susceptible (SIS) model. We first introduce a framework in which we are allowed to distribute two types of resources in order to contain the spread, namely, (i) preventive resources able to reduce the spreading rate, and (ii) corrective resources able to increase the recovery rate of nodes in which the resources are allocated. In practice, these resources have an associated cost that depends on either the resiliency level achieved by the preventive resource, or the restoration efficiency of the corrective resource. We present a mathematical framework, based on dynamic systems theory and convex optimization, to find the cost-optimal distribution of protection resources in a network to contain the spread. We also present two extensions to this framework in which (i) we consider generalized epidemic models, beyond the simple SIS model, and (ii) we assume uncertainties in the contact network in which the spreading is taking place. We compare these protection strategies with common heuristics previously proposed in the literature and illustrate our results with numerical simulations using the air traffic network.

preprint2016arXiv

Taxi Dispatch with Real-Time Sensing Data in Metropolitan Areas: A Receding Horizon Control Approach

Traditional taxi systems in metropolitan areas often suffer from inefficiencies due to uncoordinated actions as system capacity and customer demand change. With the pervasive deployment of networked sensors in modern vehicles, large amounts of information regarding customer demand and system status can be collected in real time. This information provides opportunities to perform various types of control and coordination for large-scale intelligent transportation systems. In this paper, we present a receding horizon control (RHC) framework to dispatch taxis, which incorporates highly spatiotemporally correlated demand/supply models and real-time GPS location and occupancy information. The objectives include matching spatiotemporal ratio between demand and supply for service quality with minimum current and anticipated future taxi idle driving distance. Extensive trace-driven analysis with a data set containing taxi operational records in San Francisco shows that our solution reduces the average total idle distance by 52%, and reduces the supply demand ratio error across the city during one experimental time slot by 45%. Moreover, our RHC framework is compatible with a wide variety of predictive models and optimization problem formulations. This compatibility property allows us to solve robust optimization problems with corresponding demand uncertainty models that provide disruptive event information.

preprint2015arXiv

Convex Optimal Uncertainty Quantification

Optimal uncertainty quantification (OUQ) is a framework for numerical extreme-case analysis of stochastic systems with imperfect knowledge of the underlying probability distribution. This paper presents sufficient conditions under which an OUQ problem can be reformulated as a finite-dimensional convex optimization problem, for which efficient numerical solutions can be obtained. The sufficient conditions include that the objective function is piecewise concave and the constraints are piecewise convex. In particular, we show that piecewise concave objective functions may appear in applications where the objective is defined by the optimal value of a parameterized linear program.

preprint2015arXiv

Gradual Release of Sensitive Data under Differential Privacy

We introduce the problem of releasing sensitive data under differential privacy when the privacy level is subject to change over time. Existing work assumes that privacy level is determined by the system designer as a fixed value before sensitive data is released. For certain applications, however, users may wish to relax the privacy level for subsequent releases of the same data after either a re-evaluation of the privacy concerns or the need for better accuracy. Specifically, given a database containing sensitive data, we assume that a response $y_1$ that preserves $ε_{1}$-differential privacy has already been published. Then, the privacy level is relaxed to $ε_2$, with $ε_2 > ε_1$, and we wish to publish a more accurate response $y_2$ while the joint response $(y_1, y_2)$ preserves $ε_2$-differential privacy. How much accuracy is lost in the scenario of gradually releasing two responses $y_1$ and $y_2$ compared to the scenario of releasing a single response that is $ε_{2}$-differentially private? Our results show that there exists a composite mechanism that achieves \textit{no loss} in accuracy. We consider the case in which the private data lies within $\mathbb{R}^{n}$ with an adjacency relation induced by the $\ell_{1}$-norm, and we focus on mechanisms that approximate identity queries. We show that the same accuracy can be achieved in the case of gradual release through a mechanism whose outputs can be described by a \textit{lazy Markov stochastic process}. This stochastic process has a closed form expression and can be efficiently sampled. Our results are applicable beyond identity queries. To this end, we demonstrate that our results can be applied in several cases, including Google's RAPPOR project, trading of sensitive data, and controlled transmission of private data in a social network.

preprint2015arXiv

Optimal control in Markov decision processes via distributed optimization

Optimal control synthesis in stochastic systems with respect to quantitative temporal logic constraints can be formulated as linear programming problems. However, centralized synthesis algorithms do not scale to many practical systems. To tackle this issue, we propose a decomposition-based distributed synthesis algorithm. By decomposing a large-scale stochastic system modeled as a Markov decision process into a collection of interacting sub-systems, the original control problem is formulated as a linear programming problem with a sparse constraint matrix, which can be solved through distributed optimization methods. Additionally, we propose a decomposition algorithm which automatically exploits, if exists, the modular structure in a given large-scale system. We illustrate the proposed methods through robotic motion planning examples.

preprint2015arXiv

Optimality of the Laplace Mechanism in Differential Privacy

In the highly interconnected realm of Internet of Things, exchange of sensitive information raises severe privacy concerns. The Laplace mechanism -- adding Laplace-distributed artificial noise to sensitive data -- is one of the widely used methods of providing privacy guarantees within the framework of differential privacy. In this work, we present Lipschitz privacy, a slightly tighter version of differential privacy. We prove that the Laplace mechanism is optimal in the sense that it minimizes the mean-squared error for identity queries which provide privacy with respect to the $\ell_{1}$-norm. In addition to the $\ell_{1}$-norm which respects individuals' participation, we focus on the use of the $\ell_{2}$-norm which provides privacy of high-dimensional data. A variation of the Laplace mechanism is proven to have the optimal mean-squared error from the identity query. Finally, the optimal mechanism for the scenario in which individuals submit their high-dimensional sensitive data is derived.

preprint2014arXiv

Data-Driven Allocation of Vaccines for Controlling Epidemic Outbreaks

We propose a mathematical framework, based on conic geometric programming, to control a susceptible-infected-susceptible viral spreading process taking place in a directed contact network with unknown contact rates. We assume that we have access to time series data describing the evolution of the spreading process observed by a collection of sensor nodes over a finite time interval. We propose a data-driven robust convex optimization framework to find the optimal allocation of protection resources (e.g., vaccines and/or antidotes) to eradicate the viral spread at the fastest possible rate. In contrast to current network identification heuristics, in which a single network is identified to explain the observed data, we use available data to define an uncertainty set containing all networks that are coherent with empirical observations. Our characterization of this uncertainty set of networks is tractable in the context of conic geometric programming, recently proposed by Chandrasekaran and Shah, which allows us to efficiently find the optimal allocation of resources to control the worst-case spread that can take place in the uncertainty set of networks. We illustrate our approach in a transportation network from which we collect partial data about the dynamics of a hypothetical epidemic outbreak over a finite period of time.

preprint2014arXiv

Differentially Private Convex Optimization with Piecewise Affine Objectives

Differential privacy is a recently proposed notion of privacy that provides strong privacy guarantees without any assumptions on the adversary. The paper studies the problem of computing a differentially private solution to convex optimization problems whose objective function is piecewise affine. Such problem is motivated by applications in which the affine functions that define the objective function contain sensitive user information. We propose several privacy preserving mechanisms and provide analysis on the trade-offs between optimality and the level of privacy for these mechanisms. Numerical experiments are also presented to evaluate their performance in practice.

preprint2014arXiv

Differentially Private Distributed Constrained Optimization

Many resource allocation problems can be formulated as an optimization problem whose constraints contain sensitive information about participating users. This paper concerns solving this kind of optimization problem in a distributed manner while protecting the privacy of user information. Without privacy considerations, existing distributed algorithms normally consist in a central entity computing and broadcasting certain public coordination signals to participating users. However, the coordination signals often depend on user information, so that an adversary who has access to the coordination signals can potentially decode information on individual users and put user privacy at risk. We present a distributed optimization algorithm that preserves differential privacy, which is a strong notion that guarantees user privacy regardless of any auxiliary information an adversary may have. The algorithm achieves privacy by perturbing the public signals with additive noise, whose magnitude is determined by the sensitivity of the projection operation onto user-specified constraints. By viewing the differentially private algorithm as an implementation of stochastic gradient descent, we are able to derive a bound for the suboptimality of the algorithm. We illustrate the implementation of our algorithm via a case study of electric vehicle charging. Specifically, we derive the sensitivity and present numerical simulations for the algorithm. Through numerical simulations, we are able to investigate various aspects of the algorithm when being used in practice, including the choice of step size, number of iterations, and the trade-off between privacy level and suboptimality.

Shuo Han

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Ratio-Variance Regularized Policy Optimization for Efficient LLM Fine-tuning

See or Say Graphs: Agent-Driven Scalable Graph Structure Understanding with Vision-Language Models

Optimal Decoy Resource Allocation for Proactive Defense in Probabilistic Attack Graphs

A Secure and Efficient Federated Learning Framework for NLP

Coordinate Translator for Learning Deformable Medical Image Registration

Disentangling A Single MR Modality

Pressure-induced superconductivity in flat-band Kagome compounds Pd$_3$P$_2$(S$_{1-x}$Se$_x$)$_8$

CHAOS Challenge -- Combined (CT-MR) Healthy Abdominal Organ Segmentation

Atomic structure of CdS magic-size clusters by X-ray absorption spectroscopy

Gradient Methods with Dynamic Inexact Oracles

Center-Extraction-Based Three Dimensional Nuclei Instance Segmentation of Fluorescence Microscopy Images

Bio-Inspired Framework for Allocation of Protection Resources in Cyber-Physical Networks

Taxi Dispatch with Real-Time Sensing Data in Metropolitan Areas: A Receding Horizon Control Approach

Convex Optimal Uncertainty Quantification

Gradual Release of Sensitive Data under Differential Privacy

Optimal control in Markov decision processes via distributed optimization

Optimality of the Laplace Mechanism in Differential Privacy

Data-Driven Allocation of Vaccines for Controlling Epidemic Outbreaks

Differentially Private Convex Optimization with Piecewise Affine Objectives

Differentially Private Distributed Constrained Optimization