Researcher profile

Hongwei Li

Hongwei Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
19works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

19 published item(s)

preprint2026arXiv

ExploitGym: Can AI Agents Turn Security Vulnerabilities into Real Attacks?

AI agents are rapidly gaining capabilities that could significantly reshape cybersecurity, making rigorous evaluation urgent. A critical capability is exploitation: turning a vulnerability, which is not yet an attack, into a concrete security impact, such as unauthorized file access or code execution. Exploitation is a particularly challenging task because it requires low-level program reasoning (e.g., about memory layout), runtime adaptation, and sustained progress over long horizons. Meanwhile, it is inherently dual-use, supporting defensive workflows while lowering the barrier for offense. Despite its importance and diagnostic value, exploitation remains under-evaluated. To address this gap, we introduce ExploitGym, a large-scale, diverse, realistic benchmark on the exploitation capabilities of AI agents. Given a program input that triggers a vulnerability, ExploitGym tasks agents with progressively extending it into a working exploit. The benchmark comprises 898 instances sourced from real-world vulnerabilities across three domains, including userspace programs, Google's V8 JavaScript engine, and the Linux kernel. We vary the security protections applied to each instance, isolating their impact on agent performance. All configurations are packaged in reproducible containerized environments. Our evaluation shows that while exploitation remains challenging, frontier models can successfully exploit a non-trivial fraction of vulnerabilities. For example, the strongest configurations are Anthropic's latest model Claude Mythos Preview and OpenAI's GPT-5.5, which produce working exploits for 157 and 120 instances, respectively. Notably, even with widely used defenses enabled, models retain non-trivial success rates. These results establish ExploitGym as an effective testbed for exploitation and highlight the growing cybersecurity risks posed by increasingly capable AI agents.

preprint2026arXiv

Machine-learning enabled characterization of individual ring resonators in integrated photonic lattices

Accurately determining the underlying physical parameters of individual elements in integrated photonics is increasingly difficult as device architectures become more complex. Inferring these parameters directly from spectral measurements of the system as a whole provides a practical alternative to traditional calibration, allowing characterization of photonic systems without relying on detailed device-specific models. Here, we introduce a supervised machine-learning strategy to learn the onsite losses and resonant frequency shifts of each individual ring in an array of coupled ring resonators from measured spectral power distributions of the whole array. The neural network infers these parameters with high accuracy across multiple experimental configurations. Our methodology provides a scalable and non-invasive method for extracting intrinsic parameters in coupled photonic platforms, paving the way for future development of automated calibration and control methods.

preprint2026arXiv

State Backdoor: Towards Stealthy Real-world Poisoning Attack on Vision-Language-Action Model in State Space

Vision-Language-Action (VLA) models are widely deployed in safety-critical embodied AI applications such as robotics. However, their complex multimodal interactions also expose new security vulnerabilities. In this paper, we investigate a backdoor threat in VLA models, where malicious inputs cause targeted misbehavior while preserving performance on clean data. Existing backdoor methods predominantly rely on inserting visible triggers into visual modality, which suffer from poor robustness and low insusceptibility in real-world settings due to environmental variability. To overcome these limitations, we introduce the State Backdoor, a novel and practical backdoor attack that leverages the robot arm's initial state as the trigger. To optimize trigger for insusceptibility and effectiveness, we design a Preference-guided Genetic Algorithm (PGA) that efficiently searches the state space for minimal yet potent triggers. Extensive experiments on five representative VLA models and five real-world tasks show that our method achieves over 90% attack success rate without affecting benign task performance, revealing an underexplored vulnerability in embodied AI systems.

preprint2022arXiv

Adaptive Local Implicit Image Function for Arbitrary-scale Super-resolution

Image representation is critical for many visual tasks. Instead of representing images discretely with 2D arrays of pixels, a recent study, namely local implicit image function (LIIF), denotes images as a continuous function where pixel values are expansion by using the corresponding coordinates as inputs. Due to its continuous nature, LIIF can be adopted for arbitrary-scale image super-resolution tasks, resulting in a single effective and efficient model for various up-scaling factors. However, LIIF often suffers from structural distortions and ringing artifacts around edges, mostly because all pixels share the same model, thus ignoring the local properties of the image. In this paper, we propose a novel adaptive local image function (A-LIIF) to alleviate this problem. Specifically, our A-LIIF consists of two main components: an encoder and a expansion network. The former captures cross-scale image features, while the latter models the continuous up-scaling function by a weighted combination of multiple local implicit image functions. Accordingly, our A-LIIF can reconstruct the high-frequency textures and structures more accurately. Experiments on multiple benchmark datasets verify the effectiveness of our method. Our codes are available at \url{https://github.com/LeeHW-THU/A-LIIF}.

preprint2022arXiv

CodeGen-Test: An Automatic Code Generation Model Integrating Program Test Information

Automatic code generation is to generate the program code according to the given natural language description. The current mainstream approach uses neural networks to encode natural language descriptions, and output abstract syntax trees (AST) at the decoder, then convert the AST into program code. While the generated code largely conforms to specific syntax rules, two problems are still ignored. One is missing program testing, an essential step in the process of complete code implementation; the other is only focusing on the syntax compliance of the generated code, while ignoring the more important program functional requirements. The paper proposes a CodeGen-Test model, which adds program testing steps and incorporates program testing information to iteratively generate code that meets the functional requirements of the program, thereby improving the quality of code generation. At the same time, the paper proposes a new evaluation metric, test accuracy (Test-Acc), which represents the proportion of passing program test in generated code. Different from the previous evaluation metric, which only evaluates the quality of code generation from the perspective of character similarity, the Test-Acc can evaluate the quality of code generation from the Program functions. Moreover, the paper evaluates the CodeGen-test model on a python data set "hearthstone legend". The experimental results show the proposed method can effectively improve the quality of generated code. Compared with the existing optimal model, CodeGen-Test model improves the Bleu value by 0.2%, Rouge-L value by 0.3% and Test-Acc by 6%.

preprint2022arXiv

Deep Quality Estimation: Creating Surrogate Models for Human Quality Ratings

Human ratings are abstract representations of segmentation quality. To approximate human quality ratings on scarce expert data, we train surrogate quality estimation models. We evaluate on a complex multi-class segmentation problem, specifically glioma segmentation, following the BraTS annotation protocol. The training data features quality ratings from 15 expert neuroradiologists on a scale ranging from 1 to 6 stars for various computer-generated and manual 3D annotations. Even though the networks operate on 2D images and with scarce training data, we can approximate segmentation quality within a margin of error comparable to human intra-rater reliability. Segmentation quality prediction has broad applications. While an understanding of segmentation quality is imperative for successful clinical translation of automatic segmentation quality algorithms, it can play an essential role in training new segmentation models. Due to the split-second inference times, it can be directly applied within a loss function or as a fully-automatic dataset curation mechanism in a federated learning setting.

preprint2022arXiv

Domain-Adaptive 3D Medical Image Synthesis: An Efficient Unsupervised Approach

Medical image synthesis has attracted increasing attention because it could generate missing image data, improving diagnosis and benefits many downstream tasks. However, so far the developed synthesis model is not adaptive to unseen data distribution that presents domain shift, limiting its applicability in clinical routine. This work focuses on exploring domain adaptation (DA) of 3D image-to-image synthesis models. First, we highlight the technical difference in DA between classification, segmentation and synthesis models. Second, we present a novel efficient adaptation approach based on 2D variational autoencoder which approximates 3D distributions. Third, we present empirical studies on the effect of the amount of adaptation data and the key hyper-parameters. Our results show that the proposed approach can significantly improve the synthesis accuracy on unseen domains in a 3D setting. The code is publicly available at https://github.com/WinstonHuTiger/2D_VAE_UDA_for_3D_sythesis

preprint2022arXiv

Hercules: Boosting the Performance of Privacy-preserving Federated Learning

In this paper, we address the problem of privacy-preserving federated neural network training with $N$ users. We present Hercules, an efficient and high-precision training framework that can tolerate collusion of up to $N-1$ users. Hercules follows the POSEIDON framework proposed by Sav et al. (NDSS'21), but makes a qualitative leap in performance with the following contributions: (i) we design a novel parallel homomorphic computation method for matrix operations, which enables fast Single Instruction and Multiple Data (SIMD) operations over ciphertexts. For the multiplication of two $h\times h$ dimensional matrices, our method reduces the computation complexity from $O(h^3)$ to $O(h)$. This greatly improves the training efficiency of the neural network since the ciphertext computation is dominated by the convolution operations; (ii) we present an efficient approximation on the sign function based on the composite polynomial approximation. It is used to approximate non-polynomial functions (i.e., ReLU and max), with the optimal asymptotic complexity. Extensive experiments on various benchmark datasets (BCW, ESR, CREDIT, MNIST, SVHN, CIFAR-10 and CIFAR-100) show that compared with POSEIDON, Hercules obtains up to 4% increase in model accuracy, and up to 60$\times$ reduction in the computation and communication cost.

preprint2022arXiv

Privacy-preserving Decentralized Deep Learning with Multiparty Homomorphic Encryption

Decentralized deep learning plays a key role in collaborative model training due to its attractive properties, including tolerating high network latency and less prone to single-point failures. Unfortunately, such a training mode is more vulnerable to data privacy leaks compared to other distributed training frameworks. Existing efforts exclusively use differential privacy as the cornerstone to alleviate the data privacy threat. However, it is still not clear whether differential privacy can provide a satisfactory utility-privacy trade-off for model training, due to its inherent contradictions. To address this problem, we propose D-MHE, the first secure and efficient decentralized training framework with lossless precision. Inspired by the latest developments in the homomorphic encryption technology, we design a multiparty version of Brakerski-Fan-Vercauteren (BFV), one of the most advanced cryptosystems, and use it to implement private gradient updates of users'local models. D-MHE can reduce the communication complexity of general Secure Multiparty Computation (MPC) tasks from quadratic to linear in the number of users, making it very suitable and scalable for large-scale decentralized learning systems. Moreover, D-MHE provides strict semantic security protection even if the majority of users are dishonest with collusion. We conduct extensive experiments on MNIST and CIFAR-10 datasets to demonstrate the superiority of D-MHE in terms of model accuracy, computation and communication cost compared with existing schemes.

preprint2022arXiv

Relationformer: A Unified Framework for Image-to-Graph Generation

A comprehensive representation of an image requires understanding objects and their mutual relationship, especially in image-to-graph generation, e.g., road network extraction, blood-vessel network extraction, or scene graph generation. Traditionally, image-to-graph generation is addressed with a two-stage approach consisting of object detection followed by a separate relation prediction, which prevents simultaneous object-relation interaction. This work proposes a unified one-stage transformer-based framework, namely Relationformer, that jointly predicts objects and their relations. We leverage direct set-based object prediction and incorporate the interaction among the objects to learn an object-relation representation jointly. In addition to existing [obj]-tokens, we propose a novel learnable token, namely [rln]-token. Together with [obj]-tokens, [rln]-token exploits local and global semantic reasoning in an image through a series of mutual associations. In combination with the pair-wise [obj]-token, the [rln]-token contributes to a computationally efficient relation prediction. We achieve state-of-the-art performance on multiple, diverse and multi-domain datasets that demonstrate our approach's effectiveness and generalizability.

preprint2022arXiv

Selecting Regularization Parameters for nuclear norm type minimization problems

The reconstruction of low-rank matrix from its noisy observation finds its usage in many applications. It can be reformulated into a constrained nuclear norm minimization problem, where the bound $η$ of the constraint is explicitly given or can be estimated by the probability distribution of the noise. When the Lagrangian method is applied to find the minimizer, the solution can be obtained by the singular value thresholding operator where the thresholding parameter $λ$ is related to the Lagrangian multiplier. In this paper, we first show that the Frobenius norm of the discrepancy between the minimizer and the observed matrix is a strictly increasing function of $λ$. From that we derive a closed-form solution for $λ$ in terms of $η$. The result can be used to solve the constrained nuclear-norm-type minimization problem when $η$ is given. For the unconstrained nuclear-norm-type regularized problems, our result allows us to automatically choose a suitable regularization parameter by using the discrepancy principle. The regularization parameters obtained are comparable to (and sometimes better than) those obtained by Stein's unbiased risk estimator (SURE) approach while the cost of solving the minimization problem can be reduced by 11--18 times. Numerical experiments with both synthetic data and real MRI data are performed to validate the proposed approach.

preprint2022arXiv

VerSe: A Vertebrae Labelling and Segmentation Benchmark for Multi-detector CT Images

Vertebral labelling and segmentation are two fundamental tasks in an automated spine processing pipeline. Reliable and accurate processing of spine images is expected to benefit clinical decision-support systems for diagnosis, surgery planning, and population-based analysis on spine and bone health. However, designing automated algorithms for spine processing is challenging predominantly due to considerable variations in anatomy and acquisition protocols and due to a severe shortage of publicly available data. Addressing these limitations, the Large Scale Vertebrae Segmentation Challenge (VerSe) was organised in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2019 and 2020, with a call for algorithms towards labelling and segmentation of vertebrae. Two datasets containing a total of 374 multi-detector CT scans from 355 patients were prepared and 4505 vertebrae have individually been annotated at voxel-level by a human-machine hybrid algorithm (https://osf.io/nqjyw/, https://osf.io/t98fz/). A total of 25 algorithms were benchmarked on these datasets. In this work, we present the the results of this evaluation and further investigate the performance-variation at vertebra-level, scan-level, and at different fields-of-view. We also evaluate the generalisability of the approaches to an implicit domain shift in data by evaluating the top performing algorithms of one challenge iteration on data from the other iteration. The principal takeaway from VerSe: the performance of an algorithm in labelling and segmenting a spine scan hinges on its ability to correctly identify vertebrae in cases of rare anatomical variations. The content and code concerning VerSe can be accessed at: https://github.com/anjany/verse.

preprint2022arXiv

What Makes for Automatic Reconstruction of Pulmonary Segments

3D reconstruction of pulmonary segments plays an important role in surgical treatment planning of lung cancer, which facilitates preservation of pulmonary function and helps ensure low recurrence rates. However, automatic reconstruction of pulmonary segments remains unexplored in the era of deep learning. In this paper, we investigate what makes for automatic reconstruction of pulmonary segments. First and foremost, we formulate, clinically and geometrically, the anatomical definitions of pulmonary segments, and propose evaluation metrics adhering to these definitions. Second, we propose ImPulSe (Implicit Pulmonary Segment), a deep implicit surface model designed for pulmonary segment reconstruction. The automatic reconstruction of pulmonary segments by ImPulSe is accurate in metrics and visually appealing. Compared with canonical segmentation methods, ImPulSe outputs continuous predictions of arbitrary resolutions with higher training efficiency and fewer parameters. Lastly, we experiment with different network inputs to analyze what matters in the task of pulmonary segment reconstruction. Our code is available at https://github.com/M3DV/ImPulSe.

preprint2021arXiv

Deep Class-Specific Affinity-Guided Convolutional Network for Multimodal Unpaired Image Segmentation

Multi-modal medical image segmentation plays an essential role in clinical diagnosis. It remains challenging as the input modalities are often not well-aligned spatially. Existing learning-based methods mainly consider sharing trainable layers across modalities and minimizing visual feature discrepancies. While the problem is often formulated as joint supervised feature learning, multiple-scale features and class-specific representation have not yet been explored. In this paper, we propose an affinity-guided fully convolutional network for multimodal image segmentation. To learn effective representations, we design class-specific affinity matrices to encode the knowledge of hierarchical feature reasoning, together with the shared convolutional layers to ensure the cross-modality generalization. Our affinity matrix does not depend on spatial alignments of the visual features and thus allows us to train with unpaired, multimodal inputs. We extensively evaluated our method on two public multimodal benchmark datasets and outperform state-of-the-art methods.

preprint2021arXiv

Micro-CT Synthesis and Inner Ear Super Resolution via Generative Adversarial Networks and Bayesian Inference

Existing medical image super-resolution methods rely on pairs of low- and high- resolution images to learn a mapping in a fully supervised manner. However, such image pairs are often not available in clinical practice. In this paper, we address super-resolution problem in a real-world scenario using unpaired data and synthesize linearly \textbf{eight times} higher resolved Micro-CT images of temporal bone structure, which is embedded in the inner ear. We explore cycle-consistency generative adversarial networks for super-resolution task and equip the translation approach with Bayesian inference. We further introduce \emph{Hu Moment distance} the evaluation metric to quantify the shape of the temporal bone. We evaluate our method on a public inner ear CT dataset and have seen both visual and quantitative improvement over state-of-the-art deep-learning-based methods. In addition, we perform a multi-rater visual evaluation experiment and find that trained experts consistently rate the proposed method the highest quality scores among all methods. Furthermore, we are able to quantify uncertainty in the unpaired translation task and the uncertainty map can provide structural information of the temporal bone.

preprint2020arXiv

Domain Adaptive Medical Image Segmentation via Adversarial Learning of Disease-Specific Spatial Patterns

In medical imaging, the heterogeneity of multi-centre data impedes the applicability of deep learning-based methods and results in significant performance degradation when applying models in an unseen data domain, e.g. a new centreor a new scanner. In this paper, we propose an unsupervised domain adaptation framework for boosting image segmentation performance across multiple domains without using any manual annotations from the new target domains, but by re-calibrating the networks on few images from the target domain. To achieve this, we enforce architectures to be adaptive to new data by rejecting improbable segmentation patterns and implicitly learning through semantic and boundary information, thus to capture disease-specific spatial patterns in an adversarial optimization. The adaptation process needs continuous monitoring, however, as we cannot assume the presence of ground-truth masks for the target domain, we propose two new metrics to monitor the adaptation process, and strategies to train the segmentation algorithm in a stable fashion. We build upon well-established 2D and 3D architectures and perform extensive experiments on three cross-centre brain lesion segmentation tasks, involving multicentre public and in-house datasets. We demonstrate that recalibrating the deep networks on a few unlabeled images from the target domain improves the segmentation accuracy significantly.

preprint2020arXiv

Fast non-convex low-rank matrix decomposition for separation of potential field data using minimal memory

A fast non-convex low-rank matrix decomposition method for potential field data separation is proposed. The singular value decomposition of the large size trajectory matrix, which is also a block Hankel matrix, is obtained using a fast randomized singular value decomposition algorithm in which fast block Hankel matrix-vector multiplications are implemented with minimal memory storage. This fast block Hankel matrix randomized singular value decomposition algorithm is integrated into the \texttt{Altproj} algorithm, which is a standard non-convex method for solving the robust principal component analysis optimization problem. The improved algorithm avoids the construction of the trajectory matrix. Hence, gravity and magnetic data matrices of large size can be computed. Moreover, it is more efficient than the traditional low-rank matrix decomposition method, which is based on the use of an inexact augmented Lagrange multiplier algorithm. The presented algorithm is also robust and, hence, algorithm-dependent parameters are easily determined. The improved and traditional algorithms are contrasted for the separation of synthetic gravity and magnetic data matrices of different sizes. The presented results demonstrate that the improved algorithm is not only computationally more efficient but it is also more accurate. Moreover, it is possible to solve far larger problems. As an example, for the adopted computational environment, matrices of sizes larger than $205 \times 205$ generate "out of memory" exceptions with the traditional method, but a matrix of size $2001\times 2001$ can be calculated in $1062.29$s with the new algorithm. Finally, the improved method is applied to separate real gravity and magnetic data in the Tongling area, Anhui province, China. Areas which may exhibit mineralizations are inferred based on the separated anomalies.

preprint2020arXiv

Generalisable Cardiac Structure Segmentation via Attentional and Stacked Image Adaptation

Tackling domain shifts in multi-centre and multi-vendor data sets remains challenging for cardiac image segmentation. In this paper, we propose a generalisable segmentation framework for cardiac image segmentation in which multi-centre, multi-vendor, multi-disease datasets are involved. A generative adversarial networks with an attention loss was proposed to translate the images from existing source domains to a target domain, thus to generate good-quality synthetic cardiac structure and enlarge the training set. A stack of data augmentation techniques was further used to simulate real-world transformation to boost the segmentation performance for unseen domains.We achieved an average Dice score of 90.3% for the left ventricle, 85.9% for the myocardium, and 86.5% for the right ventricle on the hidden validation set across four vendors. We show that the domain shifts in heterogeneous cardiac imaging datasets can be drastically reduced by two aspects: 1) good-quality synthetic data by learning the underlying target domain distribution, and 2) stacked classical image processing techniques for data augmentation.

preprint2019arXiv

Quantum algorithms for the Goldreich-Levin learning problem

The Goldreich-Levin algorithm was originally proposed for a cryptographic purpose and then applied to learning. The algorithm is to find some larger Walsh coefficients of an $n$ variable Boolean function. Roughly speaking, it takes a $poly(n,\frac{1}ε\log\frac{1}δ)$ time to output the vectors $w$ with Walsh coefficients $S(w)\geqε$ with probability at least $1-δ$. However, in this paper, a quantum algorithm for this problem is given with query complexity $O(\frac{\log\frac{1}δ}{ε^4})$, which is independent of $n$. Furthermore, the quantum algorithm is generalized to apply for an $n$ variable $m$ output Boolean function $F$ with query complexity $O(2^m\frac{\log\frac{1}δ}{ε^4})$.