Source author record

Xingyu Li

Xingyu Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Artificial Intelligence eess.IV Quantitative Methods Distributed, Parallel, and Cluster Computing Neural and Evolutionary Computing Computer Science and Game Theory Cryptography and Security econ.EM Genomics math.AP Methodology Robotics

Catalog footprint

What is connected

23works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning

LiDAR-based 3D object detection models often struggle to generalize to real-world environments due to limited object diversity in existing datasets. To tackle it, we introduce the first generalized cross-domain few-shot (GCFS) task in 3D object detection, aiming to adapt a source-pretrained model to both common and novel classes in a new domain with only few-shot annotations. We propose a unified framework that learns stable target semantics under limited supervision by bridging 2D open-set semantics with 3D spatial reasoning. Specifically, an image-guided multi-modal fusion injects transferable 2D semantic cues into the 3D pipeline via vision-language models, while a physically-aware box search enhances 2D-to-3D alignment via LiDAR priors. To capture class-specific semantics from sparse data, we further introduce contrastive-enhanced prototype learning, which encodes few-shot instances into discriminative semantic anchors and stabilizes representation learning. Extensive experiments on GCFS benchmarks demonstrate the effectiveness and generality of our approach in realistic deployment settings.

preprint2026arXiv

GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization

Existing neural combinatorial optimization solvers frame solution search as imitation of optimal decisions, inherently limiting their utility to single-objective minimization and static constraints. We propose GOAL, a conditioned diffusion solver over relational graph representations that enables controllable decision generations by conditioning on human-specified objectives. We introduce a heterogeneous graph encoding in which distinct edge types, corresponding to different classes of constraints, define the message passing structure of the graph neural network, which allows information to propagate selectively according to the ontology of each constraint. GOAL is instantiated and evaluated on three canonical scheduling benchmarks of various constraint complexity: the Flow Shop Problem (FSP), the Job Shop Scheduling Problem (JSP), and the Flexible Job Shop Scheduling Problem (FJSP). Generalization is demonstrated across structurally distinct constraint regimes and problem types without architectural modification. On all three benchmarks, GOAL achieves 100% solution feasibility and near-zero MAPE (below 0.20%) on multiple objectives for problem sizes up to 20 jobs and 60 operations, outperforming NSGA-II and MOEA/D in both solution quality and inference speed by up to 25x.

preprint2026arXiv

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Autonomous systems are increasingly deployed in open and dynamic environments -- from city streets to aerial and indoor spaces -- where perception models must remain reliable under sensor noise, environmental variation, and platform shifts. However, even state-of-the-art methods often degrade under unseen conditions, highlighting the need for robust and generalizable robot sensing. The RoboSense 2025 Challenge is designed to advance robustness and adaptability in robot perception across diverse sensing scenarios. It unifies five complementary research tracks spanning language-grounded decision making, socially compliant navigation, sensor configuration generalization, cross-view and cross-modal correspondence, and cross-platform 3D perception. Together, these tasks form a comprehensive benchmark for evaluating real-world sensing reliability under domain shifts, sensor failures, and platform discrepancies. RoboSense 2025 provides standardized datasets, baseline models, and unified evaluation protocols, enabling large-scale and reproducible comparison of robust perception methods. The challenge attracted 143 teams from 85 institutions across 16 countries, reflecting broad community engagement. By consolidating insights from 23 winning solutions, this report highlights emerging methodological trends, shared design principles, and open challenges across all tracks, marking a step toward building robots that can sense reliably, act robustly, and adapt across platforms in real-world environments.

preprint2026arXiv

Unsupervised dense random survival forests identify interpretable patient profiles with heterogeneous treatment benefit

Precision oncology aims to prescribe the optimal cancer treatment to the right patients, maximizing therapeutic benefits. However, identifying patient subgroups that may benefit more from experimental cancer treatments based on randomized clinical trials presents a significant analytical challenge. To address this, we introduce a novel unsupervised machine learning approach based on very dense random survival forests (up to 100,000 trees), equipped with a new splitting rule that explicitly targets treatment-effect heterogeneity. This method is robust, interpretable, and effectively identifies responsive subgroups. Extensive simulations confirm its ability to detect heterogeneous patient responses and distinguish between datasets with and without heterogeneity, while maintaining a stringent Type I error rate of 1%. We further validate its performance using Phase III randomized clinical trial datasets, demonstrating significant patient heterogeneity in treatment response based on baseline characteristics.

preprint2026arXiv

Visual Text Compression as Measure Transport

Visual text compression (VTC) promises efficient long-context processing by rendering text into an image and re-encoding it with a vision-language model, often producing $3$--$20\times$ fewer decoder tokens than subword tokenization. Yet token savings do not translate predictably into downstream utility: on some tasks the visual path matches or exceeds the text path, on others it collapses, and the compression ratio itself does not predict which regime will occur. The missing quantity is therefore not another summary of efficiency, but a principled measure of task-relevant information loss induced by visual encoding. We address this problem by formulating VTC in the language of measure transport. Treating text and visual tokens as empirical probability measures, we show that the ViT patch encoder induces a push-forward map whose transport cost decomposes into a precision cost from within-patch aggregation and a coverage cost from cross-patch fragmentation. Both terms are estimable from downstream-label-free probes. This formulation yields two operational consequences: a downstream-label-free routing criterion that selects whether to use the visual path for a given input or benchmark instance, and a transport-informed foveation mechanism that re-encodes high-cost regions at higher resolution. Across $24$ NLP datasets at Qwen3-4B, our label-free rule matches the per-dataset oracle on $17/24$ datasets ($70.8\%$), and improves the average task score by $+3.3\%$ with $-10.3\%$ average tokens relative to a pure-LLM.

preprint2022arXiv

A robust and lightweight deep attention multiple instance learning algorithm for predicting genetic alterations

Deep-learning models based on whole-slide digital pathology images (WSIs) become increasingly popular for predicting molecular biomarkers. Instance-based models has been the mainstream strategy for predicting genetic alterations using WSIs although bag-based models along with self-attention mechanism-based algorithms have been proposed for other digital pathology applications. In this paper, we proposed a novel Attention-based Multiple Instance Mutation Learning (AMIML) model for predicting gene mutations. AMIML was comprised of successive 1-D convolutional layers, a decoder, and a residual weight connection to facilitate further integration of a lightweight attention mechanism to detect the most predictive image patches. Using data for 24 clinically relevant genes from four cancer cohorts in The Cancer Genome Atlas (TCGA) studies (UCEC, BRCA, GBM and KIRC), we compared AMIML with one popular instance-based model and four recently published bag-based models (e.g., CHOWDER, HE2RNA, etc.). AMIML demonstrated excellent robustness, not only outperforming all the five baseline algorithms in the vast majority of the tested genes (17 out of 24), but also providing near-best-performance for the other seven genes. Conversely, the performance of the baseline published algorithms varied across different cancers/genes. In addition, compared to the published models for genetic alterations, AMIML provided a significant improvement for predicting a wide range of genes (e.g., KMT2C, TP53, and SETD2 for KIRC; ERBB2, BRCA1, and BRCA2 for BRCA; JAK1, POLE, and MTOR for UCEC) as well as produced outstanding predictive models for other clinically relevant gene mutations, which have not been reported in the current literature. Furthermore, with the flexible and interpretable attention-based MIL pooling mechanism, AMIML could further zero-in and detect predictive image patches.

preprint2022arXiv

Adversarial Fine-tune with Dynamically Regulated Adversary

Adversarial training is an effective method to boost model robustness to malicious, adversarial attacks. However, such improvement in model robustness often leads to a significant sacrifice of standard performance on clean images. In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks. This leads to the question: To what extent we can boost model robustness without sacrificing standard performance? This work tackles this problem and proposes a simple yet effective transfer learning-based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance. In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity. Extensive experimentation indicates that the proposed method outperforms previous adversarial training algorithms towards the target: to improve model robustness while preserving model's standard performance on clean data.

preprint2022arXiv

Confidence Intervals of Treatment Effects in Panel Data Models with Interactive Fixed Effects

We consider the construction of confidence intervals for treatment effects estimated using panel models with interactive fixed effects. We first use the factor-based matrix completion technique proposed by Bai and Ng (2021) to estimate the treatment effects, and then use bootstrap method to construct confidence intervals of the treatment effects for treated units at each post-treatment period. Our construction of confidence intervals requires neither specific distributional assumptions on the error terms nor large number of post-treatment periods. We also establish the validity of the proposed bootstrap procedure that these confidence intervals have asymptotically correct coverage probabilities. Simulation studies show that these confidence intervals have satisfactory finite sample performances, and empirical applications using classical datasets yield treatment effect estimates of similar magnitudes and reliable confidence intervals.

preprint2022arXiv

DDDM: a Brain-Inspired Framework for Robust Classification

Despite their outstanding performance in a broad spectrum of real-world tasks, deep artificial neural networks are sensitive to input noises, particularly adversarial perturbations. On the contrary, human and animal brains are much less vulnerable. In contrast to the one-shot inference performed by most deep neural networks, the brain often solves decision-making with an evidence accumulation mechanism that may trade time for accuracy when facing noisy inputs. The mechanism is well described by the Drift-Diffusion Model (DDM). In the DDM, decision-making is modeled as a process in which noisy evidence is accumulated toward a threshold. Drawing inspiration from the DDM, we propose the Dropout-based Drift-Diffusion Model (DDDM) that combines test-phase dropout and the DDM for improving the robustness for arbitrary neural networks. The dropouts create temporally uncorrelated noises in the network that counter perturbations, while the evidence accumulation mechanism guarantees a reasonable decision accuracy. Neural networks enhanced with the DDDM tested in image, speech, and text classification tasks all significantly outperform their native counterparts, demonstrating the DDDM as a task-agnostic defense against adversarial attacks.

preprint2022arXiv

Generalized Federated Learning via Sharpness Aware Minimization

Federated Learning (FL) is a promising framework for performing privacy-preserving, distributed learning with a set of clients. However, the data distribution among clients often exhibits non-IID, i.e., distribution shift, which makes efficient optimization difficult. To tackle this problem, many FL algorithms focus on mitigating the effects of data heterogeneity across clients by increasing the performance of the global model. However, almost all algorithms leverage Empirical Risk Minimization (ERM) to be the local optimizer, which is easy to make the global model fall into a sharp valley and increase a large deviation of parts of local clients. Therefore, in this paper, we revisit the solutions to the distribution shift problem in FL with a focus on local learning generality. To this end, we propose a general, effective algorithm, \texttt{FedSAM}, based on Sharpness Aware Minimization (SAM) local optimizer, and develop a momentum FL algorithm to bridge local and global models, \texttt{MoFedSAM}. Theoretically, we show the convergence analysis of these two algorithms and demonstrate the generalization bound of \texttt{FedSAM}. Empirically, our proposed algorithms substantially outperform existing FL studies and significantly decrease the learning deviation.

preprint2022arXiv

Improving Feature Extraction from Histopathological Images Through A Fine-tuning ImageNet Model

Due to lack of annotated pathological images, transfer learning has been the predominant approach in the field of digital pathology.Pre-trained neural networks based on ImageNet database are often used to extract "off the shelf" features, achieving great success in predicting tissue types, molecular features, and clinical outcomes, etc. We hypothesize that fine-tuning the pre-trained models using histopathological images could further improve feature extraction, and downstream prediction performance.We used 100,000 annotated HE image patches for colorectal cancer (CRC) to finetune a pretrained Xception model via a twostep approach.The features extracted from finetuned Xception (FTX2048) model and Imagepretrained (IMGNET2048) model were compared through: (1) tissue classification for HE images from CRC, same image type that was used for finetuning; (2) prediction of immunerelated gene expression and (3) gene mutations for lung adenocarcinoma (LUAD).Fivefold cross validation was used for model performance evaluation. The extracted features from the finetuned FTX2048 exhibited significantly higher accuracy for predicting tisue types of CRC compared to the off the shelf feature directly from Xception based on ImageNet database. Particularly, FTX2048 markedly improved the accuracy for stroma from 87% to 94%. Similarly, features from FTX2048 boosted the prediction of transcriptomic expression of immunerelated genesin LUAD. For the genes that had signigicant relationships with image fetures, the features fgrom the finetuned model imprroved the prediction for the majority of the genes. Inaddition, fetures from FTX2048 improved prediction of mutation for 5 out of 9 most frequently mutated genes in LUAD.

preprint2022arXiv

LoMar: A Local Defense Against Poisoning Attack on Federated Learning

Federated learning (FL) provides a high efficient decentralized machine learning framework, where the training data remains distributed at remote clients in a network. Though FL enables a privacy-preserving mobile edge computing framework using IoT devices, recent studies have shown that this approach is susceptible to poisoning attacks from the side of remote clients. To address the poisoning attacks on FL, we provide a \textit{two-phase} defense algorithm called {Lo}cal {Ma}licious Facto{r} (LoMar). In phase I, LoMar scores model updates from each remote client by measuring the relative distribution over their neighbors using a kernel density estimation method. In phase II, an optimal threshold is approximated to distinguish malicious and clean updates from a statistical perspective. Comprehensive experiments on four real-world datasets have been conducted, and the experimental results show that our defense strategy can effectively protect the FL system. {Specifically, the defense performance on Amazon dataset under a label-flipping attack indicates that, compared with FG+Krum, LoMar increases the target label testing accuracy from $96.0\%$ to $98.8\%$, and the overall averaged testing accuracy from $90.1\%$ to $97.0\%$.

preprint2022arXiv

Object Wake-up: 3D Object Rigging from a Single Image

Given a single image of a general object such as a chair, could we also restore its articulated 3D shape similar to human modeling, so as to animate its plausible articulations and diverse motions? This is an interesting new question that may have numerous downstream augmented reality and virtual reality applications. Comparing with previous efforts on object manipulation, our work goes beyond 2D manipulation and rigid deformation, and involves articulated manipulation. To achieve this goal, we propose an automated approach to build such 3D generic objects from single images and embed articulated skeletons in them. Specifically, our framework starts by reconstructing the 3D object from an input image. Afterwards, to extract skeletons for generic 3D objects, we develop a novel skeleton prediction method with a multi-head structure for skeleton probability field estimation by utilizing the deep implicit functions. A dataset of generic 3D objects with ground-truth annotated skeletons is collected. Empirically our approach is demonstrated with satisfactory performance on public datasets as well as our in-house dataset; our results surpass those of the state-of-the-arts by a noticeable margin on both 3D reconstruction and skeleton prediction.

preprint2022arXiv

On the Convergence of Multi-Server Federated Learning with Overlapping Area

Multi-server Federated learning (FL) has been considered as a promising solution to address the limited communication resource problem of single-server FL. We consider a typical multi-server FL architecture, where the coverage areas of regional servers may overlap. The key point of this architecture is that the clients located in the overlapping areas update their local models based on the average model of all accessible regional models, which enables indirect model sharing among different regional servers. Due to the complicated network topology, the convergence analysis is much more challenging than single-server FL. In this paper, we firstly propose a novel MS-FedAvg algorithm for this multi-server FL architecture and analyze its convergence on non-iid datasets for general non-convex settings. Since the number of clients located in each regional server is much less than in single-server FL, the bandwidth of each client should be large enough to successfully communicate training models with the server, which indicates that full client participation can work in multi-server FL. Also, we provide the convergence analysis of the partial client participation scheme and develop a new biased partial participation strategy to further accelerate convergence. Our results indicate that the convergence results highly depend on the ratio of the number of clients in each area type to the total number of clients in all three strategies. The extensive experiments show remarkable performance and support our theoretical results.

preprint2022arXiv

Optimize Deep Learning Models for Prediction of Gene Mutations Using Unsupervised Clustering

Deep learning has become the mainstream methodological choice for analyzing and interpreting whole-slide digital pathology images (WSIs). It is commonly assumed that tumor regions carry most predictive information. In this paper, we proposed an unsupervised clustering-based multiple-instance learning, and apply our method to develop deep-learning models for prediction of gene mutations using WSIs from three cancer types in The Cancer Genome Atlas (TCGA) studies (CRC, LUAD, and HNSCC). We showed that unsupervised clustering of image patches could help identify predictive patches, exclude patches lack of predictive information, and therefore improve prediction on gene mutations in all three different cancer types, compared with the WSI based method without selection of image patches and models based on only tumor regions. Additionally, our proposed algorithm outperformed two recently published baseline algorithms leveraging unsupervised clustering to assist model prediction. The unsupervised-clustering-based approach for mutation prediction allows identification of the spatial regions related to mutation of a specific gene via the resolved probability scores, highlighting the heterogeneity of a predicted genotype in the tumor microenvironment. Finally, our study also demonstrated that selection of tumor regions of WSIs is not always the best way to identify patches for prediction of gene mutations, and other tissue types in the tumor micro-environment may provide better prediction ability for gene mutations than tumor tissues.

preprint2022arXiv

Prognostic Significance of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images in Colorectal Cancers

Purpose Tumor-infiltrating lymphocytes (TILs) have significant prognostic values in cancers. However, very few automated, deep-learning-based TIL scoring algorithms have been developed for colorectal cancers (CRC). Methods We developed an automated, multiscale LinkNet workflow for quantifying cellular-level TILs for CRC tumors using H&E-stained images. The predictive performance of the automatic TIL scores (TIL) for disease progression and overall survival was evaluate using two international datasets, including 554 CRC patients from The Cancer Genome Atlas (TCGA) and 1130 CRC patients from Molecular and Cellular Oncology (MCO). Results The LinkNet model provided an outstanding precision (0.9508), recall (0.9185), and overall F1 score (0.9347). Clear dose-response relationships were observed between TILs and risk of disease progression or death decreased in both TCGA and MCO cohorts. Both univariate and multivariate Cox regression analyses for the TCGA data demonstrated that patients with high TILs had significant (approx. 75%) reduction of risk for disease progression. In both MCO and TCGA studies, the TIL-high group was significantly associated with improved overall survival in univariate analysis (30% and 54% reduction in risk, respectively). However, potential confounding was observed in the MCO dataset. The favorable effects of high TILs were consistently observed in different subgroups according to know risk factors. Conclusion A deep-learning workflow for automatic TIL quantification based on LinkNet was successfully developed.

preprint2021arXiv

Blind stain separation using model-aware generative learning and its applications on fluorescence microscopy images

Multiple stains are usually used to highlight biological substances in biomedical image analysis. To decompose multiple stains for co-localization quantification, blind source separation is usually performed. Prior model-based stain separation methods usually rely on stains' spatial distributions over an image and may fail to solve the co-localization problem. With the advantage of machine learning, deep generative models are used for this purpose. Since prior knowledge of imaging models is ignored in purely data-driven solutions, these methods may be sub-optimal. In this study, a novel learning-based blind source separation framework is proposed, where the physical model of biomedical imaging is incorporated to regularize the learning process. The introduced model-relevant adversarial loss couples all generators in the framework and limits the capacities of the generative models. Further more, a training algorithm is innovated for the proposed framework to avoid inter-generator confusion during learning. This paper particularly takes fluorescence unmixing in fluorescence microscopy images as an application example of the proposed framework. Qualitative and quantitative experimentation on a public fluorescence microscopy image set demonstrates the superiority of the proposed method over both prior model-based approaches and learning-based methods.

preprint2021arXiv

Large Time Asymptotic Behaviors of Two Types of Fast Diffusion Equations

We consider two types of non linear fast diffusion equations in R^N:(1) External drift type equation with general external potential. It is a natural extension of the harmonic potential case, which has been studied in many papers. In this paper we can prove the large time asymptotic behavior to the stationary state by using entropy methods.(2) Mean-field type equation with the convolution term. The stationary solution is the minimizer of the free energy functional, which has direct relation with reverse Hardy-Littlewood-Sobolev inequalities. In this paper, we prove that for some special cases, it also exists large time asymptotic behavior to the stationary state.

preprint2021arXiv

Sample Elicitation

It is important to collect credible training samples $(x,y)$ for building data-intensive learning systems (e.g., a deep learning system). Asking people to report complex distribution $p(x)$, though theoretically viable, is challenging in practice. This is primarily due to the cognitive loads required for human agents to form the report of this highly complicated information. While classical elicitation mechanisms apply to eliciting a complex and generative (and continuous) distribution $p(x)$, we are interested in eliciting samples $x_i \sim p(x)$ from agents directly. We coin the above problem "sample elicitation". This paper introduces a deep learning aided method to incentivize credible sample contributions from self-interested and rational agents. We show that with an accurate estimation of a certain $f$-divergence function we can achieve approximate incentive compatibility in eliciting truthful samples. We then present an efficient estimator with theoretical guarantees via studying the variational forms of the $f$-divergence function. We also show a connection between this sample elicitation problem and $f$-GAN, and how this connection can help reconstruct an estimator of the distribution based on collected samples. Experiments on synthetic data, MNIST, and CIFAR-10 datasets demonstrate that our mechanism elicits truthful samples. Our implementation is available at https://github.com/weijiaheng/Credible-sample-elicitation.git.

preprint2021arXiv

Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with Delayed Gradients

Federated learning (FL) is a new machine learning framework which trains a joint model across a large amount of decentralized computing devices. Existing methods, e.g., Federated Averaging (FedAvg), are able to provide an optimization guarantee by synchronously training the joint model, but usually suffer from stragglers, i.e., IoT devices with low computing power or communication bandwidth, especially on heterogeneous optimization problems. To mitigate the influence of stragglers, this paper presents a novel FL algorithm, namely Hybrid Federated Learning (HFL), to achieve a learning balance in efficiency and effectiveness. It consists of two major components: synchronous kernel and asynchronous updater. Unlike traditional synchronous FL methods, our HFL introduces the asynchronous updater which actively pulls unsynchronized and delayed local weights from stragglers. An adaptive approximation method, Adaptive Delayed-SGD (AD-SGD), is proposed to merge the delayed local updates into the joint model. The theoretical analysis of HFL shows that the convergence rate of the proposed algorithm is $\mathcal{O}(\frac{1}{t+τ})$ for both convex and non-convex optimization problems.

preprint2020arXiv

Discriminative Pattern Mining for Breast Cancer Histopathology Image Classification via Fully Convolutional Autoencoder

Accurate diagnosis of breast cancer in histopathology images is challenging due to the heterogeneity of cancer cell growth as well as of a variety of benign breast tissue proliferative lesions. In this paper, we propose a practical and self-interpretable invasive cancer diagnosis solution. With minimum annotation information, the proposed method mines contrast patterns between normal and malignant images in unsupervised manner and generates a probability map of abnormalities to verify its reasoning. Particularly, a fully convolutional autoencoder is used to learn the dominant structural patterns among normal image patches. Patches that do not share the characteristics of this normal population are detected and analyzed by one-class support vector machine and 1-layer neural network. We apply the proposed method to a public breast cancer image set. Our results, in consultation with a senior pathologist, demonstrate that the proposed method outperforms existing methods. The obtained probability map could benefit the pathology practice by providing visualized verification data and potentially leads to a better understanding of data-driven diagnosis solutions.

preprint2020arXiv

How Much Off-The-Shelf Knowledge Is Transferable From Natural Images To Pathology Images?

Deep learning has achieved a great success in natural image classification. To overcome data-scarcity in computational pathology, recent studies exploit transfer learning to reuse knowledge gained from natural images in pathology image analysis, aiming to build effective pathology image diagnosis models. Since transferability of knowledge heavily depends on the similarity of the original and target tasks, significant differences in image content and statistics between pathology images and natural images raise the questions: how much knowledge is transferable? Is the transferred information equally contributed by pre-trained layers? To answer these questions, this paper proposes a framework to quantify knowledge gain by a particular layer, conducts an empirical investigation in pathology image centered transfer learning, and reports some interesting observations. Particularly, compared to the performance baseline obtained by random-weight model, though transferability of off-the-shelf representations from deep layers heavily depend on specific pathology image sets, the general representation generated by early layers does convey transferred knowledge in various image classification applications. The observation in this study encourages further investigation of specific metric and tools to quantify effectiveness and feasibility of transfer learning in future.

preprint2020arXiv

Stain Style Transfer of Histopathology Images Via Structure-Preserved Generative Learning

Computational histopathology image diagnosis becomes increasingly popular and important, where images are segmented or classified for disease diagnosis by computers. While pathologists do not struggle with color variations in slides, computational solutions usually suffer from this critical issue. To address the issue of color variations in histopathology images, this study proposes two stain style transfer models, SSIM-GAN and DSCSI-GAN, based on the generative adversarial networks. By cooperating structural preservation metrics and feedback of an auxiliary diagnosis net in learning, medical-relevant information presented by image texture, structure, and chroma-contrast features is preserved in color-normalized images. Particularly, the smart treat of chromatic image content in our DSCSI-GAN model helps to achieve noticeable normalization improvement in image regions where stains mix due to histological substances co-localization. Extensive experimentation on public histopathology image sets indicates that our methods outperform prior arts in terms of generating more stain-consistent images, better preserving histological information in images, and obtaining significantly higher learning efficiency. Our python implementation is published on https://github.com/hanwen0529/DSCSI-GAN.

Xingyu Li

What is connected

Connect this record

See the researcher in context

Building this map preview

23 published item(s)

From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning

GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization

The RoboSense Challenge: Sense Anything, Navigate Anywhere, Adapt Across Platforms

Unsupervised dense random survival forests identify interpretable patient profiles with heterogeneous treatment benefit

Visual Text Compression as Measure Transport

A robust and lightweight deep attention multiple instance learning algorithm for predicting genetic alterations

Adversarial Fine-tune with Dynamically Regulated Adversary

Confidence Intervals of Treatment Effects in Panel Data Models with Interactive Fixed Effects

DDDM: a Brain-Inspired Framework for Robust Classification

Generalized Federated Learning via Sharpness Aware Minimization

Improving Feature Extraction from Histopathological Images Through A Fine-tuning ImageNet Model

LoMar: A Local Defense Against Poisoning Attack on Federated Learning

Object Wake-up: 3D Object Rigging from a Single Image

On the Convergence of Multi-Server Federated Learning with Overlapping Area

Optimize Deep Learning Models for Prediction of Gene Mutations Using Unsupervised Clustering

Prognostic Significance of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images in Colorectal Cancers

Blind stain separation using model-aware generative learning and its applications on fluorescence microscopy images

Large Time Asymptotic Behaviors of Two Types of Fast Diffusion Equations

Sample Elicitation

Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with Delayed Gradients

Discriminative Pattern Mining for Breast Cancer Histopathology Image Classification via Fully Convolutional Autoencoder

How Much Off-The-Shelf Knowledge Is Transferable From Natural Images To Pathology Images?

Stain Style Transfer of Histopathology Images Via Structure-Preserved Generative Learning