Researcher profile

Leyuan Fang

Leyuan Fang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

MoASE++: Mixture of Activation Sparsity Experts with Domain-Adaptive On-policy Distillation for Continual Test Time Adaptation

Continual test-time adaptation adapts a source-pretrained model to non-stationary, unlabeled target streams while retaining past competence, yet texture-biased backbones risk error accumulation and catastrophic forgetting. Drawing inspiration from the process of decoupling shape and texture in the human visual system, we introduce MoASE, a plug-in mixture-of-experts that disentangles domain-agnostic structure from domain-specific texture using Activation Sparsity Experts with Spatial Differentiable Dropout, forming complementary high- and low-activation pathways, while high- and low-rank bottlenecks diversify representations. The Activation Sparsity Gate produces input-adaptive SDD thresholds for precise token selection, and the Domain-Aware Router assigns per-sample expert weights using texture-sensitive cues. To curb confirmation bias on unlabeled streams and stabilize supervision, we then introduce Domain-Adaptive On-Policy Distillation to constitute MoASE++, with an EMA-anchored on-policy reverse KL distillation and an augmentation policy conditioned on entropy and confidence that aligns predictions across the same views and improves the robustness-plasticity balance. Extensive experiments on classification (CIFAR-10/100-C, ImageNet-C) and semantic segmentation (Cityscapes->ACDC) demonstrate consistent state-of-the-art performance, offering a principled, controllable approach to continual adaptation in dynamic visual environments.

preprint2023arXiv

FedDiff: Diffusion Model Driven Federated Learning for Multi-Modal and Multi-Clients

With the rapid development of imaging sensor technology in the field of remote sensing, multi-modal remote sensing data fusion has emerged as a crucial research direction for land cover classification tasks. While diffusion models have made great progress in generative models and image classification tasks, existing models primarily focus on single-modality and single-client control, that is, the diffusion process is driven by a single modal in a single computing node. To facilitate the secure fusion of heterogeneous data from clients, it is necessary to enable distributed multi-modal control, such as merging the hyperspectral data of organization A and the LiDAR data of organization B privately on each base station client. In this study, we propose a multi-modal collaborative diffusion federated learning framework called FedDiff. Our framework establishes a dual-branch diffusion model feature extraction setup, where the two modal data are inputted into separate branches of the encoder. Our key insight is that diffusion models driven by different modalities are inherently complementary in terms of potential denoising steps on which bilateral connections can be built. Considering the challenge of private and efficient communication between multiple clients, we embed the diffusion model into the federated learning communication structure, and introduce a lightweight communication module. Qualitative and quantitative experiments validate the superiority of our framework in terms of image quality and conditional consistency.

preprint2022arXiv

NFANet: A Novel Method for Weakly Supervised Water Extraction from High-Resolution Remote Sensing Imagery

The use of deep learning for water extraction requires precise pixel-level labels. However, it is very difficult to label high-resolution remote sensing images at the pixel level. Therefore, we study how to utilize point labels to extract water bodies and propose a novel method called the neighbor feature aggregation network (NFANet). Compared with pixellevel labels, point labels are much easier to obtain, but they will lose much information. In this paper, we take advantage of the similarity between the adjacent pixels of a local water-body, and propose a neighbor sampler to resample remote sensing images. Then, the sampled images are sent to the network for feature aggregation. In addition, we use an improved recursive training algorithm to further improve the extraction accuracy, making the water boundary more natural. Furthermore, our method utilizes neighboring features instead of global or local features to learn more representative features. The experimental results show that the proposed NFANet method not only outperforms other studied weakly supervised approaches, but also obtains similar results as the state-of-the-art ones.

preprint2022arXiv

Optical Remote Sensing Image Understanding with Weak Supervision: Concepts, Methods, and Perspectives

In recent years, supervised learning has been widely used in various tasks of optical remote sensing image understanding, including remote sensing image classification, pixel-wise segmentation, change detection, and object detection. The methods based on supervised learning need a large amount of high-quality training data and their performance highly depends on the quality of the labels. However, in practical remote sensing applications, it is often expensive and time-consuming to obtain large-scale data sets with high-quality labels, which leads to a lack of sufficient supervised information. In some cases, only coarse-grained labels can be obtained, resulting in the lack of exact supervision. In addition, the supervised information obtained manually may be wrong, resulting in a lack of accurate supervision. Therefore, remote sensing image understanding often faces the problems of incomplete, inexact, and inaccurate supervised information, which will affect the breadth and depth of remote sensing applications. In order to solve the above-mentioned problems, researchers have explored various tasks in remote sensing image understanding under weak supervision. This paper summarizes the research progress of weakly supervised learning in the field of remote sensing, including three typical weakly supervised paradigms: 1) Incomplete supervision, where only a subset of training data is labeled; 2) Inexact supervision, where only coarse-grained labels of training data are given; 3) Inaccurate supervision, where the labels given are not always true on the ground.

preprint2022arXiv

Peng Cheng Object Detection Benchmark for Smart City

Object detection is an algorithm that recognizes and locates the objects in the image and has a wide range of applications in the visual understanding of complex urban scenes. Existing object detection benchmarks mainly focus on a single specific scenario and their annotation attributes are not rich enough, these make the object detection model is not generalized for the smart city scenes. Considering the diversity and complexity of scenes in intelligent city governance, we build a large-scale object detection benchmark for the smart city. Our benchmark contains about 500K images and includes three scenarios: intelligent transportation, intelligent security, and drones. For the complexity of the real scene in the smart city, the diversity of weather, occlusion, and other complex environment diversity attributes of the images in the three scenes are annotated. The characteristics of the benchmark are analyzed and extensive experiments of the current state-of-the-art target detection algorithm are conducted based on our benchmark to show their performance.

preprint2022arXiv

RRNet: Relational Reasoning Network with Parallel Multi-scale Attention for Salient Object Detection in Optical Remote Sensing Images

Salient object detection (SOD) for optical remote sensing images (RSIs) aims at locating and extracting visually distinctive objects/regions from the optical RSIs. Despite some saliency models were proposed to solve the intrinsic problem of optical RSIs (such as complex background and scale-variant objects), the accuracy and completeness are still unsatisfactory. To this end, we propose a relational reasoning network with parallel multi-scale attention for SOD in optical RSIs in this paper. The relational reasoning module that integrates the spatial and the channel dimensions is designed to infer the semantic relationship by utilizing high-level encoder features, thereby promoting the generation of more complete detection results. The parallel multi-scale attention module is proposed to effectively restore the detail information and address the scale variation of salient objects by using the low-level features refined by multi-scale attention. Extensive experiments on two datasets demonstrate that our proposed RRNet outperforms the existing state-of-the-art SOD competitors both qualitatively and quantitatively.

preprint2020arXiv

Noise-Powered Disentangled Representation for Unsupervised Speckle Reduction of Optical Coherence Tomography Images

Due to its noninvasive character, optical coherence tomography (OCT) has become a popular diagnostic method in clinical settings. However, the low-coherence interferometric imaging procedure is inevitably contaminated by heavy speckle noise, which impairs both visual quality and diagnosis of various ocular diseases. Although deep learning has been applied for image denoising and achieved promising results, the lack of well-registered clean and noisy image pairs makes it impractical for supervised learning-based approaches to achieve satisfactory OCT image denoising results. In this paper, we propose an unsupervised OCT image speckle reduction algorithm that does not rely on well-registered image pairs. Specifically, by employing the ideas of disentangled representation and generative adversarial network, the proposed method first disentangles the noisy image into content and noise spaces by corresponding encoders. Then, the generator is used to predict the denoised OCT image with the extracted content features. In addition, the noise patches cropped from the noisy image are utilized to facilitate more accurate disentanglement. Extensive experiments have been conducted, and the results suggest that our proposed method is superior to the classic methods and demonstrates competitive performance to several recently proposed learning-based approaches in both quantitative and qualitative aspects.