Source author record

Xiaofeng Yang

Xiaofeng Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.med-ph Computer Vision eess.IV Machine Learning astro-ph.CO Artificial Intelligence astro-ph.GA astro-ph.HE Computation and Language Data Structures and Algorithms Databases physics.flu-dyn

Catalog footprint

What is connected

34works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Beyond Single Prompts: Synergistic Fusion and Arrangement for VICL

Vision In-Context Learning (VICL) enables inpainting models to quickly adapt to new visual tasks from only a few prompts. However, existing methods suffer from two key issues: (1) selecting only the most similar prompt discards complementary cues from other high-quality prompts; and (2) failing to exploit the structured information implied by different prompt arrangements. We propose an end-to-end VICL framework to overcome these limitations. Firstly, an adaptive Fusion Module aggregates critical patterns and annotations from multiple prompts to form more precise contextual prompts. Secondly, we introduce arrangement-specific lightweight MLPs to decouple layout priors from the core model, while minimally affecting the overall model. In addition, an bidirectional fine-tuning mechanism swaps the roles of query and prompt, encouraging the model to reconstruct the original prompt from fused context and thus enhancing collaboration between the fusion module and the inpainting model. Experiments on foreground segmentation, single-object detection, and image colorization demonstrate superior results and strong cross-task generalization of our method.

preprint2026arXiv

BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning

Brain MRI underpins a wide range of neuroscientific and clinical applications, yet most learning-based methods remain task-specific and require substantial labeled data. Here we show that a single self-supervised representation can generalize across heterogeneous brain MRI endpoints. We trained BrainDINO, a self-distilled foundation model, on approximately 6.6 million unlabeled axial slices from 20 datasets encompassing broad variation in population, disease, and acquisition setting. Using a frozen encoder with lightweight task heads, BrainDINO supported transfer across tumor segmentation, neurodegenerative and neurodevelopmental conditions classification, brain age estimation, post-stroke temporal prediction, molecular status prediction, MRI sequence classification, and survival modeling. Across tasks and supervision regimes, BrainDINO consistently equaled or exceeded natural-image and MRI-specific self-supervised baselines, with particularly strong advantages under label scarcity. Representation analyses further showed anatomically organized and pathology-sensitive feature structure in the absence of task-specific supervision. Our findings indicate that large-scale slice-wise self-supervised learning can yield a unified brain MRI representation that supports diverse neuroimaging tasks without volumetric pretraining or full-network fine-tuning, establishing a scalable foundation for robust and data-efficient brain imaging analysis.

preprint2026arXiv

EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning in Vision Transformers

Large models such as Vision Transformers (ViTs) have demonstrated remarkable superiority over smaller architectures like ResNet in few-shot classification, owing to their powerful representational capacity. However, fine-tuning such large models demands extensive GPU memory and prolonged training time, making them impractical for many real-world low-resource scenarios. To bridge this gap, we propose EfficientFSL, a query-only fine-tuning framework tailored specifically for few-shot classification with ViT, which achieves competitive performance while significantly reducing computational overhead. EfficientFSL fully leverages the knowledge embedded in the pre-trained model and its strong comprehension ability, achieving high classification accuracy with an extremely small number of tunable parameters. Specifically, we introduce a lightweight trainable Forward Block to synthesize task-specific queries that extract informative features from the intermediate representations of the pre-trained model in a query-only manner. We further propose a Combine Block to fuse multi-layer outputs, enhancing the depth and robustness of feature representations. Finally, a Support-Query Attention Block mitigates distribution shift by adjusting prototypes to align with the query set distribution. With minimal trainable parameters, EfficientFSL achieves state-of-the-art performance on four in-domain few-shot datasets and six cross-domain datasets, demonstrating its effectiveness in real-world applications.

preprint2026arXiv

Enhancing Visual In-Context Learning by Multi-Faceted Fusion

Visual In-Context Learning (VICL) has emerged as a powerful paradigm, enabling models to perform novel visual tasks by learning from in-context examples. The dominant "retrieve-then-prompt" approach typically relies on selecting the single best visual prompt, a practice that often discards valuable contextual information from other suitable candidates. While recent work has explored fusing the top-K prompts into a single, enhanced representation, this still simply collapses multiple rich signals into one, limiting the model's reasoning capability. We argue that a more multi-faceted, collaborative fusion is required to unlock the full potential of these diverse contexts. To address this limitation, we introduce a novel framework that moves beyond single-prompt fusion towards an multi-combination collaborative fusion. Instead of collapsing multiple prompts into one, our method generates three contextual representation branches, each formed by integrating information from different combinations of top-quality prompts. These complementary guidance signals are then fed into proposed MULTI-VQGAN architecture, which is designed to jointly interpret and utilize collaborative information from multiple sources. Extensive experiments on diverse tasks, including foreground segmentation, single-object detection, and image colorization, highlight its strong cross-task generalization, effective contextual fusion, and ability to produce more robust and accurate predictions than existing methods.

preprint2026arXiv

InfoSculpt: Sculpting the Latent Space for Generalized Category Discovery

Generalized Category Discovery (GCD) aims to classify instances from both known and novel categories within a large-scale unlabeled dataset, a critical yet challenging task for real-world, open-world applications. However, existing methods often rely on pseudo-labeling, or two-stage clustering, which lack a principled mechanism to explicitly disentangle essential, category-defining signals from instance-specific noise. In this paper, we address this fundamental limitation by re-framing GCD from an information-theoretic perspective, grounded in the Information Bottleneck (IB) principle. We introduce InfoSculpt, a novel framework that systematically sculpts the representation space by minimizing a dual Conditional Mutual Information (CMI) objective. InfoSculpt uniquely combines a Category-Level CMI on labeled data to learn compact and discriminative representations for known classes, and a complementary Instance-Level CMI on all data to distill invariant features by compressing augmentation-induced noise. These two objectives work synergistically at different scales to produce a disentangled and robust latent space where categorical information is preserved while noisy, instance-specific details are discarded. Extensive experiments on 8 benchmarks demonstrate that InfoSculpt validating the effectiveness of our information-theoretic approach.

preprint2026arXiv

IPEC: Test-Time Incremental Prototype Enhancement Classifier for Few-Shot Learning

Metric-based few-shot approaches have gained significant popularity due to their relatively straightforward implementation, high interpret ability, and computational efficiency. However, stemming from the batch-independence assumption during testing, which prevents the model from leveraging valuable knowledge accumulated from previous batches. To address these challenges, we propose a novel test-time method called Incremental Prototype Enhancement Classifier (IPEC), a test-time method that optimizes prototype estimation by leveraging information from previous query samples. IPEC maintains a dynamic auxiliary set by selectively incorporating query samples that are classified with high confidence. To ensure sample quality, we design a robust dual-filtering mechanism that assesses each query sample based on both global prediction confidence and local discriminative ability. By aggregating this auxiliary set with the support set in subsequent tasks, IPEC builds progressively more stable and representative prototypes, effectively reducing its reliance on the initial support set. We ground this approach in a Bayesian interpretation, conceptualizing the support set as a prior and the auxiliary set as a data-driven posterior, which in turn motivates the design of a practical "warm-up and test" two-stage inference protocol. Extensive empirical results validate the superior performance of our proposed method across multiple few-shot classification tasks.

preprint2025arXiv

A Physics-Informed Deep Learning Model for MRI Brain Motion Correction

Background: MRI is crucial for brain imaging but is highly susceptible to motion artifacts due to long acquisition times. This study introduces PI-MoCoNet, a physics-informed motion correction network that integrates spatial and k-space information to remove motion artifacts without explicit motion parameter estimation, enhancing image fidelity and diagnostic reliability. Materials and Methods: PI-MoCoNet consists of a motion detection network (U-net with spatial averaging) to identify corrupted k-space lines and a motion correction network (U-net with Swin Transformer blocks) to reconstruct motion-free images. The correction is guided by three loss functions: reconstruction (L1), perceptual (LPIPS), and data consistency (Ldc). Motion artifacts were simulated via rigid phase encoding perturbations and evaluated on IXI and MR-ART datasets against Pix2Pix, CycleGAN, and U-net using PSNR, SSIM, and NMSE. Results: PI-MoCoNet significantly improved image quality. On IXI, for minor artifacts, PSNR increased from 34.15 dB to 45.95 dB, SSIM from 0.87 to 1.00, and NMSE reduced from 0.55% to 0.04%. For moderate artifacts, PSNR improved from 30.23 dB to 42.16 dB, SSIM from 0.80 to 0.99, and NMSE from 1.32% to 0.09%. For heavy artifacts, PSNR rose from 27.99 dB to 36.01 dB, SSIM from 0.75 to 0.97, and NMSE decreased from 2.21% to 0.36%. On MR-ART, PI-MoCoNet achieved PSNR gains of ~10 dB and SSIM improvements of up to 0.20, with NMSE reductions of ~6%. Ablation studies confirmed the importance of data consistency and perceptual losses, yielding a 1 dB PSNR gain and 0.17% NMSE reduction. Conclusions: PI-MoCoNet effectively mitigates motion artifacts in brain MRI, outperforming existing methods. Its ability to integrate spatial and k-space information makes it a promising tool for clinical use in motion-prone settings. Code: https://github.com/mosaf/PI-MoCoNet.git.

preprint2025arXiv

MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting

Objective:This study introduces a residual error-shifting mechanism that drastically reduces sampling steps while preserving critical anatomical details, thus accelerating MRI reconstruction. Approach:We propose a novel diffusion-based SR framework called Res-SRDiff, which integrates residual error shifting into the forward diffusion process. This enables efficient HR image reconstruction by aligning the degraded HR and LR distributions.We evaluated Res-SRDiff on ultra-high-field brain T1 MP2RAGE maps and T2-weighted prostate images, comparing it with Bicubic, Pix2pix, CycleGAN, and a conventional denoising diffusion probabilistic model with vision transformer backbone (TM-DDPM), using quantitative metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), gradient magnitude similarity deviation (GMSD), and learned perceptual image patch similarity (LPIPS). Main results: Res-SRDiff significantly outperformed all comparative methods in terms of PSNR, SSIM, and GMSD across both datasets, with statistically significant improvements (p-values<<0.05). The model achieved high-fidelity image restoration with only four sampling steps, drastically reducing computational time to under one second per slice, which is substantially faster than conventional TM-DDPM with around 20 seconds per slice. Qualitative analyses further demonstrated that Res-SRDiff effectively preserved fine anatomical details and lesion morphology in both brain and pelvic MRI images. Significance: Our findings show that Res-SRDiff is an efficient and accurate MRI SR method, markedly improving computational efficiency and image quality. Integrating residual error shifting into the diffusion process allows for rapid and robust HR image reconstruction, enhancing clinical MRI workflows and advancing medical imaging research. The source at:https://github.com/mosaf/Res-SRDiff

preprint2025arXiv

Res-MoCoDiff: Residual-guided diffusion models for motion artifact correction in brain MRI

Objective. Motion artifacts in brain MRI, mainly from rigid head motion, degrade image quality and hinder downstream applications. Conventional methods to mitigate these artifacts, including repeated acquisitions or motion tracking, impose workflow burdens. This study introduces Res-MoCoDiff, an efficient denoising diffusion probabilistic model specifically designed for MRI motion artifact correction.Approach.Res-MoCoDiff exploits a novel residual error shifting mechanism during the forward diffusion process to incorporate information from motion-corrupted images. This mechanism allows the model to simulate the evolution of noise with a probability distribution closely matching that of the corrupted data, enabling a reverse diffusion process that requires only four steps. The model employs a U-net backbone, with attention layers replaced by Swin Transformer blocks, to enhance robustness across resolutions. Furthermore, the training process integrates a combined l1+l2 loss function, which promotes image sharpness and reduces pixel-level errors. Res-MoCoDiff was evaluated on both an in-silico dataset generated using a realistic motion simulation framework and an in-vivo MR-ART dataset. Comparative analyses were conducted against established methods, including CycleGAN, Pix2pix, and a diffusion model with a vision transformer backbone, using quantitative metrics such as PSNR, SSIM, and NMSE.Main results. The proposed method demonstrated superior performance in removing motion artifacts across minor, moderate, and heavy distortion levels. Res-MoCoDiff consistently achieved the highest SSIM and the lowest NMSE values, with a PSNR of up to 41.91+-2.94 dB for minor distortions. Notably, the average sampling time was reduced to 0.37 seconds per batch of two image slices, compared with 101.74 seconds for conventional approaches.

preprint2024arXiv

Fast MRI Reconstruction Using Deep Learning-based Compressed Sensing: A Systematic Review

Magnetic resonance imaging (MRI) has revolutionized medical imaging, providing a non-invasive and highly detailed look into the human body. However, the long acquisition times of MRI present challenges, causing patient discomfort, motion artifacts, and limiting real-time applications. To address these challenges, researchers are exploring various techniques to reduce acquisition time and improve the overall efficiency of MRI. One such technique is compressed sensing (CS), which reduces data acquisition by leveraging image sparsity in transformed spaces. In recent years, deep learning (DL) has been integrated with CS-MRI, leading to a new framework that has seen remarkable growth. DL-based CS-MRI approaches are proving to be highly effective in accelerating MR imaging without compromising image quality. This review comprehensively examines DL-based CS-MRI techniques, focusing on their role in increasing MR imaging speed. We provide a detailed analysis of each category of DL-based CS-MRI including end-to-end, unroll optimization, self-supervised, and federated learning. Our systematic review highlights significant contributions and underscores the exciting potential of DL in CS-MRI. Additionally, our systematic review efficiently summarizes key results and trends in DL-based CS-MRI including quantitative metrics, the dataset used, acceleration factors, and the progress of and research interest in DL techniques over time. Finally, we discuss potential future directions and the importance of DL-based CS-MRI in the advancement of medical imaging. To facilitate further research in this area, we provide a GitHub repository that includes up-to-date DL-based CS-MRI publications and publicly available datasets - https://github.com/mosaf/Awesome-DL-based-CS-MRI.

preprint2023arXiv

Effective End-to-End Vision Language Pretraining with Semantic Visual Loss

Current vision language pretraining models are dominated by methods using region visual features extracted from object detectors. Given their good performance, the extract-then-process pipeline significantly restricts the inference speed and therefore limits their real-world use cases. However, training vision language models from raw image pixels is difficult, as the raw image pixels give much less prior knowledge than region features. In this paper, we systematically study how to leverage auxiliary visual pretraining tasks to help training end-to-end vision language models. We introduce three types of visual losses that enable much faster convergence and better finetuning accuracy. Compared with region feature models, our end-to-end models could achieve similar or better performance on downstream tasks and run more than 10 times faster during inference. Compared with other end-to-end models, our proposed method could achieve similar or better performance when pretrained for only 10% of the pretraining GPU hours.

preprint2023arXiv

Self-Training Vision Language BERTs with a Unified Conditional Model

Natural language BERTs are trained with language corpus in a self-supervised manner. Unlike natural language BERTs, vision language BERTs need paired data to train, which restricts the scale of VL-BERT pretraining. We propose a self-training approach that allows training VL-BERTs from unlabeled image data. The proposed method starts with our unified conditional model -- a vision language BERT model that can perform zero-shot conditional generation. Given different conditions, the unified conditional model can generate captions, dense captions, and even questions. We use the labeled image data to train a teacher model and use the trained model to generate pseudo captions on unlabeled image data. We then combine the labeled data and pseudo labeled data to train a student model. The process is iterated by putting the student model as a new teacher. By using the proposed self-training approach and only 300k unlabeled extra data, we are able to get competitive or even better performances compared to the models of similar model size trained with 3 million extra image data.

preprint2022arXiv

Deep Q-learning of global optimizer of multiply model parameters for viscoelastic imaging

Objective: Estimation of the global optima of multiple model parameters is valuable in imaging to form a reliable diagnostic image. Given non convexity of the objective function, it is challenging to avoid from different local minima. Methods: We first formulate the global searching of multiply parameters to be a k-D move in the parametric space, and convert parameters updating to be state-action decision-making problem. We proposed a novel Deep Q-learning of Model Parameters (DQMP) method for global optimization of model parameters by updating the parameter configurations through actions that maximize a Q-value, which employs a Deep Reward Network designed to learn global reward values from both visible curve fitting errors and hidden parameter errors. Results: The DQMP method was evaluated by viscoelastic imaging on soft matter by Kelvin-Voigt fractional derivative (KVFD) modeling. In comparison to other methods, imaging of parameters by DQMP yielded the smallest errors (< 2%) to the ground truth images. DQMP was applied to viscoelastic imaging on biological tissues, which indicated a great potential of imaging on physical parameters in diagnostic applications. Conclusions: DQMP method is able to achieve global optima, yielding accurate model parameter estimates in viscoelastic imaging. Assessment of DQMP by simulation imaging and ultrasound breast imaging demonstrated the consistency, reliability of the imaged parameters, and powerful global searching ability of DQMP. Significance: DQMP method is promising for imaging of multiple parameters, and can be generalized to global optimization for many other complex nonconvex functions and imaging of physical parameters.

preprint2022arXiv

Multi-organ Segmentation Network with Adversarial Performance Validator

CT organ segmentation on computed tomography (CT) images becomes a significant brick for modern medical image analysis, supporting clinic workflows in multiple domains. Previous segmentation methods include 2D convolution neural networks (CNN) based approaches, fed by CT image slices that lack the structural knowledge in axial view, and 3D CNN-based methods with the expensive computation cost in multi-organ segmentation applications. This paper introduces an adversarial performance validation network into a 2D-to-3D segmentation framework. The classifier and performance validator competition contribute to accurate segmentation results via back-propagation. The proposed network organically converts the 2D-coarse result to 3D high-quality segmentation masks in a coarse-to-fine manner, allowing joint optimization to improve segmentation accuracy. Besides, the structural information of one specific organ is depicted by a statistics-meaningful prior bounding box, which is transformed into a global feature leveraging the learning process in 3D fine segmentation. The experiments on the NIH pancreas segmentation dataset demonstrate the proposed network achieves state-of-the-art accuracy on small organ segmentation and outperforms the previous best. High accuracy is also reported on multi-organ segmentation in a dataset collected by ourselves.

preprint2022arXiv

Reinforcement Learning in Medical Image Analysis: Concepts, Applications, Challenges, and Future Directions

Motivation: Medical image analysis involves tasks to assist physicians in qualitative and quantitative analysis of lesions or anatomical structures, significantly improving the accuracy and reliability of diagnosis and prognosis. Traditionally, these tasks are finished by physicians or medical physicists and lead to two major problems: (i) low efficiency; (ii) biased by personal experience. In the past decade, many machine learning methods have been applied to accelerate and automate the image analysis process. Compared to the enormous deployments of supervised and unsupervised learning models, attempts to use reinforcement learning in medical image analysis are scarce. This review article could serve as the stepping-stone for related research. Significance: From our observation, though reinforcement learning has gradually gained momentum in recent years, many researchers in the medical analysis field find it hard to understand and deploy in clinics. One cause is lacking well-organized review articles targeting readers lacking professional computer science backgrounds. Rather than providing a comprehensive list of all reinforcement learning models in medical image analysis, this paper may help the readers to learn how to formulate and solve their medical image analysis research as reinforcement learning problems. Approach & Results: We selected published articles from Google Scholar and PubMed. Considering the scarcity of related articles, we also included some outstanding newest preprints. The papers are carefully reviewed and categorized according to the type of image analysis task. We first review the basic concepts and popular models of reinforcement learning. Then we explore the applications of reinforcement learning models in landmark detection. Finally, we conclude the article by discussing the reviewed reinforcement learning approaches' limitations and possible improvements.

preprint2020arXiv

A plan quality control method of treatment planning for Gamma Knife radiosurgery

With many variables to adjust, conventional manual forward planning for Gamma Knife (GK) radiosurgery is very complicated and cumbersome. The resulting plan quality heavily depends on planners skills, experiences and devoted efforts, and varies significantly among cases, planners, and institutions. Quality control for GK planning is desired to consistently provide high-quality plan to each patient. In this study, we proposed a quality control method for GK planning by building a database of high-quality GK plans. Patient anatomy was described by target volume, target shape complexity, and spatial relationship between target and nearby organs, which determine GK planning difficulty level. Plan quality was evaluated using target coverage, selectivity, intermediate dose spillage, maximum dose to 0.1 cc of brainstem, mean dose of ipsilateral cochlea, and beam-on time. When a new plan is created, a high-quality plan that has the most similar target volume size and shape complexity will be identified from the database. A model has also been built to predict the dose to brainstem and cochlea based on their overlap volume histograms. The identified reference plan and the predicted organ dose will help planners to make quality control decisions accordingly. To validate this method, we have built a database for vestibular schwannoma, which are considered to be challenging for GK planning due to the irregularly-shaped target and its proximity to brainstem and cochlea. Five cases were tested, among which one case was considered to be of high quality and four cases had a lower plan quality than prediction. These four cases were replanned and got substantially improved. Our results have demonstrated the efficacy of our proposed quality control method. This method may also be used as a plan quality prediction method to facilitate the development of automatic treatment planning for GK radiosurgery.

preprint2020arXiv

An institutional study on plan quality and variation of manual forward planning for Gamma Knife radiosurgery for vestibular schwannoma

Due to the complexity and cumbersomeness of Gamma Knife (GK) manual forward planning, the quality of the resulting treatment plans heavily depends on the planners skill, experience and the amount of effort devoted to plan development. Hence, GK plan quality may vary significantly among institutions and planners, and even for a same planner at different cases. This is particularly a concern for challenging cases with complicated geometry, such as vestibular schwannoma cases. The purpose of this retrospective study is to investigate the plan quality and variation in the manually forward planned, clinically acceptable GK treatment plans of 22 previous vestibular schwannoma cases. Considering the impacts of different patient geometry and different trade-offs among the planning objectives in GK planning, it is difficult to objectively assess the plan quality across different cases. To reduce these confounding factors on plan quality assessment, we employed our recently developed multiresolution-level inverse planning algorithm to generate a golden plan for each case, which is expected to be on or close to the pareto surface with a similar trade-off as used in the manual plan. The plan quality of the manual plan is then quantified in terms of its deviation from the golden plan. A scoring criterion between 0-100 was designed to calculate a final score for each manual plan to simplify our analysis. Large quality variation was observed in these 22 cases, with two cases having a score lower than 75, three cases scoring between 80 and 85, two cases between 85 and 90, eight cases between 90 and 95, and seven cases higher than 95. Inter- and intra- planner variability was also observed in our study. This large variation in GK manual planning deserves high attention, and merits further investigation on how to reduce the variation in GK treatment plan quality.

preprint2020arXiv

Artificial Intelligence in Quantitative Ultrasound Imaging: A Review

Quantitative ultrasound (QUS) imaging is a reliable, fast and inexpensive technique to extract physically descriptive parameters for assessing pathologies. Despite its safety and efficacy, QUS suffers from several major drawbacks: poor imaging quality, inter- and intra-observer variability which hampers the reproducibility of measurements. Therefore, it is in great need to develop automatic method to improve the imaging quality and aid in measurements in QUS. In recent years, there has been an increasing interest in artificial intelligence (AI) applications in ultrasound imaging. However, no research has been found that surveyed the AI use in QUS. The purpose of this paper is to review recent research into the AI applications in QUS. This review first introduces the AI workflow, and then discusses the various AI applications in QUS. Finally, challenges and future potential AI applications in QUS are discussed.

preprint2020arXiv

Catching butterflies in the sky: Extended catalog of winged or X-shaped radio sources from the latest FIRST data release

We present a catalog of 290 "winged" or X-shaped radio galaxies (XRGs) extracted from the latest (2014 December 17) data release of the "Very Large Array Faint Images of the Radio Sky at Twenty centimeter." We have combined these radio images with their counterparts in the TIFR GMRT sky survey at 150 MHz, in an attempt to identify any low surface brightness radio emission present in these sources. This has enabled us to assemble a sample of 106 "strong" XRG candidates and 184 "probable" XRG candidates whose XRG designation needs to be verified by further observations. The present sample of 290 XRG candidates is almost twice as large as the number of XRGs currently known. Twenty-five of our 290 XRG candidates (9 "strong" and 16 "probable") are identified as quasars. Double-peaked narrow emission lines are seen in the optical spectra of three of the XRG candidates (two "strong" and one "probable"). Nearly 90% of the sample is located in the FR II domain of the Owen-Ledlow diagram. A few of the strong XRG candidates have a rather flat radio spectrum (spectral index alpha flatter than -0.3) between 150 MHz and 1.4 GHz, or between 1.4 and 5 GHz. Since this is not expected for lobe-dominated extragalactic radio sources (like nearly all known XRGs), these sources are particularly suited for follow-up radio imaging and near-simultaneous measurement of the radio spectrum.

preprint2020arXiv

Deep Learning in Multi-organ Segmentation

This paper presents a review of deep learning (DL) in multi-organ segmentation. We summarized the latest DL-based methods for medical image segmentation and applications. These methods were classified into six categories according to their network design. For each category, we listed the surveyed works, highlighted important contributions and identified specific challenges. Following the detailed review of each category, we briefly discussed its achievements, shortcomings and future potentials. We provided a comprehensive comparison among DL-based methods for thoracic and head & neck multiorgan segmentation using benchmark datasets, including the 2017 AAPM Thoracic Auto-segmentation Challenge datasets and 2015 MICCAI Head Neck Auto-Segmentation Challenge datasets.

preprint2020arXiv

Deep learning-based Real-time Volumetric Imaging for Lung Stereotactic Body Radiation Therapy: A Proof of Concept Study

Due to the inter- and intra- variation of respiratory motion, it is highly desired to provide real-time volumetric images during the treatment delivery of lung stereotactic body radiation therapy (SBRT) for accurate and active motion management. In this proof-of-concept study, we propose a novel generative adversarial network integrated with perceptual supervision to derive instantaneous volumetric images from a single 2D projection. Our proposed network, named TransNet, consists of three modules, i.e., encoding, transformation and decoding modules. Rather than only using image distance loss between the generated 3D images and the ground truth 3D CT images to supervise the network, perceptual loss in feature space is integrated into loss function to force the TransNet to yield accurate lung boundary. Adversarial supervision is also used to improve the realism of generated 3D images. We conducted a simulation study on 20 patient cases, who had received lung SBRT treatments in our institution and undergone 4D-CT simulation, and evaluated the efficacy and consistency of our method for four different projection angles, i.e., 0, 30, 60 and 90 degree. For each 3D CT image set of a breathing phase, we simulated its 2D projections at these angles.Then for each projection angle, a patient's 3D CT images of 9 phases and the corresponding 2D projection data were used for training, with the remaining phase used for testing. The mean absolute error, normalized MAE, peak signal-to-noise ratio and structural similarity index metric achieved by our method are 99.3 HU, 0.032, 23.4 dB and 0.949, respectively. These results demonstrate the feasibility and efficacy of our 2D-to-3D method for lung cancer patients, which provides a potential solution for in-treatment real-time on-board volumetric imaging for accurate dose delivery to ensure the effectiveness of lung SBRT treatment.

preprint2020arXiv

Generative Adversarial Network for Image Synthesis

This chapter reviews recent developments of generative adversarial networks (GAN)-based methods for medical and biomedical image synthesis tasks. These methods are classified into conditional GAN and Cycle-GAN according to the network architecture designs. For each category, a literature survey is given, which covers discussions of the network architecture designs, highlights important contributions and identifies specific challenges.

preprint2020arXiv

Intensity Non-uniformity Correction in MR Imaging Using Residual Cycle Generative Adversarial Network

Purpose: Correcting or reducing the effects of voxel intensity non-uniformity (INU) within a given tissue type is a crucial issue for quantitative MRI image analysis in daily clinical practice. In this study, we present a deep learning-based approach for MRI image INU correction. Method: We developed a residual cycle generative adversarial network (res-cycle GAN), which integrates the residual block concept into a cycle-consistent GAN (cycle-GAN). In cycle-GAN, an inverse transformation was implemented between the INU uncorrected and corrected MRI images to constrain the model through forcing the calculation of both an INU corrected MRI and a synthetic corrected MRI. A fully convolution neural network integrating residual blocks was applied in the generator of cycle-GAN to enhance end-to-end raw MRI to INU corrected MRI transformation. A cohort of 30 abdominal patients with T1-weighted MR INU images and their corrections with a clinically established and commonly used method, namely, N4ITK were used as a pair to evaluate the proposed res-cycle GAN based INU correction algorithm. Quantitatively comparisons were made among the proposed method and other approaches. Result: Our res-cycle GAN based method achieved higher accuracy and better tissue uniformity compared to the other algorithms. Moreover, once the model is well trained, our approach can automatically generate the corrected MR images in a few minutes, eliminating the need for manual setting of parameters. Conclusion: In this study, a deep learning based automatic INU correction method in MRI, namely, res-cycle GAN has been investigated. The results show that learning based methods can achieve promising accuracy, while highly speeding up the correction through avoiding the unintuitive parameter tuning process in N4ITK correction.

preprint2020arXiv

Knowledge-based Radiation Treatment Planning: A Data-driven Method Survey

This paper surveys the data-driven dose prediction approaches introduced for knowledge-based planning (KBP) in the last decade. These methods were classified into two major categories according to their methods and techniques of utilizing previous knowledge: traditional KBP methods and deep-learning-based methods. Previous studies that required geometric or anatomical features to either find the best matched case(s) from repository of previously delivered treatment plans or build prediction models were included in traditional methods category, whereas deep-learning-based methods included studies that trained neural networks to make dose prediction. A comprehensive review of each category is presented, highlighting key parameters, methods, and their outlooks in terms of dose prediction over the years. We separated the cited works according to the framework and cancer site in each category. Finally, we briefly discuss the performance of both traditional KBP methods and deep-learning-based methods, and future trends of both data-driven KBP approaches.

preprint2020arXiv

Learning-Based Stopping Power Mapping on Dual Energy CT for Proton Radiation Therapy

Purpose: Dual-energy CT (DECT) has been used to derive relative stopping power (RSP) map by obtaining the energy dependence of photon interactions. The DECT-derived RSP maps could potentially be compromised by image noise levels and the severity of artifacts when using physics-based mapping techniques, which would affect subsequent clinical applications. This work presents a noise-robust learning-based method to predict RSP maps from DECT for proton radiation therapy. Methods: The proposed method uses a residual attention cycle-consistent generative adversarial (CycleGAN) network. CycleGAN were used to let the DECT-to-RSP mapping be close to a one-to-one mapping by introducing an inverse RSP-to-DECT mapping. We retrospectively investigated 20 head-and-neck cancer patients with DECT scans in proton radiation therapy simulation. Ground truth RSP values were assigned by calculation based on chemical compositions, and acted as learning targets in the training process for DECT datasets, and were evaluated against results from the proposed method using a leave-one-out cross-validation strategy. Results: The predicted RSP maps showed an average normalized mean square error (NMSE) of 2.83% across the whole body volume, and average mean error (ME) less than 3% in all volumes of interest (VOIs). With additional simulated noise added in DECT datasets, the proposed method still maintained a comparable performance, while the physics-based stoichiometric method suffered degraded inaccuracy from increased noise level. The average differences in DVH metrics for clinical target volumes (CTVs) were less than 0.2 Gy for D95% and Dmax with no statistical significance. Conclusion: These results strongly indicate the high accuracy of RSP maps predicted by our machine-learning-based method and show its potential feasibility for proton treatment planning and dose calculation.

preprint2020arXiv

Learning-Based Synthetic Dual Energy CT Imaging from Single Energy CT for Stopping Power Ratio Calculation in Proton Radiation Therapy

Purpose: Dual-energy CT (DECT) has been shown to derive stopping power ratio (SPR) map with higher accuracy than conventional single energy CT (SECT) by obtaining the energy dependence of photon interactions. However, DECT is not as widely implemented as SECT in proton radiation therapy simulation. This work presents a learning-based method to synthetize DECT images from SECT for proton radiation therapy. Methods: The proposed method uses a residual attention generative adversarial network. Residual blocks with attention gates were used to force the model focus on the difference between DECT maps and SECT images. To evaluate the accuracy of the method, we retrospectively investigated 20 head-and-neck cancer patients with both DECT and SECT scans available. The high and low energy CT images acquired from DECT acted as learning targets in the training process for SECT datasets and were evaluated against results from the proposed method using a leave-one-out cross-validation strategy. To evaluate our method in the context of a practical application, we generated SPR maps from sDECT using physics-based dual-energy stoichiometric method and compared the maps to those generated from DECT. Results: The synthesized DECT images showed an average mean absolute error around 30 Hounsfield Unit (HU) across the whole-body volume. The corresponding SPR maps generated from synthetic DECT showed an average normalized mean square error of about 1% with reduced noise level and artifacts than those from original DECT. Conclusions: The accuracy of the synthesized DECT image by our machine-learning-based method was evaluated on head and neck patient, and potential feasibility for proton treatment planning and dose calculation was shown by generating SPR map using the synthesized DECT.

preprint2020arXiv

Machine Learning in Quantitative PET Imaging

This paper reviewed the machine learning-based studies for quantitative positron emission tomography (PET). Specifically, we summarized the recent developments of machine learning-based methods in PET attenuation correction and low-count PET reconstruction by listing and comparing the proposed methods, study designs and reported performances of the current published studies with brief discussion on representative studies. The contributions and challenges among the reviewed studies were summarized and highlighted in the discussion part followed by.

preprint2020arXiv

Medical Imaging Synthesis using Deep Learning and its Clinical Applications: A Review

This paper reviewed the deep learning-based studies for medical imaging synthesis and its clinical application. Specifically, we summarized the recent developments of deep learning-based methods in inter- and intra-modality image synthesis by listing and highlighting the proposed methods, study designs and reported performances with related clinical applications on representative studies. The challenges among the reviewed studies were summarized in the discussion part.

preprint2020arXiv

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

We study ranked enumeration of join-query results according to very general orders defined by selective dioids. Our main contribution is a framework for ranked enumeration over a class of dynamic programming problems that generalizes seemingly different problems that had been studied in isolation. To this end, we extend classic algorithms that find the k-shortest paths in a weighted graph. For full conjunctive queries, including cyclic ones, our approach is optimal in terms of the time to return the top result and the delay between results. These optimality properties are derived for the widely used notion of data complexity, which treats query size as a constant. By performing a careful cost analysis, we are able to uncover a previously unknown tradeoff between two incomparable enumeration approaches: one has lower complexity when the number of returned results is small, the other when the number is very large. We theoretically and empirically demonstrate the superiority of our techniques over batch algorithms, which produce the full result and then sort it. Our technique is not only faster for returning the first few results, but on some inputs beats the batch algorithm even when all results are produced.

preprint2019arXiv

A preliminary study on a multi-resolution-level inverse planning algorithm for Gamma Knife radiosurgery

Manual forward planning for GK radiosurgery is complicated and time-consuming, particularly for cases with large or irregularly shaped targets. Inverse planning eases GK planning by solving an optimization problem. However, due to the vast search space, most inverse planning algorithms have to decouple the planning process to isocenter preselection and sector duration optimization. This sequential scheme does not necessarily lead to optimal isocenter locations and hence optimal plans. In this study, we attempt to optimize the isocenter positions, beam shapes and durations simultaneously by proposing a multi-resolution-level (MRL) strategy to handle the large-scale GK optimization problem. In our approach, several rounds of optimizations were performed with a progressively increased spatial resolution for isocenter candidate selection. The isocenters selected from last round and their neighbors on a finer resolution were used as new isocenter candidates for next round of optimization. After plan optimization, shot sequencing was performed to group the optimized sectors to deliverable shots supported by GK treatment units. We have tested our algorithm on 6 GK cases previously treated in our institution (2 meningioma cases, 3 cases with single metastasis and 1 case with 6 metastases). Compared with manual planning, achieving same coverage and similar selectivity, our algorithm improved the gradient index averagely from 3.1 to 2.9 and reduced the maximum dose of brainstem from 8.0Gy to 5.6Gy. The average beam-on time was also reduced by from 103.8 mins to 87.4 mins. Our method was also compared with the inverse planning algorithm provided in Leksell GammaPlan planning system, and outperformed it with better plan quality for all the 6 cases.This preliminary study has demonstrated the effectiveness and feasibility of our MRL inverse planning approach for GK radiosurgery.

preprint2014arXiv

Flow-pattern switching in a Motored Spark Ignition Engine

Cyclic-to-cycle variability, CCV, of intake-jet flow in an optical engine was measured using particle image velocimetry (PIV), revealing the possibility of two different flow patterns. A phase-dependent proper orthogonal decomposition (POD) analysis showed that one or the other flow pattern would appear in the average flow, sampled from test to test or sub-sampled within a single test; each data set contained individual cycles showing one flow pattern or the other. Three-dimensional velocity data from a large-eddy simulation (LES) of the engine showed that the PIV plane was cutting through a region of high shear between the intake jet and another large flow structure. Rotating the measurement plane 10° revealed one or the other flow structure observed in the PIV measurements. Thus, it was hypothesized that cycle-to-cycle variations in the swirl ratio result in the two different flow patterns in the PIV plane. Having an unambiguous metric to reveal large-scale flow CCV, causes for this variability were examined within the possible sources present in the available testing. In particular, variations in intake-port and cylinder pressure, lateral valve oscillations, and engine RPM were examined as potential causes for the cycle-to-cycle flow ariations using the phase-dependent POD coefficients. No direct correlation was seen between the intake port pressure, or the pressure drop across the intake valve, and the in-cylinder flow pattern. A correlation was observed between dominant flow pattern and cycle-to-cycle variations in intake valve horizontal position. RPM values and in-cylinder flow patterns did not correlate directly. However, a shift in flow pattern was observed between early and late cycles in a 2900-cycle test after an approximately 5 rpm engine speed perturbation.

preprint2013arXiv

Analytical Solutions of Singular Isothermal Quadrupole Lens

Using analytical method, we study the Singular Isothermal Quadrupole (SIQ) lens system, which is the simplest lens model that can produce four images. In this case, the radial mass distribution is in accord with the profile of the Singular Isothermal Sphere (SIS) lens, and the tangential distribution is given by adding a quadrupole on the monopole component. The basic properties of the SIQ lens have been studied in this paper, including deflection potential, deflection angle, magnification, critical curve, caustic, pseudo-caustic and transition locus. Analytical solutions of the image positions and magnifications for the source on axes are derived. As have been found, naked cusps will appear when the relative intensity $k$ of quadrupole to monopole is larger than 0.6. According to the magnification invariant theory of the SIQ lens, the sum of the signed magnifications of the four images should be equal to unity \citep{dal98}. However, if a source lies in the naked cusp, the summed magnification of the left three images is smaller than the invariant 1. With this simple lens system, we study the situations that a point source infinitely approaches a cusp or a fold. The sum of magnifications of cusp image triplet is usually not equal to 0, and it is usually positive for major cusp while negative for minor cusp. Similarly, the sum of magnifications of fold image pair is usually neither equal to 0. Nevertheless, the cusp and fold relations are still equal to 0, in that the sum values are divided by infinite absolute magnifications by definition.

preprint2013arXiv

Searching for a preferred direction with Union2.1 data

A cosmological preferred direction was reported from the type Ia supernovae (SNe Ia) data in recent years. We use the Union2.1 data to give a simple classification of such studies for the first time. Because the maximum anisotropic direction is independent of isotropic dark energy models, we adopt two cosmological models ($Λ$CDM, $w$CDM) for the hemisphere comparison analysis and $Λ$CDM model for dipole fit approach. In hemisphere comparison method, the matter density and the equation of state of dark energy are adopted as the diagnostic qualities in the $Λ$CDM model and $w$CDM model, respectively. In dipole fit approach, we fit the fluctuation of distance modulus. We find that there is a null signal for the hemisphere comparison method, while a preferred direction ($b=-14.3^\circ \pm 10.1^\circ, l=307.1^\circ \pm 16.2^\circ$) for the dipole fit method. This result indicates that the dipole fit is more sensitive than the hemisphere comparison method.

preprint2011arXiv

The optimal weighting function for cosmic magnification measurement through foreground galaxy-background galaxy (quasar) cross correlation

Cosmic magnification has been detected through cross correlation between foreground and background populations (galaxies or quasars). It has been shown that weighing each background object by its $α-1$ can significantly improve the cosmic magnification measurement \citep{Menard02,Scranton05}. Here, $α$ is the logarithmic slope of the luminosity function of background populations. However, we find that this weighting function is optimal only for sparse background populations in which intrinsic clustering is negligible with respect to shot noise. We derive the optimal weighting function for general case including scale independent and scale dependent weights. The optimal weighting function improves the S/N (signal to noise ratio) by $\sim 20%$ for a BigBOSS-like survey and the improvement can reach a factor of $\sim 2$ for surveys with much denser background populations.

Xiaofeng Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

34 published item(s)

Beyond Single Prompts: Synergistic Fusion and Arrangement for VICL

BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning

EfficientFSL: Enhancing Few-Shot Classification via Query-Only Tuning in Vision Transformers

Enhancing Visual In-Context Learning by Multi-Faceted Fusion

InfoSculpt: Sculpting the Latent Space for Generalized Category Discovery

IPEC: Test-Time Incremental Prototype Enhancement Classifier for Few-Shot Learning

A Physics-Informed Deep Learning Model for MRI Brain Motion Correction

MRI super-resolution reconstruction using efficient diffusion probabilistic model with residual shifting

Res-MoCoDiff: Residual-guided diffusion models for motion artifact correction in brain MRI

Fast MRI Reconstruction Using Deep Learning-based Compressed Sensing: A Systematic Review

Effective End-to-End Vision Language Pretraining with Semantic Visual Loss

Self-Training Vision Language BERTs with a Unified Conditional Model

Deep Q-learning of global optimizer of multiply model parameters for viscoelastic imaging

Multi-organ Segmentation Network with Adversarial Performance Validator

Reinforcement Learning in Medical Image Analysis: Concepts, Applications, Challenges, and Future Directions

A plan quality control method of treatment planning for Gamma Knife radiosurgery

An institutional study on plan quality and variation of manual forward planning for Gamma Knife radiosurgery for vestibular schwannoma

Artificial Intelligence in Quantitative Ultrasound Imaging: A Review

Catching butterflies in the sky: Extended catalog of winged or X-shaped radio sources from the latest FIRST data release

Deep Learning in Multi-organ Segmentation

Deep learning-based Real-time Volumetric Imaging for Lung Stereotactic Body Radiation Therapy: A Proof of Concept Study

Generative Adversarial Network for Image Synthesis

Intensity Non-uniformity Correction in MR Imaging Using Residual Cycle Generative Adversarial Network

Knowledge-based Radiation Treatment Planning: A Data-driven Method Survey

Learning-Based Stopping Power Mapping on Dual Energy CT for Proton Radiation Therapy

Learning-Based Synthetic Dual Energy CT Imaging from Single Energy CT for Stopping Power Ratio Calculation in Proton Radiation Therapy

Machine Learning in Quantitative PET Imaging

Medical Imaging Synthesis using Deep Learning and its Clinical Applications: A Review

Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries

A preliminary study on a multi-resolution-level inverse planning algorithm for Gamma Knife radiosurgery

Flow-pattern switching in a Motored Spark Ignition Engine

Analytical Solutions of Singular Isothermal Quadrupole Lens

Searching for a preferred direction with Union2.1 data

The optimal weighting function for cosmic magnification measurement through foreground galaxy-background galaxy (quasar) cross correlation