Source author record

Ulas Bagci

Ulas Bagci appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Machine Learning Quantitative Methods Computation and Language Multimedia physics.med-ph

Catalog footprint

What is connected

33works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation

Computed tomography (CT) images are frequently degraded by acquisition artifacts, including noise, blur, streaking, aliasing, and metal artifacts. Yet CT enhancement is still largely evaluated using image quality metrics with limited perceptual and clinical validity, while existing datasets remain focused on isolated restoration tasks, hindering unified benchmarking across diverse degradation types. We present CT-DegradBench, a dataset and benchmark for CT degradation detection and severity estimation under controlled single- and mixed-artifact settings. CT-DegradBench enables systematic evaluation across multiple degradation families and severity levels within a common experimental framework. We further propose SeSpeCT (Semantic-Spectral CT degradation estimation), a framework that combines semantic priors from medical vision-language models with complementary frequency-domain cues for artifact analysis. SeSpeCT constructs a training-free semantic quality axis in the multimodal embedding space using radiology-informed text prompts, without task-specific fine-tuning, and combines it with spectral features that capture degradation-specific frequency patterns. The resulting representation enables joint prediction of artifact type and severity. Experimental results show that SeSpeCT consistently outperforms the evaluated baselines under both single- and mixed-degradation settings. The framework is available at https://github.com/yousranb/CT-DEGRADBENCH.

preprint2026arXiv

CTest-Metric: A Unified Framework to Assess Clinical Validity of Metrics for CT Report Generation

In the generative AI era, where even critical medical tasks are increasingly automated, radiology report generation (RRG) continues to rely on suboptimal metrics for quality assessment. Developing domain-specific metrics has therefore been an active area of research, yet it remains challenging due to the lack of a unified, well-defined framework to assess their robustness and applicability in clinical contexts. To address this, we present CTest-Metric, a first unified metric assessment framework with three modules determining the clinical feasibility of metrics for CT RRG. The modules test: (i) Writing Style Generalizability (WSG) via LLM-based rephrasing; (ii) Synthetic Error Injection (SEI) at graded severities; and (iii) Metrics-vs-Expert correlation (MvE) using clinician ratings on 175 "disagreement" cases. Eight widely used metrics (BLEU, ROUGE, METEOR, BERTScore-F1, F1-RadGraph, RaTEScore, GREEN Score, CRG) are studied across seven LLMs built on a CT-CLIP encoder. Using our novel framework, we found that lexical NLG metrics are highly sensitive to stylistic variations; GREEN Score aligns best with expert judgments (Spearman~0.70), while CRG shows negative correlation; and BERTScore-F1 is least sensitive to factual error injection. We will release the framework, code, and allowable portion of the anonymized evaluation data (rephrased/error-injected CT reports), to facilitate reproducible benchmarking and future metric development.

preprint2026arXiv

Ensemble Models for Predicting Treatment Response in Pediatric Low-Grade Glioma Managed with Chemotherapy

In this paper, we introduce a novel pipeline for predicting chemotherapy response in pediatric brain tumors that are not amenable to complete surgical resection, using pre-treatment magnetic resonance imaging combined with clinical information. Our method integrates a state-of-the-art pediatric brain tumor segmentation framework with radiomic feature extraction and clinical data through an ensemble of a Swin UNETR encoder and XGBoost classifier. The segmentation model delineates four tumor subregions enhancing tumor, non-enhancing tumor, cystic component and edema which are used to extract imaging biomarkers and generate predictive features. The Swin UNETR network classifies the response to treatment directly from these segmented MRI scans, while XGBoost predicts response using radiomics and clinical variables including legal sex, ethnicity, race, age at event (in days), molecular subtype, tumor locations, initial surgery status, metastatic status, metastasis location, chemotherapy type, protocol name and chemotherapy agents. The ensemble output provides a non-invasive estimate of chemotherapy response in this historically challenging population characterized by lower progression-free survival. Among compared approaches, our Swin-Ensemble achieved the best performance (precision for non effective cases=0.68, recall for non effective cases=0.85, precision for chemotherapy effective cases=0.64 and overall accuracy=0.69), outperforming Mamba-FeatureFuse, Swin UNETR encoder, and Swin-FeatureFuse models. Our findings suggest that this ensemble framework represents a promising step toward personalized therapy response prediction for pediatric low-grade glioma patients in need of chemotherapy treatment who are not suitable for complete surgical resection, a population with significantly lower progression free survival and for whom chemotherapy remains the primary treatment option.

preprint2026arXiv

Improved Segmentation of Polyps and Visual Explainability Analysis

Colorectal cancer (CRC) remains one of the leading causes of cancer-related morbidity and mortality worldwide, with gastrointestinal (GI) polyps serving as critical precursors according to the World Health Organization (WHO). Early and accurate segmentation of polyps during colonoscopy is essential for reducing CRC progression, yet manual delineation is labor-intensive and prone to observer variability. Deep learning methods have demonstrated strong potential for automated polyp analysis, but their limited interpretability remains a barrier to clinical adoption. In this study, we present PolypSeg-GradCAM, an explainable deep learning framework that integrates a U-Net architecture with a pre-trained ResNet-34 backbone and Gradient-weighted Class Activation Mapping (Grad-CAM) for transparent polyp segmentation. To ensure rigorous benchmarking, the model was trained and evaluated using 5-Fold Cross-Validation on the Kvasir-SEG dataset of 1,000 annotated endoscopic images. Experimental results show a mean Dice coefficient of 0.8902 +/- 0.0125, a mean Intersection-over-Union (IoU) of 0.8023, and an Area Under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.9722. Advanced quantitative analysis using an optimal threshold yielded a Sensitivity of 0.9058 and Precision of 0.9083. Additionally, Grad-CAM visualizations confirmed that predictions were guided by clinically relevant regions, offering insight into the model's decision-making process. This study demonstrates that integrating segmentation accuracy with interpretability can support the development of trustworthy AI-assisted colonoscopy tools.

preprint2025arXiv

ProDM: Synthetic Reality-driven Property-aware Progressive Diffusion Model for Coronary Calcium Motion Correction in Non-gated Chest CT

Coronary artery calcium (CAC) scoring from chest CT is a well-established tool to stratify and refine clinical cardiovascular disease risk estimation. CAC quantification relies on the accurate delineation of calcified lesions, but is oftentimes affected by artifacts introduced by cardiac and respiratory motion. ECG-gated cardiac CTs substantially reduce motion artifacts, but their use in population screening and routine imaging remains limited due to gating requirements and lack of insurance coverage. Although identification of incidental CAC from non-gated chest CT is increasingly considered for it offers an accessible and widely available alternative, this modality is limited by more severe motion artifacts. We present ProDM (Property-aware Progressive Correction Diffusion Model), a generative diffusion framework that restores motion-free calcified lesions from non-gated CTs. ProDM introduces three key components: (1) a CAC motion simulation data engine that synthesizes realistic non-gated acquisitions with diverse motion trajectories directly from cardiac-gated CTs, enabling supervised training without paired data; (2) a property-aware learning strategy incorporating calcium-specific priors through a differentiable calcium consistency loss to preserve lesion integrity; and (3) a progressive correction scheme that reduces artifacts gradually across diffusion steps to enhance stability and calcium fidelity. Experiments on real patient datasets show that ProDM significantly improves CAC scoring accuracy, spatial lesion fidelity, and risk stratification performance compared with several baselines. A reader study on real non-gated scans further confirms that ProDM suppresses motion artifacts and improves clinical usability. These findings highlight the potential of progressive, property-aware frameworks for reliable CAC quantification from routine chest CT imaging.

preprint2022arXiv

A Critical Appraisal of Data Augmentation Methods for Imaging-Based Medical Diagnosis Applications

Current data augmentation techniques and transformations are well suited for improving the size and quality of natural image datasets but are not yet optimized for medical imaging. We hypothesize that sub-optimal data augmentations can easily distort or occlude medical images, leading to false positives or negatives during patient diagnosis, prediction, or therapy/surgery evaluation. In our experimental results, we found that utilizing commonly used intensity-based data augmentation distorts the MRI scans and leads to texture information loss, thus negatively affecting the overall performance of classification. Additionally, we observed that commonly used data augmentation methods cannot be used with a plug-and-play approach in medical imaging, and requires manual tuning and adjustment.

preprint2022arXiv

An Efficient Multi-Scale Fusion Network for 3D Organ at Risk (OAR) Segmentation

Accurate segmentation of organs-at-risks (OARs) is a precursor for optimizing radiation therapy planning. Existing deep learning-based multi-scale fusion architectures have demonstrated a tremendous capacity for 2D medical image segmentation. The key to their success is aggregating global context and maintaining high resolution representations. However, when translated into 3D segmentation problems, existing multi-scale fusion architectures might underperform due to their heavy computation overhead and substantial data diet. To address this issue, we propose a new OAR segmentation framework, called OARFocalFuseNet, which fuses multi-scale features and employs focal modulation for capturing global-local context across multiple scales. Each resolution stream is enriched with features from different resolution scales, and multi-scale information is aggregated to model diverse contextual ranges. As a result, feature representations are further boosted. The comprehensive comparisons in our experimental setup with OAR segmentation as well as multi-organ segmentation show that our proposed OARFocalFuseNet outperforms the recent state-of-the-art methods on publicly available OpenKBP datasets and Synapse multi-organ segmentation. Both of the proposed methods (3D-MSF and OARFocalFuseNet) showed promising performance in terms of standard evaluation metrics. Our best performing method (OARFocalFuseNet) obtained a dice coefficient of 0.7995 and hausdorff distance of 5.1435 on OpenKBP datasets and dice coefficient of 0.8137 on Synapse multi-organ segmentation dataset.

preprint2022arXiv

Automatic Polyp Segmentation with Multiple Kernel Dilated Convolution Network

The detection and removal of precancerous polyps through colonoscopy is the primary technique for the prevention of colorectal cancer worldwide. However, the miss rate of colorectal polyp varies significantly among the endoscopists. It is well known that a computer-aided diagnosis (CAD) system can assist endoscopists in detecting colon polyps and minimize the variation among endoscopists. In this study, we introduce a novel deep learning architecture, named MKDCNet, for automatic polyp segmentation robust to significant changes in polyp data distribution. MKDCNet is simply an encoder-decoder neural network that uses the pre-trained ResNet50 as the encoder and novel multiple kernel dilated convolution (MKDC) block that expands the field of view to learn more robust and heterogeneous representation. Extensive experiments on four publicly available polyp datasets and cell nuclei dataset show that the proposed MKDCNet outperforms the state-of-the-art methods when trained and tested on the same dataset as well when tested on unseen polyp datasets from different distributions. With rich results, we demonstrated the robustness of the proposed architecture. From an efficiency perspective, our algorithm can process at (approx 45) frames per second on RTX 3090 GPU. MKDCNet can be a strong benchmark for building real-time systems for clinical colonoscopies. The code of the proposed MKDCNet is available at https://github.com/nikhilroxtomar/MKDCNet.

preprint2022arXiv

Enhancing Organ at Risk Segmentation with Improved Deep Neural Networks

Organ at risk (OAR) segmentation is a crucial step for treatment planning and outcome determination in radiotherapy treatments of cancer patients. Several deep learning based segmentation algorithms have been developed in recent years, however, U-Net remains the de facto algorithm designed specifically for biomedical image segmentation and has spawned many variants with known weaknesses. In this study, our goal is to present simple architectural changes in U-Net to improve its accuracy and generalization properties. Unlike many other available studies evaluating their algorithms on single center data, we thoroughly evaluate several variations of U-Net as well as our proposed enhanced architecture on multiple data sets for an extensive and reliable study of the OAR segmentation problem. Our enhanced segmentation model includes (a)architectural changes in the loss function, (b)optimization framework, and (c)convolution type. Testing on three publicly available multi-object segmentation data sets, we achieved an average of 80% dice score compared to the baseline U-Net performance of 63%.

preprint2022arXiv

Multi-Contrast MRI Segmentation Trained on Synthetic Images

In our comprehensive experiments and evaluations, we show that it is possible to generate multiple contrast (even all synthetically) and use synthetically generated images to train an image segmentation engine. We showed promising segmentation results tested on real multi-contrast MRI scans when delineating muscle, fat, bone and bone marrow, all trained on synthetic images. Based on synthetic image training, our segmentation results were as high as 93.91\%, 94.11\%, 91.63\%, 95.33\%, for muscle, fat, bone, and bone marrow delineation, respectively. Results were not significantly different from the ones obtained when real images were used for segmentation training: 94.68\%, 94.67\%, 95.91\%, and 96.82\%, respectively.

preprint2022arXiv

Out of Distribution Detection, Generalization, and Robustness Triangle with Maximum Probability Theorem

Maximum Probability Framework, powered by Maximum Probability Theorem, is a recent theoretical development in artificial intelligence, aiming to formally define probabilistic models, guiding development of objective functions, and regularization of probabilistic models. MPT uses the probability distribution that the models assume on random variables to provide an upper bound on the probability of the model. We apply MPT to challenging out-of-distribution (OOD) detection problems in computer vision by incorporating MPT as a regularization scheme in the training of CNNs and their energy-based variants. We demonstrate the effectiveness of the proposed method on 1080 trained models, with varying hyperparameters, and conclude that the MPT-based regularization strategy stabilizes and improves the generalization and robustness of base models in addition to enhanced OOD performance on CIFAR10, CIFAR100, and MNIST datasets.

preprint2022arXiv

TGANet: Text-guided attention for improved polyp segmentation

Colonoscopy is a gold standard procedure but is highly operator-dependent. Automated polyp segmentation, a precancerous precursor, can minimize missed rates and timely treatment of colon cancer at an early stage. Even though there are deep learning methods developed for this task, variability in polyp size can impact model training, thereby limiting it to the size attribute of the majority of samples in the training dataset that may provide sub-optimal results to differently sized polyps. In this work, we exploit size-related and polyp number-related features in the form of text attention during training. We introduce an auxiliary classification task to weight the text-based embedding that allows network to learn additional feature representations that can distinctly adapt to differently sized polyps and can adapt to cases with multiple polyps. Our experimental results demonstrate that these added text embeddings improve the overall performance of the model compared to state-of-the-art segmentation methods. We explore four different datasets and provide insights for size-specific improvements. Our proposed text-guided attention network (TGANet) can generalize well to variable-sized polyps in different datasets.

preprint2022arXiv

Transformer based Generative Adversarial Network for Liver Segmentation

Automated liver segmentation from radiology scans (CT, MRI) can improve surgery and therapy planning and follow-up assessment in addition to conventional use for diagnosis and prognosis. Although convolutional neural networks (CNNs) have become the standard image segmentation tasks, more recently this has started to change towards Transformers based architectures because Transformers are taking advantage of capturing long range dependence modeling capability in signals, so called attention mechanism. In this study, we propose a new segmentation approach using a hybrid approach combining the Transformer(s) with the Generative Adversarial Network (GAN) approach. The premise behind this choice is that the self-attention mechanism of the Transformers allows the network to aggregate the high dimensional feature and provide global information modeling. This mechanism provides better segmentation performance compared with traditional methods. Furthermore, we encode this generator into the GAN based architecture so that the discriminator network in the GAN can classify the credibility of the generated segmentation masks compared with the real masks coming from human (expert) annotations. This allows us to extract the high dimensional topology information in the mask for biomedical image segmentation and provide more reliable segmentation results. Our model achieved a high dice coefficient of 0.9433, recall of 0.9515, and precision of 0.9376 and outperformed other Transformer based approaches.

preprint2022arXiv

TransResU-Net: Transformer based ResU-Net for Real-Time Colonoscopy Polyp Segmentation

Colorectal cancer (CRC) is one of the most common causes of cancer and cancer-related mortality worldwide. Performing colon cancer screening in a timely fashion is the key to early detection. Colonoscopy is the primary modality used to diagnose colon cancer. However, the miss rate of polyps, adenomas and advanced adenomas remains significantly high. Early detection of polyps at the precancerous stage can help reduce the mortality rate and the economic burden associated with colorectal cancer. Deep learning-based computer-aided diagnosis (CADx) system may help gastroenterologists to identify polyps that may otherwise be missed, thereby improving the polyp detection rate. Additionally, CADx system could prove to be a cost-effective system that improves long-term colorectal cancer prevention. In this study, we proposed a deep learning-based architecture for automatic polyp segmentation, called Transformer ResU-Net (TransResU-Net). Our proposed architecture is built upon residual blocks with ResNet-50 as the backbone and takes the advantage of transformer self-attention mechanism as well as dilated convolution(s). Our experimental results on two publicly available polyp segmentation benchmark datasets showed that TransResU-Net obtained a highly promising dice score and a real-time speed. With high efficacy in our performance metrics, we concluded that TransResU-Net could be a strong benchmark for building a real-time polyp detection system for the early diagnosis, treatment, and prevention of colorectal cancer. The source code of the proposed TransResU-Net is publicly available at https://github.com/nikhilroxtomar/TransResUNet.

preprint2022arXiv

Video Analytics in Elite Soccer: A Distributed Computing Perspective

Ubiquitous sensors and Internet of Things (IoT) technologies have revolutionized the sports industry, providing new methodologies for planning, effective coordination of training, and match analysis post game. New methods, including machine learning, image and video processing, have been developed for performance evaluation, allowing the analyst to track the performance of a player in real-time. Following FIFA's 2015 approval of electronics performance and tracking system during games, performance data of a single player or the entire team is allowed to be collected using GPS-based wearables. Data from practice sessions outside the sporting arena is being collected in greater numbers than ever before. Realizing the significance of data in professional soccer, this paper presents video analytics, examines recent state-of-the-art literature in elite soccer, and summarizes existing real-time video analytics algorithms. We also discuss real-time crowdsourcing of the obtained data, tactical and technical performance, distributed computing and its importance in video analytics and propose a future research perspective.

preprint2022arXiv

Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network

Video capsule endoscopy is a hot topic in computer vision and medicine. Deep learning can have a positive impact on the future of video capsule endoscopy technology. It can improve the anomaly detection rate, reduce physicians' time for screening, and aid in real-world clinical analysis. CADx classification system for video capsule endoscopy has shown a great promise for further improvement. For example, detection of cancerous polyp and bleeding can lead to swift medical response and improve the survival rate of the patients. To this end, an automated CADx system must have high throughput and decent accuracy. In this paper, we propose FocalConvNet, a focal modulation network integrated with lightweight convolutional layers for the classification of small bowel anatomical landmarks and luminal findings. FocalConvNet leverages focal modulation to attain global context and allows global-local spatial interactions throughout the forward pass. Moreover, the convolutional block with its intrinsic inductive/learning bias and capacity to extract hierarchical features allows our FocalConvNet to achieve favourable results with high throughput. We compare our FocalConvNet with other SOTA on Kvasir-Capsule, a large-scale VCE dataset with 44,228 frames with 13 classes of different anomalies. Our proposed method achieves the weighted F1-score, recall and MCC} of 0.6734, 0.6373 and 0.2974, respectively outperforming other SOTA methodologies. Furthermore, we report the highest throughput of 148.02 images/second rate to establish the potential of FocalConvNet in a real-time clinical environment. The code of the proposed FocalConvNet is available at https://github.com/NoviceMAn-prog/FocalConvNet.

preprint2021arXiv

Deep Convolutional Neural Network based Classification of Alzheimer's Disease using MRI data

Alzheimer's disease (AD) is a progressive and incurable neurodegenerative disease which destroys brain cells and causes loss to patient's memory. An early detection can prevent the patient from further damage of the brain cells and hence avoid permanent memory loss. In past few years, various automatic tools and techniques have been proposed for diagnosis of AD. Several methods focus on fast, accurate and early detection of the disease to minimize the loss to patients mental health. Although machine learning and deep learning techniques have significantly improved medical imaging systems for AD by providing diagnostic performance close to human level. But the main problem faced during multi-class classification is the presence of highly correlated features in the brain structure. In this paper, we have proposed a smart and accurate way of diagnosing AD based on a two-dimensional deep convolutional neural network (2D-DCNN) using imbalanced three-dimensional MRI dataset. Experimental results on Alzheimer Disease Neuroimaging Initiative magnetic resonance imaging (MRI) dataset confirms that the proposed 2D-DCNN model is superior in terms of accuracy, efficiency, and robustness. The model classifies MRI into three categories: AD, mild cognitive impairment, and normal control: and has achieved 99.89% classification accuracy with imbalanced classes. The proposed model exhibits noticeable improvement in accuracy as compared to the state-fo-the-art methods.

preprint2021arXiv

Hierarchical 3D Feature Learning for Pancreas Segmentation

We propose a novel 3D fully convolutional deep network for automated pancreas segmentation from both MRI and CT scans. More specifically, the proposed model consists of a 3D encoder that learns to extract volume features at different scales; features taken at different points of the encoder hierarchy are then sent to multiple 3D decoders that individually predict intermediate segmentation maps. Finally, all segmentation maps are combined to obtain a unique detailed segmentation mask. We test our model on both CT and MRI imaging data: the publicly available NIH Pancreas-CT dataset (consisting of 82 contrast-enhanced CTs) and a private MRI dataset (consisting of 40 MRI scans). Experimental results show that our model outperforms existing methods on CT pancreas segmentation, obtaining an average Dice score of about 88%, and yields promising segmentation performance on a very challenging MRI data set (average Dice score is about 77%). Additional control experiments demonstrate that the achieved performance is due to the combination of our 3D fully-convolutional deep network and the hierarchical representation decoding, thus substantiating our architectural design.

preprint2020arXiv

Adipose Tissue Segmentation in Unlabeled Abdomen MRI using Cross Modality Domain Adaptation

Abdominal fat quantification is critical since multiple vital organs are located within this region. Although computed tomography (CT) is a highly sensitive modality to segment body fat, it involves ionizing radiations which makes magnetic resonance imaging (MRI) a preferable alternative for this purpose. Additionally, the superior soft tissue contrast in MRI could lead to more accurate results. Yet, it is highly labor intensive to segment fat in MRI scans. In this study, we propose an algorithm based on deep learning technique(s) to automatically quantify fat tissue from MR images through a cross modality adaptation. Our method does not require supervised labeling of MR scans, instead, we utilize a cycle generative adversarial network (C-GAN) to construct a pipeline that transforms the existing MR scans into their equivalent synthetic CT (s-CT) images where fat segmentation is relatively easier due to the descriptive nature of HU (hounsfield unit) in CT images. The fat segmentation results for MRI scans were evaluated by expert radiologist. Qualitative evaluation of our segmentation results shows average success score of 3.80/5 and 4.54/5 for visceral and subcutaneous fat segmentation in MR images.

preprint2020arXiv

Brain Tumor Survival Prediction using Radiomics Features

Surgery planning in patients diagnosed with brain tumor is dependent on their survival prognosis. A poor prognosis might demand for a more aggressive treatment and therapy plan, while a favorable prognosis might enable a less risky surgery plan. Thus, accurate survival prognosis is an important step in treatment planning. Recently, deep learning approaches have been used extensively for brain tumor segmentation followed by the use of deep features for prognosis. However, radiomics-based studies have shown more promise using engineered/hand-crafted features. In this paper, we propose a three-step approach for multi-class survival prognosis. In the first stage, we extract image slices corresponding to tumor regions from multiple magnetic resonance image modalities. We then extract radiomic features from these 2D slices. Finally, we train machine learning classifiers to perform the classification. We evaluate our proposed approach on the publicly available BraTS 2019 data and achieve an accuracy of 76.5% and precision of 74.3% using the random forest classifier, which to the best of our knowledge are the highest reported results yet. Further, we identify the most important features that contribute in improving the prediction.

preprint2020arXiv

Deep Learning for Musculoskeletal Image Analysis

The diagnosis, prognosis, and treatment of patients with musculoskeletal (MSK) disorders require radiology imaging (using computed tomography, magnetic resonance imaging(MRI), and ultrasound) and their precise analysis by expert radiologists. Radiology scans can also help assessment of metabolic health, aging, and diabetes. This study presents how machinelearning, specifically deep learning methods, can be used for rapidand accurate image analysis of MRI scans, an unmet clinicalneed in MSK radiology. As a challenging example, we focus on automatic analysis of knee images from MRI scans and study machine learning classification of various abnormalities including meniscus and anterior cruciate ligament tears. Using widely used convolutional neural network (CNN) based architectures, we comparatively evaluated the knee abnormality classification performances of different neural network architectures under limited imaging data regime and compared single and multi-view imaging when classifying the abnormalities. Promising results indicated the potential use of multi-view deep learning based classification of MSK abnormalities in routine clinical assessment.

preprint2020arXiv

Diagnosing Colorectal Polyps in the Wild with Capsule Networks

Colorectal cancer, largely arising from precursor lesions called polyps, remains one of the leading causes of cancer-related death worldwide. Current clinical standards require the resection and histopathological analysis of polyps due to test accuracy and sensitivity of optical biopsy methods falling substantially below recommended levels. In this study, we design a novel capsule network architecture (D-Caps) to improve the viability of optical biopsy of colorectal polyps. Our proposed method introduces several technical novelties including a novel capsule architecture with a capsule-average pooling (CAP) method to improve efficiency in large-scale image classification. We demonstrate improved results over the previous state-of-the-art convolutional neural network (CNN) approach by as much as 43%. This work provides an important benchmark on the new Mayo Polyp dataset, a significantly more challenging and larger dataset than previous polyp studies, with results stratified across all available categories, imaging devices and modalities, and focus modes to promote future direction into AI-driven colorectal cancer screening systems. Code is publicly available at https://github.com/lalonderodney/D-Caps .

preprint2020arXiv

Encoding Visual Attributes in Capsules for Explainable Medical Diagnoses

Convolutional neural network based systems have largely failed to be adopted in many high-risk application areas, including healthcare, military, security, transportation, finance, and legal, due to their highly uninterpretable "black-box" nature. Towards solving this deficiency, we teach a novel multi-task capsule network to improve the explainability of predictions by embodying the same high-level language used by human-experts. Our explainable capsule network, X-Caps, encodes high-level visual object attributes within the vectors of its capsules, then forms predictions based solely on these human-interpretable features. To encode attributes, X-Caps utilizes a new routing sigmoid function to independently route information from child capsules to parents. Further, to provide radiologists with an estimate of model confidence, we train our network on a distribution of expert labels, modeling inter-observer agreement and punishing over/under confidence during training, supervised by human-experts' agreement. X-Caps simultaneously learns attribute and malignancy scores from a multi-center dataset of over 1000 CT scans of lung cancer screening patients. We demonstrate a simple 2D capsule network can outperform a state-of-the-art deep dense dual-path 3D CNN at capturing visually-interpretable high-level attributes and malignancy prediction, while providing malignancy prediction scores approaching that of non-explainable 3D CNNs. To the best of our knowledge, this is the first study to investigate capsule networks for making predictions based on radiologist-level interpretable attributes and its applications to medical image diagnosis. Code is publicly available at https://github.com/lalonderodney/X-Caps .

preprint2020arXiv

The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset

Purpose: To organize a knee MRI segmentation challenge for characterizing the semantic and clinical efficacy of automatic segmentation methods relevant for monitoring osteoarthritis progression. Methods: A dataset partition consisting of 3D knee MRI from 88 subjects at two timepoints with ground-truth articular (femoral, tibial, patellar) cartilage and meniscus segmentations was standardized. Challenge submissions and a majority-vote ensemble were evaluated using Dice score, average symmetric surface distance, volumetric overlap error, and coefficient of variation on a hold-out test set. Similarities in network segmentations were evaluated using pairwise Dice correlations. Articular cartilage thickness was computed per-scan and longitudinally. Correlation between thickness error and segmentation metrics was measured using Pearson's coefficient. Two empirical upper bounds for ensemble performance were computed using combinations of model outputs that consolidated true positives and true negatives. Results: Six teams (T1-T6) submitted entries for the challenge. No significant differences were observed across all segmentation metrics for all tissues (p=1.0) among the four top-performing networks (T2, T3, T4, T6). Dice correlations between network pairs were high (>0.85). Per-scan thickness errors were negligible among T1-T4 (p=0.99) and longitudinal changes showed minimal bias (<0.03mm). Low correlations (<0.41) were observed between segmentation metrics and thickness error. The majority-vote ensemble was comparable to top performing networks (p=1.0). Empirical upper bound performances were similar for both combinations (p=1.0). Conclusion: Diverse networks learned to segment the knee similarly where high segmentation accuracy did not correlate to cartilage thickness accuracy. Voting ensembles did not outperform individual networks but may help regularize individual models.

preprint2019arXiv

Instance-Level Microtubule Tracking

We propose a new method of instance-level microtubule (MT) tracking in time-lapse image series using recurrent attention. Our novel deep learning algorithm segments individual MTs at each frame. Segmentation results from successive frames are used to assign correspondences among MTs. This ultimately generates a distinct path trajectory for each MT through the frames. Based on these trajectories, we estimate MT velocities. To validate our proposed technique, we conduct experiments using real and simulated data. We use statistics derived from real time-lapse series of MT gliding assays to simulate realistic MT time-lapse image series in our simulated data. This dataset is employed as pre-training and hyperparameter optimization for our network before training on the real data. Our experimental results show that the proposed supervised learning algorithm improves the precision for MT instance velocity estimation drastically to 71.3% from the baseline result (29.3%). We also demonstrate how the inclusion of temporal information into our deep network can reduce the false negative rates from 67.8% (baseline) down to 28.7% (proposed). Our findings in this work are expected to help biologists characterize the spatial arrangement of MTs, specifically the effects of MT-MT interactions.

preprint2016arXiv

Characterization of Lung Nodule Malignancy using Hybrid Shape and Appearance Features

Computed tomography imaging is a standard modality for detecting and assessing lung cancer. In order to evaluate the malignancy of lung nodules, clinical practice often involves expert qualitative ratings on several criteria describing a nodule's appearance and shape. Translating these features for computer-aided diagnostics is challenging due to their subjective nature and the difficulties in gaining a complete description. In this paper, we propose a computerized approach to quantitatively evaluate both appearance distinctions and 3D surface variations. Nodule shape was modeled and parameterized using spherical harmonics, and appearance features were extracted using deep convolutional neural networks. Both sets of features were combined to estimate the nodule malignancy using a random forest classifier. The proposed algorithm was tested on the publicly available Lung Image Database Consortium dataset, achieving high accuracy. By providing lung nodule characterization, this method can provide a robust alternative reference opinion for lung cancer diagnosis.

preprint2015arXiv

Context Driven Label Fusion for segmentation of Subcutaneous and Visceral Fat in CT Volumes

Quantification of adipose tissue (fat) from computed tomography (CT) scans is conducted mostly through manual or semi-automated image segmentation algorithms with limited efficacy. In this work, we propose a completely unsupervised and automatic method to identify adipose tissue, and then separate Subcutaneous Adipose Tissue (SAT) from Visceral Adipose Tissue (VAT) at the abdominal region. We offer a three-phase pipeline consisting of (1) Initial boundary estimation using gradient points, (2) boundary refinement using Geometric Median Absolute Deviation and Appearance based Local Outlier Scores (3) Context driven label fusion using Conditional Random Fields (CRF) to obtain the final boundary between SAT and VAT. We evaluate the proposed method on 151 abdominal CT scans and obtain state-of-the-art 94% and 91% dice similarity scores for SAT and VAT segmentation, as well as significant reduction in fat quantification error measure.

preprint2014arXiv

CIDI-Lung-Seg: A Single-Click Annotation Tool for Automatic Delineation of Lungs from CT Scans

Accurate and fast extraction of lung volumes from computed tomography (CT) scans remains in a great demand in the clinical environment because the available methods fail to provide a generic solution due to wide anatomical variations of lungs and existence of pathologies. Manual annotation, current gold standard, is time consuming and often subject to human bias. On the other hand, current state-of-the-art fully automated lung segmentation methods fail to make their way into the clinical practice due to their inability to efficiently incorporate human input for handling misclassifications and praxis. This paper presents a lung annotation tool for CT images that is interactive, efficient, and robust. The proposed annotation tool produces an "as accurate as possible" initial annotation based on the fuzzy-connectedness image segmentation, followed by efficient manual fixation of the initial extraction if deemed necessary by the practitioner. To provide maximum flexibility to the users, our annotation tool is supported in three major operating systems (Windows, Linux, and the Mac OS X). The quantitative results comparing our free software with commercially available lung segmentation tools show higher degree of consistency and precision of our software with a considerable potential to enhance the performance of routine clinical tasks.

preprint2014arXiv

Near-optimal Keypoint Sampling for Fast Pathological Lung Segmentation

Accurate delineation of pathological lungs from computed tomography (CT) images remains mostly unsolved because available methods fail to provide a reliable generic solution due to high variability of abnormality appearance. Local descriptor-based classification methods have shown to work well in annotating pathologies; however, these methods are usually computationally intensive which restricts their widespread use in real-time or near-real-time clinical applications. In this paper, we present a novel approach for fast, accurate, reliable segmentation of pathological lungs from CT scans by combining region-based segmentation method with local descriptor classification that is performed on an optimized sampling grid. Our method works in two stages; during stage one, we adapted the fuzzy connectedness (FC) image segmentation algorithm to perform initial lung parenchyma extraction. In the second stage, texture-based local descriptors are utilized to segment abnormal imaging patterns using a near optimal keypoint analysis by employing centroid of supervoxel as grid points. The quantitative results show that our pathological lung segmentation method is fast, robust, and improves on current standards and has potential to enhance the performance of routine clinical tasks.

preprint2014arXiv

Optimally Stabilized PET Image Denoising Using Trilateral Filtering

Low-resolution and signal-dependent noise distribution in positron emission tomography (PET) images makes denoising process an inevitable step prior to qualitative and quantitative image analysis tasks. Conventional PET denoising methods either over-smooth small-sized structures due to resolution limitation or make incorrect assumptions about the noise characteristics. Therefore, clinically important quantitative information may be corrupted. To address these challenges, we introduced a novel approach to remove signal-dependent noise in the PET images where the noise distribution was considered as Poisson-Gaussian mixed. Meanwhile, the generalized Anscombe's transformation (GAT) was used to stabilize varying nature of the PET noise. Other than noise stabilization, it is also desirable for the noise removal filter to preserve the boundaries of the structures while smoothing the noisy regions. Indeed, it is important to avoid significant loss of quantitative information such as standard uptake value (SUV)-based metrics as well as metabolic lesion volume. To satisfy all these properties, we extended bilateral filtering method into trilateral filtering through multiscaling and optimal Gaussianization process. The proposed method was tested on more than 50 PET-CT images from various patients having different cancers and achieved the superior performance compared to the widely used denoising techniques in the literature.

preprint2011arXiv

Learning Shape and Texture Characteristics of CT Tree-in-Bud Opacities for CAD Systems

Although radiologists can employ CAD systems to characterize malignancies, pulmonary fibrosis and other chronic diseases; the design of imaging techniques to quantify infectious diseases continue to lag behind. There exists a need to create more CAD systems capable of detecting and quantifying characteristic patterns often seen in respiratory tract infections such as influenza, bacterial pneumonia, or tuborculosis. One of such patterns is Tree-in-bud (TIB) which presents \textit{thickened} bronchial structures surrounding by clusters of \textit{micro-nodules}. Automatic detection of TIB patterns is a challenging task because of their weak boundary, noisy appearance, and small lesion size. In this paper, we present two novel methods for automatically detecting TIB patterns: (1) a fast localization of candidate patterns using information from local scale of the images, and (2) a Möbius invariant feature extraction method based on learned local shape and texture properties. A comparative evaluation of the proposed methods is presented with a dataset of 39 laboratory confirmed viral bronchiolitis human parainfluenza (HPIV) CTs and 21 normal lung CTs. Experimental results demonstrate that the proposed CAD system can achieve high detection rate with an overall accuracy of 90.96%.

preprint2010arXiv

Ball-Scale Based Hierarchical Multi-Object Recognition in 3D Medical Images

This paper investigates, using prior shape models and the concept of ball scale (b-scale), ways of automatically recognizing objects in 3D images without performing elaborate searches or optimization. That is, the goal is to place the model in a single shot close to the right pose (position, orientation, and scale) in a given image so that the model boundaries fall in the close vicinity of object boundaries in the image. This is achieved via the following set of key ideas: (a) A semi-automatic way of constructing a multi-object shape model assembly. (b) A novel strategy of encoding, via b-scale, the pose relationship between objects in the training images and their intensity patterns captured in b-scale images. (c) A hierarchical mechanism of positioning the model, in a one-shot way, in a given image from a knowledge of the learnt pose relationship and the b-scale image of the given image to be segmented. The evaluation results on a set of 20 routine clinical abdominal female and male CT data sets indicate the following: (1) Incorporating a large number of objects improves the recognition accuracy dramatically. (2) The recognition algorithm can be thought as a hierarchical framework such that quick replacement of the model assembly is defined as coarse recognition and delineation itself is known as finest recognition. (3) Scale yields useful information about the relationship between the model assembly and any given image such that the recognition results in a placement of the model close to the actual pose without doing any elaborate searches or optimization. (4) Effective object recognition can make delineation most accurate.

preprint2010arXiv

The Influence of Intensity Standardization on Medical Image Registration

Acquisition-to-acquisition signal intensity variations (non-standardness) are inherent in MR images. Standardization is a post processing method for correcting inter-subject intensity variations through transforming all images from the given image gray scale into a standard gray scale wherein similar intensities achieve similar tissue meanings. The lack of a standard image intensity scale in MRI leads to many difficulties in tissue characterizability, image display, and analysis, including image segmentation. This phenomenon has been documented well; however, effects of standardization on medical image registration have not been studied yet. In this paper, we investigate the influence of intensity standardization in registration tasks with systematic and analytic evaluations involving clinical MR images. We conducted nearly 20,000 clinical MR image registration experiments and evaluated the quality of registrations both quantitatively and qualitatively. The evaluations show that intensity variations between images degrades the accuracy of registration performance. The results imply that the accuracy of image registration not only depends on spatial and geometric similarity but also on the similarity of the intensity values for the same tissues in different images.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Computer Vision eess.IV Machine Learning Quantitative Methods Computation and Language Multimedia physics.med-ph

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2512.24948:author:6:ulas-bagci

Imported May 21, 2026Synced May 21, 2026

arxivconfidence 95%

external id: arxiv:2605.16431:author:11:ulas-bagci

Imported May 20, 2026Synced May 20, 2026

8 works

Debesh Jha

Researcher

Debesh Jha contributes to research discovery and scholarly infrastructure.

Open to collaborate

5 works

Daniel J. Mollura

Researcher

Daniel J. Mollura contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Elif Keles

Researcher

Elif Keles contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Nikhil Kumar Tomar

Researcher

Nikhil Kumar Tomar contributes to research discovery and scholarly infrastructure.

Open to collaborate

Ulas Bagci

What is connected

Connect this record

See the researcher in context

Building this map preview

33 published item(s)

CT-DegradBench: A Physics-Informed Benchmark for CT Degradation Detection and Severity Estimation

CTest-Metric: A Unified Framework to Assess Clinical Validity of Metrics for CT Report Generation

Ensemble Models for Predicting Treatment Response in Pediatric Low-Grade Glioma Managed with Chemotherapy

Improved Segmentation of Polyps and Visual Explainability Analysis

ProDM: Synthetic Reality-driven Property-aware Progressive Diffusion Model for Coronary Calcium Motion Correction in Non-gated Chest CT

A Critical Appraisal of Data Augmentation Methods for Imaging-Based Medical Diagnosis Applications

An Efficient Multi-Scale Fusion Network for 3D Organ at Risk (OAR) Segmentation

Automatic Polyp Segmentation with Multiple Kernel Dilated Convolution Network

Enhancing Organ at Risk Segmentation with Improved Deep Neural Networks

Multi-Contrast MRI Segmentation Trained on Synthetic Images

Out of Distribution Detection, Generalization, and Robustness Triangle with Maximum Probability Theorem

TGANet: Text-guided attention for improved polyp segmentation

Transformer based Generative Adversarial Network for Liver Segmentation

TransResU-Net: Transformer based ResU-Net for Real-Time Colonoscopy Polyp Segmentation

Video Analytics in Elite Soccer: A Distributed Computing Perspective

Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network

Deep Convolutional Neural Network based Classification of Alzheimer's Disease using MRI data

Hierarchical 3D Feature Learning for Pancreas Segmentation

Adipose Tissue Segmentation in Unlabeled Abdomen MRI using Cross Modality Domain Adaptation

Brain Tumor Survival Prediction using Radiomics Features

Deep Learning for Musculoskeletal Image Analysis

Diagnosing Colorectal Polyps in the Wild with Capsule Networks

Encoding Visual Attributes in Capsules for Explainable Medical Diagnoses

The International Workshop on Osteoarthritis Imaging Knee MRI Segmentation Challenge: A Multi-Institute Evaluation and Analysis Framework on a Standardized Dataset

Instance-Level Microtubule Tracking

Characterization of Lung Nodule Malignancy using Hybrid Shape and Appearance Features

Context Driven Label Fusion for segmentation of Subcutaneous and Visceral Fat in CT Volumes

CIDI-Lung-Seg: A Single-Click Annotation Tool for Automatic Delineation of Lungs from CT Scans

Near-optimal Keypoint Sampling for Fast Pathological Lung Segmentation

Optimally Stabilized PET Image Denoising Using Trilateral Filtering

Learning Shape and Texture Characteristics of CT Tree-in-Bud Opacities for CAD Systems

Ball-Scale Based Hierarchical Multi-Object Recognition in 3D Medical Images

The Influence of Intensity Standardization on Medical Image Registration