Researcher profile

Dinggang Shen

Dinggang Shen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
35works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

35 published item(s)

preprint2024arXiv

Predicting Infant Brain Connectivity with Federated Multi-Trajectory GNNs using Scarce Data

The understanding of the convoluted evolution of infant brain networks during the first postnatal year is pivotal for identifying the dynamics of early brain connectivity development. Existing deep learning solutions suffer from three major limitations. First, they cannot generalize to multi-trajectory prediction tasks, where each graph trajectory corresponds to a particular imaging modality or connectivity type (e.g., T1-w MRI). Second, existing models require extensive training datasets to achieve satisfactory performance which are often challenging to obtain. Third, they do not efficiently utilize incomplete time series data. To address these limitations, we introduce FedGmTE-Net++, a federated graph-based multi-trajectory evolution network. Using the power of federation, we aggregate local learnings among diverse hospitals with limited datasets. As a result, we enhance the performance of each hospital's local generative model, while preserving data privacy. The three key innovations of FedGmTE-Net++ are: (i) presenting the first federated learning framework specifically designed for brain multi-trajectory evolution prediction in a data-scarce environment, (ii) incorporating an auxiliary regularizer in the local objective function to exploit all the longitudinal brain connectivity within the evolution trajectory and maximize data utilization, (iii) introducing a two-step imputation process, comprising a preliminary KNN-based precompletion followed by an imputation refinement step that employs regressors to improve similarity scores and refine imputations. Our comprehensive experimental results showed the outperformance of FedGmTE-Net++ in brain multi-trajectory prediction from a single baseline graph in comparison with benchmark methods.

preprint2022arXiv

A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

Interventional magnetic resonance imaging (i-MRI) for surgical guidance could help visualize the interventional process such as deep brain stimulation (DBS), improving the surgery performance and patient outcome. Different from retrospective reconstruction in conventional dynamic imaging, i-MRI for DBS has to acquire and reconstruct the interventional images sequentially online. Here we proposed a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling. By using an initializer and Conv-LSTM blocks, the priors from the pre-operative reference image and intra-operative frames were exploited for reconstructing the current frame. Data consistency for radial sampling was implemented by a soft-projection method. To improve the reconstruction accuracy, an adversarial learning strategy was adopted. A set of interventional images based on the pre-operative and post-operative MR images were simulated for algorithm validation. Results showed with only 10 radial spokes, ConvLR provided the best performance compared with state-of-the-art methods, giving an acceleration up to 40 folds. The proposed algorithm has the potential to achieve real-time i-MRI for DBS and can be used for general purpose MR-guided intervention.

preprint2022arXiv

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion

Multi-modal medical image completion has been extensively applied to alleviate the missing modality issue in a wealth of multi-modal diagnostic tasks. However, for most existing synthesis methods, their inferences of missing modalities can collapse into a deterministic mapping from the available ones, ignoring the uncertainties inherent in the cross-modal relationships. Here, we propose the Unified Multi-Modal Conditional Score-based Generative Model (UMM-CSGM) to take advantage of Score-based Generative Model (SGM) in modeling and stochastically sampling a target probability distribution, and further extend SGM to cross-modal conditional synthesis for various missing-modality configurations in a unified framework. Specifically, UMM-CSGM employs a novel multi-in multi-out Conditional Score Network (mm-CSN) to learn a comprehensive set of cross-modal conditional distributions via conditional diffusion and reverse generation in the complete modality space. In this way, the generation process can be accurately conditioned by all available information, and can fit all possible configurations of missing modalities in a single network. Experiments on BraTS19 dataset show that the UMM-CSGM can more reliably synthesize the heterogeneous enhancement and irregular area in tumor-induced lesions for any missing modalities.

preprint2022arXiv

Eye-gaze-guided Vision Transformer for Rectifying Shortcut Learning

Learning harmful shortcuts such as spurious correlations and biases prevents deep neural networks from learning the meaningful and useful representations, thus jeopardizing the generalizability and interpretability of the learned representation. The situation becomes even more serious in medical imaging, where the clinical data (e.g., MR images with pathology) are limited and scarce while the reliability, generalizability and transparency of the learned model are highly required. To address this problem, we propose to infuse human experts' intelligence and domain knowledge into the training of deep neural networks. The core idea is that we infuse the visual attention information from expert radiologists to proactively guide the deep model to focus on regions with potential pathology and avoid being trapped in learning harmful shortcuts. To do so, we propose a novel eye-gaze-guided vision transformer (EG-ViT) for diagnosis with limited medical image data. We mask the input image patches that are out of the radiologists' interest and add an additional residual connection in the last encoder layer of EG-ViT to maintain the correlations of all patches. The experiments on two public datasets of INbreast and SIIM-ACR demonstrate our EG-ViT model can effectively learn/transfer experts' domain knowledge and achieve much better performance than baselines. Meanwhile, it successfully rectifies the harmful shortcut learning and significantly improves the EG-ViT model's interpretability. In general, EG-ViT takes the advantages of both human expert's prior knowledge and the power of deep neural networks. This work opens new avenues for advancing current artificial intelligence paradigms by infusing human intelligence.

preprint2022arXiv

Follow My Eye: Using Gaze to Supervise Computer-Aided Diagnosis

When deep neural network (DNN) was first introduced to the medical image analysis community, researchers were impressed by its performance. However, it is evident now that a large number of manually labeled data is often a must to train a properly functioning DNN. This demand for supervision data and labels is a major bottleneck in current medical image analysis, since collecting a large number of annotations from experienced experts can be time-consuming and expensive. In this paper, we demonstrate that the eye movement of radiologists reading medical images can be a new form of supervision to train the DNN-based computer-aided diagnosis (CAD) system. Particularly, we record the tracks of the radiologists' gaze when they are reading images. The gaze information is processed and then used to supervise the DNN's attention via an Attention Consistency module. To the best of our knowledge, the above pipeline is among the earliest efforts to leverage expert eye movement for deep-learning-based CAD. We have conducted extensive experiments on knee X-ray images for osteoarthritis assessment. The results show that our method can achieve considerable improvement in diagnosis performance, with the help of gaze supervision.

preprint2022arXiv

Image Synthesis with Disentangled Attributes for Chest X-Ray Nodule Augmentation and Detection

Lung nodule detection in chest X-ray (CXR) images is common to early screening of lung cancers. Deep-learning-based Computer-Assisted Diagnosis (CAD) systems can support radiologists for nodule screening in CXR. However, it requires large-scale and diverse medical data with high-quality annotations to train such robust and accurate CADs. To alleviate the limited availability of such datasets, lung nodule synthesis methods are proposed for the sake of data augmentation. Nevertheless, previous methods lack the ability to generate nodules that are realistic with the size attribute desired by the detector. To address this issue, we introduce a novel lung nodule synthesis framework in this paper, which decomposes nodule attributes into three main aspects including shape, size, and texture, respectively. A GAN-based Shape Generator firstly models nodule shapes by generating diverse shape masks. The following Size Modulation then enables quantitative control on the diameters of the generated nodule shapes in pixel-level granularity. A coarse-to-fine gated convolutional Texture Generator finally synthesizes visually plausible nodule textures conditioned on the modulated shape masks. Moreover, we propose to synthesize nodule CXR images by controlling the disentangled nodule attributes for data augmentation, in order to better compensate for the nodules that are easily missed in the detection task. Our experiments demonstrate the enhanced image quality, diversity, and controllability of the proposed lung nodule synthesis framework. We also validate the effectiveness of our data augmentation on greatly improving nodule detection performance.

preprint2022arXiv

Knee Cartilage Defect Assessment by Graph Representation and Surface Convolution

Knee osteoarthritis (OA) is the most common osteoarthritis and a leading cause of disability. Cartilage defects are regarded as major manifestations of knee OA, which are visible by magnetic resonance imaging (MRI). Thus early detection and assessment for knee cartilage defects are important for protecting patients from knee OA. In this way, many attempts have been made on knee cartilage defect assessment by applying convolutional neural networks (CNNs) to knee MRI. However, the physiologic characteristics of the cartilage may hinder such efforts: the cartilage is a thin curved layer, implying that only a small portion of voxels in knee MRI can contribute to the cartilage defect assessment; heterogeneous scanning protocols further challenge the feasibility of the CNNs in clinical practice; the CNN-based knee cartilage evaluation results lack interpretability. To address these challenges, we model the cartilages structure and appearance from knee MRI into a graph representation, which is capable of handling highly diverse clinical data. Then, guided by the cartilage graph representation, we design a non-Euclidean deep learning network with the self-attention mechanism, to extract cartilage features in the local and global, and to derive the final assessment with a visualized result. Our comprehensive experiments show that the proposed method yields superior performance in knee cartilage defect assessment, plus its convenient 3D visualization for interpretability.

preprint2022arXiv

Learning towards Synchronous Network Memorizability and Generalizability for Continual Segmentation across Multiple Sites

In clinical practice, a segmentation network is often required to continually learn on a sequential data stream from multiple sites rather than a consolidated set, due to the storage cost and privacy restriction. However, during the continual learning process, existing methods are usually restricted in either network memorizability on previous sites or generalizability on unseen sites. This paper aims to tackle the challenging problem of Synchronous Memorizability and Generalizability (SMG) and to simultaneously improve performance on both previous and unseen sites, with a novel proposed SMG-learning framework. First, we propose a Synchronous Gradient Alignment (SGA) objective, which not only promotes the network memorizability by enforcing coordinated optimization for a small exemplar set from previous sites (called replay buffer), but also enhances the generalizability by facilitating site-invariance under simulated domain shift. Second, to simplify the optimization of SGA objective, we design a Dual-Meta algorithm that approximates the SGA objective as dual meta-objectives for optimization without expensive computation overhead. Third, for efficient rehearsal, we configure the replay buffer comprehensively considering additional inter-site diversity to reduce redundancy. Experiments on prostate MRI data sequentially acquired from six institutes demonstrate that our method can simultaneously achieve higher memorizability and generalizability over state-of-the-art methods. Code is available at https://github.com/jingyzhang/SMG-Learning.

preprint2022arXiv

Multi-Modal MRI Reconstruction Assisted with Spatial Alignment Network

In clinical practice, multi-modal magnetic resonance imaging (MRI) with different contrasts is usually acquired in a single study to assess different properties of the same region of interest in the human body. The whole acquisition process can be accelerated by having one or more modalities under-sampled in the $k$-space. Recent research has shown that, considering the redundancy between different modalities, a target MRI modality under-sampled in the $k$-space can be more efficiently reconstructed with a fully-sampled reference MRI modality. However, we find that the performance of the aforementioned multi-modal reconstruction can be negatively affected by subtle spatial misalignment between different modalities, which is actually common in clinical practice. In this paper, we improve the quality of multi-modal reconstruction by compensating for such spatial misalignment with a spatial alignment network. First, our spatial alignment network estimates the displacement between the fully-sampled reference and the under-sampled target images, and warps the reference image accordingly. Then, the aligned fully-sampled reference image joins the multi-modal reconstruction of the under-sampled target image. Also, considering the contrast difference between the target and reference images, we have designed a cross-modality-synthesis-based registration loss in combination with the reconstruction loss, to jointly train the spatial alignment network and the reconstruction network. The experiments on both clinical MRI and multi-coil $k$-space raw data demonstrate the superiority and robustness of the multi-modal MRI reconstruction empowered with our spatial alignment network. Our code is publicly available at \url{https://github.com/woxuankai/SpatialAlignmentNetwork}.

preprint2022arXiv

Rectify ViT Shortcut Learning by Visual Saliency

Shortcut learning is common but harmful to deep learning models, leading to degenerated feature representations and consequently jeopardizing the model's generalizability and interpretability. However, shortcut learning in the widely used Vision Transformer framework is largely unknown. Meanwhile, introducing domain-specific knowledge is a major approach to rectifying the shortcuts, which are predominated by background related factors. For example, in the medical imaging field, eye-gaze data from radiologists is an effective human visual prior knowledge that has the great potential to guide the deep learning models to focus on meaningful foreground regions of interest. However, obtaining eye-gaze data is time-consuming, labor-intensive and sometimes even not practical. In this work, we propose a novel and effective saliency-guided vision transformer (SGT) model to rectify shortcut learning in ViT with the absence of eye-gaze data. Specifically, a computational visual saliency model is adopted to predict saliency maps for input image samples. Then, the saliency maps are used to distil the most informative image patches. In the proposed SGT, the self-attention among image patches focus only on the distilled informative ones. Considering this distill operation may lead to global information lost, we further introduce, in the last encoder layer, a residual connection that captures the self-attention across all the image patches. The experiment results on four independent public datasets show that our SGT framework can effectively learn and leverage human prior knowledge without eye gaze data and achieves much better performance than baselines. Meanwhile, it successfully rectifies the harmful shortcut learning and significantly improves the interpretability of the ViT model, demonstrating the promise of transferring human prior knowledge derived visual saliency in rectifying shortcut learning

preprint2022arXiv

Transformers in Medical Image Analysis: A Review

Transformers have dominated the field of natural language processing, and recently impacted the computer vision area. In the field of medical image analysis, Transformers have also been successfully applied to full-stack clinical applications, including image synthesis/reconstruction, registration, segmentation, detection, and diagnosis. Our paper aims to promote awareness and application of Transformers in the field of medical image analysis. Specifically, we first overview the core concepts of the attention mechanism built into Transformers and other basic components. Second, we review various Transformer architectures tailored for medical image applications and discuss their limitations. Within this review, we investigate key challenges revolving around the use of Transformers in different learning paradigms, improving the model efficiency, and their coupling with other techniques. We hope this review can give a comprehensive picture of Transformers to the readers in the field of medical image analysis.

preprint2022arXiv

Two-Stage Mesh Deep Learning for Automated Tooth Segmentation and Landmark Localization on 3D Intraoral Scans

Accurately segmenting teeth and identifying the corresponding anatomical landmarks on dental mesh models are essential in computer-aided orthodontic treatment. Manually performing these two tasks is time-consuming, tedious, and, more importantly, highly dependent on orthodontists' experiences due to the abnormality and large-scale variance of patients' teeth. Some machine learning-based methods have been designed and applied in the orthodontic field to automatically segment dental meshes (e.g., intraoral scans). In contrast, the number of studies on tooth landmark localization is still limited. This paper proposes a two-stage framework based on mesh deep learning (called TS-MDL) for joint tooth labeling and landmark identification on raw intraoral scans. Our TS-MDL first adopts an end-to-end \emph{i}MeshSegNet method (i.e., a variant of the existing MeshSegNet with both improved accuracy and efficiency) to label each tooth on the downsampled scan. Guided by the segmentation outputs, our TS-MDL further selects each tooth's region of interest (ROI) on the original mesh to construct a light-weight variant of the pioneering PointNet (i.e., PointNet-Reg) for regressing the corresponding landmark heatmaps. Our TS-MDL was evaluated on a real-clinical dataset, showing promising segmentation and localization performance. Specifically, \emph{i}MeshSegNet in the first stage of TS-MDL reached an averaged Dice similarity coefficient (DSC) at \textcolor[rgb]{0,0,0}{$0.964\pm0.054$}, significantly outperforming the original MeshSegNet. In the second stage, PointNet-Reg achieved a mean absolute error (MAE) of $0.597\pm0.761 \, mm$ in distances between the prediction and ground truth for $66$ landmarks, which is superior compared with other networks for landmark detection. All these results suggest the potential usage of our TS-MDL in orthodontics.

preprint2022arXiv

Two-Stream Graph Convolutional Network for Intra-oral Scanner Image Segmentation

Precise segmentation of teeth from intra-oral scanner images is an essential task in computer-aided orthodontic surgical planning. The state-of-the-art deep learning-based methods often simply concatenate the raw geometric attributes (i.e., coordinates and normal vectors) of mesh cells to train a single-stream network for automatic intra-oral scanner image segmentation. However, since different raw attributes reveal completely different geometric information, the naive concatenation of different raw attributes at the (low-level) input stage may bring unnecessary confusion in describing and differentiating between mesh cells, thus hampering the learning of high-level geometric representations for the segmentation task. To address this issue, we design a two-stream graph convolutional network (i.e., TSGCN), which can effectively handle inter-view confusion between different raw attributes to more effectively fuse their complementary information and learn discriminative multi-view geometric representations. Specifically, our TSGCN adopts two input-specific graph-learning streams to extract complementary high-level geometric representations from coordinates and normal vectors, respectively. Then, these single-view representations are further fused by a self-attention module to adaptively balance the contributions of different views in learning more discriminative multi-view representations for accurate and fully automatic tooth segmentation. We have evaluated our TSGCN on a real-patient dataset of dental (mesh) models acquired by 3D intraoral scanners. Experimental results show that our TSGCN significantly outperforms state-of-the-art methods in 3D tooth (surface) segmentation. Github: https://github.com/ZhangLingMing1/TSGCNet.

preprint2021arXiv

A novel multiple instance learning framework for COVID-19 severity assessment via data augmentation and self-supervised learning

How to fast and accurately assess the severity level of COVID-19 is an essential problem, when millions of people are suffering from the pandemic around the world. Currently, the chest CT is regarded as a popular and informative imaging tool for COVID-19 diagnosis. However, we observe that there are two issues -- weak annotation and insufficient data that may obstruct automatic COVID-19 severity assessment with CT images. To address these challenges, we propose a novel three-component method, i.e., 1) a deep multiple instance learning component with instance-level attention to jointly classify the bag and also weigh the instances, 2) a bag-level data augmentation component to generate virtual bags by reorganizing high confidential instances, and 3) a self-supervised pretext component to aid the learning process. We have systematically evaluated our method on the CT images of 229 COVID-19 cases, including 50 severe and 179 non-severe cases. Our method could obtain an average accuracy of 95.8%, with 93.6% sensitivity and 96.4% specificity, which outperformed previous works.

preprint2021arXiv

MetricUNet: Synergistic Image- and Voxel-Level Learning for Precise CT Prostate Segmentation via Online Sampling

Fully convolutional networks (FCNs), including UNet and VNet, are widely-used network architectures for semantic segmentation in recent studies. However, conventional FCN is typically trained by the cross-entropy or Dice loss, which only calculates the error between predictions and ground-truth labels for pixels individually. This often results in non-smooth neighborhoods in the predicted segmentation. To address this problem, we propose a two-stage framework, with the first stage to quickly localize the prostate region and the second stage to precisely segment the prostate by a multi-task UNet architecture. We introduce a novel online metric learning module through voxel-wise sampling in the multi-task network. Therefore, the proposed network has a dual-branch architecture that tackles two tasks: 1) a segmentation sub-network aiming to generate the prostate segmentation, and 2) a voxel-metric learning sub-network aiming to improve the quality of the learned feature space supervised by a metric loss. Specifically, the voxel-metric learning sub-network samples tuples (including triplets and pairs) in voxel-level through the intermediate feature maps. Unlike conventional deep metric learning methods that generate triplets or pairs in image-level before the training phase, our proposed voxel-wise tuples are sampled in an online manner and operated in an end-to-end fashion via multi-task learning. To evaluate the proposed method, we implement extensive experiments on a real CT image dataset consisting of 339 patients. The ablation studies show that our method can effectively learn more representative voxel-level features compared with the conventional learning methods with cross-entropy or Dice loss. And the comparisons show that the proposed method outperforms the state-of-the-art methods by a reasonable margin.

preprint2021arXiv

Population-Guided Large Margin Classifier for High-Dimension Low -Sample-Size Problems

Various applications in different fields, such as gene expression analysis or computer vision, suffer from data sets with high-dimensional low-sample-size (HDLSS), which has posed significant challenges for standard statistical and modern machine learning methods. In this paper, we propose a novel linear binary classifier, denoted by population-guided large margin classifier (PGLMC), which is applicable to any sorts of data, including HDLSS. PGLMC is conceived with a projecting direction w given by the comprehensive consideration of local structural information of the hyperplane and the statistics of the training samples. Our proposed model has several advantages compared to those widely used approaches. First, it is not sensitive to the intercept term b. Second, it operates well with imbalanced data. Third, it is relatively simple to be implemented based on Quadratic Programming. Fourth, it is robust to the model specification for various real applications. The theoretical properties of PGLMC are proven. We conduct a series of evaluations on two simulated and six real-world benchmark data sets, including DNA classification, digit recognition, medical image analysis, and face recognition. PGLMC outperforms the state-of-the-art classification methods in most cases, or at least obtains comparable results.

preprint2020arXiv

Adaptive Feature Selection Guided Deep Forest for COVID-19 Classification with Chest CT

Chest computed tomography (CT) becomes an effective tool to assist the diagnosis of coronavirus disease-19 (COVID-19). Due to the outbreak of COVID-19 worldwide, using the computed-aided diagnosis technique for COVID-19 classification based on CT images could largely alleviate the burden of clinicians. In this paper, we propose an Adaptive Feature Selection guided Deep Forest (AFS-DF) for COVID-19 classification based on chest CT images. Specifically, we first extract location-specific features from CT images. Then, in order to capture the high-level representation of these features with the relatively small-scale data, we leverage a deep forest model to learn high-level representation of the features. Moreover, we propose a feature selection method based on the trained deep forest model to reduce the redundancy of features, where the feature selection could be adaptively incorporated with the COVID-19 classification model. We evaluated our proposed AFS-DF on COVID-19 dataset with 1495 patients of COVID-19 and 1027 patients of community acquired pneumonia (CAP). The accuracy (ACC), sensitivity (SEN), specificity (SPE) and AUC achieved by our method are 91.79%, 93.05%, 89.95% and 96.35%, respectively. Experimental results on the COVID-19 dataset suggest that the proposed AFS-DF achieves superior performance in COVID-19 vs. CAP classification, compared with 4 widely used machine learning methods.

preprint2020arXiv

An Auto-Context Deformable Registration Network for Infant Brain MRI

Deformable image registration is fundamental to longitudinal and population analysis. Geometric alignment of the infant brain MR images is challenging, owing to rapid changes in image appearance in association with brain development. In this paper, we propose an infant-dedicated deep registration network that uses the auto-context strategy to gradually refine the deformation fields to obtain highly accurate correspondences. Instead of training multiple registration networks, our method estimates the deformation fields by invoking a single network multiple times for iterative deformation refinement. The final deformation field is obtained by the incremental composition of the deformation fields. Experimental results in comparison with state-of-the-art registration methods indicate that our method achieves higher accuracy while at the same time preserves the smoothness of the deformation fields. Our implementation is available online.

preprint2020arXiv

Automatic Data Augmentation via Deep Reinforcement Learning for Effective Kidney Tumor Segmentation

Conventional data augmentation realized by performing simple pre-processing operations (\eg, rotation, crop, \etc) has been validated for its advantage in enhancing the performance for medical image segmentation. However, the data generated by these conventional augmentation methods are random and sometimes harmful to the subsequent segmentation. In this paper, we developed a novel automatic learning-based data augmentation method for medical image segmentation which models the augmentation task as a trial-and-error procedure using deep reinforcement learning (DRL). In our method, we innovatively combine the data augmentation module and the subsequent segmentation module in an end-to-end training manner with a consistent loss. Specifically, the best sequential combination of different basic operations is automatically learned by directly maximizing the performance improvement (\ie, Dice ratio) on the available validation set. We extensively evaluated our method on CT kidney tumor segmentation which validated the promising results of our method.

preprint2020arXiv

Co-evolution of Functional Brain Network at Multiple Scales during Early Infancy

The human brains are organized into hierarchically modular networks facilitating efficient and stable information processing and supporting diverse cognitive processes during the course of development. While the remarkable reconfiguration of functional brain network has been firmly established in early life, all these studies investigated the network development from a "single-scale" perspective, which ignore the richness engendered by its hierarchical nature. To fill this gap, this paper leveraged a longitudinal infant resting-state functional magnetic resonance imaging dataset from birth to 2 years of age, and proposed an advanced methodological framework to delineate the multi-scale reconfiguration of functional brain network during early development. Our proposed framework is consist of two parts. The first part developed a novel two-step multi-scale module detection method that could uncover efficient and consistent modular structure for longitudinal dataset from multiple scales in a completely data-driven manner. The second part designed a systematic approach that employed the linear mixed-effect model to four global and nodal module-related metrics to delineate scale-specific age-related changes of network organization. By applying our proposed methodological framework on the collected longitudinal infant dataset, we provided the first evidence that, in the first 2 years of life, the brain functional network is co-evolved at different scales, where each scale displays the unique reconfiguration pattern in terms of modular organization.

preprint2020arXiv

Deep Modeling of Growth Trajectories for Longitudinal Prediction of Missing Infant Cortical Surfaces

Charting cortical growth trajectories is of paramount importance for understanding brain development. However, such analysis necessitates the collection of longitudinal data, which can be challenging due to subject dropouts and failed scans. In this paper, we will introduce a method for longitudinal prediction of cortical surfaces using a spatial graph convolutional neural network (GCNN), which extends conventional CNNs from Euclidean to curved manifolds. The proposed method is designed to model the cortical growth trajectories and jointly predict inner and outer cortical surfaces at multiple time points. Adopting a binary flag in loss calculation to deal with missing data, we fully utilize all available cortical surfaces for training our deep learning model, without requiring a complete collection of longitudinal data. Predicting the surfaces directly allows cortical attributes such as cortical thickness, curvature, and convexity to be computed for subsequent analysis. We will demonstrate with experimental results that our method is capable of capturing the nonlinearity of spatiotemporal cortical growth patterns and can predict cortical surfaces with improved accuracy.

preprint2020arXiv

Dual-Sampling Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia

The coronavirus disease (COVID-19) is rapidly spreading all over the world, and has infected more than 1,436,000 people in more than 200 countries and territories as of April 9, 2020. Detecting COVID-19 at early stage is essential to deliver proper healthcare to the patients and also to protect the uninfected population. To this end, we develop a dual-sampling attention network to automatically diagnose COVID- 19 from the community acquired pneumonia (CAP) in chest computed tomography (CT). In particular, we propose a novel online attention module with a 3D convolutional network (CNN) to focus on the infection regions in lungs when making decisions of diagnoses. Note that there exists imbalanced distribution of the sizes of the infection regions between COVID-19 and CAP, partially due to fast progress of COVID-19 after symptom onset. Therefore, we develop a dual-sampling strategy to mitigate the imbalanced learning. Our method is evaluated (to our best knowledge) upon the largest multi-center CT data for COVID-19 from 8 hospitals. In the training-validation stage, we collect 2186 CT scans from 1588 patients for a 5-fold cross-validation. In the testing stage, we employ another independent large-scale testing dataset including 2796 CT scans from 2057 patients. Results show that our algorithm can identify the COVID-19 images with the area under the receiver operating characteristic curve (AUC) value of 0.944, accuracy of 87.5%, sensitivity of 86.9%, specificity of 90.1%, and F1-score of 82.0%. With this performance, the proposed algorithm could potentially aid radiologists with COVID-19 diagnosis from CAP, especially in the early stage of the COVID-19 outbreak.

preprint2020arXiv

HF-UNet: Learning Hierarchically Inter-Task Relevance in Multi-Task U-Net for Accurate Prostate Segmentation

Accurate segmentation of the prostate is a key step in external beam radiation therapy treatments. In this paper, we tackle the challenging task of prostate segmentation in CT images by a two-stage network with 1) the first stage to fast localize, and 2) the second stage to accurately segment the prostate. To precisely segment the prostate in the second stage, we formulate prostate segmentation into a multi-task learning framework, which includes a main task to segment the prostate, and an auxiliary task to delineate the prostate boundary. Here, the second task is applied to provide additional guidance of unclear prostate boundary in CT images. Besides, the conventional multi-task deep networks typically share most of the parameters (i.e., feature representations) across all tasks, which may limit their data fitting ability, as the specificities of different tasks are inevitably ignored. By contrast, we solve them by a hierarchically-fused U-Net structure, namely HF-UNet. The HF-UNet has two complementary branches for two tasks, with the novel proposed attention-based task consistency learning block to communicate at each level between the two decoding branches. Therefore, HF-UNet endows the ability to learn hierarchically the shared representations for different tasks, and preserve the specificities of learned representations for different tasks simultaneously. We did extensive evaluations of the proposed method on a large planning CT image dataset, including images acquired from 339 patients. The experimental results show HF-UNet outperforms the conventional multi-task network architectures and the state-of-the-art methods.

preprint2020arXiv

Hypergraph Learning for Identification of COVID-19 with CT Imaging

The coronavirus disease, named COVID-19, has become the largest global public health crisis since it started in early 2020. CT imaging has been used as a complementary tool to assist early screening, especially for the rapid identification of COVID-19 cases from community acquired pneumonia (CAP) cases. The main challenge in early screening is how to model the confusing cases in the COVID-19 and CAP groups, with very similar clinical manifestations and imaging features. To tackle this challenge, we propose an Uncertainty Vertex-weighted Hypergraph Learning (UVHL) method to identify COVID-19 from CAP using CT images. In particular, multiple types of features (including regional features and radiomics features) are first extracted from CT image for each case. Then, the relationship among different cases is formulated by a hypergraph structure, with each case represented as a vertex in the hypergraph. The uncertainty of each vertex is further computed with an uncertainty score measurement and used as a weight in the hypergraph. Finally, a learning process of the vertex-weighted hypergraph is used to predict whether a new testing case belongs to COVID-19 or not. Experiments on a large multi-center pneumonia dataset, consisting of 2,148 COVID-19 cases and 1,182 CAP cases from five hospitals, are conducted to evaluate the performance of the proposed method. Results demonstrate the effectiveness and robustness of our proposed method on the identification of COVID-19 in comparison to state-of-the-art methods.

preprint2020arXiv

Large-Scale Screening of COVID-19 from Community Acquired Pneumonia using Infection Size-Aware Classification

The worldwide spread of coronavirus disease (COVID-19) has become a threatening risk for global public health. It is of great importance to rapidly and accurately screen patients with COVID-19 from community acquired pneumonia (CAP). In this study, a total of 1658 patients with COVID-19 and 1027 patients of CAP underwent thin-section CT. All images were preprocessed to obtain the segmentations of both infections and lung fields, which were used to extract location-specific features. An infection Size Aware Random Forest method (iSARF) was proposed, in which subjects were automated categorized into groups with different ranges of infected lesion sizes, followed by random forests in each group for classification. Experimental results show that the proposed method yielded sensitivity of 0.907, specificity of 0.833, and accuracy of 0.879 under five-fold cross-validation. Large performance margins against comparison methods were achieved especially for the cases with infection size in the medium range, from 0.01% to 10%. The further inclusion of Radiomics features show slightly improvement. It is anticipated that our proposed framework could assist clinical decision making.

preprint2020arXiv

Learning-based Computer-aided Prescription Model for Parkinson's Disease: A Data-driven Perspective

In this paper, we study a novel problem: "automatic prescription recommendation for PD patients." To realize this goal, we first build a dataset by collecting 1) symptoms of PD patients, and 2) their prescription drug provided by neurologists. Then, we build a novel computer-aided prescription model by learning the relation between observed symptoms and prescription drug. Finally, for the new coming patients, we could recommend (predict) suitable prescription drug on their observed symptoms by our prescription model. From the methodology part, our proposed model, namely Prescription viA Learning lAtent Symptoms (PALAS), could recommend prescription using the multi-modality representation of the data. In PALAS, a latent symptom space is learned to better model the relationship between symptoms and prescription drug, as there is a large semantic gap between them. Moreover, we present an efficient alternating optimization method for PALAS. We evaluated our method using the data collected from 136 PD patients at Nanjing Brain Hospital, which can be regarded as a large dataset in PD research community. The experimental results demonstrate the effectiveness and clinical potential of our method in this recommendation task, if compared with other competing methods.

preprint2020arXiv

Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion Encoding (SIDE)

Diffusion MRI (dMRI) is a unique imaging technique for in vivo characterization of tissue microstructure and white matter pathways. However, its relatively long acquisition time implies greater motion artifacts when imaging, for example, infants and Parkinson's disease patients. To accelerate dMRI acquisition, we propose in this paper (i) a diffusion encoding scheme, called Slice-Interleaved Diffusion Encoding (SIDE), that interleaves each diffusion-weighted (DW) image volume with slices that are encoded with different diffusion gradients, essentially allowing the slice-undersampling of image volume associated with each diffusion gradient to significantly reduce acquisition time, and (ii) a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data. Evaluation based on the Human Connectome Project (HCP) dataset indicates that our method can achieve a high acceleration factor of up to 6 with minimal information loss. Evaluation using dMRI data acquired with SIDE acquisition demonstrates that it is possible to accelerate the acquisition by as much as 50 folds when combined with multi-band imaging.

preprint2020arXiv

Non-local U-Net for Biomedical Image Segmentation

Deep learning has shown its great promise in various biomedical image segmentation tasks. Existing models are typically based on U-Net and rely on an encoder-decoder architecture with stacked local operators to aggregate long-range information gradually. However, only using the local operators limits the efficiency and effectiveness. In this work, we propose the non-local U-Nets, which are equipped with flexible global aggregation blocks, for biomedical image segmentation. These blocks can be inserted into U-Net as size-preserving processes, as well as down-sampling and up-sampling layers. We perform thorough experiments on the 3D multimodality isointense infant brain MR image segmentation task to evaluate the non-local U-Nets. Results show that our proposed models achieve top performances with fewer parameters and faster computation.

preprint2020arXiv

Probing Tissue Microarchitecture of the Baby Brain via Spherical Mean Spectrum Imaging

During the first years of life, the human brain undergoes dynamic spatially-heterogeneous changes, involving differentiation of neuronal types, dendritic arborization, axonal ingrowth, outgrowth and retraction, synaptogenesis, and myelination. To better quantify these changes, this article presents a method for probing tissue microarchitecture by characterizing water diffusion in a spectrum of length scales, factoring out the effects of intra-voxel orientation heterogeneity. Our method is based on the spherical means of the diffusion signal, computed over gradient directions for a fixed set of diffusion weightings (i.e., b-values). We decompose the spherical mean series at each voxel into a spherical mean spectrum (SMS), which essentially encodes the fractions of spin packets undergoing fine- to coarse-scale diffusion processes, characterizing hindered and restricted diffusion stemming respectively from extra- and intra-neurite water compartments. From the SMS, multiple orientation distribution invariant indices can be computed, allowing for example the quantification of neurite density, microscopic fractional anisotropy ($μ$FA), per-axon axial/radial diffusivity, and free/restricted isotropic diffusivity. We show maps of these indices for baby brains, demonstrating that microscopic tissue features can be extracted from the developing brain for greater sensitivity and specificity to development related changes. Also, we demonstrate that our method, called spherical mean spectrum imaging (SMSI), is fast, accurate, and can overcome the biases associated with other state-of-the-art microstructure models.

preprint2020arXiv

Real-Time Quality Assessment of Pediatric MRI via Semi-Supervised Deep Nonlocal Residual Neural Networks

In this paper, we introduce an image quality assessment (IQA) method for pediatric T1- and T2-weighted MR images. IQA is first performed slice-wise using a nonlocal residual neural network (NR-Net) and then volume-wise by agglomerating the slice QA results using random forest. Our method requires only a small amount of quality-annotated images for training and is designed to be robust to annotation noise that might occur due to rater errors and the inevitable mix of good and bad slices in an image volume. Using a small set of quality-assessed images, we pre-train NR-Net to annotate each image slice with an initial quality rating (i.e., pass, questionable, fail), which we then refine by semi-supervised learning and iterative self-training. Experimental results demonstrate that our method, trained using only samples of modest size, exhibit great generalizability, capable of real-time (milliseconds per volume) large-scale IQA with near-perfect accuracy.

preprint2020arXiv

Review of Artificial Intelligence Techniques in Imaging Data Acquisition, Segmentation and Diagnosis for COVID-19

(This paper was submitted as an invited paper to IEEE Reviews in Biomedical Engineering on April 6, 2020.) The pandemic of coronavirus disease 2019 (COVID-19) is spreading all over the world. Medical imaging such as X-ray and computed tomography (CT) plays an essential role in the global fight against COVID-19, whereas the recently emerging artificial intelligence (AI) technologies further strengthen the power of the imaging tools and help medical specialists. We hereby review the rapid responses in the community of medical imaging (empowered by AI) toward COVID-19. For example, AI-empowered image acquisition can significantly help automate the scanning procedure and also reshape the workflow with minimal contact to patients, providing the best protection to the imaging technicians. Also, AI can improve work efficiency by accurate delination of infections in X-ray and CT images, facilitating subsequent quantification. Moreover, the computer-aided platforms help radiologists make clinical decisions, i.e., for disease diagnosis, tracking, and prognosis. In this review paper, we thus cover the entire pipeline of medical imaging and analysis techniques involved with COVID-19, including image acquisition, segmentation, diagnosis, and follow-up. We particularly focus on the integration of AI with X-ray and CT, both of which are widely used in the frontline hospitals, in order to depict the latest progress of medical imaging and radiology fighting against COVID-19.

preprint2020arXiv

Severity Assessment of Coronavirus Disease 2019 (COVID-19) Using Quantitative Features from Chest CT Images

Background: Chest computed tomography (CT) is recognized as an important tool for COVID-19 severity assessment. As the number of affected patients increase rapidly, manual severity assessment becomes a labor-intensive task, and may lead to delayed treatment. Purpose: Using machine learning method to realize automatic severity assessment (non-severe or severe) of COVID-19 based on chest CT images, and to explore the severity-related features from the resulting assessment model. Materials and Method: Chest CT images of 176 patients (age 45.3$\pm$16.5 years, 96 male and 80 female) with confirmed COVID-19 are used, from which 63 quantitative features, e.g., the infection volume/ratio of the whole lung and the volume of ground-glass opacity (GGO) regions, are calculated. A random forest (RF) model is trained to assess the severity (non-severe or severe) based on quantitative features. Importance of each quantitative feature, which reflects the correlation to the severity of COVID-19, is calculated from the RF model. Results: Using three-fold cross validation, the RF model shows promising results, i.e., 0.933 of true positive rate, 0.745 of true negative rate, 0.875 of accuracy, and 0.91 of area under receiver operating characteristic curve (AUC). The resulting importance of quantitative features shows that the volume and its ratio (with respect to the whole lung volume) of ground glass opacity (GGO) regions are highly related to the severity of COVID-19, and the quantitative features calculated from the right lung are more related to the severity assessment than those of the left lung. Conclusion: The RF based model can achieve automatic severity assessment (non-severe or severe) of COVID-19 infection, and the performance is promising. Several quantitative features, which have the potential to reflect the severity of COVID-19, were revealed.

preprint2020arXiv

Synergistic Learning of Lung Lobe Segmentation and Hierarchical Multi-Instance Classification for Automated Severity Assessment of COVID-19 in CT Images

Understanding chest CT imaging of the coronavirus disease 2019 (COVID-19) will help detect infections early and assess the disease progression. Especially, automated severity assessment of COVID-19 in CT images plays an essential role in identifying cases that are in great need of intensive clinical care. However, it is often challenging to accurately assess the severity of this disease in CT images, due to variable infection regions in the lungs, similar imaging biomarkers, and large inter-case variations. To this end, we propose a synergistic learning framework for automated severity assessment of COVID-19 in 3D CT images, by jointly performing lung lobe segmentation and multi-instance classification. Considering that only a few infection regions in a CT image are related to the severity assessment, we first represent each input image by a bag that contains a set of 2D image patches (with each cropped from a specific slice). A multi-task multi-instance deep network (called M$^2$UNet) is then developed to assess the severity of COVID-19 patients and also segment the lung lobe simultaneously. Our M$^2$UNet consists of a patch-level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment (with a unique hierarchical multi-instance learning strategy). Here, the context information provided by segmentation can be implicitly employed to improve the performance of severity assessment. Extensive experiments were performed on a real COVID-19 CT image dataset consisting of 666 chest CT images, with results suggesting the effectiveness of our proposed method compared to several state-of-the-art methods.

preprint2019arXiv

Brain Network Construction and Classification Toolbox (BrainNetClass)

Brain functional network has become an increasingly used approach in understanding brain functions and diseases. Many network construction methods have been developed, whereas the majority of the studies still used static pairwise Pearson's correlation-based functional connectivity. The goal of this work is to introduce a toolbox namely "Brain Network Construction and Classification" (BrainNetClass) to the field to promote more advanced brain network construction methods. It comprises various brain network construction methods, including some state-of-the-art methods that were recently developed to capture more complex interactions among brain regions along with connectome feature extraction, reduction, parameter optimization towards network-based individualized classification. BrainNetClass is a MATLAB-based, open-source, cross-platform toolbox with graphical user-friendly interfaces for cognitive and clinical neuroscientists to perform rigorous computer-aided diagnosis with interpretable result presentations even though they do not possess neuroimage computing and machine learning knowledge. We demonstrate the implementations of this toolbox on real resting-state functional MRI datasets. BrainNetClass (v1.0) can be downloaded from https://github.com/zzstefan/BrainNetClass.

preprint2019arXiv

Task Decomposition and Synchronization for Semantic Biomedical Image Segmentation

Semantic segmentation is essentially important to biomedical image analysis. Many recent works mainly focus on integrating the Fully Convolutional Network (FCN) architecture with sophisticated convolution implementation and deep supervision. In this paper, we propose to decompose the single segmentation task into three subsequent sub-tasks, including (1) pixel-wise image segmentation, (2) prediction of the class labels of the objects within the image, and (3) classification of the scene the image belonging to. While these three sub-tasks are trained to optimize their individual loss functions of different perceptual levels, we propose to let them interact by the task-task context ensemble. Moreover, we propose a novel sync-regularization to penalize the deviation between the outputs of the pixel-wise segmentation and the class prediction tasks. These effective regularizations help FCN utilize context information comprehensively and attain accurate semantic segmentation, even though the number of the images for training may be limited in many biomedical applications. We have successfully applied our framework to three diverse 2D/3D medical image datasets, including Robotic Scene Segmentation Challenge 18 (ROBOT18), Brain Tumor Segmentation Challenge 18 (BRATS18), and Retinal Fundus Glaucoma Challenge (REFUGE18). We have achieved top-tier performance in all three challenges.