Researcher profile

Jinzheng Cai

Jinzheng Cai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2021arXiv

Learning from Multiple Datasets with Heterogeneous and Partial Labels for Universal Lesion Detection in CT

Large-scale datasets with high-quality labels are desired for training accurate deep learning models. However, due to the annotation cost, datasets in medical imaging are often either partially-labeled or small. For example, DeepLesion is such a large-scale CT image dataset with lesions of various types, but it also has many unlabeled lesions (missing annotations). When training a lesion detector on a partially-labeled dataset, the missing annotations will generate incorrect negative signals and degrade the performance. Besides DeepLesion, there are several small single-type datasets, such as LUNA for lung nodules and LiTS for liver tumors. These datasets have heterogeneous label scopes, i.e., different lesion types are labeled in different datasets with other types ignored. In this work, we aim to develop a universal lesion detection algorithm to detect a variety of lesions. The problem of heterogeneous and partial labels is tackled. First, we build a simple yet effective lesion detection framework named Lesion ENSemble (LENS). LENS can efficiently learn from multiple heterogeneous lesion datasets in a multi-task fashion and leverage their synergy by proposal fusion. Next, we propose strategies to mine missing annotations from partially-labeled datasets by exploiting clinical prior knowledge and cross-dataset knowledge transfer. Finally, we train our framework on four public lesion datasets and evaluate it on 800 manually-labeled sub-volumes in DeepLesion. Our method brings a relative improvement of 49% compared to the current state-of-the-art approach in the metric of average sensitivity. We have publicly released our manual 3D annotations of DeepLesion in https://github.com/viggin/DeepLesion_manual_test_set.

preprint2020arXiv

3D Semi-Supervised Learning with Uncertainty-Aware Multi-View Co-Training

While making a tremendous impact in various fields, deep neural networks usually require large amounts of labeled data for training which are expensive to collect in many applications, especially in the medical domain. Unlabeled data, on the other hand, is much more abundant. Semi-supervised learning techniques, such as co-training, could provide a powerful tool to leverage unlabeled data. In this paper, we propose a novel framework, uncertainty-aware multi-view co-training (UMCT), to address semi-supervised learning on 3D data, such as volumetric data from medical imaging. In our work, co-training is achieved by exploiting multi-viewpoint consistency of 3D data. We generate different views by rotating or permuting the 3D data and utilize asymmetrical 3D kernels to encourage diversified features in different sub-networks. In addition, we propose an uncertainty-weighted label fusion mechanism to estimate the reliability of each view's prediction with Bayesian deep learning. As one view requires the supervision from other views in co-training, our self-adaptive approach computes a confidence score for the prediction of each unlabeled sample in order to assign a reliable pseudo label. Thus, our approach can take advantage of unlabeled data during training. We show the effectiveness of our proposed semi-supervised method on several public datasets from medical image segmentation tasks (NIH pancreas & LiTS liver tumor dataset). Meanwhile, a fully-supervised method based on our approach achieved state-of-the-art performances on both the LiTS liver tumor segmentation and the Medical Segmentation Decathlon (MSD) challenge, demonstrating the robustness and value of our framework, even when fully supervised training is feasible.

preprint2020arXiv

Deep Volumetric Universal Lesion Detection using Light-Weight Pseudo 3D Convolution and Surface Point Regression

Identifying, measuring and reporting lesions accurately and comprehensively from patient CT scans are important yet time-consuming procedures for physicians. Computer-aided lesion/significant-findings detection techniques are at the core of medical imaging, which remain very challenging due to the tremendously large variability of lesion appearance, location and size distributions in 3D imaging. In this work, we propose a novel deep anchor-free one-stage VULD framework that incorporates (1) P3DC operators to recycle the architectural configurations and pre-trained weights from the off-the-shelf 2D networks, especially ones with large capacities to cope with data variance, and (2) a new SPR method to effectively regress the 3D lesion spatial extents by pinpointing their representative key points on lesion surfaces. Experimental validations are first conducted on the public large-scale NIH DeepLesion dataset where our proposed method delivers new state-of-the-art quantitative performance. We also test VULD on our in-house dataset for liver tumor detection. VULD generalizes well in both large-scale and small-sized tumor datasets in CT imaging.

preprint2020arXiv

Detecting Scatteredly-Distributed, Small, andCritically Important Objects in 3D OncologyImaging via Decision Stratification

Finding and identifying scatteredly-distributed, small, and critically important objects in 3D oncology images is very challenging. We focus on the detection and segmentation of oncology-significant (or suspicious cancer metastasized) lymph nodes (OSLNs), which has not been studied before as a computational task. Determining and delineating the spread of OSLNs is essential in defining the corresponding resection/irradiating regions for the downstream workflows of surgical resection and radiotherapy of various cancers. For patients who are treated with radiotherapy, this task is performed by experienced radiation oncologists that involves high-level reasoning on whether LNs are metastasized, which is subject to high inter-observer variations. In this work, we propose a divide-and-conquer decision stratification approach that divides OSLNs into tumor-proximal and tumor-distal categories. This is motivated by the observation that each category has its own different underlying distributions in appearance, size and other characteristics. Two separate detection-by-segmentation networks are trained per category and fused. To further reduce false positives (FP), we present a novel global-local network (GLNet) that combines high-level lesion characteristics with features learned from localized 3D image patches. Our method is evaluated on a dataset of 141 esophageal cancer patients with PET and CT modalities (the largest to-date). Our results significantly improve the recall from $45\%$ to $67\%$ at $3$ FPs per patient as compared to previous state-of-the-art methods. The highest achieved OSLN recall of $0.828$ is clinically relevant and valuable.

preprint2020arXiv

Harvesting, Detecting, and Characterizing Liver Lesions from Large-scale Multi-phase CT Data via Deep Dynamic Texture Learning

Non-invasive radiological-based lesion characterization and identification, e.g., to differentiate cancer subtypes, has long been a major aim to enhance oncological diagnosis and treatment procedures. Here we study a specific population of human subjects, with the hope of reducing the need for invasive surgical biopsies of liver cancer patients, which can cause many harmful side-effects. To this end, we propose a fully-automated and multi-stage liver tumor characterization framework designed for dynamic contrast computed tomography (CT). Our system comprises four sequential processes of tumor proposal detection, tumor harvesting, primary tumor site selection, and deep texture-based tumor characterization. Our main contributions are that, (1) we propose a 3D non-isotropic anchor-free detection method for liver lesions; (2) we present and validate spatially adaptivedeep texture (SaDT) learning, which allows for more precise characterization of liver lesions; (3) using a semi-automatic process, we bootstrap off of 200 gold standard annotations to curate another 1001 patients. Experimental evaluations demonstrate that our new data curation strategy, combined with the SaDT deep dynamic texture analysis, can effectively improve the mean F1 scores by >8.6% compared with baselines, in differentiating four major liver lesion types. Our F1 score of (hepatocellular carcinoma versus remaining subclasses) is 0.763, which is higher than reported human observer performance using dynamic CT and comparable to an advanced magnetic resonance imagery protocol. Apart from demonstrating the benefits of our data curation approach and physician-inspired workflow, these results also indicate that analyzing texture features, instead of standard object-based analysis, is a promising strategy for lesion differentiation.

preprint2020arXiv

JSSR: A Joint Synthesis, Segmentation, and Registration System for 3D Multi-Modal Image Alignment of Large-scale Pathological CT Scans

Multi-modal image registration is a challenging problem that is also an important clinical task for many real applications and scenarios. As a first step in analysis, deformable registration among different image modalities is often required in order to provide complementary visual information. During registration, semantic information is key to match homologous points and pixels. Nevertheless, many conventional registration methods are incapable in capturing high-level semantic anatomical dense correspondences. In this work, we propose a novel multi-task learning system, JSSR, based on an end-to-end 3D convolutional neural network that is composed of a generator, a registration and a segmentation component. The system is optimized to satisfy the implicit constraints between different tasks in an unsupervised manner. It first synthesizes the source domain images into the target domain, then an intra-modal registration is applied on the synthesized images and target images. The segmentation module are then applied on the synthesized and target images, providing additional cues based on semantic correspondences. The supervision from another fully-annotated dataset is used to regularize the segmentation. We extensively evaluate JSSR on a large-scale medical image dataset containing 1,485 patient CT imaging studies of four different contrast phases (i.e., 5,940 3D CT scans with pathological livers) on the registration, segmentation and synthesis tasks. The performance is improved after joint training on the registration and segmentation tasks by 0.9% and 1.9% respectively compared to a highly competitive and accurate deep learning baseline. The registration also consistently outperforms conventional state-of-the-art multi-modal registration methods.

preprint2020arXiv

Lymph Node Gross Tumor Volume Detection in Oncology Imaging via Relationship Learning Using Graph Neural Network

Determining the spread of GTV$_{LN}$ is essential in defining the respective resection or irradiating regions for the downstream workflows of surgical resection and radiotherapy for many cancers. Different from the more common enlarged lymph node (LN), GTV$_{LN}$ also includes smaller ones if associated with high positron emission tomography signals and/or any metastasis signs in CT. This is a daunting task. In this work, we propose a unified LN appearance and inter-LN relationship learning framework to detect the true GTV$_{LN}$. This is motivated by the prior clinical knowledge that LNs form a connected lymphatic system, and the spread of cancer cells among LNs often follows certain pathways. Specifically, we first utilize a 3D convolutional neural network with ROI-pooling to extract the GTV$_{LN}$'s instance-wise appearance features. Next, we introduce a graph neural network to further model the inter-LN relationships where the global LN-tumor spatial priors are included in the learning process. This leads to an end-to-end trainable network to detect by classifying GTV$_{LN}$. We operate our model on a set of GTV$_{LN}$ candidates generated by a preliminary 1st-stage method, which has a sensitivity of $>85\%$ at the cost of high false positive (FP) ($>15$ FPs per patient). We validate our approach on a radiotherapy dataset with 142 paired PET/RTCT scans containing the chest and upper abdominal body parts. The proposed method significantly improves over the state-of-the-art (SOTA) LN classification method by $5.5\%$ and $13.1\%$ in F1 score and the averaged sensitivity value at $2, 3, 4, 6$ FPs per patient, respectively.

preprint2020arXiv

Uncertainty-aware multi-view co-training for semi-supervised medical image segmentation and domain adaptation

Although having achieved great success in medical image segmentation, deep learning-based approaches usually require large amounts of well-annotated data, which can be extremely expensive in the field of medical image analysis. Unlabeled data, on the other hand, is much easier to acquire. Semi-supervised learning and unsupervised domain adaptation both take the advantage of unlabeled data, and they are closely related to each other. In this paper, we propose uncertainty-aware multi-view co-training (UMCT), a unified framework that addresses these two tasks for volumetric medical image segmentation. Our framework is capable of efficiently utilizing unlabeled data for better performance. We firstly rotate and permute the 3D volumes into multiple views and train a 3D deep network on each view. We then apply co-training by enforcing multi-view consistency on unlabeled data, where an uncertainty estimation of each view is utilized to achieve accurate labeling. Experiments on the NIH pancreas segmentation dataset and a multi-organ segmentation dataset show state-of-the-art performance of the proposed framework on semi-supervised medical image segmentation. Under unsupervised domain adaptation settings, we validate the effectiveness of this work by adapting our multi-organ segmentation model to two pathological organs from the Medical Segmentation Decathlon Datasets. Additionally, we show that our UMCT-DA model can even effectively handle the challenging situation where labeled source data is inaccessible, demonstrating strong potentials for real-world applications.

preprint2020arXiv

Universal Lesion Detection by Learning from Multiple Heterogeneously Labeled Datasets

Lesion detection is an important problem within medical imaging analysis. Most previous work focuses on detecting and segmenting a specialized category of lesions (e.g., lung nodules). However, in clinical practice, radiologists are responsible for finding all possible types of anomalies. The task of universal lesion detection (ULD) was proposed to address this challenge by detecting a large variety of lesions from the whole body. There are multiple heterogeneously labeled datasets with varying label completeness: DeepLesion, the largest dataset of 32,735 annotated lesions of various types, but with even more missing annotation instances; and several fully-labeled single-type lesion datasets, such as LUNA for lung nodules and LiTS for liver tumors. In this work, we propose a novel framework to leverage all these datasets together to improve the performance of ULD. First, we learn a multi-head multi-task lesion detector using all datasets and generate lesion proposals on DeepLesion. Second, missing annotations in DeepLesion are retrieved by a new method of embedding matching that exploits clinical prior knowledge. Last, we discover suspicious but unannotated lesions using knowledge transfer from single-type lesion detectors. In this way, reliable positive and negative regions are obtained from partially-labeled and unlabeled images, which are effectively utilized to train ULD. To assess the clinically realistic protocol of 3D volumetric ULD, we fully annotated 1071 CT sub-volumes in DeepLesion. Our method outperforms the current state-of-the-art approach by 29% in the metric of average sensitivity.

preprint2020arXiv

User-Guided Domain Adaptation for Rapid Annotation from User Interactions: A Study on Pathological Liver Segmentation

Mask-based annotation of medical images, especially for 3D data, is a bottleneck in developing reliable machine learning models. Using minimal-labor user interactions (UIs) to guide the annotation is promising, but challenges remain on best harmonizing the mask prediction with the UIs. To address this, we propose the user-guided domain adaptation (UGDA) framework, which uses prediction-based adversarial domain adaptation (PADA) to model the combined distribution of UIs and mask predictions. The UIs are then used as anchors to guide and align the mask prediction. Importantly, UGDA can both learn from unlabelled data and also model the high-level semantic meaning behind different UIs. We test UGDA on annotating pathological livers using a clinically comprehensive dataset of 927 patient studies. Using only extreme-point UIs, we achieve a mean (worst-case) performance of 96.1%(94.9%), compared to 93.0% (87.0%) for deep extreme points (DEXTR). Furthermore, we also show UGDA can retain this state-of-the-art performance even when only seeing a fraction of available UIs, demonstrating an ability for robust and reliable UI-guided segmentation with extremely minimal labor demands.