Researcher profile

Guang-Zhong Yang

Guang-Zhong Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
22works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

22 published item(s)

preprint2022arXiv

A Compacted Structure for Cross-domain learning on Monocular Depth and Flow Estimation

Accurate motion and depth recovery is important for many robot vision tasks including autonomous driving. Most previous studies have achieved cooperative multi-task interaction via either pre-defined loss functions or cross-domain prediction. This paper presents a multi-task scheme that achieves mutual assistance by means of our Flow to Depth (F2D), Depth to Flow (D2F), and Exponential Moving Average (EMA). F2D and D2F mechanisms enable multi-scale information integration between optical flow and depth domain based on differentiable shallow nets. A dual-head mechanism is used to predict optical flow for rigid and non-rigid motion based on a divide-and-conquer manner, which significantly improves the optical flow estimation performance. Furthermore, to make the prediction more robust and stable, EMA is used for our multi-task training. Experimental results on KITTI datasets show that our multi-task scheme outperforms other multi-task schemes and provide marked improvements on the prediction results.

preprint2022arXiv

A Long Short-term Memory Based Recurrent Neural Network for Interventional MRI Reconstruction

Interventional magnetic resonance imaging (i-MRI) for surgical guidance could help visualize the interventional process such as deep brain stimulation (DBS), improving the surgery performance and patient outcome. Different from retrospective reconstruction in conventional dynamic imaging, i-MRI for DBS has to acquire and reconstruct the interventional images sequentially online. Here we proposed a convolutional long short-term memory (Conv-LSTM) based recurrent neural network (RNN), or ConvLR, to reconstruct interventional images with golden-angle radial sampling. By using an initializer and Conv-LSTM blocks, the priors from the pre-operative reference image and intra-operative frames were exploited for reconstructing the current frame. Data consistency for radial sampling was implemented by a soft-projection method. To improve the reconstruction accuracy, an adversarial learning strategy was adopted. A set of interventional images based on the pre-operative and post-operative MR images were simulated for algorithm validation. Results showed with only 10 radial spokes, ConvLR provided the best performance compared with state-of-the-art methods, giving an acceleration up to 40 folds. The proposed algorithm has the potential to achieve real-time i-MRI for DBS and can be used for general purpose MR-guided intervention.

preprint2022arXiv

Faithful learning with sure data for lung nodule diagnosis

Recent evolution in deep learning has proven its value for CT-based lung nodule classification. Most current techniques are intrinsically black-box systems, suffering from two generalizability issues in clinical practice. First, benign-malignant discrimination is often assessed by human observers without pathologic diagnoses at the nodule level. We termed these data as "unsure data". Second, a classifier does not necessarily acquire reliable nodule features for stable learning and robust prediction with patch-level labels during learning. In this study, we construct a sure dataset with pathologically-confirmed labels and propose a collaborative learning framework to facilitate sure nodule classification by integrating unsure data knowledge through nodule segmentation and malignancy score regression. A loss function is designed to learn reliable features by introducing interpretability constraints regulated with nodule segmentation maps. Furthermore, based on model inference results that reflect the understanding from both machine and experts, we explore a new nodule analysis method for similar historical nodule retrieval and interpretable diagnosis. Detailed experimental results demonstrate that our approach is beneficial for achieving improved performance coupled with faithful model reasoning for lung cancer prediction. Extensive cross-evaluation results further illustrate the effect of unsure data for deep-learning-based methods in lung nodule classification.

preprint2022arXiv

Human-Robot Shared Control for Surgical Robot Based on Context-Aware Sim-to-Real Adaptation

Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical subtasks for the construction of the shared control framework. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Using a surgical simulator to collect data is a less resource-demanding approach. With sim-to-real adaptation, the manoeuvres learned from a simulator can be transferred to a physical robot. To this end, we propose a sim-to-real adaptation method to construct a human-robot shared control framework for robotic surgery. In this paper, a desired trajectory is generated from a simulator using LfD method, while dynamic motion primitives (DMPs) based method is used to transfer the desired trajectory from the simulator to the physical robotic platform. Moreover, a role adaptation mechanism is developed such that the robot can adjust its role according to the surgical operation contexts predicted by a neural network model. The effectiveness of the proposed framework is validated on the da Vinci Research Kit (dVRK). Results of the user studies indicated that with the adaptive human-robot shared control framework, the path length of the remote controller, the total clutching number and the task completion time can be reduced significantly. The proposed method outperformed the traditional manual control via teleoperation.

preprint2022arXiv

Re-thinking and Re-labeling LIDC-IDRI for Robust Pulmonary Cancer Prediction

The LIDC-IDRI database is the most popular benchmark for lung cancer prediction. However, with subjective assessment from radiologists, nodules in LIDC may have entirely different malignancy annotations from the pathological ground truth, introducing label assignment errors and subsequent supervision bias during training. The LIDC database thus requires more objective labels for learning-based cancer prediction. Based on an extra small dataset containing 180 nodules diagnosed by pathological examination, we propose to re-label LIDC data to mitigate the effect of original annotation bias verified on this robust benchmark. We demonstrate in this paper that providing new labels by similar nodule retrieval based on metric learning would be an effective re-labeling strategy. Training on these re-labeled LIDC nodules leads to improved model performance, which is enhanced when new labels of uncertain nodules are added. We further infer that re-labeling LIDC is current an expedient way for robust lung cancer prediction while building a large pathological-proven nodule database provides the long-term solution.

preprint2022arXiv

Tackling Long-Tailed Category Distribution Under Domain Shifts

Machine learning models fail to perform well on real-world applications when 1) the category distribution P(Y) of the training dataset suffers from long-tailed distribution and 2) the test data is drawn from different conditional distributions P(X|Y). Existing approaches cannot handle the scenario where both issues exist, which however is common for real-world applications. In this study, we took a step forward and looked into the problem of long-tailed classification under domain shifts. We designed three novel core functional blocks including Distribution Calibrated Classification Loss, Visual-Semantic Mapping and Semantic-Similarity Guided Augmentation. Furthermore, we adopted a meta-learning framework which integrates these three blocks to improve domain generalization on unseen target domains. Two new datasets were proposed for this problem, named AWA2-LTS and ImageNet-LTS. We evaluated our method on the two datasets and extensive experimental results demonstrate that our proposed method can achieve superior performance over state-of-the-art long-tailed/domain generalization approaches and the combinations. Source codes and datasets can be found at our project page https://xiaogu.site/LTDS.

preprint2021arXiv

Learning Tubule-Sensitive CNNs for Pulmonary Airway and Artery-Vein Segmentation in CT

Training convolutional neural networks (CNNs) for segmentation of pulmonary airway, artery, and vein is challenging due to sparse supervisory signals caused by the severe class imbalance between tubular targets and background. We present a CNNs-based method for accurate airway and artery-vein segmentation in non-contrast computed tomography. It enjoys superior sensitivity to tenuous peripheral bronchioles, arterioles, and venules. The method first uses a feature recalibration module to make the best use of features learned from the neural networks. Spatial information of features is properly integrated to retain relative priority of activated regions, which benefits the subsequent channel-wise recalibration. Then, attention distillation module is introduced to reinforce representation learning of tubular objects. Fine-grained details in high-resolution attention maps are passing down from one layer to its previous layer recursively to enrich context. Anatomy prior of lung context map and distance transform map is designed and incorporated for better artery-vein differentiation capacity. Extensive experiments demonstrated considerable performance gains brought by these components. Compared with state-of-the-art methods, our method extracted much more branches while maintaining competitive overall segmentation performance. Codes and models are available at http://www.pami.sjtu.edu.cn/News/56

preprint2020arXiv

ACNN: a Full Resolution DCNN for Medical Image Segmentation

Deep Convolutional Neural Networks (DCNNs) are used extensively in medical image segmentation and hence 3D navigation for robot-assisted Minimally Invasive Surgeries (MISs). However, current DCNNs usually use down sampling layers for increasing the receptive field and gaining abstract semantic information. These down sampling layers decrease the spatial dimension of feature maps, which can be detrimental to image segmentation. Atrous convolution is an alternative for the down sampling layer. It increases the receptive field whilst maintains the spatial dimension of feature maps. In this paper, a method for effective atrous rate setting is proposed to achieve the largest and fully-covered receptive field with a minimum number of atrous convolutional layers. Furthermore, a new and full resolution DCNN - Atrous Convolutional Neural Network (ACNN), which incorporates cascaded atrous II-blocks, residual learning and Instance Normalization (IN) is proposed. Application results of the proposed ACNN to Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) image segmentation demonstrate that the proposed ACNN can achieve higher segmentation Intersection over Unions (IoUs) than U-Net and Deeplabv3+, but with reduced trainable parameters.

preprint2020arXiv

Constrained-Space Optimization and Reinforcement Learning for Complex Tasks

Learning from Demonstration is increasingly used for transferring operator manipulation skills to robots. In practice, it is important to cater for limited data and imperfect human demonstrations, as well as underlying safety constraints. This paper presents a constrained-space optimization and reinforcement learning scheme for managing complex tasks. Through interactions within the constrained space, the reinforcement learning agent is trained to optimize the manipulation skills according to a defined reward function. After learning, the optimal policy is derived from the well-trained reinforcement learning agent, which is then implemented to guide the robot to conduct tasks that are similar to the experts' demonstrations. The effectiveness of the proposed method is verified with a robotic suturing task, demonstrating that the learned policy outperformed the experts' demonstrations in terms of the smoothness of the joint motion and end-effector trajectories, as well as the overall task completion time.

preprint2020arXiv

DDU-Nets: Distributed Dense Model for 3D MRI Brain Tumor Segmentation

Segmentation of brain tumors and their subregions remains a challenging task due to their weak features and deformable shapes. In this paper, three patterns (cross-skip, skip-1 and skip-2) of distributed dense connections (DDCs) are proposed to enhance feature reuse and propagation of CNNs by constructing tunnels between key layers of the network. For better detecting and segmenting brain tumors from multi-modal 3D MR images, CNN-based models embedded with DDCs (DDU-Nets) are trained efficiently from pixel to pixel with a limited number of parameters. Postprocessing is then applied to refine the segmentation results by reducing the false-positive samples. The proposed method is evaluated on the BraTS 2019 dataset with results demonstrating the effectiveness of the DDU-Nets while requiring less computational cost.

preprint2020arXiv

End-to-End Real-time Catheter Segmentation with Optical Flow-Guided Warping during Endovascular Intervention

Accurate real-time catheter segmentation is an important pre-requisite for robot-assisted endovascular intervention. Most of the existing learning-based methods for catheter segmentation and tracking are only trained on small-scale datasets or synthetic data due to the difficulties of ground-truth annotation. Furthermore, the temporal continuity in intraoperative imaging sequences is not fully utilised. In this paper, we present FW-Net, an end-to-end and real-time deep learning framework for endovascular intervention. The proposed FW-Net has three modules: a segmentation network with encoder-decoder architecture, a flow network to extract optical flow information, and a novel flow-guided warping function to learn the frame-to-frame temporal continuity. We show that by effectively learning temporal continuity, the network can successfully segment and track the catheters in real-time sequences using only raw ground-truth for training. Detailed validation results confirm that our FW-Net outperforms state-of-the-art techniques while achieving real-time performance.

preprint2020arXiv

Hybrid Data-Driven and Analytical Model for Kinematic Control of a Surgical Robotic Tool

Accurate kinematic models are essential for effective control of surgical robots. For tendon driven robots, which is common for minimally invasive surgery, intrinsic nonlinearities are important to consider. Traditional analytical methods allow to build the kinematic model of the system by making certain assumptions and simplifications on the nonlinearities. Machine learning techniques, instead, allow to recover a more complex model based on the acquired data. However, analytical models are more generalisable, but can be over-simplified; data-driven models, on the other hand, can cater for more complex models, but are less generalisable and the result is highly affected by the training dataset. In this paper, we present a novel approach to combining analytical and data-driven approaches to model the kinematics of nonlinear tendon-driven surgical robots. Gaussian Process Regression (GPR) is used for learning the data-driven model and the proposed method is tested on both simulated data and real experimental data.

preprint2020arXiv

Hybrid Robot-assisted Frameworks for Endomicroscopy Scanning in Retinal Surgeries

High-resolution real-time intraocular imaging of retina at the cellular level is very challenging due to the vulnerable and confined space within the eyeball as well as the limited availability of appropriate modalities. A probe-based confocal laser endomicroscopy (pCLE) system, can be a potential imaging modality for improved diagnosis. The ability to visualize the retina at the cellular level could provide information that may predict surgical outcomes. The adoption of intraocular pCLE scanning is currently limited due to the narrow field of view and the micron-scale range of focus. In the absence of motion compensation, physiological tremors of the surgeons' hand and patient movements also contribute to the deterioration of the image quality. Therefore, an image-based hybrid control strategy is proposed to mitigate the above challenges. The proposed hybrid control strategy enables a shared control of the pCLE probe between surgeons and robots to scan the retina precisely, with the absence of hand tremors and with the advantages of an image-based auto-focus algorithm that optimizes the quality of pCLE images. The hybrid control strategy is deployed on two frameworks - cooperative and teleoperated. Better image quality, smoother motion, and reduced workload are all achieved in a statistically significant manner with the hybrid control frameworks.

preprint2020arXiv

MIndGrasp: A New Training and Testing Framework for Motor Imagery Based 3-Dimensional Assistive Robotic Control

With increasing global age and disability assistive robots are becoming more necessary, and brain computer interfaces (BCI) are often proposed as a solution to understanding the intent of a disabled person that needs assistance. Most frameworks for electroencephalography (EEG)-based motor imagery (MI) BCI control rely on the direct control of the robot in Cartesian space. However, for 3-dimensional movement, this requires 6 motor imagery classes, which is a difficult distinction even for more experienced BCI users. In this paper, we present a simulated training and testing framework which reduces the number of motor imagery classes to 4 while still grasping objects in three-dimensional space. This is achieved through semi-autonomous eye-in-hand vision-based control of the robotic arm, while the user-controlled BCI achieves movement to the left and right, as well as movement toward and away from the object of interest. Additionally, the framework includes a method of training a BCI directly on the assistive robotic system, which should be more easily transferrable to a real-world assistive robot than using a standard training protocol such as Graz-BCI. Presented results do not consider real human EEG data, but are rather shown as a baseline for comparison with future human data and other improvements on the system.

preprint2020arXiv

Nonlinearity Compensation in a Multi-DoF Shoulder Sensing Exosuit for Real-Time Teleoperation

The compliant nature of soft wearable robots makes them ideal for complex multiple degrees of freedom (DoF) joints, but also introduce additional structural nonlinearities. Intuitive control of these wearable robots requires robust sensing to overcome the inherent nonlinearities. This paper presents a joint kinematics estimator for a bio-inspired multi-DoF shoulder exosuit capable of compensating the encountered nonlinearities. To overcome the nonlinearities and hysteresis inherent to the soft and compliant nature of the suit, we developed a deep learning-based method to map the sensor data to the joint space. The experimental results show that the new learning-based framework outperforms recent state-of-the-art methods by a large margin while achieving 12ms inference time using only a GPU-based edge-computing device. The effectiveness of our combined exosuit and learning framework is demonstrated through real-time teleoperation with a simulated NAO humanoid robot.

preprint2020arXiv

On-Orbit Operations Simulator for Workload Measurement during Telerobotic Training

Training for telerobotic systems often makes heavy use of simulated platforms, which ensure safe operation during the learning process. Outer space is one domain in which such a simulated training platform would be useful, as On-Orbit Operations (O3) can be costly, inefficient, or even dangerous if not performed properly. In this paper, we present a new telerobotic training simulator for the Canadarm2 on the International Space Station (ISS), which is able to modulate workload through the addition of confounding factors such as latency, obstacles, and time pressure. In addition, multimodal physiological data is collected from subjects as they perform a task from the simulator under these different conditions. As most current workload measures are subjective, we analyse objective measures from the simulator and EEG data that can provide a reliable measure. ANOVA of task data revealed which simulator-based performance measures could predict the presence of latency and time pressure. Furthermore, EEG classification using a Riemannian classifier and Leave-One-Subject-Out cross-validation showed promising classification performance and allowed for comparison of different channel configurations and preprocessing methods. Additionally, Riemannian distance and beta power of EEG data were investigated as potential cross-trial and continuous workload measures.

preprint2020arXiv

Z-Net: an Anisotropic 3D DCNN for Medical CT Volume Segmentation

Accurate volume segmentation from the Computed Tomography (CT) scan is a common prerequisite for pre-operative planning, intra-operative guidance and quantitative assessment of therapeutic outcomes in robot-assisted Minimally Invasive Surgery (MIS). 3D Deep Convolutional Neural Network (DCNN) is a viable solution for this task, but is memory intensive. Small isotropic patches are cropped from the original and large CT volume to mitigate this issue in practice, but it may cause discontinuities between the adjacent patches and severe class-imbalances within individual sub-volumes. This paper presents a new 3D DCNN framework, namely Z-Net, to tackle the discontinuity and class-imbalance issue by preserving a full field-of-view of the objects in the XY planes using anisotropic spatial separable convolutions. The proposed Z-Net can be seamlessly integrated into existing 3D DCNNs with isotropic convolutions such as 3D U-Net and V-Net, with improved volume segmentation Intersection over Union (IoU) - up to $12.6\%$. Detailed validation of Z-Net is provided for CT aortic, liver and lung segmentation, demonstrating the effectiveness and practical value of Z-Net for intra-operative 3D navigation in robot-assisted MIS.

preprint2019arXiv

Artificial Intelligence in Surgery

Artificial Intelligence (AI) is gradually changing the practice of surgery with the advanced technological development of imaging, navigation and robotic intervention. In this article, the recent successful and influential applications of AI in surgery are reviewed from pre-operative planning and intra-operative guidance to the integration of surgical robots. We end with summarizing the current state, emerging trends and major challenges in the future development of AI in surgery.

preprint2019arXiv

Real-time 3D Shape Instantiation for Partially-deployed Stent Segment from a Single 2D Fluoroscopic Image in Robot-assisted Fenestrated Endovascular Aortic Repair

In robot-assisted Fenestrated Endovascular Aortic Repair (FEVAR), accurate alignment of stent graft fenestrations or scallops with aortic branches is essential for establishing complete blood flow perfusion. Current navigation is largely based on 2D fluoroscopic images, which lacks 3D anatomical information, thus causing longer operation time as well as high risks of radiation exposure. Previously, 3D shape instantiation frameworks for real-time 3D shape reconstruction of fully-deployed or fully-compressed stent graft from a single 2D fluoroscopic image have been proposed for 3D navigation in robot-assisted FEVAR. However, these methods could not instantiate partially-deployed stent segments, as the 3D marker references are unknown. In this paper, an adapted Graph Convolutional Network (GCN) is proposed to predict 3D marker references from 3D fully-deployed markers. As original GCN is for classification, in this paper, the coarsening layers are removed and the softmax function at the network end is replaced with linear mapping for the regression task. The derived 3D and the 2D marker references are used to instantiate partially-deployed stent segment shape with the existing 3D shape instantiation framework. Validations were performed on three commonly used stent grafts and five patient-specific 3D printed aortic aneurysm phantoms. Comparable performances with average mesh distance errors of 1$\sim$3mm and average angular errors around 7degree were achieved.

preprint2018arXiv

Artificial Intelligence and Robotics

The recent successes of AI have captured the wildest imagination of both the scientific communities and the general public. Robotics and AI amplify human potentials, increase productivity and are moving from simple reasoning towards human-like cognitive abilities. Current AI technologies are used in a set area of applications, ranging from healthcare, manufacturing, transport, energy, to financial services, banking, advertising, management consulting and government agencies. The global AI market is around 260 billion USD in 2016 and it is estimated to exceed 3 trillion by 2024. To understand the impact of AI, it is important to draw lessons from it's past successes and failures and this white paper provides a comprehensive explanation of the evolution of AI, its current status and future directions.

preprint2008arXiv

A Semi-parametric Technique for the Quantitative Analysis of Dynamic Contrast-enhanced MR Images Based on Bayesian P-splines

Dynamic Contrast-enhanced Magnetic Resonance Imaging (DCE-MRI) is an important tool for detecting subtle kinetic changes in cancerous tissue. Quantitative analysis of DCE-MRI typically involves the convolution of an arterial input function (AIF) with a nonlinear pharmacokinetic model of the contrast agent concentration. Parameters of the kinetic model are biologically meaningful, but the optimization of the non-linear model has significant computational issues. In practice, convergence of the optimization algorithm is not guaranteed and the accuracy of the model fitting may be compromised. To overcome this problems, this paper proposes a semi-parametric penalized spline smoothing approach, with which the AIF is convolved with a set of B-splines to produce a design matrix using locally adaptive smoothing parameters based on Bayesian penalized spline models (P-splines). It has been shown that kinetic parameter estimation can be obtained from the resulting deconvolved response function, which also includes the onset of contrast enhancement. Detailed validation of the method, both with simulated and in vivo data, is provided.

preprint2007arXiv

A Bayesian Hierarchical Model for the Analysis of a Longitudinal Dynamic Contrast-Enhanced MRI Cancer Study

Imaging in clinical oncology trials provides a wealth of information that contributes to the drug development process, especially in early phase studies. This paper focuses on kinetic modeling in DCE-MRI, inspired by mixed-effects models that are frequently used in the analysis of clinical trials. Instead of summarizing each scanning session as a single kinetic parameter -- such as median $\ktrans$ across all voxels in the tumor ROI -- we propose to analyze all voxel time courses from all scans and across all subjects simultaneously in a single model. The kinetic parameters from the usual non-linear regression model are decomposed into unique components associated with factors from the longitudinal study; e.g., treatment, patient and voxel effects. A Bayesian hierarchical model provides the framework in order to construct a data model, a parameter model, as well as prior distributions. The posterior distribution of the kinetic parameters is estimated using Markov chain Monte Carlo (MCMC) methods. Hypothesis testing at the study level for an overall treatment effect is straightforward and the patient- and voxel-level parameters capture random effects that provide additional information at various levels of resolution to allow a thorough evaluation of the clinical trial. The proposed method is validated with a breast cancer study, where the subjects were imaged before and after two cycles of chemotherapy, demonstrating the clinical potential of this method to longitudinal oncology studies.