Researcher profile

Zhihong Chen

Zhihong Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2022arXiv

Cross-modal Memory Networks for Radiology Report Generation

Medical imaging plays a significant role in clinical practice of medical diagnosis, where the text reports of the images are essential in understanding them and facilitating later treatments. By generating the reports automatically, it is beneficial to help lighten the burden of radiologists and significantly promote clinical automation, which already attracts much attention in applying artificial intelligence to medical domain. Previous studies mainly follow the encoder-decoder paradigm and focus on the aspect of text generation, with few studies considering the importance of cross-modal mappings and explicitly exploit such mappings to facilitate radiology report generation. In this paper, we propose a cross-modal memory networks (CMN) to enhance the encoder-decoder framework for radiology report generation, where a shared memory is designed to record the alignment between images and texts so as to facilitate the interaction and generation across modalities. Experimental results illustrate the effectiveness of our proposed model, where state-of-the-art performance is achieved on two widely used benchmark datasets, i.e., IU X-Ray and MIMIC-CXR. Further analyses also prove that our model is able to better align information from radiology images and texts so as to help generating more accurate reports in terms of clinical indicators.

preprint2022arXiv

Generating Radiology Reports via Memory-driven Transformer

Medical imaging is frequently used in clinical practice and trials for diagnosis and treatment. Writing imaging reports is time-consuming and can be error-prone for inexperienced radiologists. Therefore, automatically generating radiology reports is highly desired to lighten the workload of radiologists and accordingly promote clinical automation, which is an essential task to apply artificial intelligence to the medical domain. In this paper, we propose to generate radiology reports with memory-driven Transformer, where a relational memory is designed to record key information of the generation process and a memory-driven conditional layer normalization is applied to incorporating the memory into the decoder of Transformer. Experimental results on two prevailing radiology report datasets, IU X-Ray and MIMIC-CXR, show that our proposed approach outperforms previous models with respect to both language generation metrics and clinical evaluations. Particularly, this is the first work reporting the generation results on MIMIC-CXR to the best of our knowledge. Further analyses also demonstrate that our approach is able to generate long reports with necessary medical terms as well as meaningful image-text attention mappings.

preprint2022arXiv

Graph Enhanced Contrastive Learning for Radiology Findings Summarization

The impression section of a radiology report summarizes the most prominent observation from the findings section and is the most important section for radiologists to communicate to physicians. Summarizing findings is time-consuming and can be prone to error for inexperienced radiologists, and thus automatic impression generation has attracted substantial attention. With the encoder-decoder framework, most previous studies explore incorporating extra knowledge (e.g., static pre-defined clinical ontologies or extra background information). Yet, they encode such knowledge by a separate encoder to treat it as an extra input to their models, which is limited in leveraging their relations with the original findings. To address the limitation, we propose a unified framework for exploiting both extra knowledge and the original findings in an integrated way so that the critical information (i.e., key words and their relations) can be extracted in an appropriate way to facilitate impression generation. In detail, for each input findings, it is encoded by a text encoder, and a graph is constructed through its entities and dependency tree. Then, a graph encoder (e.g., graph neural networks (GNNs)) is adopted to model relation information in the constructed graph. Finally, to emphasize the key words in the findings, contrastive learning is introduced to map positive samples (constructed by masking non-key words) closer and push apart negative ones (constructed by masking key words). The experimental results on OpenI and MIMIC-CXR confirm the effectiveness of our proposed method.

preprint2022arXiv

How to Report and Benchmark Emerging Field-Effect Transistors

Emerging low-dimensional nanomaterials have been studied for decades in device applications as field-effect transistors (FETs). However, properly reporting and comparing device performance has been challenging due to the involvement and interlinking of multiple device parameters. More importantly, the interdisciplinarity of this research community results in a lack of consistent reporting and benchmarking guidelines. Here we report a consensus among the authors regarding guidelines for reporting and benchmarking important FET parameters and performance metrics. We provide an example of this reporting and benchmarking process for a two-dimensional (2D) semiconductor FET. Our consensus will help promote an improved approach for assessing device performance in emerging FETs, thus aiding the field to progress more consistently and meaningfully.

preprint2020arXiv

Attention-Guided Discriminative Region Localization and Label Distribution Learning for Bone Age Assessment

Bone age assessment (BAA) is clinically important as it can be used to diagnose endocrine and metabolic disorders during child development. Existing deep learning based methods for classifying bone age use the global image as input, or exploit local information by annotating extra bounding boxes or key points. However, training with the global image underutilizes discriminative local information, while providing extra annotations is expensive and subjective. In this paper, we propose an attention-guided approach to automatically localize the discriminative regions for BAA without any extra annotations. Specifically, we first train a classification model to learn the attention maps of the discriminative regions, finding the hand region, the most discriminative region (the carpal bones), and the next most discriminative region (the metacarpal bones). Guided by those attention maps, we then crop the informative local regions from the original image and aggregate different regions for BAA. Instead of taking BAA as a general regression task, which is suboptimal due to the label ambiguity problem in the age label space, we propose using joint age distribution learning and expectation regression, which makes use of the ordinal relationship among hand images with different individual ages and leads to more robust age estimation. Extensive experiments are conducted on the RSNA pediatric bone age data set. Using no training annotations, our method achieves competitive results compared with existing state-of-the-art semi-automatic deep learning-based methods that require manual annotation. Code is available at https: //github.com/chenchao666/Bone-Age-Assessment.

preprint2020arXiv

Electrically-Tunable Stochasticity for Spin-based Neuromorphic Circuits: Self-Adjusting to Variation

Energy-efficient methods are addressed for leveraging low energy barrier nanomagnetic devices within neuromorphic architectures. Using a Magnetoresistive Random Access Memory (MRAM) probabilistic device (p-bit) as the basis of neuronal structures in Deep Belief Networks (DBNs), the impact of reducing the Magnetic Tunnel Junction's (MTJ's) energy barrier is assessed and optimized for the resulting stochasticity present in the learning system. This can mitigate the process variation sensitivity of stochastic DBNs which encounter a sharp drop-off when energy barriers exceed near-zero kT. As evaluated for the MNIST dataset for energy barriers at near-zero kT to 2.0 kT in increments of 0.5 kT, it is shown that the stability factor changes by 5 orders of magnitude. The self-compensating circuit developed herein provides a compact, and low complexity approach to mitigating process variation impacts towards practical implementation and fabrication.

preprint2020arXiv

ESAM: Discriminative Domain Adaptation with Non-Displayed Items to Improve Long-Tail Performance

Most of ranking models are trained only with displayed items (most are hot items), but they are utilized to retrieve items in the entire space which consists of both displayed and non-displayed items (most are long-tail items). Due to the sample selection bias, the long-tail items lack sufficient records to learn good feature representations, i.e. data sparsity and cold start problems. The resultant distribution discrepancy between displayed and non-displayed items would cause poor long-tail performance. To this end, we propose an entire space adaptation model (ESAM) to address this problem from the perspective of domain adaptation (DA). ESAM regards displayed and non-displayed items as source and target domains respectively. Specifically, we design the attribute correlation alignment that considers the correlation between high-level attributes of the item to achieve distribution alignment. Furthermore, we introduce two effective regularization strategies, i.e. \textit{center-wise clustering} and \textit{self-training} to improve DA process. Without requiring any auxiliary information and auxiliary domains, ESAM transfers the knowledge from displayed items to non-displayed items for alleviating the distribution inconsistency. Experiments on two public datasets and a large-scale industrial dataset collected from Taobao demonstrate that ESAM achieves state-of-the-art performance, especially in the long-tail space. Besides, we deploy ESAM to the Taobao search engine, leading to significant improvement on online performance. The code is available at \url{https://github.com/A-bone1/ESAM.git}

preprint2020arXiv

Hardware implementation of Bayesian network building blocks with stochastic spintronic devices

Bayesian networks are powerful statistical models to understand causal relationships in real-world probabilistic problems such as diagnosis, forecasting, computer vision, etc. For systems that involve complex causal dependencies among many variables, the complexity of the associated Bayesian networks become computationally intractable. As a result, direct hardware implementation of these networks is one promising approach to reducing power consumption and execution time. However, the few hardware implementations of Bayesian networks presented in literature rely on deterministic CMOS devices that are not efficient in representing the inherently stochastic variables in a Bayesian network. This work presents an experimental demonstration of a Bayesian network building block implemented with naturally stochastic spintronic devices. These devices are based on nanomagnets with perpendicular magnetic anisotropy, initialized to their hard axes by the spin orbit torque from a heavy metal under-layer utilizing the giant spin Hall effect, enabling stochastic behavior. We construct an electrically interconnected network of two stochastic devices and manipulate the correlations between their states by changing connection weights and biases. By mapping given conditional probability tables to the circuit hardware, we demonstrate that any two node Bayesian networks can be implemented by our stochastic network. We then present the stochastic simulation of an example case of a four node Bayesian network using our proposed device, with parameters taken from the experiment. We view this work as a first step towards the large scale hardware implementation of Bayesian networks.

preprint2020arXiv

Ro-SOS: Metric Expression Network (MEnet) for Robust Salient Object Segmentation

Although deep CNNs have brought significant improvement to image saliency detection, most CNN based models are sensitive to distortion such as compression and noise. In this paper, we propose an end-to-end generic salient object segmentation model called Metric Expression Network (MEnet) to deal with saliency detection with the tolerance of distortion. Within MEnet, a new topological metric space is constructed, whose implicit metric is determined by the deep network. As a result, we manage to group all the pixels in the observed image semantically within this latent space into two regions: a salient region and a non-salient region. With this architecture, all feature extractions are carried out at the pixel level, enabling fine granularity of output boundaries of the salient objects. What's more, we try to give a general analysis for the noise robustness of the network in the sense of Lipschitz and Jacobian literature. Experiments demonstrate that robust salient maps facilitating object segmentation can be generated by the proposed metric. Tests on several public benchmarks show that MEnet has achieved desirable performance. Furthermore, by direct computation and measuring the robustness, the proposed method outperforms previous CNN-based methods on distorted inputs.

preprint2020arXiv

Selective Transfer with Reinforced Transfer Network for Partial Domain Adaptation

One crucial aspect of partial domain adaptation (PDA) is how to select the relevant source samples in the shared classes for knowledge transfer. Previous PDA methods tackle this problem by re-weighting the source samples based on their high-level information (deep features). However, since the domain shift between source and target domains, only using the deep features for sample selection is defective. We argue that it is more reasonable to additionally exploit the pixel-level information for PDA problem, as the appearance difference between outlier source classes and target classes is significantly large. In this paper, we propose a reinforced transfer network (RTNet), which utilizes both high-level and pixel-level information for PDA problem. Our RTNet is composed of a reinforced data selector (RDS) based on reinforcement learning (RL), which filters out the outlier source samples, and a domain adaptation model which minimizes the domain discrepancy in the shared label space. Specifically, in the RDS, we design a novel reward based on the reconstruct errors of selected source samples on the target generator, which introduces the pixel-level information to guide the learning of RDS. Besides, we develope a state containing high-level information, which used by the RDS for sample selection. The proposed RDS is a general module, which can be easily integrated into existing DA models to make them fit the PDA situation. Extensive experiments indicate that RTNet can achieve state-of-the-art performance for PDA tasks on several benchmark datasets.

preprint2019arXiv

Atomically Controlled Tunable Doping in High Performance WSe2 Devices

Two-dimensional transitional metal dichalcogenide (TMD) field-effect transistors (FETs) are promising candidates for future electronic applications, owing to their excellent transport properties and potential for ultimate device scaling. However, it is widely acknowledged that substantial contact resistance associated with the contact-TMD interface has impeded device performance to a large extent. It has been discovered that O2 plasma treatment can convert WSe2 into WO3-x and substantially improve contact resistances of p-type WSe2 devices by strong doping induced thinner depletion width. In this paper, we carefully study the temperature dependence of this conversion, demonstrating an oxidation process with a precise monolayer control at room temperature and multilayer conversion at elevated temperatures. Furthermore, the lateral oxidation of WSe2 under the contact revealed by HR-STEM leads to potential unpinning of the metal Fermi level and Schottky barrier lowering, resulting in lower contact resistances. The p-doping effect is attributed to the high electron affinity of the formed WO3-x layer on top of the remaining WSe2 channel, and the doping level is found to be dependent on the WO3-x thickness that is controlled by the temperature. Comprehensive materials and electrical characterizations are presented, with a low contact resistance of ~528 ohm-um and record high on-state current of 320 uA/um at -1V bias being reported.

preprint2019arXiv

Correlated fluctuations in spin orbit torque-coupled perpendicular nanomagnets

Low barrier nanomagnets have attracted a lot of research interest for their use as sources of high quality true random number generation. More recently, low barrier nanomagnets with tunable output have been shown to be a natural hardware platform for unconventional computing paradigms such as probabilistic spin logic. Efficient generation and tunability of high quality random bits is critical for these novel applications. However, current spintronic random number generators are based on superparamagnetic tunnel junctions (SMTJs) with tunability obtained through spin transfer torque (STT), which unavoidably leads to challenges in designing concatenated networks using these two terminal devices. The more recent development of utilizing spin orbit torque (SOT) allows for a three terminal device design, but can only tune in-plane magnetization freely, which is not very energy efficient due to the needs of overcoming a large demagnetization field. In this work, we experimentally demonstrate for the first time, a stochastic device with perpendicular magnetic anisotropy (PMA) that is completely tunable by SOT without the aid of any external magnetic field. Our measurements lead us to hypothesize that a tilted anisotropy might be responsible for the observed tunability. We carry out stochastic Landau-Lifshitz-Gilbert (sLLG) simulations to confirm our experimental observation. Finally, we build an electrically coupled network of two such stochastic nanomagnet based devices and demonstrate that finite correlation or anti-correlation can be established between their output fluctuations by a weak interconnection, despite having a large difference in their natural fluctuation time scale. Simulations based on a newly developed dynamical model for autonomous circuits composed of low barrier nanomagnets show close agreement with the experimental results.