Source author record

Peng Yao

Peng Yao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Multimedia physics.app-ph physics.optics Artificial Intelligence Computation and Language cond-mat.dis-nn cond-mat.mtrl-sci Emerging Technologies Machine Learning

Catalog footprint

What is connected

7works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

2022 Roadmap on Neuromorphic Computing and Engineering

Modern computation based on the von Neumann architecture is today a mature cutting-edge science. In the Von Neumann architecture, processing and memory units are implemented as separate blocks interchanging data intensively and continuously. This data transfer is responsible for a large part of the power consumption. The next generation computer technology is expected to solve problems at the exascale with 1018 calculations each second. Even though these future computers will be incredibly powerful, if they are based on von Neumann type architectures, they will consume between 20 and 30 megawatts of power and will not have intrinsic physically built-in capabilities to learn or deal with complex data as our brain does. These needs can be addressed by neuromorphic computing systems which are inspired by the biological concepts of the human brain. This new generation of computers has the potential to be used for the storage and processing of large amounts of digital information with much lower power consumption than conventional processors. Among their potential future applications, an important niche is moving the control from data centers to edge devices. The aim of this Roadmap is to present a snapshot of the present state of neuromorphic technology and provide an opinion on the challenges and opportunities that the future holds in the major areas of neuromorphic technology, namely materials, devices, neuromorphic circuits, neuromorphic algorithms, applications, and ethics. The Roadmap is a collection of perspectives where leading researchers in the neuromorphic community provide their own view about the current state and the future challenges. We hope that this Roadmap will be a useful resource to readers outside this field, for those who are just entering the field, and for those who are well established in the neuromorphic community. https://doi.org/10.1088/2634-4386/ac4a83

preprint2022arXiv

Integrated lithium niobate intensity modulator on a silicon handle with slow-wave electrodes

Segmented, or slow-wave electrodes have emerged as an index-matching solution to improve bandwidth of traveling-wave Mach Zehnder and phase modulators on the thin-film lithium niobate on insulator platform. However, these devices require the use of a quartz handle or substrate removal, adding cost and additional processing. In this work, a high-speed dual-output electro-optic intensity modulator in the thin-film silicon nitride and lithium niobate material system that uses segmented electrodes for RF and optical index matching is presented. The device uses a silicon handle and does not require substrate removal. A silicon handle allows the use of larger wafer sizes to increase yield, and lends itself to processing in established silicon foundries that may not have the capability to process a quartz or fused silica wafer. The modulator has an interaction region of 10 mm, shows a DC half wave voltage of 3.75 V, an ultra-high extinction ratio of roughly 45 dB consistent with previous work, and a fiber-to-fiber insertion loss of 7.47 dB with a 95 GHz 3 dB bandwidth.

preprint2022arXiv

Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification

Deep convolutional neural network (DCNN) models have been widely explored for skin disease diagnosis and some of them have achieved the diagnostic outcomes comparable or even superior to those of dermatologists. However, broad implementation of DCNN in skin disease detection is hindered by small size and data imbalance of the publically accessible skin lesion datasets. This paper proposes a novel single-model based strategy for classification of skin lesions on small and imbalanced datasets. First, various DCNNs are trained on different small and imbalanced datasets to verify that the models with moderate complexity outperform the larger models. Second, regularization DropOut and DropBlock are added to reduce overfitting and a Modified RandAugment augmentation strategy is proposed to deal with the defects of sample underrepresentation in the small dataset. Finally, a novel Multi-Weighted New Loss (MWNL) function and an end-to-end cumulative learning strategy (CLS) are introduced to overcome the challenge of uneven sample size and classification difficulty and to reduce the impact of abnormal samples on training. By combining Modified RandAugment, MWNL and CLS, our single DCNN model method achieved the classification accuracy comparable or superior to those of multiple ensembling models on different dermoscopic image datasets. Our study shows that this method is able to achieve a high classification performance at a low cost of computational resources and inference time, potentially suitable to implement in mobile devices for automated screening of skin lesions and many other malignancies in low resource settings.

preprint2022arXiv

Ultra-high extinction dual-output thin-film lithium niobate intensity modulator

A low voltage, wide bandwidth compact electro-optic modulator is a key building block in the realization of tomorrow's communication and networking needs. Recent advances in the fabrication and application of thin-film lithium niobate, and its integration with photonic integrated circuits based in silicon make it an ideal platform for such a device. In this work, a high-extinction dual-output folded electro-optic Mach Zehnder modulator in the silicon nitride and thin-film lithium niobate material system is presented. This modulator has an interaction region length of 11 mm and a physical length of 7.8 mm. The device demonstrates a fiber-to-fiber loss of roughly 12 dB using on-chip fiber couplers and DC half wave voltage (V$π$) of less than 3.0 V, or a modulation efficiency (V$π\cdot$L) of 3.3 V$\cdot$cm. The device shows a 3 dB bandwidth of roughly 30 GHz. Notably, the device demonstrates a power extinction ratio over 45 dB at each output port without the use of cascaded directional couplers or additional control circuitry; roughly 31 times better than previously reported devices. Paired with a balanced photo-diode receiver, this modulator can be used in various photonic communication systems. Such a detecting scheme is compatible with complex modulation formats such as differential phase shift keying and differential quadrature phase shift keying, where a dual-output, ultra-high extinction device is fundamentally paramount to low-noise operation of the system.

preprint2021arXiv

Low-cost and high-performance data augmentation for deep-learning-based skin lesion classification

Although deep convolutional neural networks (DCNNs) have achieved significant accuracy in skin lesion classification comparable or even superior to those of dermatologists, practical implementation of these models for skin cancer screening in low resource settings is hindered by their limitations in computational cost and training dataset. To overcome these limitations, we propose a low-cost and high-performance data augmentation strategy that includes two consecutive stages of augmentation search and network search. At the augmentation search stage, the augmentation strategy is optimized in the search space of Low-Cost-Augment (LCA) under the criteria of balanced accuracy (BACC) with 5-fold cross validation. At the network search stage, the DCNNs are fine-tuned with the full training set in order to select the model with the highest BACC. The efficiency of the proposed data augmentation strategy is verified on the HAM10000 dataset using EfficientNets as a baseline. With the proposed strategy, we are able to reduce the search space to 60 and achieve a high BACC of 0.853 by using a single DCNN model without external database, suitable to be implemented in mobile devices for DCNN-based skin lesion detection in low resource settings.

preprint2020arXiv

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

Self-attention (SA) network has shown profound value in image captioning. In this paper, we improve SA from two aspects to promote the performance of image captioning. First, we propose Normalized Self-Attention (NSA), a reparameterization of SA that brings the benefits of normalization inside SA. While normalization is previously only applied outside SA, we introduce a novel normalization method and demonstrate that it is both possible and beneficial to perform it on the hidden activations inside SA. Second, to compensate for the major limit of Transformer that it fails to model the geometry structure of the input objects, we propose a class of Geometry-aware Self-Attention (GSA) that extends SA to explicitly and efficiently consider the relative geometry relations between the objects in the image. To construct our image captioning model, we combine the two modules and apply it to the vanilla self-attention network. We extensively evaluate our proposals on MS-COCO image captioning dataset and superior results are achieved when comparing to state-of-the-art approaches. Further experiments on three challenging tasks, i.e. video captioning, machine translation, and visual question answering, show the generality of our methods.

preprint2020arXiv

Vatex Video Captioning Challenge 2020: Multi-View Features and Hybrid Reward Strategies for Video Captioning

This report describes our solution for the VATEX Captioning Challenge 2020, which requires generating descriptions for the videos in both English and Chinese languages. We identified three crucial factors that improve the performance, namely: multi-view features, hybrid reward, and diverse ensemble. Based on our method of VATEX 2019 challenge, we achieved significant improvements this year with more advanced model architectures, combination of appearance and motion features, and careful hyper-parameters tuning. Our method achieves very competitive results on both of the Chinese and English video captioning tracks.

Peng Yao

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

2022 Roadmap on Neuromorphic Computing and Engineering

Integrated lithium niobate intensity modulator on a silicon handle with slow-wave electrodes

Single Model Deep Learning on Imbalanced Small Datasets for Skin Lesion Classification

Ultra-high extinction dual-output thin-film lithium niobate intensity modulator

Low-cost and high-performance data augmentation for deep-learning-based skin lesion classification

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

Vatex Video Captioning Challenge 2020: Multi-View Features and Hybrid Reward Strategies for Video Captioning