Researcher profile

Chenghao Feng

Chenghao Feng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2023arXiv

A Point-of-Care Biosensor for Rapid Detection and Differentiation of COVID-19 Virus (SARS-CoV-2) and Influenza Virus Using Subwavelength Grating Micro-ring Resonator

In the context of continued spread of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 and the emergence of new variants, the demand for rapid, accurate, and frequent detection is increasing. Besides, the new predominant strain, Omicron variant, manifests more similar clinical features to those of other common respiratory infections. The concurrent detection of multiple potential pathogens helps distinguish SARS-CoV-2 infection from other diseases with overlapping symptoms, which is significant for patients to receive tailored treatment and containing the outbreak. Here, we report a lab-on-a-chip biosensing platform for SARS-CoV-2 detection based on subwavelength grating micro-ring resonator. The sensing surface is functionalized by specific antibody against SARS-CoV-2 spike protein, which could produce redshifts of resonant peaks by antigen-antibody combination, thus achieving quantitative detection. Additionally, the sensor chip is integrated with a microfluidic chip with an anti-backflow Y-shaped structure that enables the concurrent detection of two analytes. In this study, we realized the detection and differentiation of COVID-19 and influenza A H1N1. Experimental results show that the limit of detection of our device reaches 100 fg/mL (1.31 fM) within 15 min detecting time, and cross-reactivity tests manifest the specificity of the optical diagnostic assay. Further, the integrated packaging and streamlined workflow facilitate its use for clinical applications. Thus, the biosensing platform offers a promising solution to achieve ultrasensitive, selective, multiplexed, and quantitative point-of-care detection of COVID-19.

preprint2023arXiv

Lightening-Transformer: A Dynamically-operated Optically-interconnected Photonic Transformer Accelerator

The wide adoption and significant computing resource of attention-based transformers, e.g., Vision Transformers and large language models (LLM), have driven the demand for efficient hardware accelerators. There is a growing interest in exploring photonics as an alternative technology to digital electronics due to its high energy efficiency and ultra-fast processing speed. Photonic accelerators have shown promising results for CNNs, which mainly rely on weight-static linear operations. However, they encounter issues when efficiently supporting Transformer architectures, questioning the applicability of photonics to advanced ML tasks. The primary hurdle lies in their inefficiency in handling unique workloads in Transformers, i.e., dynamic and full-range tensor multiplication. In this work, we propose Lightening-Transformer, the first light-empowered, high-performance, and energy-efficient photonic Transformer accelerator. To overcome prior designs' fundamental limitations, we introduce a novel dynamically-operated photonic tensor core, DPTC, a crossbar array of interference-based optical vector dot-product engines supporting highly parallel, dynamic, and full-range matrix multiplication. Furthermore, we design a dedicated accelerator that integrates our novel photonic computing cores with photonic interconnects for inter-core data broadcast, fully unleashing the power of optics. Comprehensive evaluations show that ours achieves >2.6x energy and >12x latency reductions compared to prior photonic accelerators and delivers the lowest energy cost and 2 to 3 orders of magnitude lower energy-delay product compared to electronic Transformer accelerators, all while maintaining digital-comparable accuracy. Our work highlights the immense potential of photonics for advanced ML workloads, such as Transformer-backboned LLM. Our work is available at https://github.com/zhuhanqing/Lightening-Transformer.

preprint2023arXiv

M3ICRO: Machine Learning-Enabled Compact Photonic Tensor Core based on PRogrammable Multi-Operand Multimode Interference

Photonic computing shows promise for transformative advancements in machine learning (ML) acceleration, offering ultra-fast speed, massive parallelism, and high energy efficiency. However, current photonic tensor core (PTC) designs based on standard optical components hinder scalability and compute density due to their large spatial footprint. To address this, we propose an ultra-compact PTC using customized programmable multi-operand multimode interference (MOMMI) devices, named M3ICRO. The programmable MOMMI leverages the intrinsic light propagation principle, providing a single-device programmable matrix unit beyond the conventional computing paradigm of one multiply-accumulate (MAC) operation per device. To overcome the optimization difficulty of customized devices that often requires time-consuming simulation, we apply ML for optics to predict the device behavior and enable a differentiable optimization flow. We thoroughly investigate the reconfigurability and matrix expressivity of our customized PTC, and introduce a novel block unfolding method to fully exploit the computing capabilities of a complex-valued PTC for near-universal real-valued linear transformations. Extensive evaluations demonstrate that M3ICRO achieves a 3.4-9.6x smaller footprint, 1.6-4.4x higher speed, 10.6-42x higher compute density, 3.7-12x higher system throughput, and superior noise robustness compared to state-of-the-art coherent PTC designs, while maintaining close-to-digital task accuracy across various ML benchmarks. Our code is open-sourced at https://github.com/JeremieMelo/M3ICRO-MOMMI.

preprint2022arXiv

A compact butterfly-style silicon photonic-electronic neural chip for hardware-efficient deep learning

The optical neural network (ONN) is a promising hardware platform for next-generation neurocomputing due to its high parallelism, low latency, and low energy consumption. Previous ONN architectures are mainly designed for general matrix multiplication (GEMM), leading to unnecessarily large area cost and high control complexity. Here, we move beyond classical GEMM-based ONNs and propose an optical subspace neural network (OSNN) architecture, which trades the universality of weight representation for lower optical component usage, area cost, and energy consumption. We devise a butterfly-style photonic-electronic neural chip to implement our OSNN with up to 7x fewer trainable optical components compared to GEMM-based ONNs. Additionally, a hardware-aware training framework is provided to minimize the required device programming precision, lessen the chip area, and boost the noise robustness. We experimentally demonstrate the utility of our neural chip in practical image recognition tasks, showing that a measured accuracy of 94.16% can be achieved in hand-written digit recognition tasks with 3-bit weight programming precision.

preprint2022arXiv

ADEPT: Automatic Differentiable DEsign of Photonic Tensor Cores

Photonic tensor cores (PTCs) are essential building blocks for optical artificial intelligence (AI) accelerators based on programmable photonic integrated circuits. PTCs can achieve ultra-fast and efficient tensor operations for neural network (NN) acceleration. Current PTC designs are either manually constructed or based on matrix decomposition theory, which lacks the adaptability to meet various hardware constraints and device specifications. To our best knowledge, automatic PTC design methodology is still unexplored. It will be promising to move beyond the manual design paradigm and "nurture" photonic neurocomputing with AI and design automation. Therefore, in this work, for the first time, we propose a fully differentiable framework, dubbed ADEPT, that can efficiently search PTC designs adaptive to various circuit footprint constraints and foundry PDKs. Extensive experiments show superior flexibility and effectiveness of the proposed ADEPT framework to explore a large PTC design space. On various NN models and benchmarks, our searched PTC topology outperforms prior manually-designed structures with competitive matrix representability, 2-30x higher footprint compactness, and better noise robustness, demonstrating a new paradigm in photonic neural chip design. The code of ADEPT is available at https://github.com/JeremieMelo/ADEPT using the https://github.com/JeremieMelo/pytorch-onn (TorchONN) library.

preprint2022arXiv

Joint Hybrid and Passive RIS-Assisted Beamforming for MmWave MIMO Systems Relying on Dynamically Configured Subarrays

Reconfigurable intelligent surface (RIS) assisted millimeter-wave (mmWave) communication systems relying on hybrid beamforming structures are capable of achieving high spectral efficiency at a low hardware complexity and low power consumption. In this paper, we propose an RIS-assisted mmWave point-to-point system relying on dynamically configured sub-array connected hybrid beamforming structures. More explicitly, an energy-efficient analog beamformer relying on twin-resolution phase shifters is proposed. Then, we conceive a successive interference cancelation (SIC) based method for jointly designing the hybrid beamforming matrix of the base station (BS) and the passive beamforming matrix of the RIS. Specifically, the associated bandwidth-efficiency maximization problem is transformed into a series of sub-problems, where the sub-array of phase shifters and RIS elements are jointly optimized for maximizing each sub-array's rate. Furthermore, a greedy method is proposed for determining the phase shifter configuration of each sub-array. We then propose to update the RIS elements relying on a complex circle manifold (CCM)-based method. The proposed dynamic sub-connected structure as well as the proposed joint hybrid and passive beamforming method strikes an attractive trade-off between the bandwidth efficiency and power consumption. Our simulation results demonstrate the superiority of the proposed method compared to its traditional counterparts.

preprint2022arXiv

Weighted Sum Rate Maximization of the mmWave Cell-Free MIMO Downlink Relying on Hybrid Precoding

The cell-free MIMO concept relying on hybrid precoding constitutes an innovative technique capable of dramatically increasing the network capacity of millimeter-wave (mmWave) communication systems. It dispenses with the cell boundary of conventional multi-cell MIMO systems, while drastically reducing the power consumption by limiting the number of radio frequency (RF) chains at the access points (APs). In this paper, we aim for maximizing the weighted sum rate (WSR) of mmWave cell-free MIMO systems by conceiving a low-complexity hybrid precoding algorithm. We formulate the WSR optimization problem subject to the transmit power constraint for each AP and the constant-modulus constraint for the phase shifters of the analog precoders. A block coordinate descent (BCD) algorithm is proposed for iteratively solving the problem. In each iteration, the classic Lagrangian multiplier method and the penalty dual decomposition (PDD) method are combined for obtaining near-optimal hybrid analog/digital precoding matrices. Furthermore, we extend our proposed algorithm for deriving closed-form expressions for the precoders of fully digital cell-free MIMO systems. Moreover, we present the convergency analysis and complexity analysis of our proposed method. Finally, our simulation results demonstrate the superiority of the algorithms proposed for both fully digital and hybrid precoding matrices.

preprint2020arXiv

Beam Selection for Wideband Millimeter Wave MIMO Relying on Lens Antenna Arrays

Beamspace multi-input multi-output (MIMO) relying on lens antenna arrays can significantly reduce the number of radio-frequency chains in millimeter-wave (mmWave) communication systems through beam selection. However, the beamforming gain is actually frequency-dependent in wideband mmWave MIMO systems. This phenomenon is called beam squint, which will deteriorate the system's performance when traditional beam selection methods are used. To solve this problem, we propose a wideband beam selection method for mmWave MIMO systems relying on lens antenna arrays. Firstly, we select one beam with the maximal energy averaged over the whole band for each user and then we sequentially select the beams that contribute the most to the sum-rate. Performance analysis of the proposed wideband beam selection method is also presented. Numerical results show that the proposed method achieves higher sum-rate and energy efficiency compared with its traditional counterparts.