Source author record

Can Zhao

Can Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Artificial Intelligence cond-mat.mes-hall Distributed, Parallel, and Cluster Computing eess.SP eess.SY Information Retrieval Machine Learning Neural and Evolutionary Computing Systems and Control

Catalog footprint

What is connected

6works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Cross-Modal Attention Network with Dual Graph Learning in Multimodal Recommendation

Multimedia recommendation systems leverage user-item interactions and multimodal information to capture user preferences, enabling more accurate and personalized recommendations. Despite notable advancements, existing approaches still face two critical limitations: first, shallow modality fusion often relies on simple concatenation, failing to exploit rich synergic intra- and inter-modal relationships; second, asymmetric feature treatment-where users are only characterized by interaction IDs while items benefit from rich multimodal content-hinders the learning of a shared semantic space. To address these issues, we propose a Cross-modal Recursive Attention Network with dual graph Embedding (CRANE). To tackle shallow fusion, we design a core Recursive Cross-Modal Attention (RCA) mechanism that iteratively refines modality features based on cross-correlations in a joint latent space, effectively capturing high-order intra- and inter-modal dependencies. For symmetric multimodal learning, we explicitly construct users' multimodal profiles by aggregating features of their interacted items. Furthermore, CRANE integrates a symmetric dual-graph framework-comprising a heterogeneous user-item interaction graph and a homogeneous item-item semantic graph-unified by a self-supervised contrastive learning objective to fuse behavioral and semantic signals. Despite these complex modeling capabilities, CRANE maintains high computational efficiency. Theoretical and empirical analyses confirm its scalability and high practical efficiency, achieving faster convergence on small datasets and superior performance ceilings on large-scale ones. Comprehensive experiments on four public real-world datasets validate an average 5% improvement in key metrics over state-of-the-art baselines.

preprint2022arXiv

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.

preprint2021arXiv

A Negotiation-based Right-of-way Assignment Strategy to Ensure Traffic Safety and Efficiency in Lane Change

It is widely acknowledged that verifying the safety of autonomous driving strategies requires a substantial body of simulation testing and road testing. In recent years, the formal safety methods represented by Responsibility-Sensitive Safety (RSS) have encouraged low-cost autonomous driving safety research, benefitting from its accurate assessment of safety and clear division of responsibilities. However, how to maintain traffic efficiency while ensuring safety remains a challenge. To address this problem, this paper proposes a formulized negotiation-based lane-changing strategy that makes a trade-off between safety and efficiency. Both theoretical analysis and numerical experimental results shows that compared to RSS, our strategy can noticeably improve the success rate of changing lanes on the premise of safety.

preprint2020arXiv

Distributed Brillouin frequency shift extraction via a convolutional neural network

Distributed optical fiber Brillouin sensors detect the temperature and strain along a fiber according to the local Brillouin frequency shift, which is usually calculated by the measured Brillouin spectrum using Lorentzian curve fitting. In addition, cross-correlation, principal component analysis, and machine learning methods have been proposed for the more efficient extraction of Brillouin frequency shifts. However, existing methods only process the Brillouin spectrum individually, ignoring the correlation in the time domain, indicating that there is still room for improvement. Here, we propose and experimentally demonstrate a full convolution neural network to extract the distributed Brillouin frequency shift directly from the measured two-dimensional data. Simulated ideal Brillouin spectrum with various parameters are used to train the network. Both the simulation and experimental results show that the extraction accuracy of the network is better than that of the traditional curve fitting algorithm with a much shorter processing time. This network has good universality and robustness and can effectively improve the performances of existing Brillouin sensors.

preprint2020arXiv

LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation

Deep Learning (DL) models are becoming larger, because the increase in model size might offer significant accuracy gain. To enable the training of large deep networks, data parallelism and model parallelism are two well-known approaches for parallel training. However, data parallelism does not help reduce memory footprint per device. In this work, we introduce Large deep 3D ConvNets with Automated Model Parallelism (LAMP) and investigate the impact of both input's and deep 3D ConvNets' size on segmentation accuracy. Through automated model parallelism, it is feasible to train large deep 3D ConvNets with a large input patch, even the whole image. Extensive experiments demonstrate that, facilitated by the automated model parallelism, the segmentation accuracy can be improved through increasing model size and input context size, and large input yields significant inference speedup compared with sliding window of small patches in the inference. Code is available\footnote{https://monai.io/research/lamp-automated-model-parallelism}.

preprint2014arXiv

Compact Model of Nanowire Tunneling FETs Including Phonon-Assisted Tunneling and Quantum Capacitance

A physics-based compact model for silicon gate-all-around (GAA) nanowire tunneling FETs (NW-tFETs) with good accuracy has been developed by considering Phonon-Assisted Tunneling (PAT) and transition from Quantum Capacitance Limit (QCL) to Classical Limit (CL) during the device-size scaling. The impact of PAT results in the broadening of a single electron-energy level to an energy band with density-of-states (DOS) distribution of Lorentzian shape. As a consequence, the tunneling probability at the edge of tunneling window no longer changes abruptly from zero to having a finite value. By adjusting the parameters in the Lorentzian function, an accurate fitting to the measured transfer characteristics in the subthreshold region is made possible. Besides, with an analytical formula to calculate the channel potential, the model is able to cover naturally the transition from QCL to CL regime when the device size is scaled. Furthermore, on-voltage is defined to facilitate the modeling and fitting processes. Comparisons with the experimental data demonstrate the model accuracy across all device operation regions and the flexibility in model parameter extraction is also shown.