Researcher profile

Can Zhao

Can Zhao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Cross-Modal Attention Network with Dual Graph Learning in Multimodal Recommendation

Multimedia recommendation systems leverage user-item interactions and multimodal information to capture user preferences, enabling more accurate and personalized recommendations. Despite notable advancements, existing approaches still face two critical limitations: first, shallow modality fusion often relies on simple concatenation, failing to exploit rich synergic intra- and inter-modal relationships; second, asymmetric feature treatment-where users are only characterized by interaction IDs while items benefit from rich multimodal content-hinders the learning of a shared semantic space. To address these issues, we propose a Cross-modal Recursive Attention Network with dual graph Embedding (CRANE). To tackle shallow fusion, we design a core Recursive Cross-Modal Attention (RCA) mechanism that iteratively refines modality features based on cross-correlations in a joint latent space, effectively capturing high-order intra- and inter-modal dependencies. For symmetric multimodal learning, we explicitly construct users' multimodal profiles by aggregating features of their interacted items. Furthermore, CRANE integrates a symmetric dual-graph framework-comprising a heterogeneous user-item interaction graph and a homogeneous item-item semantic graph-unified by a self-supervised contrastive learning objective to fuse behavioral and semantic signals. Despite these complex modeling capabilities, CRANE maintains high computational efficiency. Theoretical and empirical analyses confirm its scalability and high practical efficiency, achieving faster convergence on small datasets and superior performance ceilings on large-scale ones. Comprehensive experiments on four public real-world datasets validate an average 5% improvement in key metrics over state-of-the-art baselines.

preprint2022arXiv

Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation

Federated learning (FL) is a distributed machine learning technique that enables collaborative model training while avoiding explicit data sharing. The inherent privacy-preserving property of FL algorithms makes them especially attractive to the medical field. However, in case of heterogeneous client data distributions, standard FL methods are unstable and require intensive hyperparameter tuning to achieve optimal performance. Conventional hyperparameter optimization algorithms are impractical in real-world FL applications as they involve numerous training trials, which are often not affordable with limited compute budgets. In this work, we propose an efficient reinforcement learning (RL)-based federated hyperparameter optimization algorithm, termed Auto-FedRL, in which an online RL agent can dynamically adjust hyperparameters of each client based on the current training progress. Extensive experiments are conducted to investigate different search strategies and RL agents. The effectiveness of the proposed method is validated on a heterogeneous data split of the CIFAR-10 dataset as well as two real-world medical image segmentation datasets for COVID-19 lesion segmentation in chest CT and pancreas segmentation in abdominal CT.

preprint2021arXiv

A Negotiation-based Right-of-way Assignment Strategy to Ensure Traffic Safety and Efficiency in Lane Change

It is widely acknowledged that verifying the safety of autonomous driving strategies requires a substantial body of simulation testing and road testing. In recent years, the formal safety methods represented by Responsibility-Sensitive Safety (RSS) have encouraged low-cost autonomous driving safety research, benefitting from its accurate assessment of safety and clear division of responsibilities. However, how to maintain traffic efficiency while ensuring safety remains a challenge. To address this problem, this paper proposes a formulized negotiation-based lane-changing strategy that makes a trade-off between safety and efficiency. Both theoretical analysis and numerical experimental results shows that compared to RSS, our strategy can noticeably improve the success rate of changing lanes on the premise of safety.

preprint2020arXiv

Distributed Brillouin frequency shift extraction via a convolutional neural network

Distributed optical fiber Brillouin sensors detect the temperature and strain along a fiber according to the local Brillouin frequency shift, which is usually calculated by the measured Brillouin spectrum using Lorentzian curve fitting. In addition, cross-correlation, principal component analysis, and machine learning methods have been proposed for the more efficient extraction of Brillouin frequency shifts. However, existing methods only process the Brillouin spectrum individually, ignoring the correlation in the time domain, indicating that there is still room for improvement. Here, we propose and experimentally demonstrate a full convolution neural network to extract the distributed Brillouin frequency shift directly from the measured two-dimensional data. Simulated ideal Brillouin spectrum with various parameters are used to train the network. Both the simulation and experimental results show that the extraction accuracy of the network is better than that of the traditional curve fitting algorithm with a much shorter processing time. This network has good universality and robustness and can effectively improve the performances of existing Brillouin sensors.

preprint2020arXiv

LAMP: Large Deep Nets with Automated Model Parallelism for Image Segmentation

Deep Learning (DL) models are becoming larger, because the increase in model size might offer significant accuracy gain. To enable the training of large deep networks, data parallelism and model parallelism are two well-known approaches for parallel training. However, data parallelism does not help reduce memory footprint per device. In this work, we introduce Large deep 3D ConvNets with Automated Model Parallelism (LAMP) and investigate the impact of both input's and deep 3D ConvNets' size on segmentation accuracy. Through automated model parallelism, it is feasible to train large deep 3D ConvNets with a large input patch, even the whole image. Extensive experiments demonstrate that, facilitated by the automated model parallelism, the segmentation accuracy can be improved through increasing model size and input context size, and large input yields significant inference speedup compared with sliding window of small patches in the inference. Code is available\footnote{https://monai.io/research/lamp-automated-model-parallelism}.