Researcher profile

Yi Gao

Yi Gao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

DepRadar: Agentic Coordination for Context Aware Defect Impact Analysis in Deep Learning Libraries

Deep learning libraries like Transformers and Megatron are now widely adopted in modern AI programs. However, when these libraries introduce defects, ranging from silent computation errors to subtle performance regressions, it is often challenging for downstream users to assess whether their own programs are affected. Such impact analysis requires not only understanding the defect semantics but also checking whether the client code satisfies complex triggering conditions involving configuration flags, runtime environments, and indirect API usage. We present DepRadar, an agent coordination framework for fine grained defect and impact analysis in DL library updates. DepRadar coordinates four specialized agents across three steps: 1. the PR Miner and Code Diff Analyzer extract structured defect semantics from commits or pull requests, 2. the Orchestrator Agent synthesizes these signals into a unified defect pattern with trigger conditions, and 3. the Impact Analyzer checks downstream programs to determine whether the defect can be triggered. To improve accuracy and explainability, DepRadar integrates static analysis with DL-specific domain rules for defect reasoning and client side tracing. We evaluate DepRadar on 157 PRs and 70 commits across two representative DL libraries. It achieves 90% precision in defect identification and generates high quality structured fields (average field score 1.6). On 122 client programs, DepRadar identifies affected cases with 90% recall and 80% precision, substantially outperforming other baselines.

preprint2022arXiv

Cluster-based Contrastive Disentangling for Generalized Zero-Shot Learning

Generalized Zero-Shot Learning (GZSL) aims to recognize both seen and unseen classes by training only the seen classes, in which the instances of unseen classes tend to be biased towards the seen class. In this paper, we propose a Cluster-based Contrastive Disentangling (CCD) method to improve GZSL by alleviating the semantic gap and domain shift problems. Specifically, we first cluster the batch data to form several sets containing similar classes. Then, we disentangle the visual features into semantic-unspecific and semantic-matched variables, and further disentangle the semantic-matched variables into class-shared and class-unique variables according to the clustering results. The disentangled learning module with random swapping and semantic-visual alignment bridges the semantic gap. Moreover, we introduce contrastive learning on semantic-matched and class-unique variables to learn high intra-set and intra-class similarity, as well as inter-set and inter-class discriminability. Then, the generated visual features conform to the underlying characteristics of general images and have strong discriminative information, which alleviates the domain shift problem well. We evaluate our proposed method on four datasets and achieve state-of-the-art results in both conventional and generalized settings.

preprint2022arXiv

Non-neglectable entropy effect on sintering of supported nanoparticles

Sintering refers to particle coalescence by heat, which has been known as a thermal phenomenon involving all aspects of natural science for centuries. It is particularly important in heterogeneous catalysis because normally sintering results in deactivation of the catalysts. In previous studies, the enthalpy contribution was considered to be dominant in sintering and the entropy effect is generally considered neglectable. However, we unambiguously demonstrate in this work that entropy could prevail over the enthalpy contribution to dominate the sintering behavior of supported nanoparticles (NPs) by designed experiments and improved theoretical framework. Using in situ Cs-corrected environmental scanning transmission electron microscopy and synchrotron-based ambient pressure X-ray photoelectron spectroscopy, we observe the unprecedent entropy-driven phenomenon that supported NPs reversibly redisperse upon heating and sinter upon cooling in three systems (Pd-CeO2, Cu-TiO2, Ag-TiO2). We quantitatively show that the configurational entropy of highly dispersed ad-atoms is large enough to reverse their sintering tendency at the elevated temperature. This work reshapes the basic understanding of sintering at the nanoscale and opens the door for various de-novo designs of thermodynamically stable nanocatalysts.

preprint2021arXiv

Dynamic Normalization

Batch Normalization has become one of the essential components in CNN. It allows the network to use a higher learning rate and speed up training. And the network doesn't need to be initialized carefully. However, in our work, we find that a simple extension of BN can increase the performance of the network. First, we extend BN to adaptively generate scale and shift parameters for each mini-batch data, called DN-C (Batch-shared and Channel-wise). We use the statistical characteristics of mini-batch data ($E[X], Std[X]\in\mathbb{R}^{c}$) as the input of SC module. Then we extend BN to adaptively generate scale and shift parameters for each channel of each sample, called DN-B (Batch and Channel-wise). Our experiments show that DN-C model can't train normally, but DN-B model has very good robustness. In classification task, DN-B can improve the accuracy of the MobileNetV2 on ImageNet-100 more than 2% with only 0.6% additional Mult-Adds. In detection task, DN-B can improve the accuracy of the SSDLite on MS-COCO nearly 4% mAP with the same settings. Compared with BN, DN-B has stable performance when using higher learning rate or smaller batch size.

preprint2020arXiv

FEA-Net: A Physics-guided Data-driven Model for Efficient Mechanical Response Prediction

An innovative physics-guided learning algorithm for predicting the mechanical response of materials and structures is proposed in this paper. The key concept of the proposed study is based on the fact that physics models are governed by Partial Differential Equation (PDE), and its loading/ response mapping can be solved using Finite Element Analysis (FEA). Based on this, a special type of deep convolutional neural network (DCNN) is proposed that takes advantage of our prior knowledge in physics to build data-driven models whose architectures are of physics meaning. This type of network is named as FEA-Net and is used to solve the mechanical response under external loading. Thus, the identification of a mechanical system parameters and the computation of its responses are treated as the learning and inference of FEA-Net, respectively. Case studies on multi-physics (e.g., coupled mechanical-thermal analysis) and multi-phase problems (e.g., composite materials with random micro-structures) are used to demonstrate and verify the theoretical and computational advantages of the proposed method.

preprint2020arXiv

MCMC Guided CNN Training and Segmentation for Pancreas Extraction

Efficient organ segmentation is the precondition of various quantitative analysis. Segmenting the pancreas from abdominal CT images is a challenging task because of its high anatomical variability in shape, size and location. What's more, the pancreas only occupies a small portion in abdomen, and the organ border is very fuzzy. All these factors make the segmentation methods of other organs less suitable for the pancreas segmentation. In this report, we propose a Markov Chain Monte Carlo (MCMC) sampling guided convolutional neural network (CNN) approach, in order to handle such difficulties in morphological and photometric variabilities. Specifically, the proposed method mainly contains three steps: First, registration is carried out to mitigate the body weight and location variability. Then, an MCMC sampling is employed to guide the sampling of 3D patches, which are fed to the CNN for training. At the same time, the pancreas distribution is also learned for the subsequent segmentation. Third, sampled from the learned distribution, an MCMC process guides the segmentation process. Lastly, the patches based segmentation is fused using a Bayesian voting scheme. This method is evaluated on the NIH pancreatic datasets which contains 82 abdominal contrast-enhanced CT volumes. Finally, we achieved a competitive result of 78.13% Dice Similarity Coefficient value and 82.65% Recall value in testing data.

preprint2020arXiv

Spin excitations in nickelate superconductors

We study theoretically spin excitations in the newly discovered nickelate superconductors based on a single-band model and the random phase approximation. The spin excitations are found to be incommensurate in a low energy region. A spin resonance phenomena is revealed as the excitation energy increases. The maximum intensity may be at the incommensurate momentum or the commensurate momentum, depending on the out-of-plane momentum. The spin excitations become incommensurate again at higher energies. The similarities and differences of the spin excitations between nickelate and cuprate superconductors are addressed. Our predicted results can be tested by inelastic neutron scattering experiments later.

preprint2020arXiv

Unidirectional Oriented Water Wire in Short Nanotube

The orientation of water molecules is the key factor for the fast transport of water in small nanotubes. It has been accepted that the bidirectional water burst in short nanotubes can be transformed into unidirectional transport when the orientation of water molecules is maintained in long nanotubes under the external field. In this work, based on molecular dynamics simulations and first-principles calculations, we showed without external field, it only needs 21 water molecules to maintain the unidirectional single file water intrinsically in carbon nanotube at seconds. Detailed analysis indicates that the surprising result comes from the step by step process for the flip of water chain, which is different with the perceived concerted mechanism. Considering the thickness of cell membrane (normally 5-10 nm) is larger than the length threshold of the unidirectional water wire, this study suggests it may not need the external field to maintain the unidirectional flow in the water channel at the macroscopic timescale.