Researcher profile

Zhi Yu

Zhi Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

Beyond Euclidean Prototypes: Spectral Disentanglement and Geodesic Matching for Few-Shot Medical Image Segmentation

Few-Shot Medical Image Segmentation (FSMIS) aims to delineate novel anatomical targets from one or a few annotated support images, addressing the annotation scarcity in medical imaging. Notwithstanding recent advancements, current prototype-based methods are bottlenecked by two coupled limitations: 1) cue entanglement, where a single spatial-domain prototype is forced to summarise organ silhouette, parenchymal texture and boundary appearance simultaneously, so any support-query mismatch on one cue propagates indiscriminately to the others; and 2) topology-blind matching, where cosine similarity measures distance in the ambient Euclidean space and ignores the connectivity of the underlying feature manifold, causing fragmented activations inside low-contrast organs and leakage into neighbouring tissues. To this end, we propose Spectral-Geodesic Prototype Network (SGP-Net), built around a Spectral-Geodesic Prototype Module with two coupled components. A Spectral Prototype Bank (SPB) decomposes support and query features into low-, mid- and high-frequency bands via learnable radial Fourier filters, yielding three disentangled prototypes per class that separately encode shape, texture and boundary cues. A Geodesic Matcher (GM) then replaces cosine similarity with a differentiable heat-diffusion approximation of geodesic distance, propagating matching signals along a feature affinity graph so that on-manifold pixels accumulate consistent responses while off-manifold look-alikes are suppressed. Experiments on three public FSMIS benchmarks demonstrate that SGP-Net achieves competitive performance against recent state-of-the-art methods.

preprint2024arXiv

LORE++: Logical Location Regression Network for Table Structure Recognition with Pre-training

Table structure recognition (TSR) aims at extracting tables in images into machine-understandable formats. Recent methods solve this problem by predicting the adjacency relations of detected cell boxes or learning to directly generate the corresponding markup sequences from the table images. However, existing approaches either count on additional heuristic rules to recover the table structures, or face challenges in capturing long-range dependencies within tables, resulting in increased complexity. In this paper, we propose an alternative paradigm. We model TSR as a logical location regression problem and propose a new TSR framework called LORE, standing for LOgical location REgression network, which for the first time regresses logical location as well as spatial location of table cells in a unified network. Our proposed LORE is conceptually simpler, easier to train, and more accurate than other paradigms of TSR. Moreover, inspired by the persuasive success of pre-trained models on a number of computer vision and natural language processing tasks, we propose two pre-training tasks to enrich the spatial and logical representations at the feature level of LORE, resulting in an upgraded version called LORE++. The incorporation of pre-training in LORE++ has proven to enjoy significant advantages, leading to a substantial enhancement in terms of accuracy, generalization, and few-shot capability compared to its predecessor. Experiments on standard benchmarks against methods of previous paradigms demonstrate the superiority of LORE++, which highlights the potential and promising prospect of the logical location regression paradigm for TSR.

preprint2022arXiv

A data management system for machine learning research of tokamak

In recent years, machine learning (ML) research methods have received increasing attention in the tokamak community. The conventional database (i.e., MDSplus for tokamak) of experimental data has been designed for small group consumption and is mainly aimed at simultaneous visualization of a small amount of data. The ML data access patterns fundamentally differ from traditional data access patterns. The typical MDSplus database is increasingly showing its limitations. We developed a new data management system suitable for tokamak machine learning research based on Experimental Advanced Superconducting Tokamak (EAST) data. The data management system is based on MongoDB and Hierarchical Data Format version 5 (HDF5). Currently, the entire data management has more than 3000 channels of data. The system can provide highly reliable concurrent access. The system includes error correction, MDSplus original data conversion, and high-performance sequence data output. Further, some valuable functions are implemented to accelerate ML model training of fusion, such as bucketing generator, the concatenating buffer, and distributed sequence generation. This data management system is more suitable for fusion machine learning model R\&D than MDSplus, but it can not replace the MDSplus database. The MDSplus database is still the backend for EAST tokamak data acquisition and storage.

preprint2022arXiv

City-Scale Holographic Traffic Flow Data based on Vehicular Trajectory Resampling

Despite abundant accessible traffic data, researches on traffic flow estimation and optimization still face the dilemma of detailedness and integrity in the measurement. A dataset of city-scale vehicular continuous trajectories featuring the finest resolution and integrity, as known as the holographic traffic data, would be a breakthrough, for it could reproduce every detail of the traffic flow evolution and reveal the personal mobility pattern within the city. Due to the high coverage of Automatic Vehicle Identification (AVI) devices in Xuancheng city, we constructed one-month continuous trajectories of daily 80,000 vehicles in the city with accurate intersection passing time and no travel path estimation bias. With such holographic traffic data, it is possible to reproduce every detail of the traffic flow evolution. We presented a set of traffic flow data based on the holographic trajectories resampling, covering the whole 482 road segments in the city round the clock, including stationary average speed and flow data of 5-minute intervals and dynamic floating car data.

preprint2020arXiv

Adaptive-Step Graph Meta-Learner for Few-Shot Graph Classification

Graph classification aims to extract accurate information from graph-structured data for classification and is becoming more and more important in graph learning community. Although Graph Neural Networks (GNNs) have been successfully applied to graph classification tasks, most of them overlook the scarcity of labeled graph data in many applications. For example, in bioinformatics, obtaining protein graph labels usually needs laborious experiments. Recently, few-shot learning has been explored to alleviate this problem with only given a few labeled graph samples of test classes. The shared sub-structures between training classes and test classes are essential in few-shot graph classification. Exiting methods assume that the test classes belong to the same set of super-classes clustered from training classes. However, according to our observations, the label spaces of training classes and test classes usually do not overlap in real-world scenario. As a result, the existing methods don't well capture the local structures of unseen test classes. To overcome the limitation, in this paper, we propose a direct method to capture the sub-structures with well initialized meta-learner within a few adaptation steps. More specifically, (1) we propose a novel framework consisting of a graph meta-learner, which uses GNNs based modules for fast adaptation on graph data, and a step controller for the robustness and generalization of meta-learner; (2) we provide quantitative analysis for the framework and give a graph-dependent upper bound of the generalization error based on our framework; (3) the extensive experiments on real-world datasets demonstrate that our framework gets state-of-the-art results on several few-shot graph classification tasks compared to baselines.

preprint2020arXiv

Experiment data-driven modeling of tokamak discharge in EAST

A model for tokamak discharge through deep learning has been done on a superconducting long-pulse tokamak (EAST). This model can use the control signals (i.e. Neutral Beam Injection (NBI), Ion Cyclotron Resonance Heating (ICRH), etc) to model normal discharge without the need for doing real experiments. By using the data-driven methodology, we exploit the temporal sequence of control signals for a large set of EAST discharges to develop a deep learning model for modeling discharge diagnostic signals, such as electron density $n_{e}$, store energy $W_{mhd}$ and loop voltage $V_{loop}$. Comparing the similar methodology, we use Machine Learning techniques to develop the data-driven model for discharge modeling rather than disruption prediction. Up to 95% similarity was achieved for $W_{mhd}$. The first try showed promising results for modeling of tokamak discharge by using the data-driven methodology. The data-driven methodology provides an alternative to physical-driven modeling for tokamak discharge modeling.

preprint2020arXiv

Matching Text with Deep Mutual Information Estimation

Text matching is a core natural language processing research problem. How to retain sufficient information on both content and structure information is one important challenge. In this paper, we present a neural approach for general-purpose text matching with deep mutual information estimation incorporated. Our approach, Text matching with Deep Info Max (TIM), is integrated with a procedure of unsupervised learning of representations by maximizing the mutual information between text matching neural network's input and output. We use both global and local mutual information to learn text representations. We evaluate our text matching approach on several tasks including natural language inference, paraphrase identification, and answer selection. Compared to the state-of-the-art approaches, the experiments show that our method integrated with mutual information estimation learns better text representation and achieves better experimental results of text matching tasks without exploiting pretraining on external data.