Researcher profile

Zhifeng Hao

Zhifeng Hao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Provable Sparse Inversion and Token Relabel Enhanced One-shot Federated Learning with ViTs

One-Shot Federated Learning, where a central server learns a global model in a single communication round, has emerged as a promising paradigm. However, under extremely non-IID settings, existing data-free methods often generate low-quality data that suffers from severe semantic misalignment with ground-truth labels. To overcome these issues, we propose a novel Federated Model Inversion and Token Relabel (FedMITR) framework, which trains the global model by fully exploiting all patches of synthetic images. Specifically, FedMITR employs sparse model inversion during data generation, selectively inverting semantic foregrounds while halting the inversion of uninformative backgrounds. To address semantically meaningless tokens that hinder ViT predictions, we implement a differentiated strategy: patches with high information density utilize generated pseudo-labels, while patches with low information density are relabeled via ensemble models for robust distillation. Theoretically, our analysis based on algorithmic stability reveals that Sparse Model Inversion eliminates gradient instability arising from background noise, while Token Relabel effectively reduces gradient variance, collectively guaranteeing a tighter generalization bound. Empirically, extensive experimental results demonstrate that FedMITR substantially outperforms existing baselines under various settings.

preprint2026arXiv

SAM-NER: Semantic Archetype Mediation for Zero-Shot Named Entity Recognition

Zero-shot Named Entity Recognition (ZS-NER) remains brittle under domain and schema shifts, where unseen label definitions often misalign with a large language model's (LLM's) intrinsic semantic organization. As a result, directly mapping entity mentions to fine-grained target labels can induce systematic semantic drift, especially when target schemas are novel or semantically overlapping. We propose \textbf{SAM-NER}, a three-stage framework based on \emph{Semantic Archetype Mediation} that stabilizes cross-domain transfer through an intermediate, domain-invariant archetype space. SAM-NER: (i) performs \emph{Entity Discovery} via cooperative extraction and consensus-based denoising to obtain high-coverage, high-fidelity entity spans; (ii) conducts \emph{Abstract Mediation} by projecting entities into a compact set of universal semantic archetypes distilled from high-level ontological abstractions; and (iii) applies \emph{Semantic Calibration} to resolve archetype-level predictions into target-domain types through constrained, definition-aligned inference with a frozen LLM. Experiments on the CrossNER benchmark show that SAM-NER consistently outperforms strong prior ZS-NER baselines in cross-domain settings. Our implementation will be open-sourced at https://github.com/DMIRLAB-Group/SAM-NER.

preprint2026arXiv

SERE: Structural Example Retrieval for Enhancing LLMs in Event Causality Identification

Event Causality Identification (ECI) requires models to determine whether a given pair of events in a context exhibits a causal relationship. While Large Language Models (LLMs) have demonstrated strong performance across various NLP tasks, their effectiveness in ECI remains limited due to biases in causal reasoning, often leading to overprediction of causal relationships (causal hallucination). To mitigate these issues and enhance LLM performance in ECI, we propose SERE, a structural example retrieval framework that leverages LLMs' few-shot learning capabilities. SERE introduces an innovative retrieval mechanism based on three structural concepts: (i) Conceptual Path Metric, which measures the conceptual relationship between events using edit distance in ConceptNet; (ii) Syntactic Metric, which quantifies structural similarity through tree edit distance on syntactic trees; and (iii) Causal Pattern Filtering, which filters examples based on predefined causal structures using LLMs. By integrating these structural retrieval strategies, SERE selects more relevant examples to guide LLMs in causal reasoning, mitigating bias and improving accuracy in ECI tasks. Extensive experiments on multiple ECI datasets validate the effectiveness of SERE. The source code is publicly available at https://github.com/DMIRLAB-Group/SERE.

preprint2025arXiv

Biochemical Computing Mode for Sequential Logic

Recent years have witnessed the growing scholarly interest in the next-generation general-purpose computers. Various innovative computing modes have been proposed, such as optical, quantum phenomena, and DNA-based modes. Sequential logic circuits are a critical factor that enables these modes to function as general-purpose computers, given their essential role in facilitating continuous computation and memory storage through their ability to store states. However, compared to computability, it is often overlooked due to the difficulty of its implementation. In this paper, we first demonstrate sequential mapping, a crucial necessary condition for electronic computers to realize sequential logic circuits, and highlight this distinctive property of general-purpose computers in the context of logic gate circuits. To achieve computational functionalities comparable to those of electronic computers, we utilize the control effect of enzymes on enzymatic reactions to design a logic gate model that is composed of small molecules and driven by enzymes, subsequently propose a biochemical computing mode. Furthermore, we mathematically analyze the static and dynamic input-output properties of biochemical logic gate components and prove that the biochemical computing mode satisfies sequential mapping similar to electronic computers. When combined with the storage characteristics of NOT-AND gates, it can realize sequential logic circuits. The findings can serve as a theoretical foundation for developing general-purpose biochemical computers.

preprint2024arXiv

Homophily-Related: Adaptive Hybrid Graph Filter for Multi-View Graph Clustering

Recently there is a growing focus on graph data, and multi-view graph clustering has become a popular area of research interest. Most of the existing methods are only applicable to homophilous graphs, yet the extensive real-world graph data can hardly fulfill the homophily assumption, where the connected nodes tend to belong to the same class. Several studies have pointed out that the poor performance on heterophilous graphs is actually due to the fact that conventional graph neural networks (GNNs), which are essentially low-pass filters, discard information other than the low-frequency information on the graph. Nevertheless, on certain graphs, particularly heterophilous ones, neglecting high-frequency information and focusing solely on low-frequency information impedes the learning of node representations. To break this limitation, our motivation is to perform graph filtering that is closely related to the homophily degree of the given graph, with the aim of fully leveraging both low-frequency and high-frequency signals to learn distinguishable node embedding. In this work, we propose Adaptive Hybrid Graph Filter for Multi-View Graph Clustering (AHGFC). Specifically, a graph joint process and graph joint aggregation matrix are first designed by using the intrinsic node features and adjacency relationship, which makes the low and high-frequency signals on the graph more distinguishable. Then we design an adaptive hybrid graph filter that is related to the homophily degree, which learns the node embedding based on the graph joint aggregation matrix. After that, the node embedding of each view is weighted and fused into a consensus embedding for the downstream task. Experimental results show that our proposed model performs well on six datasets containing homophilous and heterophilous graphs.

preprint2022arXiv

Causal Discovery with Multi-Domain LiNGAM for Latent Factors

Discovering causal structures among latent factors from observed data is a particularly challenging problem. Despite some efforts for this problem, existing methods focus on the single-domain data only. In this paper, we propose Multi-Domain Linear Non-Gaussian Acyclic Models for Latent Factors (MD-LiNA), where the causal structure among latent factors of interest is shared for all domains, and we provide its identification results. The model enriches the causal representation for multi-domain data. We propose an integrated two-phase algorithm to estimate the model. In particular, we first locate the latent factors and estimate the factor loading matrix. Then to uncover the causal structure among shared latent factors of interest, we derive a score function based on the characterization of independence relations between external influences and the dependence relations between multi-domain latent factors and latent factors of interest. We show that the proposed method provides locally consistent estimators. Experimental results on both synthetic and real-world data demonstrate the efficacy and robustness of our approach.

preprint2022arXiv

SADGA: Structure-Aware Dual Graph Aggregation Network for Text-to-SQL

The Text-to-SQL task, aiming to translate the natural language of the questions into SQL queries, has drawn much attention recently. One of the most challenging problems of Text-to-SQL is how to generalize the trained model to the unseen database schemas, also known as the cross-domain Text-to-SQL task. The key lies in the generalizability of (i) the encoding method to model the question and the database schema and (ii) the question-schema linking method to learn the mapping between words in the question and tables/columns in the database schema. Focusing on the above two key issues, we propose a Structure-Aware Dual Graph Aggregation Network (SADGA) for cross-domain Text-to-SQL. In SADGA, we adopt the graph structure to provide a unified encoding model for both the natural language question and database schema. Based on the proposed unified modeling, we further devise a structure-aware aggregation method to learn the mapping between the question-graph and schema-graph. The structure-aware aggregation method is featured with Global Graph Linking, Local Graph Linking, and Dual-Graph Aggregation Mechanism. We not only study the performance of our proposal empirically but also achieved 3rd place on the challenging Text-to-SQL benchmark Spider at the time of writing.

preprint2022arXiv

THP: Topological Hawkes Processes for Learning Causal Structure on Event Sequences

Learning causal structure among event types on multi-type event sequences is an important but challenging task. Existing methods, such as the Multivariate Hawkes processes, mostly assumed that each sequence is independent and identically distributed. However, in many real-world applications, it is commonplace to encounter a topological network behind the event sequences such that an event is excited or inhibited not only by its history but also by its topological neighbors. Consequently, the failure in describing the topological dependency among the event sequences leads to the error detection of the causal structure. By considering the Hawkes processes from the view of temporal convolution, we propose a Topological Hawkes process (THP) to draw a connection between the graph convolution in the topology domain and the temporal convolution in time domains. We further propose a causal structure learning method on THP in a likelihood framework. The proposed method is featured with the graph convolution-based likelihood function of THP and a sparse optimization scheme with an Expectation-Maximization of the likelihood function. Theoretical analysis and experiments on both synthetic and real-world data demonstrate the effectiveness of the proposed method

preprint2022arXiv

Vision-based Distributed Multi-UAV Collision Avoidance via Deep Reinforcement Learning for Navigation

Online path planning for multiple unmanned aerial vehicle (multi-UAV) systems is considered a challenging task. It needs to ensure collision-free path planning in real-time, especially when the multi-UAV systems can become very crowded on certain occasions. In this paper, we presented a vision-based decentralized collision-avoidance policy for multi-UAV systems, which takes depth images and inertial measurements as sensory inputs and outputs UAV's steering commands. The policy is trained together with the latent representation of depth images using a policy gradient-based reinforcement learning algorithm and autoencoder in the multi-UAV threedimensional workspaces. Each UAV follows the same trained policy and acts independently to reach the goal without colliding or communicating with other UAVs. We validate our policy in various simulated scenarios. The experimental results show that our learned policy can guarantee fully autonomous collision-free navigation for multi-UAV in the three-dimensional workspaces with good robustness and scalability.

preprint2019arXiv

General Multi-State Rework Network and Reliability Algorithm

A rework network is a common manufacturing system, in which flows (products) are processed in a sequence of workstations (nodes), which often results in defective products. To improve the productivity and utility of the system, the rework network allows some of the defective products to go back to the "as normal" condition after the rework process. In a recent study, Song proposed an algorithm to correct more than 21 archive publications regarding the rework network reliability problem, which is an important real-life problem. However, we prove that Song's proposed algorithm is still incorrect. Additionally, we provide an accurate general model based on the novel state distribution with a smaller number of limitations. Furthermore, we propose an algorithm to calculate the reliability of the multi-state rework networks using the proposed novel state distributions.