Source author record

Zhen Tan

Zhen Tan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence cond-mat.mtrl-sci Computation and Language Machine Learning Computer Vision Cryptography and Security Software Engineering

Catalog footprint

What is connected

9works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction

Large Language Model based multi-agent systems (MAS) excel at collaborative problem solving but remain brittle to cascading errors: a single faulty step can propagate across agents and disrupt the trajectory. In this paper, we present MASC, a metacognitive framework that endows MAS with real-time, unsupervised, step-level error detection and self-correction. MASC rethinks detection as history-conditioned anomaly scoring via two complementary designs: (1) Next-Execution Reconstruction, which predicts the embedding of the next step from the query and interaction history to capture causal consistency, and (2) Prototype-Guided Enhancement, which learns a prototype prior over normal-step embeddings and uses it to stabilize reconstruction and anomaly scoring under sparse context (e.g., early steps). When an anomaly step is flagged, MASC triggers a correction agent to revise the acting agent's output before information flows downstream. On the Who&When benchmark, MASC consistently outperforms all baselines, improving step-level error detection by up to 8.47% AUC-ROC ; When plugged into diverse MAS frameworks, it delivers consistent end-to-end gains across architectures, confirming that our metacognitive monitoring and targeted correction can mitigate error propagation with minimal overhead.

preprint2026arXiv

To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model

The rapid advancement of Large Vision-Language Models (LVLMs) is increasingly accompanied by unauthorized scraping and training on multimodal web data, posing severe copyright and privacy risks to data owners. Existing countermeasures, such as machine unlearning and watermarks, are inherent post-hoc approaches that act only after intellectual property infringement has already occurred. In this work, we propose MMGuard to empower data owners to proactively protect their multimodal data against unauthorized LVLM fine-tuning. MMGuard generates unlearnable examples by injecting human-imperceptible perturbations that actively exploit the learning dynamics of LVLMs. By minimizing the training loss, the perturbation creates an optimization shortcut, causing the model to overfit to the noise and thereby degrading downstream performance when the perturbation is absent during inference. To further strengthen this defense, MMGuard introduces a cross-modal binding disruption, strategically shifting LVLM attention to enforce a spurious correlation between the noise and the training target with theoretical guarantees. Enhanced by an ensemble learning strategy for cross-model transferability, MMGuard is evaluated against nine open-source LVLMs across six datasets. Our comprehensive results demonstrate effective, stealthy, and robust protection under white-box, gray-box, and black-box threat models, establishing a mechanistic advantage in proactively defending against aggressive fine-tuning exploitation.

preprint2026arXiv

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

Reward-guided search methods have demonstrated strong potential in enhancing tool-using agents by effectively guiding sampling and exploration over complex action spaces. As a core design, those search methods utilize process reward models (PRMs) to provide step-level rewards, enabling more fine-grained monitoring. However, there is a lack of systematic and reliable evaluation benchmarks for PRMs in tool-using settings. In this paper, we introduce ToolPRMBench, a large-scale benchmark specifically designed to evaluate PRMs for tool-using agents. ToolPRMBench is built on top of several representative tool-using benchmarks and converts agent trajectories into step-level test cases. Each case contains the interaction history, a correct action, a plausible but incorrect alternative, and relevant tool metadata. We respectively utilize offline sampling to isolate local single-step errors and online sampling to capture realistic multi-step failures from full agent rollouts. A multi-LLM verification pipeline is proposed to reduce label noise and ensure data quality. We conduct extensive experiments across large language models, general PRMs, and tool-specialized PRMs on ToolPRMBench. The results reveal clear differences in PRM effectiveness and highlight the potential of specialized PRMs for tool-using. Code and data will be released at https://github.com/David-Li0406/ToolPRMBench.

preprint2026arXiv

TRUST: A Framework for Decentralized AI Service v.0.1

Large Reasoning Models (LRMs) and Multi-Agent Systems (MAS) in high-stakes domains demand reliable verification, yet centralized approaches suffer four limitations: (1) Robustness, with single points of failure vulnerable to attacks and bias; (2) Scalability, as reasoning complexity creates bottlenecks; (3) Opacity, as hidden auditing erodes trust; and (4) Privacy, as exposed reasoning traces risk model theft. We introduce TRUST (Transparent, Robust, and Unified Services for Trustworthy AI), a decentralized framework with three innovations: (i) Hierarchical Directed Acyclic Graphs (HDAGs) that decompose Chain-of-Thought reasoning into five abstraction levels for parallel distributed auditing; (ii) the DAAN protocol, which projects multi-agent interactions into Causal Interaction Graphs (CIGs) for deterministic root-cause attribution; and (iii) a multi-tier consensus mechanism among computational checkers, LLM evaluators, and human experts with stake-weighted voting that guarantees correctness under 30% adversarial participation. We prove a Safety-Profitability Theorem ensuring honest auditors profit while malicious actors incur losses. All decisions are recorded on-chain, while privacy-by-design segmentation prevents reconstruction of proprietary logic. Across multiple LLMs and benchmarks, TRUST attains 72.4% accuracy (4-18% above baselines) and remains resilient against 20% corruption. DAAN reaches 70% root-cause attribution (vs. 54-63% for standard methods) with 60% token savings. Human studies validate the design (F1 = 0.89, Brier = 0.074). The framework supports (A1) decentralized auditing, (A2) tamper-proof leaderboards, (A3) trustless data annotation, and (A4) governed autonomous agents, pioneering decentralized AI auditing for safe, accountable deployment of reasoning-capable systems.

preprint2022arXiv

Supervised Graph Contrastive Learning for Few-shot Node Classification

Graphs are present in many real-world applications, such as financial fraud detection, commercial recommendation, and social network analysis. But given the high cost of graph annotation or labeling, we face a severe graph label-scarcity problem, i.e., a graph might have a few labeled nodes. One example of such a problem is the so-called \textit{few-shot node classification}. A predominant approach to this problem resorts to \textit{episodic meta-learning}. In this work, we challenge the status quo by asking a fundamental question whether meta-learning is a must for few-shot node classification tasks. We propose a new and simple framework under the standard few-shot node classification setting as an alternative to meta-learning to learn an effective graph encoder. The framework consists of supervised graph contrastive learning with novel mechanisms for data augmentation, subgraph encoding, and multi-scale contrast on graphs. Extensive experiments on three benchmark datasets (CoraFull, Reddit, Ogbn) show that the new framework significantly outperforms state-of-the-art meta-learning based methods.

preprint2020arXiv

Degree-Aware Alignment for Entities in Tail

Entity alignment (EA) is to discover equivalent entities in knowledge graphs (KGs), which bridges heterogeneous sources of information and facilitates the integration of knowledge. Existing EA solutions mainly rely on structural information to align entities, typically through KG embedding. Nonetheless, in real-life KGs, only a few entities are densely connected to others, and the rest majority possess rather sparse neighborhood structure. We refer to the latter as long-tail entities, and observe that such phenomenon arguably limits the use of structural information for EA. To mitigate the issue, we revisit and investigate into the conventional EA pipeline in pursuit of elegant performance. For pre-alignment, we propose to amplify long-tail entities, which are of relatively weak structural information, with entity name information that is generally available (but overlooked) in the form of concatenated power mean word embeddings. For alignment, under a novel complementary framework of consolidating structural and name signals, we identify entity's degree as important guidance to effectively fuse two different sources of information. To this end, a degree-aware co-attention network is conceived, which dynamically adjusts the significance of features in a degree-aware manner. For post-alignment, we propose to complement original KGs with facts from their counterparts by using confident EA results as anchors via iterative training. Comprehensive experimental evaluations validate the superiority of our proposed techniques.

preprint2014arXiv

Temperature dependent characteristics of GaSb p-channel MOSFETs with Si-implanted source and drain

GaSb p-channel MOSFETs with an atomic layer deposited Al2O3 gate dielectric and a self-aligned Si implanted source/drain are demonstrated. Thermal anneal conditions are optimized for the source/drain impurity activation. Temperature dependent electrical characteristics are investigated. Different electrical behaviors are observed in two different temperature regions and the mechanisms underneath are proposed. Off-state drain current is generation current dominated in the low temperature regions and is diffusion current dominated in the high temperature regions.

preprint2013arXiv

Effects of ozone post deposition treatment on interfacial and electrical characteristics of atomic-layer-deposited Al2O3 and HfO2 films on GaSb substrates

Atomic-layer-deposited Al2O3 and HfO2 films on GaSb substrates were treated by in-situ ozone post deposition treatment (PDT). The effects of ozone PDT on the interfacial and electrical properties of Al2O3 and HfO2 gate dielectric films on GaSb substrates were investigated carefully. It is found that the dielectric quality and the interfacial properties of the Al2O3 and HfO2 films are improved by ozone PDT. After in-situ ozone PDT for 5 min, the Al2O3 and HfO2 films on GaSb substrates exhibit improved electrical and interfacial properties, such as reduced frequency dispersion, gate leakage current, border traps and interface traps. Interface trap density is reduced by ~24% for the Al2O3/GaSb stacks and ~27% for the HfO2/GaSb stacks. In-situ ozone PDT is proved to be a promising technique in improving the quality of high-k gate stacks on GaSb substrates.

preprint2013arXiv

Improved Interfacial and Electrical Properties of GaSb Metal Oxide Semiconductor Devices Passivated with Acidic (NH4)2S Solution

Surface passivation with acidic (NH4)2S solution is shown to be effective in improving the interfacial and electrical properties of HfO2/GaSb metal oxide semiconductor devices. Compared with control samples, those treated with acidic (NH4)2S solution showed great improvements in frequency dispersion, gate leakage current and interface trap density. These improvements were attributed to the acidic (NH4)2S solution enhancing passivation of the substrates, which was analyzed from the perspective of chemical mechanism and confirmed by X-ray photoelectron spectroscopy and high-resolution cross-sectional transmission electron microscopy.

Zhen Tan

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Metacognitive Self-Correction for Multi-Agent System via Prototype-Guided Next-Execution Reconstruction

To See is Not to Learn: Protecting Multimodal Data from Unauthorized Fine-Tuning of Large Vision-Language Model

ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents

TRUST: A Framework for Decentralized AI Service v.0.1

Supervised Graph Contrastive Learning for Few-shot Node Classification

Degree-Aware Alignment for Entities in Tail

Temperature dependent characteristics of GaSb p-channel MOSFETs with Si-implanted source and drain

Effects of ozone post deposition treatment on interfacial and electrical characteristics of atomic-layer-deposited Al2O3 and HfO2 films on GaSb substrates

Improved Interfacial and Electrical Properties of GaSb Metal Oxide Semiconductor Devices Passivated with Acidic (NH4)2S Solution