Source author record

Yun Zhu

Yun Zhu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language nlin.CD Computer Vision Cryptography and Security

Catalog footprint

What is connected

7works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Traditional workflow-based agents exhibit limited intelligence when addressing real-world problems requiring tool invocation. Tool-integrated reasoning (TIR) agents capable of autonomous reasoning and tool invocation are rapidly emerging as a powerful approach for complex decision-making tasks involving multi-step interactions with external environments. In this work, we introduce MindWatcher, a TIR agent integrating interleaved thinking and multimodal chain-of-thought (CoT) reasoning. MindWatcher can autonomously decide whether and how to invoke diverse tools and coordinate their use, without relying on human prompts or workflows. The interleaved thinking paradigm enables the model to switch between thinking and tool calling at any intermediate stage, while its multimodal CoT capability allows manipulation of images during reasoning to yield more precise search results. We implement automated data auditing and evaluation pipelines, complemented by manually curated high-quality datasets for training, and we construct a benchmark, called MindWatcher-Evaluate Bench (MWE-Bench), to evaluate its performance. MindWatcher is equipped with a comprehensive suite of auxiliary reasoning tools, enabling it to address broad-domain multimodal problems. A large-scale, high-quality local image retrieval database, covering eight categories including cars, animals, and plants, endows model with robust object recognition despite its small size. Finally, we design a more efficient training infrastructure for MindWatcher, enhancing training speed and hardware utilization. Experiments not only demonstrate that MindWatcher matches or exceeds the performance of larger or more recent models through superior tool invocation, but also uncover critical insights for agent training, such as the genetic inheritance phenomenon in agentic RL.

preprint2025arXiv

Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets

The construction of Supervised Fine-Tuning (SFT) datasets is a critical yet under-theorized stage in the post-training of Large Language Models (LLMs), as prevalent practices often rely on heuristic aggregation without a systematic understanding of how individual samples contribute to model performance. In this report, we propose a paradigm shift from ad-hoc curation to a closed-loop dataset engineering framework using OpenDataArena (ODA), which leverages value-anchored rankings and multi-dimensional analysis to transform value benchmarking into feedback signals guiding dataset construction. We instantiate this methodology through two new datasets: \textbf{ODA-Math-460k}, a specialized mathematics reasoning dataset that utilizes a novel two-stage difficulty-aware pipeline to achieve State-of-the-Art (SOTA) results on benchmarks such as AIME and HMMT, and \textbf{ODA-Mixture (100k \& 500k)}, a series of multi-domain instruction datasets built via an ``Anchor-and-Patch'' strategy that outperforms significantly larger open-source baselines. Our empirical results demonstrate that ODA-driven datasets significantly improve both domain-specific reasoning and general utility while achieving superior data efficiency, validating a transition toward data-centric AI where transparent evaluation serves as the primary engine for engineering high-quality training data.

preprint2022arXiv

Solving the Capsulation Attack against Backdoor-based Deep Neural Network Watermarks by Reversing Triggers

Backdoor-based watermarking schemes were proposed to protect the intellectual property of artificial intelligence models, especially deep neural networks, under the black-box setting. Compared with ordinary backdoors, backdoor-based watermarks need to digitally incorporate the owner's identity, which fact adds extra requirements to the trigger generation and verification programs. Moreover, these concerns produce additional security risks after the watermarking scheme has been published for as a forensics tool or the owner's evidence has been eavesdropped on. This paper proposes the capsulation attack, an efficient method that can invalidate most established backdoor-based watermarking schemes without sacrificing the pirated model's functionality. By encapsulating the deep neural network with a rule-based or Bayes filter, an adversary can block ownership probing and reject the ownership verification. We propose a metric, CAScore, to measure a backdoor-based watermarking scheme's security against the capsulation attack. This paper also proposes a new backdoor-based deep neural network watermarking scheme that is secure against the capsulation attack by reversing the encoding process and randomizing the exposure of triggers.

preprint2020arXiv

Fast Coherent Point Drift

Nonrigid point set registration is widely applied in the tasks of computer vision and pattern recognition. Coherent point drift (CPD) is a classical method for nonrigid point set registration. However, to solve spatial transformation functions, CPD has to compute inversion of a M*M matrix per iteration with time complexity O(M3). By introducing a simple corresponding constraint, we develop a fast implementation of CPD. The most advantage of our method is to avoid matrix-inverse operation. Before the iteration begins, our method requires to take eigenvalue decomposition of a M*M matrix once. After iteration begins, our method only needs to update a diagonal matrix with linear computational complexity, and perform matrix multiplication operation with time complexity approximately O(M2) in each iteration. Besides, our method can be further accelerated by the low-rank matrix approximation. Experimental results in 3D point cloud data show that our method can significantly reduce computation burden of the registration process, and keep comparable performance with CPD on accuracy.

preprint2016arXiv

Semantic Similarity Strategies for Job Title Classification

Automatic and accurate classification of items enables numerous downstream applications in many domains. These applications can range from faceted browsing of items to product recommendations and big data analytics. In the online recruitment domain, we refer to classifying job ads to pre-defined or custom occupation categories as job title classification. A large-scale job title classification system can power various downstream applications such as semantic search, job recommendations and labor market analytics. In this paper, we discuss experiments conducted to improve our in-house job title classification system. The classification component of the system is composed of a two-stage coarse and fine level classifier cascade that classifies input text such as job title and/or job ads to one of the thousands of job titles in our taxonomy. To improve classification accuracy and effectiveness, we experiment with various semantic representation strategies such as average W2V vectors and document similarity measures such as Word Movers Distance (WMD). Our initial results show an overall improvement in accuracy of Carotene[1].

preprint2011arXiv

A novel type of spiral wave with trapped ions

Pattern formation in ultra-cold quantum systems has recently received a great deal of attention.In this work, we investigate a two-dimensional model system accounting for the dynamics of trapped ions. We find a novel spiral wave which is rigidly rotating but with a peculiar core region in which adjacent ions oscillate in anti-phase. The formation of this novel spiral wave is ascribed to the novel excitability reported by Lee and Cross. The breakup of the novel spiral wave is probed and, especially, one extraordinary scenario of the disappearance of spiral wave caused by spontaneous expansion of the anti-phased core is unveiled.

preprint2011arXiv

The oscillating two-cluster chimera state in non-locally coupled phase oscillators

We investigate an array of identical phase oscillators non-locally coupled without time delay, and find that chimera state with two coherent clusters exists which is only reported in delay-coupled systems previously. Moreover, we find that the chimera state is not stationary for any finite number of oscillators. The existence of the two-cluster chimera state and its time-dependent behaviors for finite number of oscillators are confirmed by the theoretical analysis based on the self-consistency treatment and the Ott-Antonsen ansatz.

Yun Zhu

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Closing the Data Loop: Using OpenDataArena to Engineer Superior Training Datasets

Solving the Capsulation Attack against Backdoor-based Deep Neural Network Watermarks by Reversing Triggers

Fast Coherent Point Drift

Semantic Similarity Strategies for Job Title Classification

A novel type of spiral wave with trapped ions

The oscillating two-cluster chimera state in non-locally coupled phase oscillators