Source author record

Leo Zhang

Leo Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Cryptography and Security Hardware Architecture Machine Learning Software Engineering

Catalog footprint

What is connected

4works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

The rise of AI agent frameworks has introduced agent skills, modular packages containing instructions and executable code that dynamically extend agent capabilities. While this architecture enables powerful customization, skills execute with implicit trust and minimal vetting, creating a significant yet uncharacterized attack surface. We conduct the first large-scale empirical security analysis of this emerging ecosystem, collecting 42,447 skills from two major marketplaces and systematically analyzing 31,132 using SkillScan, a multi-stage detection framework integrating static analysis with LLM-based semantic classification. Our findings reveal pervasive security risks: 26.1% of skills contain at least one vulnerability, spanning 14 distinct patterns across four categories: prompt injection, data exfiltration, privilege escalation, and supply chain risks. Data exfiltration (13.3%) and privilege escalation (11.8%) are most prevalent, while 5.2% of skills exhibit high-severity patterns strongly suggesting malicious intent. We find that skills bundling executable scripts are 2.12x more likely to contain vulnerabilities than instruction-only skills (OR=2.12, p<0.001). Our contributions include: (1) a grounded vulnerability taxonomy derived from 8,126 vulnerable skills, (2) a validated detection methodology achieving 86.7% precision and 82.5% recall, and (3) an open dataset and detection toolkit to support future research. These results demonstrate an urgent need for capability-based permission systems and mandatory security vetting before this attack vector is further exploited.

preprint2026arXiv

Memory-Guided Unified Hardware Accelerator for Mixed-Precision Scientific Computing

Recent hardware acceleration advances have enabled powerful specialized accelerators for finite element computations, spiking neural network inference, and sparse tensor operations. However, existing approaches face fundamental limitations: (1) finite element methods lack comprehensive rounding error analysis for reduced-precision implementations and use fixed precision assignment strategies that cannot adapt to varying numerical conditioning; (2) spiking neural network accelerators cannot handle non-spike operations and suffer from bit-width escalation as network depth increases; and (3) FPGA tensor accelerators optimize only for dense computations while requiring manual configuration for each sparsity pattern. To address these challenges, we introduce \textbf{Memory-Guided Unified Hardware Accelerator for Mixed-Precision Scientific Computing}, a novel framework that integrates three enhanced modules with memory-guided adaptation for efficient mixed-workload processing on unified platforms. Our approach employs memory-guided precision selection to overcome fixed precision limitations, integrates experience-driven bit-width management and dynamic parallelism adaptation for enhanced spiking neural network acceleration, and introduces curriculum learning for automatic sparsity pattern discovery. Extensive experiments on FEniCS, COMSOL, ANSYS benchmarks, MNIST, CIFAR-10, CIFAR-100, DVS-Gesture datasets, and COCO 2017 demonstrate 2.8\% improvement in numerical accuracy, 47\% throughput increase, 34\% energy reduction, and 45-65\% throughput improvement compared to specialized accelerators. Our work enables unified processing of finite element methods, spiking neural networks, and sparse computations on a single platform while eliminating data transfer overhead between separate units.

preprint2026arXiv

Sampling from Flow Language Models via Marginal-Conditioned Bridges

Flow Language Models (FLMs) are a recently introduced class of language models which adapt continuous flow matching for one-hot encoded token sequences. Their denoisers have a special structure absent from generic continuous diffusion models: each block of the denoising mean is a posterior marginal distribution over the clean token at that position. Standard DDPM-style samplers collapse these marginals to a single conditional-mean endpoint and bridge toward this simplex-valued point, which is generally not a valid one-hot sequence. We argue that the natural sampler for an FLM is instead posterior-predictive. At each reverse step, we sample a clean one-hot endpoint from the factorized posterior defined by the FLM token marginals, and then sample the next continuous state from the analytic Ornstein--Uhlenbeck bridge conditioned on that endpoint. The method is training-free, uses the same model evaluations as standard sampling, and gives a principled interface for token-level decoding controls such as temperature scaling and nucleus truncation. We show that, under exact posterior marginals, the endpoint approximation error is exactly the conditional multi-information among token positions. The induced one-step bridge kernel preserves all token-wise posterior-predictive marginals and loses only the residual cross-position dependence. Finally, we prove a Girsanov path-space comparison showing that the marginal-conditioned bridge has a no-larger denoising-error term than the frozen conditional-mean bridge, with strict improvement whenever intermediate coordinate-wise bridge observations reveal additional information about the clean token. Experiments with FLMs show that the sampler improves the quality--diversity tradeoff. Code is available at: github.com/imbirik/mcb.

preprint2026arXiv

Yuan3.0 Flash: An Open Multimodal Large Language Model for Enterprise Applications

We introduce Yuan3.0 Flash, an open-source Mixture-of-Experts (MoE) MultiModal Large Language Model featuring 3.7B activated parameters and 40B total parameters, specifically designed to enhance performance on enterprise-oriented tasks while maintaining competitive capabilities on general-purpose tasks. To address the overthinking phenomenon commonly observed in Large Reasoning Models (LRMs), we propose Reflection-aware Adaptive Policy Optimization (RAPO), a novel RL training algorithm that effectively regulates overthinking behaviors. In enterprise-oriented tasks such as retrieval-augmented generation (RAG), complex table understanding, and summarization, Yuan3.0 Flash consistently achieves superior performance. Moreover, it also demonstrates strong reasoning capabilities in domains such as mathematics, science, etc., attaining accuracy comparable to frontier model while requiring only approximately 1/4 to 1/2 of the average tokens. Yuan3.0 Flash has been fully open-sourced to facilitate further research and real-world deployment: https://github.com/Yuan-lab-LLM/Yuan3.0.

Leo Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

Memory-Guided Unified Hardware Accelerator for Mixed-Precision Scientific Computing

Sampling from Flow Language Models via Marginal-Conditioned Bridges

Yuan3.0 Flash: An Open Multimodal Large Language Model for Enterprise Applications