Researcher profile

Mingliang Li

Mingliang Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

STEP3-VL-10B Technical Report

We present STEP3-VL-10B, a lightweight open-source foundation model designed to redefine the trade-off between compact efficiency and frontier-level multimodal intelligence. STEP3-VL-10B is realized through two strategic shifts: first, a unified, fully unfrozen pre-training strategy on 1.2T multimodal tokens that integrates a language-aligned Perception Encoder with a Qwen3-8B decoder to establish intrinsic vision-language synergy; and second, a scaled post-training pipeline featuring over 1k iterations of reinforcement learning. Crucially, we implement Parallel Coordinated Reasoning (PaCoRe) to scale test-time compute, allocating resources to scalable perceptual reasoning that explores and synthesizes diverse visual hypotheses. Consequently, despite its compact 10B footprint, STEP3-VL-10B rivals or surpasses models 10$\times$-20$\times$ larger (e.g., GLM-4.6V-106B, Qwen3-VL-235B) and top-tier proprietary flagships like Gemini 2.5 Pro and Seed-1.5-VL. Delivering best-in-class performance, it records 92.2% on MMBench and 80.11% on MMMU, while excelling in complex reasoning with 94.43% on AIME2025 and 75.95% on MathVision. We release the full model suite to provide the community with a powerful, efficient, and reproducible baseline.

preprint2025arXiv

CoHalLo: code hallucination localization via probing hidden layer vector

The localization of code hallucinations aims to identify specific lines of code containing hallucinations, helping developers to improve the reliability of AI-generated code more efficiently. Although recent studies have adopted several methods to detect code hallucination, most of these approaches remain limited to coarse-grained detection and lack specialized techniques for fine-grained hallucination localization. This study introduces a novel method, called CoHalLo, which achieves line-level code hallucination localization by probing the hidden-layer vectors from hallucination detection models. CoHalLo uncovers the key syntactic information driving the model's hallucination judgments and locates the hallucinating code lines accordingly. Specifically, we first fine-tune the hallucination detection model on manually annotated datasets to ensure that it learns features pertinent to code syntactic information. Subsequently, we designed a probe network that projects high-dimensional latent vectors onto a low-dimensional syntactic subspace, generating vector tuples and reconstructing the predicted abstract syntax tree (P-AST). By comparing P-AST with the original abstract syntax tree (O-AST) extracted from the input AI-generated code, we identify the key syntactic structures associated with hallucinations. This information is then used to pinpoint hallucinated code lines. To evaluate CoHalLo's performance, we manually collected a dataset of code hallucinations. The experimental results show that CoHalLo achieves a Top-1 accuracy of 0.4253, Top-3 accuracy of 0.6149, Top-5 accuracy of 0.7356, Top-10 accuracy of 0.8333, IFA of 5.73, Recall@1% Effort of 0.052721, and Effort@20% Recall of 0.155269, which outperforms the baseline methods.