Researcher profile

Pengxiang Zhao

Pengxiang Zhao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

A3: Android Agent Arena for Mobile GUI Agents with Essential-State Procedural Evaluation

The advancement of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) has catalyzed the development of mobile graphic user interface (GUI) AI agents, which is designed to autonomously perform tasks on mobile devices. However, a significant gap persists in mobile GUI agent evaluation, where existing benchmarks predominantly rely on either static frame assessments such as AndroidControl or offline static apps such as AndroidWorld and thus fail to capture agent performance in dynamic, real-world online mobile apps. To address this gap, we present Android Agent Arena (A3), a novel "essential-state" based procedural evaluation system for mobile GUI agents. A3 introduces a benchmark of 100 tasks derived from 20 widely-used, dynamic online apps across 20 categories from the Google Play Store, ensuring evaluation comprehension. A3 also presents a novel "essential-state" based procedural evaluation method that leverages MLLMs as reward models to progressively verify task completion and process achievement. This evaluation approach address the limitations of traditional function based evaluation methods on online dynamic apps. Furthermore, A3 includes a toolkit to streamline Android device interaction, reset online environment and apps and facilitate data collection from both human and agent demonstrations. The complete A3 system, including the benchmark and tools, will be publicly released to provide a robust foundation for future research and development in mobile GUI agents.

preprint2026arXiv

How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction

Large language models (LLMs) excel at semantic understanding, yet their ability to reconstruct internal structure from scrambled inputs remains underexplored. Sentence-level restoration is ill-posed for automated evaluation because multiple valid word orders often exist. We introduce OrderProbe, a deterministic benchmark for structural reconstruction using fixed four-character expressions in Chinese, Japanese, and Korean, which have a unique canonical order and thus support exact-match scoring. We further propose a diagnostic framework that evaluates models beyond recovery accuracy, including semantic fidelity, logical validity, consistency, robustness sensitivity, and information density. Experiments on twelve widely used LLMs show that structural reconstruction remains difficult even for frontier systems: zero-shot recovery frequently falls below 35%. We also observe a consistent dissociation between semantic recall and structural planning, suggesting that structural robustness is not an automatic byproduct of semantic competence.

preprint2022arXiv

Network Bandwidth Allocation Problem For Cloud Computing

Cloud computing enables ubiquitous, convenient, and on-demand network access to a shared pool of computing resources. Cloud computing technologies create tremendous commercial values in various areas, while many scientific challenges have arisen accordingly. The process of transmitting data through networks is characterized by some distinctive characteristics such as nonlinear, nonconvex and even noncontinuous cost functions generated by pricing schemes, periodically updated network topology, as well as replicable data within network nodes. Because of these characteristics, data transfer scheduling is a very challenging problem both engineeringly and scientifically. On the other hand, the cost for bandwidth is a major component of the operating cost for cloud providers, and thus how to save bandwidth cost is extremely important for them to supply service with minimized cost. We propose the Network Bandwidth Allocation (NBA) problem for cloud computing and formulate it as an integer programming model on a high level, with which more comprehensive and rigorous scientific studies become possible. We also show that the NBA problem captures some of the major cloud computing scenarios including the content delivery network (CDN), the live video delivery network (LVDN), the real-time communication network (RTCN), and the cloud wide area network (Cloud-WAN).

preprint2020arXiv

Mitochondria in higher plants possess H2 evolving activity which is closely related to complex I

Hydrogenase occupy a central place in the energy metabolism of anaerobic bacteria. Although the structure of mitochondrial complex I is similar to that of hydrogenase, whether it has hydrogen metabolic activity remain unclear. Here, we show that a H2 evolving activity exists in higher plants mitochondria and is closely related to complex I, especially around ubiquinone binding site. The H2 production could be inhibited by rotenone and ubiquinone. Hypoxia could simultaneously promote H2 evolution and succinate accumulation. Redox properties of quinone pool, adjusted by NADH or succinate according to oxygen concentration, acts as a valve to control the flow of protons and electrons and the production of H2. The coupling of H2 evolving activity of mitochondrial complex I with metabolic regulation reveals a more effective redox homeostasis regulation mechanism. Considering the ubiquity of mitochondria in eukaryotes, H2 metabolism might be the innate function of higher organisms. This may serve to explain, at least in part, the broad physiological effects of H2.