Researcher profile

Xiaoxu Zhang

Xiaoxu Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Recent advances in coding agents suggest rapid progress toward autonomous software development, yet existing benchmarks fail to rigorously evaluate the long-horizon capabilities required to build complete software systems. Most prior evaluations focus on localized code generation, scaffolded completion, or short-term repair tasks, leaving open the question of whether agents can sustain coherent reasoning, planning, and execution over the extended horizons demanded by real-world repository construction. To address this gap, we present NL2Repo Bench, a benchmark explicitly designed to evaluate the long-horizon repository generation ability of coding agents. Given only a single natural-language requirements document and an empty workspace, agents must autonomously design the architecture, manage dependencies, implement multi-module logic, and produce a fully installable Python library. Our experiments across state-of-the-art open- and closed-source models reveal that long-horizon repository generation remains largely unsolved: even the strongest agents achieve below 40% average test pass rates and rarely complete an entire repository correctly. Detailed analysis uncovers fundamental long-horizon failure modes, including premature termination, loss of global coherence, fragile cross-file dependencies, and inadequate planning over hundreds of interaction steps. NL2Repo Bench establishes a rigorous, verifiable testbed for measuring sustained agentic competence and highlights long-horizon reasoning as a central bottleneck for the next generation of autonomous coding agents.

preprint2020arXiv

Modeling, Analysis, and Optimization of Grant-Free NOMA in Massive MTC via Stochastic Geometry

Massive machine-type communications (mMTC) is a crucial scenario to support booming Internet of Things (IoTs) applications. In mMTC, although a large number of devices are registered to an access point (AP), very few of them are active with uplink short packet transmission at the same time, which requires novel design of protocols and receivers to enable efficient data transmission and accurate multi-user detection (MUD). Aiming at this problem, grant-free non-orthogonal multiple access (GF-NOMA) protocol is proposed. In GF-NOMA, active devices can directly transmit their preambles and data symbols altogether within one time frame, without grant from the AP. Compressive sensing (CS)-based receivers are adopted for non-orthogonal preambles (NOP)-based MUD, and successive interference cancellation is exploited to decode the superimposed data signals. In this paper, we model, analyze, and optimize the CS-based GF-MONA mMTC system via stochastic geometry (SG), from an aspect of network deployment. Based on the SG network model, we first analyze the success probability as well as the channel estimation error of the CS-based MUD in the preamble phase and then analyze the average aggregate data rate in the data phase. As IoT applications highly demands low energy consumption, low infrastructure cost, and flexible deployment, we optimize the energy efficiency and AP coverage efficiency of GF-NOMA via numerical methods. The validity of our analysis is verified via Monte Carlo simulations. Simulation results also show that CS-based GF-NOMA with NOP yields better MUD and data rate performances than contention-based GF-NOMA with orthogonal preambles and CS-based grant-free orthogonal multiple access.