Researcher profile

Zhiwei Deng

Zhiwei Deng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

Large Language Models (LLMs) struggle to solve complex combinatorial problems through direct reasoning, so recent neuro-symbolic systems increasingly use them to synthesize executable solvers. A central design question is how the LLM should represent the solver, and whether it should also attempt to optimize search. We introduce CP-SynC-XL, a benchmark of 100 combinatorial problems (4,577 instances), and evaluate three solver-construction paradigms: native algorithmic search (Python), constraint modeling through a Python solver API (Python + OR-Tools), and declarative constraint modeling (MiniZinc + OR-Tools). We find a consistent representational divergence: Python + OR-Tools attains the highest correctness across LLMs, while MiniZinc + OR-Tools has lower absolute coverage despite using the same OR-Tools back-end. Native Python is the most likely to return a schema-valid solution that fails verification, whereas solver-backed paths preserve higher conditional fidelity. On the heuristic axis, prompting for search optimization yields only small median speed-ups (1.03-1.12x) and a strongly bimodal effect: many instances slow down, and correctness drops sharply on a long tail of problems. A paired code-level audit traces these regressions to a recurring heuristic trap. Under an efficiency-oriented prompt, the LLM may replace complete search with local approximations (Python), inject unverified bounds (Python + OR-Tools), or add redundant declarative machinery that overwhelms or over-constrains the model (MiniZinc + OR-Tools). These findings support a conservative design principle for LLM-generated combinatorial solvers: use the LLM primarily to formalize variables, constraints, and objectives for verified solvers, and separately check any LLM-authored search optimization before use.

preprint2020arXiv

BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps

Learning to follow instructions is of fundamental importance to autonomous agents for vision-and-language navigation (VLN). In this paper, we study how an agent can navigate long paths when learning from a corpus that consists of shorter ones. We show that existing state-of-the-art agents do not generalize well. To this end, we propose BabyWalk, a new VLN agent that is learned to navigate by decomposing long instructions into shorter ones (BabySteps) and completing them sequentially. A special design memory buffer is used by the agent to turn its past experiences into contexts for future steps. The learning process is composed of two phases. In the first phase, the agent uses imitation learning from demonstration to accomplish BabySteps. In the second phase, the agent uses curriculum-based reinforcement learning to maximize rewards on navigation tasks with increasingly longer instructions. We create two new benchmark datasets (of long navigation tasks) and use them in conjunction with existing ones to examine BabyWalk's generalization ability. Empirical results show that BabyWalk achieves state-of-the-art results on several metrics, in particular, is able to follow long instructions better. The codes and the datasets are released on our project page https://github.com/Sha-Lab/babywalk.

preprint2020arXiv

Evolving Graphical Planner: Contextual Global Planning for Vision-and-Language Navigation

The ability to perform effective planning is crucial for building an instruction-following agent. When navigating through a new environment, an agent is challenged with (1) connecting the natural language instructions with its progressively growing knowledge of the world; and (2) performing long-range planning and decision making in the form of effective exploration and error correction. Current methods are still limited on both fronts despite extensive efforts. In this paper, we introduce the Evolving Graphical Planner (EGP), a model that performs global planning for navigation based on raw sensory input. The model dynamically constructs a graphical representation, generalizes the action space to allow for more flexible decision making, and performs efficient planning on a proxy graph representation. We evaluate our model on a challenging Vision-and-Language Navigation (VLN) task with photorealistic images and achieve superior performance compared to previous navigation architectures. For instance, we achieve a 53% success rate on the test split of the Room-to-Room navigation task through pure imitation learning, outperforming previous navigation architectures by up to 5%.

preprint2020arXiv

Take the Scenic Route: Improving Generalization in Vision-and-Language Navigation

In the Vision-and-Language Navigation (VLN) task, an agent with egocentric vision navigates to a destination given natural language instructions. The act of manually annotating these instructions is timely and expensive, such that many existing approaches automatically generate additional samples to improve agent performance. However, these approaches still have difficulty generalizing their performance to new environments. In this work, we investigate the popular Room-to-Room (R2R) VLN benchmark and discover that what is important is not only the amount of data you synthesize, but also how you do it. We find that shortest path sampling, which is used by both the R2R benchmark and existing augmentation methods, encode biases in the action space of the agent which we dub as action priors. We then show that these action priors offer one explanation toward the poor generalization of existing works. To mitigate such priors, we propose a path sampling method based on random walks to augment the data. By training with this augmentation strategy, our agent is able to generalize better to unknown environments compared to the baseline, significantly improving model performance in the process.