Source author record

Pengyu Zhao

Pengyu Zhao appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Computation and Language cond-mat.stat-mech

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OctoBench: Benchmarking Scaffold-Aware Instruction Following in Repository-Grounded Agentic Coding

Modern coding scaffolds turn LLMs into capable software agents, but their ability to follow scaffold-specified instructions remains under-examined, especially when constraints are heterogeneous and persist across interactions. To fill this gap, we introduce OctoBench, which benchmarks scaffold-aware instruction following in repository-grounded agentic coding. OctoBench includes 34 environments and 217 tasks instantiated under three scaffold types, and is paired with 7,098 objective checklist items. To disentangle solving the task from following the rules, we provide an automated observation-and-scoring toolkit that captures full trajectories and performs fine-grained checks. Experiments on eight representative models reveal a systematic gap between task-solving and scaffold-aware compliance, underscoring the need for training and evaluation that explicitly targets heterogeneous instruction following. We release the benchmark to support reproducible benchmarking and to accelerate the development of more scaffold-aware coding agents.

preprint2022arXiv

DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search

The convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead against efficient deployment. Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure, such that the pruned network can be easily deployed in practice. However, existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space. In this paper, we introduce Differentiable Annealing Indicator Search (DAIS) that leverages the strength of neural architecture search in the channel pruning and automatically searches for the effective pruned model with given constraints on computation overhead. Specifically, DAIS relaxes the binarized channel indicators to be continuous and then jointly learns both indicators and model parameters via bi-level optimization. To bridge the non-negligible discrepancy between the continuous model and the target binarized model, DAIS proposes an annealing-based procedure to steer the indicator convergence towards binarized states. Moreover, DAIS designs various regularizations based on a priori structural knowledge to control the pruning sparsity and to improve model performance. Experimental results show that DAIS outperforms state-of-the-art pruning methods on CIFAR-10, CIFAR-100, and ImageNet.

preprint2022arXiv

Site and bond percolation on four-dimensional simple hypercubic lattices with extended neighborhoods

The asymptotic behavior of the percolation threshold $p_c$ and its dependence upon coordination number $z$ is investigated for both site and bond percolation on four-dimensional lattices with compact extended neighborhoods. Simple hypercubic lattices with neighborhoods up to 9th nearest neighbors are studied to high precision by means of Monte-Carlo simulations based upon a single-cluster growth algorithm. For site percolation, an asymptotic analysis confirms the predicted behavior $zp_c \sim 16 η_c = 2.086$ for large $z$, and finite-size corrections are accounted for by forms $p_c \sim 16 η_c/(z+b)$ and $p_c \sim 1- \exp(-16 η_c/z)$ where $η_c \approx 0.1304$ is the continuum percolation threshold of four-dimensional hyperspheres. For bond percolation, the finite-$z$ correction is found to be consistent with the prediction of Frei and Perkins, $zp_{c} - 1 \sim a_{1} (\ln z)/z$, although the behavior $zp_{c} - 1 \sim a_1 z^{-3/4}$ cannot be ruled out.

preprint2020arXiv

Differentiable Feature Aggregation Search for Knowledge Distillation

Knowledge distillation has become increasingly important in model compression. It boosts the performance of a miniaturized student network with the supervision of the output distribution and feature maps from a sophisticated teacher network. Some recent works introduce multi-teacher distillation to provide more supervision to the student network. However, the effectiveness of multi-teacher distillation methods are accompanied by costly computation resources. To tackle with both the efficiency and the effectiveness of knowledge distillation, we introduce the feature aggregation to imitate the multi-teacher distillation in the single-teacher distillation framework by extracting informative supervision from multiple teacher feature maps. Specifically, we introduce DFA, a two-stage Differentiable Feature Aggregation search method that motivated by DARTS in neural architecture search, to efficiently find the aggregations. In the first stage, DFA formulates the searching problem as a bi-level optimization and leverages a novel bridge loss, which consists of a student-to-teacher path and a teacher-to-student path, to find appropriate feature aggregations. The two paths act as two players against each other, trying to optimize the unified architecture parameters to the opposite directions while guaranteeing both expressivity and learnability of the feature aggregation simultaneously. In the second stage, DFA performs knowledge distillation with the derived feature aggregation. Experimental results show that DFA outperforms existing methods on CIFAR-100 and CINIC-10 datasets under various teacher-student settings, verifying the effectiveness and robustness of the design.