Researcher profile

Zheng Sun

Zheng Sun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

PAGER: Bridging the Semantic-Execution Gap in Point-Precise Geometric GUI Control

Large vision-language models have significantly advanced GUI agents, enabling executable interaction across web, mobile, and desktop interfaces. Yet these gains largely rely on a forgiving region-tolerant paradigm, where many nearby pixels inside the same component remain valid. Precise geometric construction breaks this assumption: actions must land on points in continuous canvas space rather than tolerant regions. Because geometric primitives carry ontological dependencies, a local coordinate error can induce cascading topological failures that distort downstream objects and invalidate the final construction. We identify this regime as precision-sensitive GUI tasks, requiring point-level accuracy, geometry-aware verification, and robustness to dependency-driven error propagation. To benchmark it, we introduce PAGE Bench, with 4,906 problems and over 224K process-supervised, pixel-level GUI actions. We further propose PAGER, a topology-aware agent that decomposes construction into dependency-structured planning and pixel-level execution. Pixel-grounded supervised tuning establishes executable action grammar, while precision-aligned reinforcement learning mitigates rollout-induced exposure bias through state-conditioned geometric feedback. Experiments reveal a pronounced Semantic-Execution Gap: general multimodal models can exceed 88% action type accuracy yet remain below 6% task success. PAGER closes this gap, delivering 4.1x higher task success than the strongest evaluated general baseline and raising step success rate from below 9% for GUI-specialized agents to over 62%, establishing a new state of the art for point-precise GUI control.

preprint2022arXiv

On Energy Laws and Stability of Runge--Kutta Methods for Linear Seminegative Problems

This paper presents a systematic theoretical framework to derive the energy identities of general implicit and explicit Runge--Kutta (RK) methods for linear seminegative systems. It generalizes the stability analysis of explicit RK methods in [Z. Sun and C.-W. Shu, SIAM J. Numer. Anal., 57 (2019), pp. 1158-1182]. The established energy identities provide a precise characterization on whether and how the energy dissipates in the RK discretization, thereby leading to weak and strong stability criteria of RK methods. Furthermore, we discover a unified energy identity for all the diagonal Pade approximations, based on an analytical Cholesky type decomposition of a class of symmetric matrices. The structure of the matrices is very complicated, rendering the discovery of the unified energy identity and the proof of the decomposition highly challenging. Our proofs involve the construction of technical combinatorial identities and novel techniques from the theory of hypergeometric series. Our framework is motivated by a discrete analogue of integration by parts technique and a series expansion of the continuous energy law. In some special cases, our analyses establish a close connection between the continuous and discrete energy laws, enhancing our understanding of their intrinsic mechanisms. Several specific examples of implicit methods are given to illustrate the discrete energy laws. A few numerical examples further confirm the theoretical properties.

preprint2022arXiv

Supersymmetry and R-symmetries in Wess-Zumino models: properties and model dataset construction

The Nelson-Seiberg theorem and its extensions relate supersymmetry breaking and R-symmetries in Wess-Zumino models. But their applicability may be limited by previously found non-generic counterexamples. Constructing a dataset of R-symmetric Wess-Zumino models is useful for studying the occurrence of such counterexamples as well as other purposes. This work gives a pedagogical review on the basics of supersymmetry in (3+1)-dimensions, Wess-Zumino models and their supergravity extensions, the Nelson-Seiberg theorem and its extensions. We present a preliminary construction of the dataset of R-symmetric Wess-Zumino models with up to 5 chiral fields. Among 925 models in total, 20 of them with non-generic R-charges are counterexamples to both the Nelson-Seiberg theorem and its extensions. Thus the dataset gives an estimation of the accuracy of the field counting method based on these theorems. More constructions and applications of the dataset are expected in future work.

preprint2022arXiv

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage DETR based separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression \textbf{TR}ansformer (SepRETR), to predict separation lines from table images directly. To make the two-stage DETR framework work efficiently and effectively for the separation line prediction task, we propose two improvements: 1) A prior-enhanced matching strategy to solve the slow convergence issue of DETR; 2) A new cross attention module to sample features from a high-resolution convolutional feature map directly so that high localization accuracy is achieved with low computational cost. After separation line prediction, a simple relation network based cell merging module is used to recover spanning cells. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW. Furthermore, we have validated the robustness of our approach to tables with complex structures, borderless cells, large blank spaces, empty or spanning cells as well as distorted or even curved shapes on a more challenging real-world in-house dataset.

preprint2021arXiv

A formal notion of genericity and term-by-term vanishing superpotentials at supersymmetric vacua from R-symmetric Wess-Zumino models

It is known in previous literature that if a Wess-Zumino model with an R-symmetry gives a supersymmetric vacuum, the superpotential vanishes at the vacuum. In this work, we establish a formal notion of genericity, and show that if the R-symmetric superpotential has generic coefficients, the superpotential vanishes term-by-term at a supersymmetric vacuum. This result constrains the form of the superpotential which leads to a supersymmetric vacuum. It may contribute to a refined classification of R-symmetric Wess-Zumino models, and find applications in string constructions of vacua with small superpotentials. A similar result for a scalar potential system with a scaling symmetry is discussed.

preprint2021arXiv

Femtosecond dynamics of a polariton bosonic cascade at room temperature

Whispering gallery modes in a microwire are characterized by a nearly equidistant energy spectrum. In the strong exciton-photon coupling regime, this system represents a bosonic cascade: a ladder of discrete energy levels that sustains stimulated transitions between neighboring steps. In this work, by using femtosecond angle-resolved spectroscopic imaging technique, the ultrafast dynamics of polaritons in a bosonic cascade based on a one-dimensional ZnO whispering gallery microcavity is explicitly visualized. Clear ladder-form build-up process from higher to lower energy branches of the polariton condensates are observed, which are well reproduced by modeling using rate equations. Moreover, the polariton parametric scattering dynamics are distinguished on a timescale of hundreds of femtoseconds. Our understanding of the femtosecond condensation and scattering dynamics paves the way towards ultrafast coherent control of polaritons at room temperature, which will make it promising for high-speed all-optical integrated applications.

preprint2020arXiv

Error analysis of Runge--Kutta discontinuous Galerkin methods for linear time-dependent partial differential equations

In this paper, we present error estimates of fully discrete Runge--Kutta discontinuous Galerkin (DG) schemes for linear time-dependent partial differential equations. The analysis applies to explicit Runge--Kutta time discretizations of any order. For spatial discretization, a general discrete operator is considered, which covers various DG methods, such as the upwind-biased DG method, the central DG method, the local DG method and the ultra-weak DG method. We obtain error estimates for stable and consistent fully discrete schemes, if the solution is sufficiently smooth and a spatial operator with certain properties exists. Applications to schemes for hyperbolic conservation laws, the heat equation, the dispersive equation and the wave equation are discussed. In particular, we provide an alternative proof of optimal error estimates of local DG methods for equations with high order derivatives in one dimension, which does not rely on energy inequalities of auxiliary unknowns.

preprint2020arXiv

Joint Bilateral Learning for Real-time Universal Photorealistic Style Transfer

Photorealistic style transfer is the task of transferring the artistic style of an image onto a content target, producing a result that is plausibly taken with a camera. Recent approaches, based on deep neural networks, produce impressive results but are either too slow to run at practical resolutions, or still contain objectionable artifacts. We propose a new end-to-end model for photorealistic style transfer that is both fast and inherently generates photorealistic results. The core of our approach is a feed-forward neural network that learns local edge-aware affine transforms that automatically obey the photorealism constraint. When trained on a diverse set of images and a variety of styles, our model can robustly apply style transfer to an arbitrary pair of input images. Compared to the state of the art, our method produces visually superior results and is three orders of magnitude faster, enabling real-time performance at 4K on a mobile phone. We validate our method with ablation and user studies.

preprint2020arXiv

Observation of the Interlayer Exciton Gases in WSe$_2$ -p: WSe$_2$ Heterostructures

Interlayer excitons (IXs) possess a much longer lifetime than intralayer excitons due to the spatial separation of the electrons and holes; hence, they have been pursued to create exciton condensates for decades. The recent emergence of two-dimensional (2D) materials, such as transition metal dichalcogenides (TMDs), and of their van der Waals heterostructures (HSs), in which two different 2D materials are layered together, has created new opportunities to study IXs. Here we present the observation of IX gases within two stacked structures consisting of hBN/WSe$_2$/hBN/p: WSe$_2$/hBN. The IX energy of the two different structures differed by 82 meV due to the different thickness of the hBN spacer layer between the TMD layers. We demonstrate that the lifetime of the IXs is shortened when the temperature and the pump power increase. We attribute this nonlinear behavior to an Auger process.

preprint2020arXiv

Predicting quantum many-body dynamics with transferable neural networks

Machine learning (ML) architectures such as convolutional neural networks (CNNs) have garnered considerable recent attention in the study of quantum many-body systems. However, advanced ML approaches such as transfer learning have seldom been applied to such contexts. Here we demonstrate that a simple recurrent unit (SRU) based efficient and transferable sequence learning framework is capable of learning and accurately predicting the time evolution of one-dimensional (1D) Ising model with simultaneous transverse and parallel magnetic fields, as quantitatively corroborated by relative entropy measurements and magnetization between the predicted and exact state distributions. At a cost of constant computational complexity, a larger many-body state evolution was predicted in an autoregressive way from just one initial state, without any guidance or knowledge of any Hamiltonian. Our work paves the way for future applications of advanced ML methods in quantum many-body dynamics only with knowledge from a smaller system.

preprint2020arXiv

The Nelson-Seiberg theorem generalized with nonpolynomial superpotentials

The Nelson-Seiberg theorem relates R-symmetries to F-term supersymmetry breaking, and provides a guiding rule for new physics model building beyond the Standard Model. A revision of the theorem gives a necessary and sufficient condition to supersymmetry breaking in models with polynomial superpotentials. This work revisits the theorem to include models with nonpolynomial superpotentials. With a generic R-symmetric superpotential, a singularity at the origin of the field space implies both R-symmetry breaking and supersymmetry breaking. We give a generalized necessary and sufficient condition for supersymmetry breaking which applies to both perturbative and nonperturbative models.

preprint2019arXiv

On structure-preserving discontinuous Galerkin methods for Hamiltonian partial differential equations: Energy conservation and multi-symplecticity

In this paper, we present and study discontinuous Galerkin (DG) methods for one-dimensional multi-symplectic Hamiltonian partial differential equations. We particularly focus on semi-discrete schemes with spatial discretization only, and show that the proposed DG methods can simultaneously preserve the multi-symplectic structure and energy conservation with a general class of numerical fluxes, which includes the well-known central and alternating fluxes. Applications to the wave equation, the Benjamin-Bona-Mahony equation, the Camassa-Holm equation, the Korteweg-de Vries equation and the nonlinear Schrödinger equation are discussed. Some numerical results are provided to demonstrate the accuracy and long time behavior of the proposed methods. Numerically, we observe that certain choices of numerical fluxes in the discussed class may help achieve better accuracy compared with the commonly used ones including the central fluxes.