Researcher profile

Yadong Lu

Yadong Lu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2025arXiv

Beyond Clicking:A Step Towards Generalist GUI Grounding via Text Dragging

Graphical user interface (GUI) grounding, the process of mapping human instructions to GUI actions, serves as a fundamental basis to autonomous GUI agents. While existing grounding models achieve promising performance to simulate the mouse click action on various click-based benchmarks, another essential mode of mouse interaction, namely dragging, remains largely underexplored. Yet, dragging the mouse to select and manipulate textual content represents a prevalent and important usage in practical GUI scenarios. To narrow this gap, we first introduce GUI-Drag, a diverse dataset of 161K text dragging examples synthesized through a scalable pipeline. To support systematic and robust evaluation, we further construct ScreenDrag, a benchmark with 5,333 examples spanning three levels of interface context, together with three dedicated metrics designed for assessing text dragging capability. Models trained on GUI-Drag with an efficient continual training strategy achieve substantial improvements on ScreenDrag, while preserving the original click-based performance on ScreenSpot, ScreenSpot-v2, and OSWorld-G. Our work encourages further research on broader GUI grounding beyond just clicking and paves way toward a truly generalist GUI grounding model. All benchmark, data, checkpoints, and code are open-sourced and available at https://osu-nlp-group.github.io/GUI-Drag.

preprint2022arXiv

Resolving Extreme Jet Substructure

We study the effectiveness of theoretically-motivated high-level jet observables in the extreme context of jets with a large number of hard sub-jets (up to $N=8$). Previous studies indicate that high-level observables are powerful, interpretable tools to probe jet substructure for $N\le 3$ hard sub-jets, but that deep neural networks trained on low-level jet constituents match or slightly exceed their performance. We extend this work for up to $N=8$ hard sub-jets, using deep particle-flow networks (PFNs) and Transformer based networks to estimate a loose upper bound on the classification performance. A fully-connected neural network operating on a standard set of high-level jet observables, 135 $\textrm{N}$-subjetiness observables and jet mass, reach classification accuracy of 86.90\%, but fall short of the PFN and Transformer models, which reach classification accuracies of 89.19\% and 91.27\% respectively, suggesting that the constituent networks utilize information not captured by the set of high-level observables. We then identify additional high-level observables which are able to narrow this gap, and utilize LASSO regularization for feature selection to identify and rank the most relevant observables and provide further insights into the learning strategies used by the constituent-based neural networks. The final model contains only 31 high-level observables and is able to match the performance of the PFN and approximate the performance of the Transformer model to within 2\%.

preprint2021arXiv

Progressive Neural Image Compression with Nested Quantization and Latent Ordering

We present PLONQ, a progressive neural image compression scheme which pushes the boundary of variable bitrate compression by allowing quality scalable coding with a single bitstream. In contrast to existing learned variable bitrate solutions which produce separate bitstreams for each quality, it enables easier rate-control and requires less storage. Leveraging the latent scaling based variable bitrate solution, we introduce nested quantization, a method that defines multiple quantization levels with nested quantization grids, and progressively refines all latents from the coarsest to the finest quantization level. To achieve finer progressiveness in between any two quantization levels, latent elements are incrementally refined with an importance ordering defined in the rate-distortion sense. To the best of our knowledge, PLONQ is the first learning-based progressive image coding scheme and it outperforms SPIHT, a well-known wavelet-based progressive image codec.

preprint2021arXiv

SARM: Sparse Autoregressive Model for Scalable Generation of Sparse Images in Particle Physics

Generation of simulated data is essential for data analysis in particle physics, but current Monte Carlo methods are very computationally expensive. Deep-learning-based generative models have successfully generated simulated data at lower cost, but struggle when the data are very sparse. We introduce a novel deep sparse autoregressive model (SARM) that explicitly learns the sparseness of the data with a tractable likelihood, making it more stable and interpretable when compared to Generative Adversarial Networks (GANs) and other methods. In two case studies, we compare SARM to a GAN model and a non-sparse autoregressive model. As a quantitative measure of performance, we compute the Wasserstein distance ($W_p$) between the distributions of physical quantities calculated on the generated images and on the training images. In the first study, featuring images of jets in which 90% of the pixels are zero-valued, SARM produces images with $W_p$ scores that are 24-52% better than the scores obtained with other state-of-the-art generative models. In the second study, on calorimeter images in the vicinity of muons where 98% of the pixels are zero-valued, SARM produces images with $W_p$ scores that are 66-68% better. Similar observations made with other metrics confirm the usefulness of SARM for sparse data in particle physics. Original data and software will be made available upon acceptance of the manuscript from the UCI Machine Learning in Physics web portal at: http://mlphysics.ics.uci.edu/.