Source author record

Dianyu Liu

Dianyu Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-ex hep-ph Artificial Intelligence Machine Learning

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SciHorizon-DataEVA: An Agentic System for AI-Readiness Evaluation of Heterogeneous Scientific Data

AI-for-Science (AI4Science) is increasingly transforming scientific discovery by embedding machine learning models into prediction, simulation, and hypothesis generation workflows across domains. However, the effectiveness of these models is fundamentally constrained by the AI-readiness of scientific data, for which no scalable and systematic evaluation mechanism currently exists. In this work, we propose SciHorizon-DataEVA, a novel agentic system to scalable AI-readiness evaluation of heterogeneous scientific data. At the evaluation-criteria level, we introduce the Sci-TQA2 principles, which organize AI-readiness into four complementary dimensions: Governance Trustworthiness, Data Quality, AI Compatibility, and Scientific Adaptability. Each dimension is decomposed into measurable atomic elements that enable fine-grained and executable assessment. To operationalize these principles at scale, we develop Sci-TQA2-Eval, a hierarchical multi-agent evaluation approach orchestrated through a directed, cyclic workflow. Our Sci-TQA2-Eval dynamically constructs dataset-aware evaluation specifications by combining lightweight dataset profiling, applicability-aware metric activation, and knowledge-augmented planning grounded in domain constraints and dataset-paper signals. These specifications are executed through an adaptive, tool-centric evaluation mechanism with built-in verification and self-correction, enabling scalable and reliable assessment across heterogeneous scientific data. Extensive experiments on scientific datasets spanning multiple domains demonstrate the effectiveness and generality of SciHorizon-DataEVA for principled AI-readiness evaluation.

preprint2022arXiv

Machine learning of log-likelihood functions in global analysis of parton distributions

Modern analysis on parton distribution functions (PDFs) requires calculations of the log-likelihood functions from thousands of experimental data points, and scans of multi-dimensional parameter space with tens of degrees of freedom. In conventional analysis the Hessian approximation has been widely used for the estimation of the PDF uncertainties.The Lagrange Multiplier (LM) scan while being a more faithful method is less used due to computational limitations, and is the main focus of this study. We propose to use Neural Networks (NNs) and machine learning techniques to model the profile of the log-likelihood functions or cross sections for multi-dimensional parameter space in order to overcome those limitations which work beyond the quadratic approximations and meanwhile ensures efficient scans of the full parameter space. We demonstrate the efficiency of the new approach in the framework of the CT18 global analysis of PDFs by constructing NNs for various target functions, and performing LM scans on PDFs and cross sections at hadron colliders. We further study the impact of the NOMAD dimuon data on constraining PDFs with the new approach, and find enhanced strange-quark distributions and reduced PDF uncertainties. Moreover, we show how the approach can be used to constrain new physics beyond the Standard Model (BSM) by a joint fit of both PDFs and Wilson coefficients of operators in the SM effective field theory.

preprint2022arXiv

Understanding PDF uncertainty on the $W$ boson mass measurements in CT18 global analysis

We study the dependence of the transverse mass distribution of the charged lepton and the missing energies on the parton distributions (PDFs) adapted to the $W$ boson mass measurements at the CDF and ATLAS experiments. We compare the shape variations of the distribution induced by different PDFs and find that spread of predictions from different PDF sets can be much larger than the PDF uncertainty predicted by a specific PDF set. We suggest analyzing the experimental data using up-to-date PDFs for a better understanding of the PDF uncertainties in the $W$ boson mass measurements. We further carry out a series of Lagrange multiplier scans to identify the constraints on the transverse mass distribution imposed by individual data sets in the CT18 global analysis. In the case of CDF measurement, the distribution is mostly sensitive to the $d$-quark PDFs at the intermediate $x$ region that is largely constrained by the DIS and Drell-Yan data on the deuteron target, as well as the Tevatron lepton charge asymmetry data.

preprint2021arXiv

Constraints on neutrino non-standard interactions from LHC data with large missing transverse momentum

The possible non-standard interactions (NSIs) of neutrinos with matter plays important role in the global determination of neutrino properties. In our study we select various data sets from LHC measurements at 13 TeV with integrated luminosities of $35 \sim 139$ fb$^{-1}$, including production of a single jet, photon, $W/Z$ boson, or charged lepton accompanied with large missing transverse momentum. We derive constraints on neutral-current NSIs with quarks imposed by different data sets in a framework of either effective operators or simplified $Z'$ models. We use theoretical predictions of productions induced by NSIs at next-to-leading order in QCD matched with parton showering which stabilize the theory predictions and result in more robust constraints. In a simplified $Z'$ model we obtain a 95% CLs upper limit on the conventional NSI strength $ε$ of 0.042 and 0.0028 for a $Z'$ mass of 0.2 and 2 TeV respectively. We also discuss possible improvements from future runs of LHC with higher luminosities.