Source author record

Zhao Li

Zhao Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-ph Artificial Intelligence hep-ex Machine Learning Information Retrieval physics.ins-det astro-ph.GA astro-ph.IM Computation and Language Computer Vision Databases Distributed, Parallel, and Cluster Computing gr-qc Neural and Evolutionary Computing physics.app-ph quant-ph

Catalog footprint

What is connected

35works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Multi-domain Multi-modal Document Classification Benchmark with a Multi-level Taxonomy

Document classification forms the backbone of modern enterprise content management, yet existing benchmarks remain trapped in oversimplified paradigms -- single domain settings with flat label structures -- that bear little resemblance to the hierarchical, multi-modal, and cross-domain nature of real-world business documents. This gap not only misrepresents practical complexity but also stifles progress toward industrially viable document intelligence. To bridge this gap, we construct the first Multi-level, Multi-domain, Multi-modal document classification Benchmark (MMM-Bench). MMM-Bench includes (1) a deeply hierarchical taxonomy spanning five levels that capture the authentic organizational logic of business documentation; and (2) 5,990 real-world multi-modal documents meticulously curated from 12 commercial domains in Alibaba. Each document is manually annotated with a complete hierarchical path by domain experts. We establish comprehensive baselines on MMM-Bench, which consists of open-weight models and API-based models. Through systematic experiments, we identify four fundamental challenges within MMM-Bench and propose corresponding insights. To provide a solid foundation for advancing research in multi-level, multi-domain document classification, we release all of the data and the evaluation toolkit of MMM-Bench at https://github.com/MMMDC-Bench/MMMDC-Bench.

preprint2025arXiv

Gravitational Lensing of Gravitational Waves: Spin-wave Optics through Black Hole Scattering

Gravitational-wave (GW) scattering in strong gravitational fields is a central problem in GW lensing. Yet, conventional treatments based on asymptotic expansions suffer from divergences and become unreliable near the optical axis. In this work, we present a rigorous calculation of GW scattering by a Schwarzschild black hole (BH) within the BH perturbation theory. By placing the observer at a finite distance and abandoning the asymptotic expansion of radial wave functions, we obtain a well-convergent partial-wave description without invoking any regularization scheme, thereby naturally resolving the divergences of the partial-wave series and the Poisson spot. We numerically computed the scattered GW waveforms by reconstructing the physical $+$ and $\times$ polarizations from the master variables, revealing the formation of the Poisson spot and pronounced wavefront distortions. A systematic comparison with conventional asymptotic approaches shows that they reproduce only qualitative features at large scattering angles and fail in the forward-scattering region. We further compare the frequency-domain transmission factors derived from the scattering formalism with those obtained from the Kirchhoff diffraction integral, finding significant discrepancies at high frequencies due to the latter's neglect of long-range gravitational effects and polarization evolution. Our results establish a stable and physically transparent framework for GW scattering in strong-field regimes and provide a solid foundation for accurate modeling of GW lensing beyond standard approximations.

preprint2023arXiv

GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy

Graph neural networks (GNNs) have been demonstrated as a powerful tool for analyzing non-Euclidean graph data. However, the lack of efficient distributed graph learning systems severely hinders applications of GNNs, especially when graphs are big and GNNs are relatively deep. Herein, we present GraphTheta, the first distributed and scalable graph learning system built upon vertex-centric distributed graph processing with neural network operators implemented as user-defined functions. This system supports multiple training strategies and enables efficient and scalable big-graph learning on distributed (virtual) machines with low memory. To facilitate graph convolutions, GraphTheta puts forward a new graph learning abstraction named NN-TGAR to bridge the gap between graph processing and graph deep learning. A distributed graph engine is proposed to conduct the stochastic gradient descent optimization with a hybrid-parallel execution, and a new cluster-batched training strategy is supported. We evaluate GraphTheta using several datasets with network sizes ranging from small-, modest- to large-scale. Experimental results show that GraphTheta can scale well to 1,024 workers for training an in-house developed GNN on an industry-scale Alipay dataset of 1.4 billion nodes and 4.1 billion attributed edges, with a cluster of CPU virtual machines (dockers) of small memory each (5$\sim$12GB). Moreover, GraphTheta can outperform DistDGL by up to $2.02\times$, with better scalability, and GraphLearn by up to $30.56\times$. As for model accuracy, GraphTheta is capable of learning as good GNNs as existing frameworks. To the best of our knowledge, this work presents the largest edge-attributed GNN learning task in the literature.

preprint2022arXiv

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Clustering is a fundamental machine learning task which has been widely studied in the literature. Classic clustering methods follow the assumption that data are represented as features in a vectorized form through various representation learning techniques. As the data become increasingly complicated and complex, the shallow (traditional) clustering methods can no longer handle the high-dimensional data type. With the huge success of deep learning, especially the deep unsupervised learning, many representation learning techniques with deep architectures have been proposed in the past decade. Recently, the concept of Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. Motivated by the tremendous success of deep learning in clustering, one of the most fundamental machine learning tasks, and the large number of recent advances in this direction, in this paper we conduct a comprehensive survey on deep clustering by proposing a new taxonomy of different state-of-the-art approaches. We summarize the essential components of deep clustering and categorize existing methods by the ways they design interactions between deep representation learning and clustering. Moreover, this survey also provides the popular benchmark datasets, evaluation metrics and open-source implementations to clearly illustrate various experimental settings. Last but not least, we discuss the practical applications of deep clustering and suggest challenging topics deserving further investigations as future directions.

preprint2022arXiv

Community Trend Prediction on Heterogeneous Graph in E-commerce

In online shopping, ever-changing fashion trends make merchants need to prepare more differentiated products to meet the diversified demands, and e-commerce platforms need to capture the market trend with a prophetic vision. For the trend prediction, the attribute tags, as the essential description of items, can genuinely reflect the decision basis of consumers. However, few existing works explore the attribute trend in the specific community for e-commerce. In this paper, we focus on the community trend prediction on the item attribute and propose a unified framework that combines the dynamic evolution of two graph patterns to predict the attribute trend in a specific community. Specifically, we first design a communityattribute bipartite graph at each time step to learn the collaboration of different communities. Next, we transform the bipartite graph into a hypergraph to exploit the associations of different attribute tags in one community. Lastly, we introduce a dynamic evolution component based on the recurrent neural networks to capture the fashion trend of attribute tags. Extensive experiments on three real-world datasets in a large e-commerce platform show the superiority of the proposed approach over several strong alternatives and demonstrate the ability to discover the community trend in advance.

preprint2022arXiv

Defending Against Backdoor Attack on Graph Nerual Network by Explainability

Backdoor attack is a powerful attack algorithm to deep learning model. Recently, GNN's vulnerability to backdoor attack has been proved especially on graph classification task. In this paper, we propose the first backdoor detection and defense method on GNN. Most backdoor attack depends on injecting small but influential trigger to the clean sample. For graph data, current backdoor attack focus on manipulating the graph structure to inject the trigger. We find that there are apparent differences between benign samples and malicious samples in some explanatory evaluation metrics, such as fidelity and infidelity. After identifying the malicious sample, the explainability of the GNN model can help us capture the most significant subgraph which is probably the trigger in a trojan graph. We use various dataset and different attack settings to prove the effectiveness of our defense method. The attack success rate all turns out to decrease considerably.

preprint2022arXiv

GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction

Short video has witnessed rapid growth in the past few years in e-commerce platforms like Taobao. To ensure the freshness of the content, platforms need to release a large number of new videos every day, making conventional click-through rate (CTR) prediction methods suffer from the item cold-start problem. In this paper, we propose GIFT, an efficient Graph-guIded Feature Transfer system, to fully take advantages of the rich information of warmed-up videos to compensate for the cold-start ones. Specifically, we establish a heterogeneous graph that contains physical and semantic linkages to guide the feature transfer process from warmed-up video to cold-start videos. The physical linkages represent explicit relationships, while the semantic linkages measure the proximity of multi-modal representations of two videos. We elaborately design the feature transfer function to make aware of different types of transferred features (e.g., id representations and historical statistics) from different metapaths on the graph. We conduct extensive experiments on a large real-world dataset, and the results show that our GIFT system outperforms SOTA methods significantly and brings a 6.82% lift on CTR in the homepage of Taobao App.

preprint2022arXiv

One-loop squared amplitudes for hadronic $tW$ production at next-to-next-to-leading order in QCD

We present the analytic results of one-loop squared amplitudes for $tW$ production at a hadron collider. The calculation is performed using the method of differential equations. After renormalization, we have checked that the infrared divergences agree with the general structure predicted by anomalous dimensions. The finite remainder contributes to the next-to-next-to-leading order hard function, one of the essential ingredients in the factorization formula of the cross section near the infrared region, which can be used in resummation of all-order soft gluon effects or a differential next-to-next-to-leading order calculation based on the phase space slicing method.

preprint2022arXiv

One-off Negative Sequential Pattern Mining

Negative sequential pattern mining (SPM) is an important SPM research topic. Unlike positive SPM, negative SPM can discover events that should have occurred but have not occurred, and it can be used for financial risk management and fraud detection. However, existing methods generally ignore the repetitions of the pattern and do not consider gap constraints, which can lead to mining results containing a large number of patterns that users are not interested in. To solve this problem, this paper discovers frequent one-off negative sequential patterns (ONPs). This problem has the following two characteristics. First, the support is calculated under the one-off condition, which means that any character in the sequence can only be used once at most. Second, the gap constraint can be given by the user. To efficiently mine patterns, this paper proposes the ONP-Miner algorithm, which employs depth-first and backtracking strategies to calculate the support. Therefore, ONP-Miner can effectively avoid creating redundant nodes and parent-child relationships. Moreover, to effectively reduce the number of candidate patterns, ONP-Miner uses pattern join and pruning strategies to generate and further prune the candidate patterns, respectively. Experimental results show that ONP-Miner not only improves the mining efficiency, but also has better mining performance than the state-of-the-art algorithms. More importantly, ONP mining can find more interesting patterns in traffic volume data to predict future traffic.

preprint2022arXiv

Re-weighting Negative Samples for Model-Agnostic Matching

Recommender Systems (RS), as an efficient tool to discover users' interested items from a very large corpus, has attracted more and more attention from academia and industry. As the initial stage of RS, large-scale matching is fundamental yet challenging. A typical recipe is to learn user and item representations with a two-tower architecture and then calculate the similarity score between both representation vectors, which however still struggles in how to properly deal with negative samples. In this paper, we find that the common practice that randomly sampling negative samples from the entire space and treating them equally is not an optimal choice, since the negative samples from different sub-spaces at different stages have different importance to a matching model. To address this issue, we propose a novel method named Unbiased Model-Agnostic Matching Approach (UMA$^2$). It consists of two basic modules including 1) General Matching Model (GMM), which is model-agnostic and can be implemented as any embedding-based two-tower models; and 2) Negative Samples Debias Network (NSDN), which discriminates negative samples by borrowing the idea of Inverse Propensity Weighting (IPW) and re-weighs the loss in GMM. UMA$^2$ seamlessly integrates these two modules in an end-to-end multi-task learning framework. Extensive experiments on both real-world offline dataset and online A/B test demonstrate its superiority over state-of-the-art methods.

preprint2022arXiv

RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on

Virtual try-on(VTON) aims at fitting target clothes to reference person images, which is widely adopted in e-commerce.Existing VTON approaches can be narrowly categorized into Parser-Based(PB) and Parser-Free(PF) by whether relying on the parser information to mask the persons' clothes and synthesize try-on images. Although abandoning parser information has improved the applicability of PF methods, the ability of detail synthesizing has also been sacrificed. As a result, the distraction from original cloth may persistin synthesized images, especially in complicated postures and high resolution applications. To address the aforementioned issue, we propose a novel PF method named Regional Mask Guided Network(RMGN). More specifically, a regional mask is proposed to explicitly fuse the features of target clothes and reference persons so that the persisted distraction can be eliminated. A posture awareness loss and a multi-level feature extractor are further proposed to handle the complicated postures and synthesize high resolution images. Extensive experiments demonstrate that our proposed RMGN outperforms both state-of-the-art PB and PF methods.Ablation studies further verify the effectiveness ofmodules in RMGN.

preprint2021arXiv

Method and Dataset Entity Mining in Scientific Literature: A CNN + Bi-LSTM Model with Self-attention

Literature analysis facilitates researchers to acquire a good understanding of the development of science and technology. The traditional literature analysis focuses largely on the literature metadata such as topics, authors, abstracts, keywords, references, etc., and little attention was paid to the main content of papers. In many scientific domains such as science, computing, engineering, etc., the methods and datasets involved in the scientific papers published in those domains carry important information and are quite useful for domain analysis as well as algorithm and dataset recommendation. In this paper, we propose a novel entity recognition model, called MDER, which is able to effectively extract the method and dataset entities from the main textual content of scientific papers. The model utilizes rule embedding and adopts a parallel structure of CNN and Bi-LSTM with the self-attention mechanism. We evaluate the proposed model on datasets which are constructed from the published papers of four research areas in computer science, i.e., NLP, CV, Data Mining and AI. The experimental results demonstrate that our model performs well in all the four areas and it features a good learning capacity for cross-area learning and recognition. We also conduct experiments to evaluate the effectiveness of different building modules within our model which indicate that the importance of different building modules in collectively contributing to the good entity recognition performance as a whole. The data augmentation experiments on our model demonstrated that data augmentation positively contributes to model training, making our model much more robust in dealing with the scenarios where only small number of training samples are available. We finally apply our model on PAKDD papers published from 2009-2019 to mine insightful results from scientific papers published in a longer time span.

preprint2021arXiv

Ultra-wideband electrostrictive mechanical antenna

Conventional mechanical antennas provide a strategy in long-wave communication with a surprisingly compact size below 1/1,000 of the wavelength. However, the narrow bandwidth and weak field intensity seriously hamper its practical applications. Here, we present a mechanical antenna based on the electrostrictive effect of PMN-PT-based relaxor ferroelectric ceramic to improve radiation capacity and achieve ultra-wideband characteristics (10 kHz - 1 MHz, the relative bandwidth is beyond 196%). Determined by the different underlying mechanism, the mechanical antenna based on the electrostrictive effect exhibits excellent communication properties from traditional mechanical antennas. The functions of signal coding, transmitting, receiving, and decoding were experimentally demonstrated. This approach offers a promising way of constructing mechanical antennas for long-wave communication.

preprint2020arXiv

Deep Representation Learning of Patient Data from Electronic Health Records (EHR): A Systematic Review

Patient representation learning refers to learning a dense mathematical representation of a patient that encodes meaningful information from Electronic Health Records (EHRs). This is generally performed using advanced deep learning methods. This study presents a systematic review of this field and provides both qualitative and quantitative analyses from a methodological perspective. We identified studies developing patient representations from EHRs with deep learning methods from MEDLINE, EMBASE, Scopus, the Association for Computing Machinery (ACM) Digital Library, and Institute of Electrical and Electronics Engineers (IEEE) Xplore Digital Library. After screening 363 articles, 49 papers were included for a comprehensive data collection. We noticed a typical workflow starting with feeding raw data, applying deep learning models, and ending with clinical outcome predictions as evaluations of the learned representations. Specifically, learning representations from structured EHR data was dominant (37 out of 49 studies). Recurrent Neural Networks were widely applied as the deep learning architecture (LSTM: 13 studies, GRU: 11 studies). Disease prediction was the most common application and evaluation (31 studies). Benchmark datasets were mostly unavailable (28 studies) due to privacy concerns of EHR data, and code availability was assured in 20 studies. We show the importance and feasibility of learning comprehensive representations of patient EHR data through a systematic review. Advances in patient representation learning techniques will be essential for powering patient-level EHR analyses. Future work will still be devoted to leveraging the richness and potential of available EHR data. Knowledge distillation and advanced learning techniques will be exploited to assist the capability of learning patient representation further.

preprint2020arXiv

Direct reduction of multiloop multiscale scattering amplitudes

We propose an alternative approach based on series representation to directly reduce multi-loop multi-scale scattering amplitude into set of freely chosen master integrals. And this approach avoid complicated calculations of inverse matrix and dimension shift for tensor reduction calculation. During this procedure we further utilize the Feynman parameterization to calculate the coefficients of series representation and obtain the form factors. Conventional methodologies are used only for scalar vacuum bubble integrals to finalize the result in series representation form. Finally, we elaborate our approach by presenting the reduction of a typical two-loop amplitude for W boson production.

preprint2020arXiv

Improving the measurement of the Higgs boson-gluon coupling using convolutional neural networks at $e^+e^-$ colliders

In this paper we propose to use convolutional neural networks (CNNs) to improve the precision measurement of the Higgs boson-gluon effective coupling at lepton colliders. The CNN is employed to recognize the Higgs boson and a $Z$ boson associated production process, with the Higgs boson decaying to a gluon pair and the $Z$ boson decaying to a lepton pair at the center-of-mass energy 250 GeV and integrated luminosity 5 ab$^{-1}$. By using CNNs, the uncertainty of the effective coupling measurement can be decreased from $1.94\%$ to about $1.28\%$ using the PYTHIA data and from $1.82\%$ to about $1.22\%$ using the HERWIG data in the Monte Carlo simulation. Moreover, the performance of CNNs using different final state constituents shows that the energy distributions of the leading and subleading jets constituents play a major role in the identification and the optimal uncertainty of effective coupling using CNNs is reduced by about $35\%$ compared to that using conventional method.

preprint2019arXiv

Extended Projection Method for Massive Fermion

Tensor reduction is important for multi-loop amplitude calculation. And the projection method is one of the most popular approaches for tensor reduction. However, projection method could be problematic for amplitude with massive fermions due to the inconsistency between helicity and chirality. We propose an approach to extend the projection method to reduce the loop amplitude containing fermion chain with two massive spinors. The extension is achieved by decomposing one of the massive spinors into two specific massless spinors, "null spinor" and "reference spinor". Then the extended projection method can be safely implemented for all the processes including the production of massive fermions. Finally we present the tensor reduction for the virtual Z boson decaying to top-quark pair to demonstrate our approach.

preprint2016arXiv

Efficient Numerical Evaluation of Feynman Integral

Feynman loop integrals are a key ingredient for the calculation of higher order radiation effects, and are responsible for reliable and accurate theoretical prediction. We improve the efficiency of numerical integration in sector decomposition by implementing a quasi-Monte Carlo method associated with the CUDA/GPU technique. For demonstration we present the results of several Feynman integrals up to two loops in both Euclidean and physical kinematic regions in comparison with those obtained from FIESTA3. It is shown that both planar and non-planar two-loop master integrals in the physical kinematic region can be evaluated in less than half a minute with $\mathcal{O}(10^{-3})$ accuracy, which makes the direct numerical approach viable for precise investigation of higher order effects in multi-loop processes, e.g. the next-to-leading order QCD effect in Higgs pair production via gluon fusion with a finite top quark mass.

preprint2016arXiv

The most-luminous heavily-obscured quasars have a high merger fraction: morphological study of WISE-selected hot dust-obscured galaxies

Previous studies have shown that WISE-selected hyperluminous, hot dust-obscured galaxies (Hot DOGs) are powered by highly dust-obscured, possibly Compton-thick AGNs. High obscuration provides us a good chance to study the host morphology of the most luminous AGNs directly. We analyze the host morphology of 18 Hot DOGs at $z\sim3$ using Hubble Space Telescope/WFC3 imaging. We find that Hot DOGs have a high merger fraction ($62\pm 14 \%$). By fitting the surface brightness profiles, we find that the distribution of Sérsic indices in our Hot DOG sample peaks around 2, which suggests that most of Hot DOGs have transforming morphologies. We also derive the AGN bolometric luminosity ($\sim10^{14}L_\odot$) of our Hot DOG sample by using IR SEDs decomposition. The derived merger fraction and AGN bolometric luminosity relation is well consistent with the variability-based model prediction (Hickox et al. 2014). Both the high merger fraction in IR-luminous AGN sample and relatively low merger fraction in UV/optical-selected, unobscured AGN sample can be expected in the merger-driven evolutionary model. Finally, we conclude that Hot DOGs are merger-driven and may represent a transit phase during the evolution of massive galaxies, transforming from the dusty starburst dominated phase to the unobscured QSO phase.

preprint2016arXiv

Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning

As a genetics-based machine learning technique, zeroth-level classifier system (ZCS) is based on a discounted reward reinforcement learning algorithm, bucket-brigade algorithm, which optimizes the discounted total reward received by an agent but is not suitable for all multi-step problems, especially large-size ones. There are some undiscounted reinforcement learning methods available, such as R-learning, which optimize the average reward per time step. In this paper, R-learning is used as the reinforcement learning employed by ZCS, to replace its discounted reward reinforcement learning approach, and tournament selection is used to replace roulette wheel selection in ZCS. The modification results in classifier systems that can support long action chains, and thus is able to solve large multi-step problems.

preprint2015arXiv

Probe Higgs boson pair production via the $3 \ell 2 j$ + missing $E_T$ mode

We perform a detailed hadron-level study on the sensitivity of Higgs boson pair production via the $WW^{*}WW^{*}$ channel with the final state $3 \ell 2 j$ + missing $E_T$ at the LHC with the collision energy $\sqrt{S} = 14$ TeV and a future 100 TeV collider. To avoid the huge background from $pp \to Z W + \textrm{jets}$ processes, we confine to consider the four lepton patterns: $e^\pm e^\pm μ^\mp $ and $μ^\pm μ^\pm e^\mp$. We propose a partial reconstruction method to determine the most reliable combination. After that, we examine a few crucial observables which can discriminate efficiently signal and background events, especially we notice that the observable $m_{\rm T2}$ is very efficient. For the LHC 14 TeV collisions, with an accumulated 3000 fb$^{-1}$ dataset, we find that the sensitivity of this mode can reach up to 1.5 $σ$ for the Standard Model and the triple coupling of Higgs boson $λ_3$ in the simplest effective theory can be constrained into the range [-1, 8] at $95\%$ confidence level; at a 100 TeV collider with the integrated luminosity 3000 fb$^{-1}$, the sensitivity can reach up to 13 $σ$ for the Standard Model and we find that all values of $λ_3$ in the effective theory can be covered up to 3$σ$ even without optimising signals. To precisely measure the triple coupling of Higgs boson $λ_3=1$ of the Standard Model at a 100 TeV collider, by using the invariant mass of three leptons which is robust to against the contamination of underlying events and pileup effects and by performing a $χ^2$ analysis, we find that it can be determined into a range [0.8, 1.5] at $95\%$ confidence level.

preprint2014arXiv

The CT10 NNLO Global Analysis of QCD

We present next-to-next-to-leading order (NNLO) parton distribution functions (PDFs) from the CTEQ-TEA group. The CT10NNLO PDF fit is based on essentially the same global data sets used in the CT10 and CT10W NLO PDF analyses. After exploring the goodness of the fits to the HERA combined data and the Tevatron jet data, we present various predictions at NNLO accuracy for both existing and forthcoming precision measurements from the CERN Large Hadron Collider. The range of variations in the gluon distribution introduced by correlated systematic effects in inclusive jet production is also examined.

preprint2013arXiv

Discriminating Higgs production mechanisms using jet energy profiles

We present a new tool for precision measurements of the Higgs boson production mechanisms at the LHC. We study events with a Higgs boson produced with two forward jets. Even with fairly stringent cuts, one expects a significant contamination of gluon fusion (GF) in addition to vector-boson fusion (VBF) in the event sample. By measuring the jet energy profile of the most central jet, we find that SM production can be distinguished from either pure VBF or pure GF at the $5σ$ level with 100 fb$^{-1}$ of luminosity at the 14 TeV LHC. Moreover, this discrimination technique can be used to validate or rule out new physics models that predict similar observable branching fractions as the 125 GeV SM Higgs but have different production mechanisms.

preprint2013arXiv

QCD resummation for light-particle jets

We construct an evolution equation for the invariant-mass distribution of light-quark and gluon jets in the framework of QCD resummation. The solution of the evolution equation exhibits a behavior consistent with Tevatron CDF data: the jet distribution vanishes in the small invariant-mass limit, and its peak moves toward the high invariant-mass region with the jet energy. We also construct an evolution equation for the energy profile of the light-quark and gluon jets in the similar framework. The solution shows that the energy accumulates faster within a light-quark jet cone than within a gluon jet cone. The jet energy profile convoluted with hard scattering and parton distribution functions matches well with the Tevatron CDF and the large-hadron-collider (LHC) CMS data. Moreover, comparison with the CDF and CMS data implies that jets with large (small) transverse momentum are mainly composed of the light-quark (gluon) jets. At last, we discuss the application of the above solutions for the light-particle jets to the identification of highly-boosted heavy particles produced at LHC.

preprint2012arXiv

Discovery and Identification of W' and Z' in SU(2) x SU(2) x U(1) Models at the LHC

We explore the discovery potential of W' and Z' boson searches for various SU(2) x SU(2) x U(1) models at the Large Hadron Collider (LHC), after taking into account the constraints from low energy precision measurements and direct searches at both the Tevatron (1.96 TeV) and the LHC (7 TeV). In such models, the W' and Z' bosons emerge after the electroweak symmetry is spontaneously broken. Two patterns of the symmetry breaking are considered in this work: one is SU(2)_L x SU(2)_2 x U(1)_X to SU(2)_L x U(1)_Y (BP-I), another is SU(2)_1 x SU(2)_2 x U(1)_Y to SU(2)_L x U(1)_Y (BP-II). Examining the single production channel of W' and Z' with their subsequent leptonic decays, we find that the probability of detecting W' and Z' bosons in the considered models at the LHC (with 14 TeV) is highly limited by the low energy precision data constraints. We show that observing Z' alone, without seeing a W', does not rule out new physics models with non-Abelian gauge extension, such as the phobic models in BP-I. Models in BP-II would predict the discovery of degenerate W' and Z' bosons at the LHC.

preprint2012arXiv

Improved resummation prediction on Higgs boson production at hadron colliders

We improve the resummation calculations in the ResBos program for the Higgs boson production via gluon-gluon fusion by including the NNLO Wilson coefficient functions and G-functions. The improvement leads to increasing the total cross section predictions of the new ResBos program, dubbed as ResBos2, for Higgs boson production by about 8% and 6% at the Tevatron and the LHC, respectively, as compared to the old ResBos program. Furthermore, the improved predictions are compared with those from the programs HNNLO and HqT2. We find that they agree well for the total cross sections but differ slightly for the transverse momentum $Q_T$ distributions. With ResBos2, we present the distributions of the two variables $a_T$ and $ϕ^{*}$, which can have better experimental resolutions than $Q_T$, for the process of Higgs boson decaying into a photon pair. Theoretical uncertainties of the ResBos2 predictions are also discussed.

preprint2012arXiv

One-way information reconciliation schemes of quantum key distribution

Information reconciliation(IR) is a basic step of quantum key distribution (QKD). Classical message interaction is necessary in a practical IR scheme, and the communication complexity has become a bottleneck of QKD's development. Here we propose a concatenated method of IR scheme which requires only one time one-way communication to achieve any given error rate level. A QKD scheme with the concatenated IR can work without the special interaction of error rate estimation.

preprint2012arXiv

Progress in CTEQ-TEA PDF analysis

Recent developments in the CTEQ-TEA global QCD analysis are presented. The parton distribution functions CT10-NNLO are described, constructed by comparing data from many experiments to NNLO approximations of QCD.

preprint2012arXiv

The Cryogenic System for the Panda-X Dark Matter Search Experiment

Panda-X is a liquid xenon dual-phase detector for the Dark Matter Search. The first modestly-sized module will soon be installed in the China JinPing Deep Underground Laboratory in Sichuan province, P.R. China. The cryogenics system is designed to handle much larger detectors, even the final version in the ton scale. Special attention has been paid to the reliability, serviceability, and adaptability to the requirements of a growing experiment. The system is cooled by a single Iwatani PC150 Pulse Tube Refrigerator. After subtracting all thermal losses, the remaining cooling power is still 82W. The fill speed was 9 SLPM, but could be boosted by LN2 assisted cooling to 40 SLPM. For the continuous recirculation and purification through a hot getter, a heat exchanger was employed to reduce the required cooling power. The recirculation speed is limited to 35 SLPM by the gas pump. At this speed, recirculation only adds 18.5 W to the heat load of the system, corresponding to a 95.2 % efficiency of the heat exchanger.

preprint2011arXiv

Improved Predictions for Higgs Q_T at the Tevatron and the LHC

The search for the Higgs boson at the Tevatron and the LHC relies on detailed calculations of the kinematics of Higgs boson production and decay. In this paper, we improve the calculation of the distribution in transverse momentum, $Q_T$, of the Higgs boson in the gluon fusion production process, $gg\to H$, by matching the resummed distribution at small $Q_T$ with the ${\cal O}(α_{s}^4)$ fixed-order perturbative calculation at high $Q_T$ in the ResBos Monte Carlo program. The distribution is higher at large $Q_T$ than with the old ${\cal O}(α_{s}^3)$ fixed-order perturbative calculation, and the matching with the resummed calculation is much smoother. The total cross section is also increased, more in line with next-to-next-to-leading-order calculations. We also study the effect of the new calculation on the distribution of $Δϕ_{\ell\ell}$ in the overall process $gg\to H\to W^{+} W^{-}\to\ell^{+}\ell^{-}ν\barν$, and the effect of PDF uncertainties on the distributions at the Tevatron and the LHC.

preprint2011arXiv

QCD resummation for jet substructures

We provide a novel development in jet physics by predicting the energy profiles of light-quark and gluon jets in the framework of perturbative QCD. Resumming large logarithmic contributions to all orders in the coupling constant, our predictions are shown to agree well with Tevatron CDF and Large-Hadron-Collider CMS data. We also extend our resummation formalism to the invariant mass distributions of light-quark and gluon jets produced in hadron collisions. The predicted peak positions and heights in jet mass distributions are consistent with CDF data within uncertainties induced by parton distribution functions.

preprint2010arXiv

New parton distributions for collider physics

We extract new parton distribution functions (PDFs) of the proton by global analysis of hard scattering data in the general-mass framework of perturbative quantum chromodynamics. Our analysis includes new theoretical developments together with the most recent collider data from deep-inelastic scattering, vector boson production, and single-inclusive jet production. Due to the difficulty in fitting both the DO Run-II W lepton asymmetry data and some fixed-target DIS data, we present two families of PDFs, CT10 and CT10W, without and with these high-luminosity W lepton asymmetry data included in the global analysis. With both sets of PDFs, we study theoretical predictions and uncertainties for a diverse selection of processes at the Fermilab Tevatron and the CERN Large Hadron Collider.

preprint2010arXiv

Uncertainty induced by QCD coupling in the CTEQ global analysis of parton distributions

We examine the dependence of parton distribution functions (PDFs) on the value of the QCD coupling strength $α_{s}(M_{Z})$. We explain a simple method that is rigorously valid in the quadratic approximation normally applied in PDF fitting, and fully reproduces the correlated dependence of theoretical cross sections on $α_s$ and PDF parameters. This method is based on a statistical relation that allows one to add the uncertainty produced by $α_s$, computed with some special PDF sets, in quadrature with the PDF uncertainty obtained for the fixed $α_s$ value (such as the CTEQ6.6 PDF set). A series of four CTEQ6.6AS PDFs realizing this approach, for $α_s$ values in the interval $0.116 \leq α_{s}(M_{Z}) \leq 0.120$, is presented. Using these PDFs, the combined $α_{s}$ and PDF uncertainty is assessed for theoretical predictions at the Fermilab Tevatron and Large Hadron Collider.

preprint2009arXiv

A Dark Matter Model with Non-Abelian Gauge Symmetry

We propose a dark matter model in which the dark sector is gauged under a new SU(2) group. The dark sector consists of SU(2) dark gauge fields, two triplet dark Higgs fields, and two dark fermion doublets (dark matter candidates in this model). The dark sector interacts with the SM sector through kinetic and mass mixing operators. The model explains both PAMELA and Fermi LAT data very well and also satisfies constraints from both the DM relic density and Standard Model precision observables. The phenomenology of the model at the LHC is also explored.

preprint2009arXiv

Threshold Resummation Effects in Neutral Higgs Boson Production by Bottom Quark Fusion at the CERN Large Hadron Collider

We investigate the QCD effects in the production of neutral Higgs bosons via bottom quark fusion in both the standard model and the minimal supersymmetric standard model at the CERN Large Hadron Collider. We include the next-to-leading order (NLO) QCD corrections (including supersymmetric QCD) and the threshold resummation effects. We use the soft-collinear effective theory to resum the large logarithms near threshold from soft gluon emission. Our results show that the resummation effects can enhance the total cross sections by about 5% compared with the NLO results.

Zhao Li

What is connected

Connect this record

See the researcher in context

Building this map preview

35 published item(s)

Multi-domain Multi-modal Document Classification Benchmark with a Multi-level Taxonomy

Gravitational Lensing of Gravitational Waves: Spin-wave Optics through Black Hole Scattering

GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy

A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions

Community Trend Prediction on Heterogeneous Graph in E-commerce

Defending Against Backdoor Attack on Graph Nerual Network by Explainability

GIFT: Graph-guIded Feature Transfer for Cold-Start Video Click-Through Rate Prediction

One-loop squared amplitudes for hadronic $tW$ production at next-to-next-to-leading order in QCD

One-off Negative Sequential Pattern Mining

Re-weighting Negative Samples for Model-Agnostic Matching

RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on

Method and Dataset Entity Mining in Scientific Literature: A CNN + Bi-LSTM Model with Self-attention

Ultra-wideband electrostrictive mechanical antenna

Deep Representation Learning of Patient Data from Electronic Health Records (EHR): A Systematic Review

Direct reduction of multiloop multiscale scattering amplitudes

Improving the measurement of the Higgs boson-gluon coupling using convolutional neural networks at $e^+e^-$ colliders

Extended Projection Method for Massive Fermion

Efficient Numerical Evaluation of Feynman Integral

The most-luminous heavily-obscured quasars have a high merger fraction: morphological study of WISE-selected hot dust-obscured galaxies

Tournament selection in zeroth-level classifier systems based on average reward reinforcement learning

Probe Higgs boson pair production via the $3 \ell 2 j$ + missing $E_T$ mode

The CT10 NNLO Global Analysis of QCD

Discriminating Higgs production mechanisms using jet energy profiles

QCD resummation for light-particle jets

Discovery and Identification of W' and Z' in SU(2) x SU(2) x U(1) Models at the LHC

Improved resummation prediction on Higgs boson production at hadron colliders

One-way information reconciliation schemes of quantum key distribution

Progress in CTEQ-TEA PDF analysis

The Cryogenic System for the Panda-X Dark Matter Search Experiment

Improved Predictions for Higgs Q_T at the Tevatron and the LHC

QCD resummation for jet substructures

New parton distributions for collider physics

Uncertainty induced by QCD coupling in the CTEQ global analysis of parton distributions

A Dark Matter Model with Non-Abelian Gauge Symmetry

Threshold Resummation Effects in Neutral Higgs Boson Production by Bottom Quark Fusion at the CERN Large Hadron Collider