Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
42works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

42 published item(s)

preprint2026arXiv

How Far Is Document Parsing from Solved? PureDocBench: A Source-TraceableBenchmark across Clean, Degraded, and Real-World Settings

The past year has seen over 20 open-source document parsing models, yet thefield still benchmarks almost exclusively on OmniDocBench, a 1,355-pagemanually annotated dataset whose top scores have saturated above 90%. Athree-stage audit pipeline we run on OmniDocBench screens its 21,353evaluator-scored blocks and confirms 2,580 errors (12.08%); combined with overa year of public availability, both annotation quality and contamination riskcall its rankings into question. To address these issues, we presentPureDocBench, a programmatically generated, source-traceable benchmark thatrenders document images from HTML/CSS and produces verifiable annotations fromthe same source, covering 10 domains, 66 subcategories, and 1,475 pages, eachin three versions: clean, digitally degraded, and real-degraded (4,425 imagestotal). Evaluating 40 models spanning pipeline specialists, end-to-endspecialists, and general-purpose VLMs, we find: (i) document parsing is farfrom solved: the best model scores only ~74 out of 100, with a 44.6-point gapbetween the strongest and weakest models; (ii) specialist parsers with <=4Bparameters rival or surpass general VLMs that are 5-100x larger, yet formularecognition remains a shared bottleneck where no model exceeds 67% whenaveraging the formula metric across all three tracks; (iii) general VLMs loseonly 0.99/8.52 Overall points under digital/real degradation versus 4.90/14.21for pipeline specialists, producing ranking reversals that make clean-onlyevaluation misleading for deployment. All data, code, and artifacts arepublicly released.

preprint2024arXiv

Automated calculation of Jet fragmentation at NLO in QCD

We present FMNLO, a framework to combine general-purpose Monte Carlo generators and fragmentation functions (FFs). It is based on a hybrid scheme of phase-space slicing method and local subtraction method, and accurate to next-to-leading order (NLO) in QCD. The new framework has been interfaced to MG5 aMC@NLO and made publicly available in this work. We demonstrate its unique ability by giving theoretical predictions of various fragmentation measurements at the LHC, followed by comparison with the data. With the help of interpolation techniques, FMNLO allows for fast calculation of fragmentation processes for a large number of different FFs, which makes it a promising tool for future fits of FFs. As an example, we perform a NLO fit of parton fragmentation functions to unidentified charged hadrons using measurements at the LHC. We find the ATLAS data from inclusive dijet production show a strong constraining power. Notable disparities are found between our gluon FF and that of BKK, DSS and NNFF, indicating the necessities of additional constraints and data for gluon fragmentation function.

preprint2023arXiv

Many Hamiltonian subsets in large graphs with given density

A set of vertices in a graph is a Hamiltonian subset if it induces a subgraph containing a Hamiltonian cycle. Kim, Liu, Sharifzadeh and Staden proved that among all graphs with minimum degree $d$, $K_{d+1}$ minimises the number of Hamiltonian subsets. We prove a near optimal lower bound that takes also the order and the structure of a graph into account. For many natural graph classes, it provides a much better bound than the extremal one ($\approx 2^{d+1}$). Among others, our bound implies that an $n$-vertex $C_4$-free graphs with minimum degree $d$ contains at least $n2^{d^{2-o(1)}}$ Hamiltonian subsets.

preprint2023arXiv

Mask-then-Fill: A Flexible and Effective Data Augmentation Framework for Event Extraction

We present Mask-then-Fill, a flexible and effective data augmentation framework for event extraction. Our approach allows for more flexible manipulation of text and thus can generate more diverse data while keeping the original event structure unchanged as much as possible. Specifically, it first randomly masks out an adjunct sentence fragment and then infills a variable-length text span with a fine-tuned infilling model. The main advantage lies in that it can replace a fragment of arbitrary length in the text with another fragment of variable length, compared to the existing methods which can only replace a single word or a fixed-length fragment. On trigger and argument extraction tasks, the proposed framework is more effective than baseline methods and it demonstrates particularly strong results in the low-resource setting. Our further analysis shows that it achieves a good balance between diversity and distributional similarity.

preprint2022arXiv

Decay of the charged Higgs boson and the top quark in two-Higgs-doublet model at NNLO in QCD

We present numerical calculations of the partial width of the charged Higgs boson decay into a top quark, $H^- \rightarrow \bar{t} + b + X$, and the partial width of the top quark decay into a light charged Higgs boson $t \rightarrow H^+ + b + X$ at next-to-next-to-leading order (NNLO) in QCD, based on a factorization formula of the jet mass. The NNLO corrections significantly reduce the renormalization scale dependence of the partial decay width in both cases. We show relative size of the NNLO corrections for different charged Higgs boson masses and for different renormalization scales. The NNLO corrections are about 16% (1%) of the leading order widths for the charged Higgs boson mass of 200GeV (2000GeV), while it is quite small for the top quark decay. Our analyses are independent of the detailed structure of the Yukawa couplings, and can be applied to various new physics models, as demonstrated by the decay branching ratio in different types of the two-Higgs-doublet models.

preprint2022arXiv

Empathetic Response Generation with State Management

A good empathetic dialogue system should first track and understand a user&#39;s emotion and then reply with an appropriate emotion. However, current approaches to this task either focus on improving the understanding of users&#39; emotion or on proposing better responding strategies, and very few works consider both at the same time. Our work attempts to fill this vacancy. Inspired by task-oriented dialogue systems, we propose a novel empathetic response generation model with emotion-aware dialogue management. The emotion-aware dialogue management contains two parts: (1) Emotion state tracking maintains the current emotion state of the user and (2) Empathetic dialogue policy selection predicts a target emotion and a user&#39;s intent based on the results of the emotion state tracking. The predicted information is then used to guide the generation of responses. Experimental results show that dynamically managing different information can help the model generate more empathetic responses compared with several baselines under both automatic and human evaluations.

preprint2022arXiv

Fine Detailed Texture Learning for 3D Meshes with Generative Models

This paper presents a method to reconstruct high-quality textured 3D models from both multi-view and single-view images. The reconstruction is posed as an adaptation problem and is done progressively where in the first stage, we focus on learning accurate geometry, whereas in the second stage, we focus on learning the texture with a generative adversarial network. In the generative learning pipeline, we propose two improvements. First, since the learned textures should be spatially aligned, we propose an attention mechanism that relies on the learnable positions of pixels. Secondly, since discriminator receives aligned texture maps, we augment its input with a learnable embedding which improves the feedback to the generator. We achieve significant improvements on multi-view sequences from Tripod dataset as well as on single-view image datasets, Pascal 3D+ and CUB. We demonstrate that our method achieves superior 3D textured models compared to the previous works. Please visit our web-page for 3D visuals.

preprint2022arXiv

General heavy-flavor mass scheme for charged-current DIS at NNLO and beyond

Incompleteness in current knowledge of neutrino interactions with nuclear matter imposes a primary limitation in searches for leptonic CP violation carried out at long-baseline neutrino experiments. In this paper, we present a new computation that elevates the theoretical accuracy to next-to-next-to-leading order (NNLO) in QCD for charged-current deeply-inelastic scattering (DIS) processes relevant for ongoing and future neutrino programs. Mass-dependent quark contributions are consistently included across a wide range of momentum transfers in the SACOT-$χ$ general-mass scheme. When appropriate, we further include N$^3$LO corrections in the zero-mass scheme. We show theoretical predictions for several experiments with neutrinos over a wide range of energies and at the upcoming Electron-Ion Collider. Our prediction reduces perturbative uncertainties to $\sim\!1\%$, sufficient for the high-precision objectives of future charged-current DIS measurements, and provides important theoretical inputs to experimental studies of leptonic mixing and CP violations.

preprint2022arXiv

Improving Event Representation via Simultaneous Weakly Supervised Contrastive Learning and Clustering

Representations of events described in text are important for various tasks. In this work, we present SWCC: a Simultaneous Weakly supervised Contrastive learning and Clustering framework for event representation learning. SWCC learns event representations by making better use of co-occurrence information of events. Specifically, we introduce a weakly supervised contrastive learning method that allows us to consider multiple positives and multiple negatives, and a prototype-based clustering method that avoids semantically related events being pulled apart. For model training, SWCC learns representations by simultaneously performing weakly supervised contrastive learning and prototype-based clustering. Experimental results show that SWCC outperforms other baselines on Hard Similarity and Transitive Sentence Similarity tasks. In addition, a thorough analysis of the prototype-based clustering method demonstrates that the learned prototype vectors are able to implicitly capture various relations between events.

preprint2022arXiv

Improving Semantic Segmentation in Transformers using Hierarchical Inter-Level Attention

Existing transformer-based image backbones typically propagate feature information in one direction from lower to higher-levels. This may not be ideal since the localization ability to delineate accurate object boundaries, is most prominent in the lower, high-resolution feature maps, while the semantics that can disambiguate image signals belonging to one object vs. another, typically emerges in a higher level of processing. We present Hierarchical Inter-Level Attention (HILA), an attention-based method that captures Bottom-Up and Top-Down Updates between features of different levels. HILA extends hierarchical vision transformer architectures by adding local connections between features of higher and lower levels to the backbone encoder. In each iteration, we construct a hierarchy by having higher-level features compete for assignments to update lower-level features belonging to them, iteratively resolving object-part relationships. These improved lower-level features are then used to re-update the higher-level features. HILA can be integrated into the majority of hierarchical architectures without requiring any changes to the base model. We add HILA into SegFormer and the Swin Transformer and show notable improvements in accuracy in semantic segmentation with fewer parameters and FLOPS. Project website and code: https://www.cs.toronto.edu/~garyleung/hila/

preprint2022arXiv

Interpretable Proof Generation via Iterative Backward Reasoning

We present IBR, an Iterative Backward Reasoning model to solve the proof generation tasks on rule-based Question Answering (QA), where models are required to reason over a series of textual rules and facts to find out the related proof path and derive the final answer. We handle the limitations of existed works in two folds: 1) enhance the interpretability of reasoning procedures with detailed tracking, by predicting nodes and edges in the proof path iteratively backward from the question; 2) promote the efficiency and accuracy via reasoning on the elaborate representations of nodes and history paths, without any intermediate texts that may introduce external noise during proof generation. There are three main modules in IBR, QA and proof strategy prediction to obtain the answer and offer guidance for the following procedure; parent node prediction to determine a node in the existing proof that a new child node will link to; child node prediction to find out which new node will be added to the proof. Experiments on both synthetic and paraphrased datasets demonstrate that IBR has better in-domain performance as well as cross-domain transferability than several strong baselines. Our code and models are available at https://github.com/find-knowledge/IBR .

preprint2022arXiv

JuCify: A Step Towards Android Code Unification for Enhanced Static Analysis

Native code is now commonplace within Android app packages where it co-exists and interacts with Dex bytecode through the Java Native Interface to deliver rich app functionalities. Yet, state-of-the-art static analysis approaches have mostly overlooked the presence of such native code, which, however, may implement some key sensitive, or even malicious, parts of the app behavior. This limitation of the state of the art is a severe threat to validity in a large range of static analyses that do not have a complete view of the executable code in apps. To address this issue, we propose a new advance in the ambitious research direction of building a unified model of all code in Android apps. The JuCify approach presented in this paper is a significant step towards such a model, where we extract and merge call graphs of native code and bytecode to make the final model readily-usable by a common Android analysis framework: in our implementation, JuCify builds on the Soot internal intermediate representation. We performed empirical investigations to highlight how, without the unified model, a significant amount of Java methods called from the native code are &#34;unreachable&#34; in apps&#39; call-graphs, both in goodware and malware. Using JuCify, we were able to enable static analyzers to reveal cases where malware relied on native code to hide invocation of payment library code or of other sensitive code in the Android framework. Additionally, JuCify&#39;s model enables state-of-the-art tools to achieve better precision and recall in detecting data leaks through native code. Finally, we show that by using JuCify we can find sensitive data leaks that pass through native code.

preprint2022arXiv

Machine learning of log-likelihood functions in global analysis of parton distributions

Modern analysis on parton distribution functions (PDFs) requires calculations of the log-likelihood functions from thousands of experimental data points, and scans of multi-dimensional parameter space with tens of degrees of freedom. In conventional analysis the Hessian approximation has been widely used for the estimation of the PDF uncertainties.The Lagrange Multiplier (LM) scan while being a more faithful method is less used due to computational limitations, and is the main focus of this study. We propose to use Neural Networks (NNs) and machine learning techniques to model the profile of the log-likelihood functions or cross sections for multi-dimensional parameter space in order to overcome those limitations which work beyond the quadratic approximations and meanwhile ensures efficient scans of the full parameter space. We demonstrate the efficiency of the new approach in the framework of the CT18 global analysis of PDFs by constructing NNs for various target functions, and performing LM scans on PDFs and cross sections at hadron colliders. We further study the impact of the NOMAD dimuon data on constraining PDFs with the new approach, and find enhanced strange-quark distributions and reduced PDF uncertainties. Moreover, we show how the approach can be used to constrain new physics beyond the Standard Model (BSM) by a joint fit of both PDFs and Wilson coefficients of operators in the SM effective field theory.

preprint2022arXiv

Model Degradation Hinders Deep Graph Neural Networks

Graph Neural Networks (GNNs) have achieved great success in various graph mining tasks.However, drastic performance degradation is always observed when a GNN is stacked with many layers. As a result, most GNNs only have shallow architectures, which limits their expressive power and exploitation of deep neighborhoods.Most recent studies attribute the performance degradation of deep GNNs to the \textit{over-smoothing} issue. In this paper, we disentangle the conventional graph convolution operation into two independent operations: \textit{Propagation} (\textbf{P}) and \textit{Transformation} (\textbf{T}).Following this, the depth of a GNN can be split into the propagation depth ($D_p$) and the transformation depth ($D_t$). Through extensive experiments, we find that the major cause for the performance degradation of deep GNNs is the \textit{model degradation} issue caused by large $D_t$ rather than the \textit{over-smoothing} issue mainly caused by large $D_p$. Further, we present \textit{Adaptive Initial Residual} (AIR), a plug-and-play module compatible with all kinds of GNN architectures, to alleviate the \textit{model degradation} issue and the \textit{over-smoothing} issue simultaneously. Experimental results on six real-world datasets demonstrate that GNNs equipped with AIR outperform most GNNs with shallow architectures owing to the benefits of both large $D_p$ and $D_t$, while the time costs associated with AIR can be ignored.

preprint2022arXiv

NNLO constraints on proton PDFs from the SeaQuest and STAR experiments and other developments in the CTEQ-TEA global analysis

We review progress in the global QCD analysis by the CTEQ-TEA group since the publication of CT18 parton distribution functions (PDFs) in the proton. Specifically, we discuss comparisons of CT18 NNLO predictions with the LHC 13 TeV measurements as well as with the FNAL SeaQuest and BNL STAR data on lepton pair production. The specialized CT18X PDFs approximating saturation effects are compared with the CT18sx PDFs obtained using NLL/NLO small-$x$ resummation. Short summaries are presented for the special CT18 parton distributions with fitted charm and with lattice QCD inputs. A recent comparative analysis of the impact of deuteron nuclear effects on the parton distributions by the CTEQ-JLab and CTEQ-TEA groups is summarized.

preprint2022arXiv

REAM$\sharp$: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation

The lack of reliable automatic evaluation metrics is a major impediment to the development of open-domain dialogue systems. Various reference-based metrics have been proposed to calculate a score between a predicted response and a small set of references. However, these metrics show unsatisfactory correlations with human judgments. For a reference-based metric, its reliability mainly depends on two factors: its ability to measure the similarity between the predicted response and the reference response, as well as the reliability of the given reference set. Yet, there are few discussions on the latter. Our work attempts to fill this vacancy. We first clarify an assumption on reference-based metrics that, if more high-quality references are added into the reference set, the reliability of the metric will increase. Next, we present REAM$\sharp$: an enhancement approach to Reference-based EvAluation Metrics for open-domain dialogue systems. A prediction model is designed to estimate the reliability of the given reference set. We show how its predicted results can be helpful to augment the reference set, and thus improve the reliability of the metric. Experiments validate both the effectiveness of our prediction model and that the reliability of reference-based metrics improves with the augmented reference sets.

preprint2022arXiv

Self-Supervised Light Field Depth Estimation Using Epipolar Plane Images

Exploiting light field data makes it possible to obtain dense and accurate depth map. However, synthetic scenes with limited disparity range cannot contain the diversity of real scenes. By training in synthetic data, current learning-based methods do not perform well in real scenes. In this paper, we propose a self-supervised learning framework for light field depth estimation. Different from the existing end-to-end training methods using disparity label per pixel, our approach implements network training by estimating EPI disparity shift after refocusing, which extends the disparity range of epipolar lines. To reduce the sensitivity of EPI to noise, we propose a new input mode called EPI-Stack, which stacks EPIs in the view dimension. This method is less sensitive to noise scenes than traditional input mode and improves the efficiency of estimation. Compared with other state-of-the-art methods, the proposed method can also obtain higher quality results in real-world scenarios, especially in the complex occlusion and depth discontinuity.

preprint2022arXiv

Thrust distribution in Higgs decays at the next-to-leading order and beyond

We present predictions for the thrust distribution in hadronic decays of the Higgs boson at the next-to-leading order and the approximate next-to-next-to-leading order. The approximate NNLO corrections are derived from a factorization formula in the soft/collinear phase-space regions. We find large corrections, especially for the gluon channel. The scale variations at the lowest orders tend to underestimate the genuine higher order contributions. The results of this paper is therefore necessary to control the perturbative uncertainties of the theoretical predictions. We also discuss on possible improvements to our results, such as a soft-gluon resummation for the 2-jets limit, and an exact next-to-next-to-leading order calculation for the multi-jets region.

preprint2022arXiv

Understanding PDF uncertainty on the $W$ boson mass measurements in CT18 global analysis

We study the dependence of the transverse mass distribution of the charged lepton and the missing energies on the parton distributions (PDFs) adapted to the $W$ boson mass measurements at the CDF and ATLAS experiments. We compare the shape variations of the distribution induced by different PDFs and find that spread of predictions from different PDF sets can be much larger than the PDF uncertainty predicted by a specific PDF set. We suggest analyzing the experimental data using up-to-date PDFs for a better understanding of the PDF uncertainties in the $W$ boson mass measurements. We further carry out a series of Lagrange multiplier scans to identify the constraints on the transverse mass distribution imposed by individual data sets in the CT18 global analysis. In the case of CDF measurement, the distribution is mostly sensitive to the $d$-quark PDFs at the intermediate $x$ region that is largely constrained by the DIS and Drell-Yan data on the deuteron target, as well as the Tevatron lepton charge asymmetry data.

preprint2021arXiv

A unified proof of conjectures on cycle lengths in graphs

In this paper, we prove a tight minimum degree condition in general graphs for the existence of paths between two given endpoints, whose lengths form a long arithmetic progression with common difference one or two. This allows us to obtain a number of exact and optimal results on cycle lengths in graphs of given minimum degree, connectivity or chromatic number. More precisely, we prove the following statements by a unified approach. (1) Every graph $G$ with minimum degree at least $k+1$ contains cycles of all even lengths modulo $k$; in addition, if $G$ is 2-connected and non-bipartite, then it contains cycles of all lengths modulo $k$. (2) For all $k\geq 3$, every $k$-connected graph contains a cycle of length zero modulo $k$. (3) Every 3-connected non-bipartite graph with minimum degree at least $k+1$ contains $k$ cycles of consecutive lengths. (4) Every graph with chromatic number at least $k+2$ contains $k$ cycles of consecutive lengths. The first statement is a conjecture of Thomassen, the second is a conjecture of Dean, the third is a tight answer to a question of Bondy and Vince, and the fourth is a conjecture of Sudakov and Verstraëte. All of the above results are best possible.

preprint2021arXiv

Constraints on neutrino non-standard interactions from LHC data with large missing transverse momentum

The possible non-standard interactions (NSIs) of neutrinos with matter plays important role in the global determination of neutrino properties. In our study we select various data sets from LHC measurements at 13 TeV with integrated luminosities of $35 \sim 139$ fb$^{-1}$, including production of a single jet, photon, $W/Z$ boson, or charged lepton accompanied with large missing transverse momentum. We derive constraints on neutral-current NSIs with quarks imposed by different data sets in a framework of either effective operators or simplified $Z&#39;$ models. We use theoretical predictions of productions induced by NSIs at next-to-leading order in QCD matched with parton showering which stabilize the theory predictions and result in more robust constraints. In a simplified $Z&#39;$ model we obtain a 95% CLs upper limit on the conventional NSI strength $ε$ of 0.042 and 0.0028 for a $Z&#39;$ mass of 0.2 and 2 TeV respectively. We also discuss possible improvements from future runs of LHC with higher luminosities.

preprint2021arXiv

Energy-energy correlation in hadronic Higgs decays: analytic results and phenomenology at NLO

In this work we complete the investigation of the recently introduced energy-energy correlation (EEC) function in hadronic Higgs decays at next-to-leading order (NLO) in fixed-order perturbation theory in the limit of vanishing light quark masses. The full analytic NLO result for the previously unknown EEC in the $H \to q \bar{q} + X$ channel is given in terms of classical polylogarithms and cross-checked against a numerical calculation. In addition to that, we discuss further corrections to predictions of the Higgs EEC event shape variable, including quark mass corrections, effects of parton shower and hadronization. We also estimate the statistical error on the measurements of the Higgs EEC at future Higgs factories and compare with the current perturbative uncertainty.

preprint2021arXiv

Minimizing the number of edges in $\mathcal{C}_{\ge r}$-saturated graphs

Given a family of graphs $\mathcal{F}$, a graph $G$ is said to be $\mathcal{F}$-saturated if $G$ does not contain a copy of $F$ as a subgraph for any $F\in\mathcal{F}$ but the addition of any edge $e\notin E(G)$ creates at least one copy of some $F\in\mathcal{F}$ within $G$. The minimum size of an $\mathcal{F}$-saturated graph on $n$ vertices are called the saturation number, denoted by $\sat(n, \mathcal{F})$. Let $\mathcal{C}_{\ge r}$ be the family of cycles of length at least $r$. Ferrara et al. (2012) gave lower and upper bounds of $\sat(n, C_{\ge r})$ and determined the exact values of $\sat(n, C_{\ge r})$ for $3\le r\le 5$. In this paper, we determine the exact value of $\sat(n,\mathcal{C}_{\ge r})$ for $r=6$ and $28\le \frac{n}2\le r\le n$ and give new upper and lower bounds for the other cases.

preprint2020arXiv

128 Identical Quantum Sources Integrated on a Single Silica Chip

Quantum technology is playing an increasingly important role due to the intrinsic parallel processing capabilities endorsed by quantum superposition, exceeding upper limits of classical performances in diverse fields. Integrated photonic chip offers an elegant way to construct large-scale quantum systems in a physically scalable fashion, however, nonuniformity of quantum sources prevents all the elements from being connected coherently for exponentially increasing Hilbert space. Here, we experimentally demonstrate 128 identical quantum sources integrated on a single silica chip. By actively controlling the light-matter interaction in femtosecond laser direct writing, we are able to unify the properties of waveguides comprehensively and therefore the spontaneous four-wave mixing process for quantum sources. We verify the indistinguishability of the on-chip sources by a series of heralded two-source Hong-Ou-Mandel interference, with all the dip visibilities above 90%. In addition, the brightness of the sources is found easily reaching MHz and being applicable to both discrete-variable and continuous-variable platform, showing either clear anti-bunching feature or large squeezing parameter under different pumping regimes. The demonstrated scalability and uniformity of quantum sources, together with integrated photonic network and detection, will enable large-scale all-on-chip quantum processors for real-life applications.

preprint2020arXiv

A Scalable Photonic Computer Solving the Subset Sum Problem

The subset sum problem is a typical NP-complete problem that is hard to solve efficiently in time due to the intrinsic superpolynomial-scaling property. Increasing the problem size results in a vast amount of time consuming in conventionally available computers. Photons possess the unique features of extremely high propagation speed, weak interaction with environment and low detectable energy level, therefore can be a promising candidate to meet the challenge by constructing an a photonic computer computer. However, most of optical computing schemes, like Fourier transformation, require very high operation precision and are hard to scale up. Here, we present a chip built-in photonic computer to efficiently solve the subset sum problem. We successfully map the problem into a waveguide network in three dimensions by using femtosecond laser direct writing technique. We show that the photons are able to sufficiently dissipate into the networks and search all the possible paths for solutions in parallel. In the case of successive primes the proposed approach exhibits a dominant superiority in time consumption even compared with supercomputers. Our results confirm the ability of light to realize a complicated computational function that is intractable with conventional computers, and suggest the subset sum problem as a good benchmarking platform for the race between photonic and conventional computers on the way towards &#34;photonic supremacy&#34;.

preprint2020arXiv

Anchor: Locating Android Framework-specific Crashing Faults

Android framework-specific app crashes are hard to debug. Indeed, the callback-based event-driven mechanism of Android challenges crash localization techniques that are developed for traditional Java programs. The key challenge stems from the fact that the buggy code location may not even be listed within the stack trace. For example, our empirical study on 500 framework-specific crashes from an open benchmark has revealed that 37 percent of the crash types are related to bugs that are outside the stack traces. Moreover, Android programs are a mixture of code and extra-code artifacts such as the Manifest file. The fact that any artifact can lead to failures in the app execution creates the need to position the localization target beyond the code realm. In this paper, we propose Anchor, a two-phase suspicious bug location suggestion tool. Anchor specializes in finding crash-inducing bugs outside the stack trace. Anchor is lightweight and source code independent since it only requires the crash message and the apk file to locate the fault. Experimental results, collected via cross-validation and in-the-wild dataset evaluation, show that Anchor is effective in locating Android framework-specific crashing faults.

preprint2020arXiv

Decoy-State Quantum Key Distribution over a Long-Distance High-Loss Underwater Free-Space Channel

Atmospheric free space and fiber have been widely exploited as the channels for quantum communication, and have enabled inter-continent and inter-city applications. Air-sea free-space channel, being capable of linking the satellite-based quantum resource and underwater vehicle, has now become the last piece of the puzzle in building global quantum communication network. However, long-distance quantum communication penetrating water up to tens to hundreds of meters is extremely challenging due to the inevitable high loss. Here, we present an experimental demonstration of underwater decoy-state quantum key distribution against high loss, meanwhile keep a low quantum bit error rate less than 2.5% for different distances. By directly modulating blue-green lasers at a high speed of 50MHz and decoy-state protocol, we are able to for the first time reach a long-distance quantum key distribution that is unconditionally secure and can enable real-life air-sea quantum communication tasks. The demonstrated distance, even in coastal water of Jerlov types 2C, is up to 30 meters, about one-order improvement over the proof-in-principle demonstrations in previous experiments, and the channel loss is equivalent to 345-meter-long clean seawater of Jerlov type I, representing a key step forward to practical air-sea quantum communication.

preprint2020arXiv

Differential Distributions for t-channel Single Top-Quark Production and Decay at Next-to-Next-to-Leading Order in QCD

We present a detailed phenomenological study of the next-to-next-to-leading order (NNLO) QCD corrections for $t$-channel single top (anti-)quark production and its semi-leptonic decay at the CERN Large Hadron Collider (LHC). We find the NNLO corrections for the total inclusive rates at the LHC with different center of mass energies are generally smaller than the NLO corrections, indicative of improved convergence. However, they can be large for differential distributions, reaching a level of $10\%$ or more in certain regions of the transverse momentum distributions of the top (anti-)quark and the pseudo-rapidity distributions of the leading jet in the event. In all cases the perturbative hard scale uncertainties are greatly reduced after the NNLO corrections are included. We also show a comparison of the normalized parton-level distributions to recent data from the 8 TeV measurement of the ATLAS Collaboration. The NNLO corrections tend to shift the theoretical predictions closer to the measured transverse momentum distribution of the top (anti)-quark. Importantly, for the LHC at 13 TeV, we present NNLO cross sections in a fiducial volume with decays of the top quark included.

preprint2020arXiv

Direct Observation of Quantum Percolation Dynamics

Percolation, describing critical behaviors of phase transition in a geometrical context, prompts wide investigations in natural and social networks as a fundamental model. The introduction of quantum-intrinsic interference and tunneling brings percolation into quantum regime with more fascinating phenomena and unique features, which, however, hasn&#39;t been experimentally explored yet. Here we present an experimental demonstration of quantum transport in hexagonal percolation lattices by successfully mapping such large-scale porous structures into a photonic chip using femtosecond laser direct writing techniques. A quantum percolation threshold of 80% is observed in the prototyped laser-written lattices with up to 1,600 waveguides, which is significantly larger than the classical counterpart of 63%. We also investigate the spatial confinement by localization parameters and exhibit the transition from ballistic to diffusive propagation with the decrease of the occupation probability. Direct observation of quantum percolation may deepen the understanding of the relation among materials, quantum transport, geometric quenching, disorder and localization, and inspire applications for quantum technologies.

preprint2020arXiv

EPI-based Oriented Relation Networks for Light Field Depth Estimation

Light field cameras record not only the spatial information of observed scenes but also the directions of all incoming light rays. The spatial and angular information implicitly contain geometrical characteristics such as multi-view or epipolar geometry, which can be exploited to improve the performance of depth estimation. An Epipolar Plane Image (EPI), the unique 2D spatial-angular slice of the light field, contains patterns of oriented lines. The slope of these lines is associated with the disparity. Benefiting from this property of EPIs, some representative methods estimate depth maps by analyzing the disparity of each line in EPIs. However, these methods often extract the optimal slope of the lines from EPIs while ignoring the relationship between neighboring pixels, which leads to inaccurate depth map predictions. Based on the observation that an oriented line and its neighboring pixels in an EPI share a similar linear structure, we propose an end-to-end fully convolutional network (FCN) to estimate the depth value of the intersection point on the horizontal and vertical EPIs. Specifically, we present a new feature-extraction module, called Oriented Relation Module (ORM), that constructs the relationship between the line orientations. To facilitate training, we also propose a refocusing-based data augmentation method to obtain different slopes from EPIs of the same scene point. Extensive experiments verify the efficacy of learning relations and show that our approach is competitive to other state-of-the-art methods. The code and the trained models are available at https://github.com/lkyahpu/EPI_ORM.git.

preprint2020arXiv

Experimental Test of Tracking the King Problem

In quantum theory, the retrodiction problem is not as clear as its classical counterpart because of the uncertainty principle of quantum mechanics. In classical physics, the measurement outcomes of the present state can be used directly for predicting the future events and inferring the past events which is known as retrodiction. However, as a probabilistic theory, quantum-mechanical retrodiction is a nontrivial problem that has been investigated for a long time, of which the Mean King Problem is one of the most extensively studied issues. Here, we present the first experimental test of a variant of the Mean King Problem, which has a more stringent regulation and is termed &#34;Tracking the King&#34;. We demonstrate that Alice, by harnessing the shared entanglement and controlled-not gate, can successfully retrodict the choice of King&#39;s measurement without knowing any measurement outcome. Our results also provide a counterintuitive quantum communication to deliver information hidden in the choice of measurement.

preprint2020arXiv

Fast Correlated-Photon Imaging Enhanced by Deep Learning

Correlated photon pairs, carrying strong quantum correlations, have been harnessed to bring quantum advantages to various fields from biological imaging to range finding. Such inherent non-classical properties support extracting more valid signals to build photon-limited images even in low flux-level, where the shot noise becomes dominant as light source decreases to single-photon level. Optimization by numerical reconstruction algorithms is possible but require thousands of photon-sparse frames, thus unavailable in real time. Here, we present an experimental fast correlated-photon imaging enhanced by deep learning, showing an intelligent computational strategy to discover deeper structure in big data. Convolutional neural network is found being able to efficiently solve image inverse problems associated with strong shot noise and background noise (electronic noise, scattered light). Our results fill the key gap in incompatibility between imaging speed and image quality by pushing low-light imaging technique to the regime of real-time and single-photon level, opening up an avenue to deep leaning-enhanced quantum imaging for real-life applications.

preprint2020arXiv

Hacking Quantum Key Distribution via Injection Locking

Unconditionally secure communication, being pursued for thousands of years, however, hasn&#39;t been reached yet due to continuous competitions between encryption and hacking. Quantum key distribution (QKD), harnessing the quantum mechanical nature of superposition and non-cloning, may promise unconditional security by incorporating the one-time pad algorithm rigorously proved by Claude Shannon. Massive efforts have been made in building practical and commercial QKD systems, in particular, decoy states are employed to detect photon-number splitting attack against single-photon source loophole, and measurement-device-independent (MDI) QKD has further closed all loopholes in detection side, which leads to a seemingly real-life application. Here, we propose and experimentally demonstrate an MDI-QKD hacking strategy on the trusted source assumption by using injection locking technique. Eve injects near off-resonance photons in randomly chosen polarization into sender&#39;s laser, where injection locking in a shifted frequency can happen only when Eve&#39;s choice matches with sender&#39;s state. By setting a shifted window and switching the frequency of photons back afterwards, Eve in principle can obtain all the keys without terminating the real-time QKD. We observe the dynamics of a semiconductor laser with injected photons, and obtain a hacking success rate reaching 60.0% of raw keys. Our results suggest that the spear-and-shield competitions on unconditional security may continue until all potential loopholes are discovered and closed ultimately.

preprint2020arXiv

Information Laundering for Model Privacy

In this work, we propose information laundering, a novel framework for enhancing model privacy. Unlike data privacy that concerns the protection of raw data information, model privacy aims to protect an already-learned model that is to be deployed for public use. The private model can be obtained from general learning methods, and its deployment means that it will return a deterministic or random response for a given input query. An information-laundered model consists of probabilistic components that deliberately maneuver the intended input and output for queries to the model, so the model&#39;s adversarial acquisition is less likely. Under the proposed framework, we develop an information-theoretic principle to quantify the fundamental tradeoffs between model utility and privacy leakage and derive the optimal design.

preprint2020arXiv

Investigating Bottom-Quark Yukawa Interaction at Higgs Factory

Measuring the fermion Yukawa coupling constants is important for understanding the origin of the fermion masses and its relationship to the spontaneously electroweak symmetry breaking. On the other hand, some new physics models will change the Lorentz structure of the Yukawa interactions between the standard model (SM) fermions and the SM-like Higgs boson even in their decoupling limit. Thus the precisely measurement of the fermion Yukawa interactions is a powerful tool of new physics searching in the decoupling limit. In this work, we show the possibility of investigating the Lorentz structure of the bottom-quark Yukawa interaction with the 125GeV SM-like Higgs boson at future $e^+e^-$ colliders.

preprint2020arXiv

Multipartite Entanglement of Billions of Motional Atoms Heralded by Single Photon

Quantum entanglement is of central importance to quantum computing, quantum metrology, quantum information as well as the nature of quantum physics. Quantum theory does not prevent entanglement from being created and observed in macroscopic physical systems, in reality however, the accessible scale of entanglement is still very limited due to decoherence effects. Recently, entanglement has been observed among atoms from thousands to millions level in extremely low-temperature and well-isolated systems. Here, we create multipartite entanglement of billions of motional atoms in a quantum memory at room temperature, and certify the genuine entanglement via $M$-separability witness associated with photon statistics. The information contained in a single photon is found strongly correlated with the excitation shared by the motional atoms, which intrinsically address the large system and therefore stimulate the multipartite entanglement. Remarkably, our heralded and quantum memory built-in entanglement generation allows us to directly observe the dynamic evolution of entanglement depth and further to reveal the effects of decoherence. Our results verify the existence of genuine multipartite entanglement among billions of motional atoms at ambient condition, significantly extending the boundary of the accessible scale of entanglement. Besides probing the quantum-to-classical transition in an entirely new realm, the developed abilities of manipulating such a large-scale entanglement may enhance a wide spectrum of applications for emerging quantum technologies.

preprint2020arXiv

New CTEQ global analysis of quantum chromodynamics with high-precision data from the LHC

We present the new parton distribution functions (PDFs) from the CTEQ-TEA collaboration, obtained using a wide variety of high-precision Large Hadron Collider (LHC) data, in addition to the combined HERA I+II deep-inelastic scattering data set, along with the data sets present in the CT14 global QCD analysis. New LHC measurements in single-inclusive jet production with the full rapidity coverage, as well as production of Drell-Yan pairs, top-quark pairs, and high-$p_T$ $Z$ bosons, are included to achieve the greatest sensitivity to the PDFs. The parton distributions are determined at NLO and NNLO, with each of these PDFs accompanied by error sets determined using the Hessian method. Fast PDF survey techniques, based on the Hessian representation and the Lagrange Multiplier method, are used to quantify the preference of each data set to quantities such as $α_s(m_Z)$, and the gluon and strange quark distributions. We designate the main resulting PDF set as CT18. The ATLAS 7 TeV precision $W/Z$ data are not included in CT18, due to their tension with other data sets in the global fit. Alternate PDF sets are generated including the ATLAS precision 7 TeV $W/Z$ data (CT18A), a new scale choice for low-$x$ DIS data (CT18X), or all of the above with a slightly higher choice for the charm mass (CT18Z). Theoretical calculations of standard candle cross sections at the LHC (such as the $gg$ fusion Higgs boson cross section) are presented.

preprint2020arXiv

Observing Movement of Dirac Cones from Single-Photon Dynamics

Graphene with honeycomb structure, being critically important in understanding physics of matter, exhibits exceptionally unusual half-integer quantum Hall effect and unconventional electronic spectrum with quantum relativistic phenomena. Particularly, graphene-like structure can be used for realizing topological insulator which inspires an intrinsic topological protection mechanism with strong immunity for maintaining coherence of quantum information. These various peculiar physics arise from the unique properties of Dirac cones which show high hole degeneracy, massless charge carriers and linear intersection of bands. Experimental observation of Dirac cones conventionally focuses on the energy-momentum space with bulk measurement. Recently, the wave function and band structure have been mapped into the real-space in photonic system, and made flexible control possible. Here, we demonstrate a direct observation of the movement of Dirac cones from single-photon dynamics in photonic graphene under different biaxial strains. Sharing the same spirit of wave-particle nature in quantum mechanics, we identify the movement of Dirac cones by dynamically detecting the edge modes and extracting the diffusing distance of the packets with accumulation and statistics on individual single-particle registrations. Our results of observing movement of Dirac cones from single-photon dynamics, together with the method of direct observation in real space by mapping the band structure defined in momentum space, pave the way to understand a variety of artificial structures in quantum regime.

preprint2020arXiv

Protecting Quantum Superposition and Entanglement with Photonic Higher-Order Topological Crystalline Insulator

Higher-order topological insulator, as a newly found non-trivial material and structure, possesses a topological phase beyond the bulk-boundary correspondence. Here, we present an experimental observation of photonic higher-order topological crystalline insulator and its topological protection to quantum superposition and entanglement in a two-dimensional lattice. By freely writing the insulator structure with femtosecond laser and directly measuring evolution dynamics with single-photon imaging techniques, we are able to observe the distinct features of the topological corner states in C_4 and C_2 photonic lattice symmetry. Especially, we propose and experimentally identify the topological corner states by exciting the photonic lattice with single-photon superposition state, and we examine the protection impact of topology on quantum entanglement for entangled photon states. The single-photon dynamics and the protected entanglement reveal an intrinsic topological protection mechanism isolating multi-partite quantum states from diffusion-induced decoherence. The higher-order topological crystalline insulator, built-in superposition state generation, heralded single-photon imaging and quantum entanglement demonstrated here link topology, material, and quantum physics, opening the door to wide investigations of higher-order topology and applications of topological enhancement in genuine quantum regime.

preprint2020arXiv

Quantum Go Machine

Go has long been considered as a testbed for artificial intelligence. By introducing certain quantum features, such as superposition and collapse of wavefunction, we experimentally demonstrate a quantum version of Go by using correlated photon pairs entangled in polarization degree of freedom. The total dimension of Hilbert space of the generated states grows exponentially as two players take turns to place the stones in time series. As nondeterministic and imperfect information games are more difficult to solve using nowadays technology, we excitedly find that the inherent randomness in quantum physics can bring the game nondeterministic trait, which does not exist in the classical counterpart. Some quantum resources, like coherence or entanglement, can also be encoded to represent the state of quantum stones. Adjusting the quantum resource may vary the average imperfect information (as comparison classical Go is a perfect information game) of a single game. We further verify its non-deterministic feature by showing the unpredictability of the time series data obtained from different classes of quantum state. Finally, by comparing quantum Go with a few typical games that are widely studied in artificial intelligence, we find that quantum Go can cover a wide range of game difficulties rather than a single point. Our results establish a paradigm of inventing new games with quantum-enabled difficulties by harnessing inherent quantum features and resources, and provide a versatile platform for the test of new algorithms to both classical and quantum machine learning.

preprint2020arXiv

Two-Dimensional Quantum Walk of Correlated Photons

Quantum walks in an elaborately designed graph, is a powerful tool simulating physical and topological phenomena, constructing analog quantum algorithms and realizing universal quantum computing. Integrated photonics technology has emerged as a versatile platform to implement various quantum information tasks and a promising candidate to perform large-scale quantum walks. Both extending physical dimensions and involving more particles will increase the complexity of the evolving systems and the desired quantum resources. Pioneer works have demonstrated single particle walking on two-dimensional (2D) lattices and multiple walkers interfering on a one-dimensional structure. However, 2D multi-particle quantum walk, genuinely being not classically simulatable, has been a vacancy for nearly ten years. Here, we present a genuine 2D quantum walk with correlated photons on a triangular photonic lattice, which can be mapped to a state space up to 37X37 dimensions. This breaks through the physically restriction of single-particle evolution, which can encode information in a large space and constitute high-dimensional graphs indeed beneficial to quantum information processing. A site-by-site addressing between the chip facet and the 2D fanout interface enables an observation of over 600 non-classical interferences simultaneously, violating a classical limit up to 57 standard deviations. Our platform offers a promising prospect for multi-photon quantum walks in a large-scale 2D arrangement, paving the way for practical quantum simulation and quantum computation beyond classical regime.

preprint2020arXiv

Vector Vortex Beam Emitter Embedded in a Photonic Chip

Vector vortex beams simultaneously carrying spin and orbital angular momentum of light promise additional degrees of freedom for modern optics and emerging resources for both classical and quantum information technologies. The inherently infinite dimensions can be exploited to enhance data capacity for sustaining the unprecedented growth in big data and internet traffic, and can be encoded to build quantum computing machines in high-dimensional Hilbert space. So far much progress has been made in the emission of vector vortex beams from a chip surface into free space, however, the generation of vector vortex beams inside a photonic chip hasn&#39;t been realized yet. Here, we demonstrate the first vector vortex beam emitter embedded in a photonic chip by using femtosecond laser direct writing. We achieve a conversion of vector vortex beams with an efficiency up to 30% and scalar vortex beams with an efficiency up to 74% from Gaussian beams. We also present an expanded coupled-mode model for understanding the mode conversion and the influence of the imperfection in fabrication. The fashion of embedded generation makes vector vortex beams directly ready for further transmission, manipulation and emission without any additional interconnection. Together with the ability to be integrated as an array, our results may enable vector vortex beams become accessible inside a photonic chip for high-capacity communication and high-dimensional quantum information processing.