Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
38works
0followers
22topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

38 published item(s)

preprint2026arXiv

Data-Driven Flow Initialization Framework for CFD Acceleration of Underwater Vehicle in Vertical-Plane Oblique Motion

Accurate prediction of flow fields around underwater vehicles undergoing vertical-plane oblique motions is critical for hydrodynamic analysis, but it often requires computationally expensive CFD simulations. This study proposes a Data-Driven Flow Initialization (DDFI) framework that accelerates CFD simulation by integrating deep neural network (DNN) to predict full-domain flow fields. Using the suboff hull under various inlet velocities and angles of attack as an example, a DNN is trained to predict velocity, pressure, and turbulent quantities based on mesh geometry, operating conditions, and hybrid vectors. The DNN can provide reasonably accurate predictions with a relative error about 3.3%. To enhance numerical accuracy while maintaining physical consistency, the DNN-predicted flow fields are utilized as initial solutions for the CFD solver, achieving up to 3.5-fold and 2.0-fold speedup at residual thresholds of 5*10^(-6)and 5*10^(-8), respectively. This method maintains physical consistency by refining neural network outputs via traditional CFD solvers, balancing computational efficiency and accuracy. Notably, reducing the size of training set does not exert an essential impact on acceleration performance. Besides, this method exhibits cross-mesh generalization capability. In general, this proposed hybrid approach offers a new pathway for high-fidelity and efficient full-domain flow field predictions around complex underwater vehicles.

preprint2026arXiv

FlowAct-R1: Towards Interactive Humanoid Video Generation

Interactive humanoid video generation aims to synthesize lifelike visual agents that can engage with humans through continuous and responsive video. Despite recent advances in video synthesis, existing methods often grapple with the trade-off between high-fidelity synthesis and real-time interaction requirements. In this paper, we propose FlowAct-R1, a framework specifically designed for real-time interactive humanoid video generation. Built upon a MMDiT architecture, FlowAct-R1 enables the streaming synthesis of video with arbitrary durations while maintaining low-latency responsiveness. We introduce a chunkwise diffusion forcing strategy, complemented by a novel self-forcing variant, to alleviate error accumulation and ensure long-term temporal consistency during continuous interaction. By leveraging efficient distillation and system-level optimizations, our framework achieves a stable 25fps at 480p resolution with a time-to-first-frame (TTFF) of only around 1.5 seconds. The proposed method provides holistic and fine-grained full-body control, enabling the agent to transition naturally between diverse behavioral states in interactive scenarios. Experimental results demonstrate that FlowAct-R1 achieves exceptional behavioral vividness and perceptual realism, while maintaining robust generalization across diverse character styles.

preprint2026arXiv

Motion-Aware Caching for Efficient Autoregressive Video Generation

Autoregressive video generation paradigms offer theoretical promise for long video synthesis, yet their practical deployment is hindered by the computational burden of sequential iterative denoising. While cache reuse strategies can accelerate generation by skipping redundant denoising steps, existing methods rely on coarse-grained chunk-level skipping that fails to capture fine-grained pixel dynamics. This oversight is critical: pixels with high motion require more denoising steps to prevent error accumulation, while static pixels tolerate aggressive skipping. We formalize this insight theoretically by linking cache errors to residual instability, and propose MotionCache, a motion-aware cache framework that exploits inter-frame differences as a lightweight proxy for pixel-level motion characteristics. MotionCache employs a coarse-to-fine strategy: an initial warm-up phase establishes semantic coherence, followed by motion-weighted cache reuse that dynamically adjusts update frequencies per token. Extensive experiments on state-of-the-art models like SkyReels-V2 and MAGI-1 demonstrate that MotionCache achieves significant speedups of $\textbf{6.28}\times$ and $\textbf{1.64}\times$ respectively, while effectively preserving generation quality (VBench: $1\%\downarrow$ and $0.01\%\downarrow$ respectively). The code is available at https://github.com/ywlq/MotionCache.

preprint2026arXiv

NL2Dashboard: A Lightweight and Controllable Framework for Generating Dashboards with LLMs

While Large Language Models (LLMs) have demonstrated remarkable proficiency in generating standalone charts, synthesizing comprehensive dashboards remains a formidable challenge. Existing end-to-end paradigms, which typically treat dashboard generation as a direct code generation task (e.g., raw HTML), suffer from two fundamental limitations: representation redundancy due to massive tokens spent on visual rendering, and low controllability caused by the entanglement of analytical reasoning and presentation. To address these challenges, we propose NL2Dashboard, a lightweight framework grounded in the principle of Analysis-Presentation Decoupling. We introduce a structured intermediate representation (IR) that encapsulates the dashboard's content, layout, and visual elements. Therefore, it confines the LLM's role to data analysis and intent translation, while offloading visual synthesis to a deterministic rendering engine. Building upon this framework, we develop a multi-agent system in which the IR-driven algorithm is instantiated as a suite of tools. Comprehensive experiments conducted with this system demonstrate that NL2Dashboard significantly outperforms state-of-the-art baselines across diverse domains, achieving superior visual quality, significantly higher token efficiency, and precise controllability in both generation and modification tasks.

preprint2022arXiv

Asynchronous Hierarchical Federated Learning

Federated Learning is a rapidly growing area of research and with various benefits and industry applications. Typical federated patterns have some intrinsic issues such as heavy server traffic, long periods of convergence, and unreliable accuracy. In this paper, we address these issues by proposing asynchronous hierarchical federated learning, in which the central server uses either the network topology or some clustering algorithm to assign clusters for workers (i.e., client devices). In each cluster, a special aggregator device is selected to enable hierarchical learning, leads to efficient communication between server and workers, so that the burden of the server can be significantly reduced. In addition, asynchronous federated learning schema is used to tolerate heterogeneity of the system and achieve fast convergence, i.e., the server aggregates the gradients from the workers weighted by a staleness parameter to update the global model, and regularized stochastic gradient descent is performed in workers, so that the instability of asynchronous learning can be alleviated. We evaluate the proposed algorithm on CIFAR-10 image classification task, the experimental results demonstrate the effectiveness of asynchronous hierarchical federated learning.

preprint2022arXiv

Axion-Like Particles at High Energy Muon Colliders -- A White paper for Snowmass 2021

We study the discovery potential for heavy axion-like particles (ALPs) and the perspectives for determining their coupling properties at a muon collider. Focusing on their couplings to the Standard Model (SM) gauge bosons $γ, Z, W^\pm$, we show that a high-energy muon collider can substantially extend the mass coverage, essentially reaching the kinematic limit of the collider energy. The unique kinematics allow for non-ambiguous determination of the individual coupling strengths. The associated production via $μ^+μ^-$ annihilation and the VBF processes with the tagged outgoing muons can be utilized to verify the CP property of the ALPs. We illustrate our results for a muon collider running at 3 TeV and 10 TeV.

preprint2022arXiv

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation

Back-translation is a critical component of Unsupervised Neural Machine Translation (UNMT), which generates pseudo parallel data from target monolingual data. A UNMT model is trained on the pseudo parallel data with translated source, and translates natural source sentences in inference. The source discrepancy between training and inference hinders the translation performance of UNMT models. By carefully designing experiments, we identify two representative characteristics of the data gap in source: (1) style gap (i.e., translated vs. natural text style) that leads to poor generalization capability; (2) content gap that induces the model to produce hallucination content biased towards the target language. To narrow the data gap, we propose an online self-training approach, which simultaneously uses the pseudo parallel data {natural source, translated target} to mimic the inference scenario. Experimental results on several widely-used language pairs show that our approach outperforms two strong baselines (XLM and MASS) by remedying the style and content gaps.

preprint2022arXiv

Energy dissipation from confined states in nanoporous molecular networks

Crystalline nanoporous molecular networks are assembled on the Ag(111) surface, where the pores confine electrons originating from the surface state of the metal. Depending on the pore sizes and their coupling, an antibonding level is shifted upwards by 0.1 to 0.3 eV as measured by scanning tunneling microscopy. On molecular sites, a down-shifted bonding state is observed, which is occupied under equilibrium conditions. Low-temperature force spectroscopy reveals energy dissipation peaks and jumps of frequency shifts at bias voltages, which are related to the confined states. The dissipation maps show delocalization on the supra-molecular assembly and a weak distance-dependence of the dissipation peaks. These observations indicate that two-dimensional arrays of coupled quantum dots are formed, which are quantitatively characterized by their quantum capacitances and resonant tunneling rates. Our work provides a method for studying the capacitive and dissipative response of quantum materials with nanomechanical oscillators.

preprint2022arXiv

Factorization and Sudakov Resummation in Leptonic Radiative $B$ Decay -- A Reappraisal

The $B$-meson light-cone distribution amplitude is an important non-perturbative quantity arising in the factorization of the amplitudes for many exclusive decays of $B$ mesons, such as $B^-\toγ\,\ell^-\barν$. We reconsider the renormalization-group (RG) equation satisfied by this function and present its solution at next-to-leading order (NLO) in RG-improved perturbation theory in Laplace space and, for the first time, in momentum space and the so-called diagonal (or dual) space. Since the information needed to describe the $B$ decay processes at leading order in $Λ_{\rm QCD}/m_b$ is most directly contained in the distribution amplitude in Laplace space evaluated near the origin, we propose an unbiased parameterization of this object in terms of a small set of uncorrelated hadronic parameters. Using recent results on the three-loop anomalous dimension for heavy-light current operators, we derive an expression for the convolution integral appearing in the $B^-\toγ\,\ell^-\barν$ factorization formula that is explicitly scale independent, and we evaluate this formula at (approximate) NNLO.

preprint2022arXiv

MoCoViT: Mobile Convolutional Vision Transformer

Recently, Transformer networks have achieved impressive results on a variety of vision tasks. However, most of them are computationally expensive and not suitable for real-world mobile applications. In this work, we present Mobile Convolutional Vision Transformer (MoCoViT), which improves in performance and efficiency by introducing transformer into mobile convolutional networks to leverage the benefits of both architectures. Different from recent works on vision transformer, the mobile transformer block in MoCoViT is carefully designed for mobile devices and is very lightweight, accomplished through two primary modifications: the Mobile Self-Attention (MoSA) module and the Mobile Feed Forward Network (MoFFN). MoSA simplifies the calculation of the attention map through Branch Sharing scheme while MoFFN serves as a mobile version of MLP in the transformer, further reducing the computation by a large margin. Comprehensive experiments verify that our proposed MoCoViT family outperform state-of-the-art portable CNNs and transformer neural architectures on various vision tasks. On ImageNet classification, it achieves 74.5% top-1 accuracy at 147M FLOPs, gaining 1.2% over MobileNetV3 with less computations. And on the COCO object detection task, MoCoViT outperforms GhostNet by 2.1 AP in RetinaNet framework.

preprint2022arXiv

Next-ViT: Next Generation Vision Transformer for Efficient Deployment in Realistic Industrial Scenarios

Due to the complex attention mechanisms and model design, most existing vision Transformers (ViTs) can not perform as efficiently as convolutional neural networks (CNNs) in realistic industrial deployment scenarios, e.g. TensorRT and CoreML. This poses a distinct challenge: Can a visual neural network be designed to infer as fast as CNNs and perform as powerful as ViTs? Recent works have tried to design CNN-Transformer hybrid architectures to address this issue, yet the overall performance of these works is far away from satisfactory. To end these, we propose a next generation vision Transformer for efficient deployment in realistic industrial scenarios, namely Next-ViT, which dominates both CNNs and ViTs from the perspective of latency/accuracy trade-off. In this work, the Next Convolution Block (NCB) and Next Transformer Block (NTB) are respectively developed to capture local and global information with deployment-friendly mechanisms. Then, Next Hybrid Strategy (NHS) is designed to stack NCB and NTB in an efficient hybrid paradigm, which boosts performance in various downstream tasks. Extensive experiments show that Next-ViT significantly outperforms existing CNNs, ViTs and CNN-Transformer hybrid architectures with respect to the latency/accuracy trade-off across various vision tasks. On TensorRT, Next-ViT surpasses ResNet by 5.5 mAP (from 40.4 to 45.9) on COCO detection and 7.7% mIoU (from 38.8% to 46.5%) on ADE20K segmentation under similar latency. Meanwhile, it achieves comparable performance with CSWin, while the inference speed is accelerated by 3.6x. On CoreML, Next-ViT surpasses EfficientFormer by 4.6 mAP (from 42.6 to 47.2) on COCO detection and 3.5% mIoU (from 45.1% to 48.6%) on ADE20K segmentation under similar latency. Our code and models are made public at: https://github.com/bytedance/Next-ViT

preprint2022arXiv

Reliable and Efficient Broadcast Routing Using Multipoint Relays Over VANET For Vehicle Platooning

In this paper, we design and implement a reliable broadcast algorithm over a VANET for supporting multi-hop forwarding of vehicle sensor and control packets that will enable vehicles to platoon with each other in order to form a road train behind the lead truck. In particular, we use multipoint relays (MPRs) for packet transmission, which leads to more efficient communication in a VANET. We evaluate the performance based on simulation by running a platooning simulation application program, and show that with MPRs, the communication in the VANET to form a road train is more efficient and reliable.

preprint2022arXiv

Sepsis Prediction with Temporal Convolutional Networks

We design and implement a temporal convolutional network model to predict sepsis onset. Our model is trained on data extracted from MIMIC III database, based on a retrospective analysis of patients admitted to intensive care unit who did not fall under the definition of sepsis at the time of admission. Benchmarked with several machine learning models, our model is superior on this binary classification task, demonstrates the prediction power of convolutional networks for temporal patterns, also shows the significant impact of having longer look back time on sepsis prediction.

preprint2022arXiv

Sharp $L^p$ estimates and size of nodal sets for generalized Steklov eigenfunctions

We prove sharp $L^p$ estimates for the Steklov eigenfunctions on compact manifolds with boundary in terms of their $L^2$ norms on the boundary. We prove it by establishing $L^p$ bounds for the harmonic extension operators as well as the spectral projection operators on the boundary. Moreover, we derive lower bounds on the size of nodal sets for a variation of the Steklov spectral problem. We consider a generalized version of the Steklov problem by adding a non-smooth potential on the boundary but some of our results are new even without potential.

preprint2022arXiv

Single Vector-Like top quark production via chromomagnetic interactions at present and future hadron colliders $-$A Snowmass 2021 White Paper

In our recent paper, we have investigated the potential for the LHC to discover vector-like quark partner states singly produced via their chromomagnetic moment interactions. These production mechanisms extend traditional searches which rely on pair-production of top-quark partner states or on the single production of these states through electroweak interactions, in the sense of providing greatly increased reach in parameter space regions where traditional searches are insensitive. In this study we determine the potential of both the 14 TeV high-luminosity LHC (HL-LHC) and a 100 TeV proton-proton collider to probe new vector-like quarks produced in this mode. We focus on the single production of a top-quark partner in association with an ordinary top-quark, as well as on the resonant production of the bottom-quark partner with its subsequent decay to a top-quark partner and a $W$ boson. For both cases we consider a top-partner decay to the Higgs boson and an ordinary top-quark. We find that HL-LHC and a future 100 TeV proton collider can probe vector-like partner masses up to about 3 TeV and 15-20 TeV respectively, visibly extending the range of the traditional vector like quark partner searches.

preprint2022arXiv

STAD: Self-Training with Ambiguous Data for Low-Resource Relation Extraction

We present a simple yet effective self-training approach, named as STAD, for low-resource relation extraction. The approach first classifies the auto-annotated instances into two groups: confident instances and uncertain instances, according to the probabilities predicted by a teacher model. In contrast to most previous studies, which mainly only use the confident instances for self-training, we make use of the uncertain instances. To this end, we propose a method to identify ambiguous but useful instances from the uncertain instances and then divide the relations into candidate-label set and negative-label set for each ambiguous instance. Next, we propose a set-negative training method on the negative-label sets for the ambiguous instances and a positive training method for the confident instances. Finally, a joint-training method is proposed to build the final relation extraction system on all data. Experimental results on two widely used datasets SemEval2010 Task-8 and Re-TACRED with low-resource settings demonstrate that this new self-training approach indeed achieves significant and consistent improvements when comparing to several competitive self-training systems. Code is publicly available at https://github.com/jjyunlp/STAD

preprint2022arXiv

Supersymmetry and Sum Rules in the Goldberger-Wise Model

In this work we demonstrate that the mixed gravitational and scalar sectors of the five-dimensional Goldberger-Wise (GW) model, in which the size of a warped extra dimension is dynamically determined, has a "hidden" dual $N=2$ supersymmetric structure. This symmetry structure, a generalization of one found in the unstabilized Randall-Sundrum model, is a result of the spontaneously broken five-dimensional diffeomorphism invariance of the underlying gravitational theory. The supersymmetries relate the properties of the spin-1 and spin-0 modes "eaten" by the massive spin-2 Kaluza-Klein states of the theory to the mode functions of the spin-2 modes. Because the symmetries relate the couplings and masses of the massive spin-2 states to those of the tower of physical spin-0 states of the GW model, they enable us to analytically prove the sum rule relations which ensure the tree-level scattering amplitudes of the massive spin-2 states will grow no faster than ${\cal O}(s)$. The analysis given here also explains the unconventional forms of the spin-0 mode equation, boundary condition(s), and normalization found in the GW model.

preprint2022arXiv

TRT-ViT: TensorRT-oriented Vision Transformer

We revisit the existing excellent Transformers from the perspective of practical application. Most of them are not even as efficient as the basic ResNets series and deviate from the realistic deployment scenario. It may be due to the current criterion to measure computation efficiency, such as FLOPs or parameters is one-sided, sub-optimal, and hardware-insensitive. Thus, this paper directly treats the TensorRT latency on the specific hardware as an efficiency metric, which provides more comprehensive feedback involving computational capacity, memory cost, and bandwidth. Based on a series of controlled experiments, this work derives four practical guidelines for TensorRT-oriented and deployment-friendly network design, e.g., early CNN and late Transformer at stage-level, early Transformer and late CNN at block-level. Accordingly, a family of TensortRT-oriented Transformers is presented, abbreviated as TRT-ViT. Extensive experiments demonstrate that TRT-ViT significantly outperforms existing ConvNets and vision Transformers with respect to the latency/accuracy trade-off across diverse visual tasks, e.g., image classification, object detection and semantic segmentation. For example, at 82.7% ImageNet-1k top-1 accuracy, TRT-ViT is 2.7$\times$ faster than CSWin and 2.0$\times$ faster than Twins. On the MS-COCO object detection task, TRT-ViT achieves comparable performance with Twins, while the inference speed is increased by 2.8$\times$.

preprint2022arXiv

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation

In this paper, we present a substantial step in better understanding the SOTA sequence-to-sequence (Seq2Seq) pretraining for neural machine translation~(NMT). We focus on studying the impact of the jointly pretrained decoder, which is the main difference between Seq2Seq pretraining and previous encoder-based pretraining approaches for NMT. By carefully designing experiments on three language pairs, we find that Seq2Seq pretraining is a double-edged sword: On one hand, it helps NMT models to produce more diverse translations and reduce adequacy-related translation errors. On the other hand, the discrepancies between Seq2Seq pretraining and NMT finetuning limit the translation quality (i.e., domain discrepancy) and induce the over-estimation issue (i.e., objective discrepancy). Based on these observations, we further propose simple and effective strategies, named in-domain pretraining and input adaptation to remedy the domain and objective discrepancies, respectively. Experimental results on several language pairs show that our approach can consistently improve both translation performance and model robustness upon Seq2Seq pretraining.

preprint2022arXiv

WIMP Dark Matter at High Energy Muon Colliders $-$A White Paper for Snowmass 2021

In a previous publication, we showed that a high energy muon collider can make decisive statements about the electroweak (WIMP) Dark Matter (DM), reaching a DM mass which could give the observed thermal relic abundance. In this document, we report new studies of the spin-$0$ minimal WIMP DM at high energy muon colliders, and update our results on the fermionic spin-$1/2$ case. We find that, by combining multiple inclusive missing mass search channels, it is possible to fully cover the thermal targets of fermionic and scalar doublets, and Dirac triplet, with a 10 TeV muon collider. Higher energies, 14 TeV$-$30 TeV, would be able to cover the thermal targets of Majorana and scalar triplet. For direct discovery of the higher EW multiplets with $n\geq5$, one may need to go beyond a 30 TeV muon collider to fully cover their thermal mass expectation.

preprint2021arXiv

Benchmarking Graph Neural Networks on Link Prediction

In this paper, we benchmark several existing graph neural network (GNN) models on different datasets for link predictions. In particular, the graph convolutional network (GCN), GraphSAGE, graph attention network (GAT) as well as variational graph auto-encoder (VGAE) are implemented dedicated to link prediction tasks, in-depth analysis are performed, and results from several different papers are replicated, also a more fair and systematic comparison are provided. Our experiments show these GNN architectures perform similarly on various benchmarks for link prediction tasks.

preprint2021arXiv

Factorization at Subleading Power and Endpoint Divergences in $h\toγγ$ Decay: II. Renormalization and Scale Evolution

Building on the recent derivation of a bare factorization theorem for the $b$-quark induced contribution to the $h\toγγ$ decay amplitude based on soft-collinear effective theory, we derive the first renormalized factorization theorem for a process described at subleading power in scale ratios, where $λ=m_b/M_h\ll 1$ in our case. We prove two refactorization conditions for a matching coefficient and an operator matrix element in the endpoint region, where they exhibit singularities giving rise to divergent convolution integrals. The refactorization conditions ensure that the dependence of the decay amplitude on the rapidity regulator, which regularizes the endpoint singularities, cancels out to all orders of perturbation theory. We establish the renormalized form of the factorization formula, proving that extra contributions arising from the fact that "endpoint regularization" does not commute with renormalization can be absorbed, to all orders, by a redefinition of one of the matching coefficients. We derive the renormalization-group evolution equation satisfied by all quantities in the factorization formula and use them to predict the large logarithms of order $α{\hspace{0.3mm}}α_s^2{\hspace{0.3mm}} L^k$ in the three-loop decay amplitude, where $L=\ln(-M_h^2/m_b^2)$ and $k=6,5,4,3$. We find perfect agreement with existing numerical results for the amplitude and analytical results for the three-loop contributions involving a massless quark loop. On the other hand, we disagree with the results of previous attempts to predict the series of subleading logarithms $\simα{\hspace{0.3mm}}α_s^n{\hspace{0.3mm}} L^{2n+1}$.

preprint2021arXiv

Geometry-Aware Fruit Grasping Estimation for Robotic Harvesting in Orchards

Field robotic harvesting is a promising technique in recent development of agricultural industry. It is vital for robots to recognise and localise fruits before the harvesting in natural orchards. However, the workspace of harvesting robots in orchards is complex: many fruits are occluded by branches and leaves. It is important to estimate a proper grasping pose for each fruit before performing the manipulation. In this study, a geometry-aware network, A3N, is proposed to perform end-to-end instance segmentation and grasping estimation using both color and geometry sensory data from a RGB-D camera. Besides, workspace geometry modelling is applied to assist the robotic manipulation. Moreover, we implement a global-to-local scanning strategy, which enables robots to accurately recognise and retrieve fruits in field environments with two consumer-level RGB-D cameras. We also evaluate the accuracy and robustness of proposed network comprehensively in experiments. The experimental results show that A3N achieves 0.873 on instance segmentation accuracy, with an average computation time of 35 ms. The average accuracy of grasping estimation is 0.61 cm and 4.8$^{\circ}$ in centre and orientation, respectively. Overall, the robotic system that utilizes the global-to-local scanning and A3N, achieves success rate of harvesting ranging from 70\% - 85\% in field harvesting experiments.

preprint2021arXiv

Invariant-mass distribution of top-quark pairs and top-quark mass determination

We investigate the invariant-mass distribution of top-quark pairs near the $2m_t$ threshold, which has strong impact on the determination of the top-quark mass $m_t$. We show that higher-order non-relativistic corrections lead to large contributions which are not included in the state-of-the-art theoretical predictions. We derive a factorization formula to resum such corrections to all orders in the strong-coupling, and calculate necessary ingredients to perform the resummation at next-to-leading power. We combine the resummation with fixed-order results and present phenomenologically relevant numeric results. We find that the resummation effect significantly enhances the differential cross section in the threshold region, and makes the theoretical prediction more compatible with experimental data. We estimate that using our prediction in the determination of $m_t$ will lead to a value closer to the result of direct measurement.

preprint2021arXiv

Quantum spin Hall effect in two-dimensional transition-metal chalcogenides

Based on first-principles calculations, we have found a family of 2D transition-metal (TM) chalcogenides MX5 (M = Zr, Hf and X = S, Se and Te) can host quantum spin Hall (QSH) effect. The molecular dynamics simulation indicate that they are all thermal-dynamically stable at room temperature, the largest band gap is 0.19 eV. We have investigated MX5's electronic properties and found their properties are very similar. The single-layer ZrX5 are all gapless semimetals without consideration of spin-orbit coupling (SOC). The consideration of SOC will result in insulating phases with band gaps of 0.05 eV (direct), 0.18 eV (direct) and 0.13 eV (indirect) for ZrS5, ZrSe5 to ZrTe5, respectively. The evolution of Wannier charge centers and edge states confirm they are all QSH insulators. The mechanisms for QSH effect in ZrX5 originate from the special nonsymmorphic space group features. In addition, the QSH state of ZrS5 survives at a large range of strain as long as the interchain coupling is not strong enough to reverse the band ordering. The single-layer ZrS5 will occur a topological insulator (TI)-to-semimetal (metal) or metal-to-semimetal transition under certain strain. Monolayer MX5 expand the TI materials based on TM chalcogenides and may open up a new way to fabricate novel low power spintronic devices at room temperature.

preprint2021arXiv

Radiative Quark Jet Function with an External Gluon

Factorization theorems in soft-collinear effective theory at subleading order in power counting involve ``radiative jet functions'', defined in terms of matrix elements of collinear fields with a soft momentum emitted from inside the jet. Of particular importance are the radiative quark jet functions with an external photon or gluon, which arise e.g. in the factorization theorems for the Higgs-boson amplitudes $h\toγγ$, $h\to gg$ and $gg\to h$ induced by light-quark loops. While the photon case has been studied extensively in previous work, we present here a detailed study of the radiative jet function with an external gluon. We calculate this jet function at one- and two-loop order, derive its one-loop anomalous dimension and study its renormalization-group evolution.

preprint2021arXiv

Top quark pair production near threshold: single/double distributions and mass determination

We investigate top quark pair production near the threshold where the pair invariant mass $M_{t\bar{t}}$ approaches $2m_t$, which provides sensitive observables to extract the top quark mass $m_t$. Using the effective field theory methods, we derive a factorization and resummation formula for kinematic distributions in the threshold limit up to the next-to-leading power, which resums higher order Coulomb corrections to all orders in the strong coupling constant. Our formula is similar to those in the literature but differs in several important aspects. We apply our formula to the $M_{t\bar{t}}$ distribution, as well as to the double differential cross section with respect to $M_{t\bar{t}}$ and the rapidity of the $t\bar{t}$ pair. We find that the resummation effects significantly increase the cross sections near the threshold, and lead to predictions better compatible with experimental data than the fixed-order ones. We demonstrate that incorporating resummation effects in the top quark mass determination can shift the extracted value of $m_t$ by as large as 1.4 GeV. The shift is much larger than the estimated uncertainties in previous experimental studies, and leads to a value of the top quark pole mass more consistent with the current world average.

preprint2020arXiv

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models

Machine translation (MT) systems translate text between different languages by automatically learning in-depth knowledge of bilingual lexicons, grammar and semantics from the training examples. Although neural machine translation (NMT) has led the field of MT, we have a poor understanding on how and why it works. In this paper, we bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table -- an interpretable table of bilingual lexicons. We extract the phrase table from the training examples that an NMT model correctly predicts. Extensive experiments on widely-used datasets show that the phrase table is reasonable and consistent against language pairs and random seeds. Equipped with the interpretable phrase table, we find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples. We also revisit some advances that potentially affect the learning of bilingual knowledge (e.g., back-translation), and report some interesting findings. We believe this work opens a new angle to interpret NMT with statistic models, and provides empirical supports for recent advances in improving NMT models.

preprint2020arXiv

Controllability and Accessibility on Graphs for Bilinear Systems over Lie Groups

This paper presents graph theoretic conditions for the controllability and accessibility of bilinear systems over the special orthogonal group, the special linear group and the general linear group, respectively, in the presence of drift terms. Such bilinear systems naturally induce two interaction graphs: one graph from the drift, and another from the controlled dynamics. As a result, the system controllability or accessibility becomes a property of the two graphs in view of the classical Lie algebra rank condition. We establish a systemic way of transforming the Lie bracket operations in the underlying Lie algebra, into specific operations of removing or creating links over the drift and controlled interaction graphs. As a result, we establish a series of graphical conditions for the controllability and accessibility of such bilinear systems, which rely only on the connectivity of the union of the drift and controlled interaction graphs. We present examples to illustrate the validity of the established results, and show that the proposed conditions are in fact considerably tight.

preprint2020arXiv

Controlling light absorption of graphene at critical coupling through magnetic dipole quasi-bound states in the continuum resonance

Enhancing the light-matter interaction in two-dimensional (2D) materials with high-$Q$ resonances in photonic structures has boosted the development of optical and photonic devices. Herein, we intend to build a bridge between the radiation engineering and the bound states in the continuum (BIC), and present a general method to control light absorption at critical coupling through the quasi-BIC resonance. In a single-mode two-port system composed of graphene coupled with silicon nanodisk metasurfaces, the maximum absorption of 0.5 can be achieved when the radiation rate of the magnetic dipole resonance equals to the dissipate loss rate of graphene. Furthermore, the absorption bandwidth can be adjusted more than two orders of magnitude from 0.9 nm to 94 nm by simultaneously changing the asymmetric parameter of metasurfaces, the Fermi level and the layer number of graphene. This work reveals out the essential role of BIC in radiation engineering and provides promising strategies in controlling light absorption of 2D materials for the next-generation optical and photonic devices, e.g., light emitters, detectors, modulators, and sensors.

preprint2020arXiv

Electroweak Couplings of the Higgs Boson at a Multi-TeV Muon Collider

We estimate the expected precision at a multi-TeV muon collider for measuring the Higgs boson couplings with electroweak gauge bosons, $HVV$ and $HHVV\ (V=W^\pm,Z)$, as well as the trilinear Higgs self-coupling $HHH$. At very high energies both single and double Higgs productions rely on the vector-boson fusion (VBF) topology. The outgoing remnant particles have a strong tendency to stay in the very forward region, leading to the configuration of the "inclusive process" and making it difficult to isolate $ZZ$ fusion events from the $WW$ fusion. In the single Higgs channel, we perform a maximum likelihood analysis on $HWW$ and $HZZ$ couplings using two categories: the inclusive Higgs production and the 1-muon exclusive signal. In the double Higgs channel, we consider the inclusive production and study the interplay of the trilinear $HHH$ and the quartic $VVHH$ couplings, by utilizing kinematic information in the invariant mass spectrum. We find that at a centre-of-mass energy of 10 TeV (30 TeV) with an integrated luminosity of 10 ab$^{-1}$ (90 ab$^{-1}$), one may reach a 95\% confidence level sensitivity of 0.073\% (0.023\%) for $WWH$ coupling, 0.61\% (0.21\%) for $ZZH$ coupling, 0.62\% (0.20\%) for $WWHH$ coupling, and 5.6\% (2.0\%) for $HHH$ coupling. For dim-6 operators contributing to the processes, these sensitivities could probe the new physics scale $Λ$ in the order of $1-10$ ($2-20$) TeV at a 10 TeV (30 TeV) muon collider.

preprint2020arXiv

Hierarchical Representation via Message Propagation for Robust Model Fitting

In this paper, we propose a novel hierarchical representation via message propagation (HRMP) method for robust model fitting, which simultaneously takes advantages of both the consensus analysis and the preference analysis to estimate the parameters of multiple model instances from data corrupted by outliers, for robust model fitting. Instead of analyzing the information of each data point or each model hypothesis independently, we formulate the consensus information and the preference information as a hierarchical representation to alleviate the sensitivity to gross outliers. Specifically, we firstly construct a hierarchical representation, which consists of a model hypothesis layer and a data point layer. The model hypothesis layer is used to remove insignificant model hypotheses and the data point layer is used to remove gross outliers. Then, based on the hierarchical representation, we propose an effective hierarchical message propagation (HMP) algorithm and an improved affinity propagation (IAP) algorithm to prune insignificant vertices and cluster the remaining data points, respectively. The proposed HRMP can not only accurately estimate the number and parameters of multiple model instances, but also handle multi-structural data contaminated with a large number of outliers. Experimental results on both synthetic data and real images show that the proposed HRMP significantly outperforms several state-of-the-art model fitting methods in terms of fitting accuracy and speed.

preprint2020arXiv

How Does Selective Mechanism Improve Self-Attention Networks?

Self-attention networks (SANs) with selective mechanism has produced substantial improvements in various NLP tasks by concentrating on a subset of input words. However, the underlying reasons for their strong performance have not been well explained. In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax. Experimental results on several representative NLP tasks, including natural language inference, semantic role labelling, and machine translation, show that SSANs consistently outperform the standard SANs. Through well-designed probing experiments, we empirically validate that the improvement of SSANs can be attributed in part to mitigating two commonly-cited weaknesses of SANs: word order encoding and structure modeling. Specifically, the selective mechanism improves SANs by paying more attention to content words that contribute to the meaning of the sentence. The code and data are released at https://github.com/xwgeng/SSAN.

preprint2020arXiv

Renormalization and Scale Evolution of the Soft-Quark Soft Function

Soft functions defined in terms of matrix elements of soft fields dressed by Wilson lines are central components of factorization theorems for cross sections and decay rates in collider and heavy-quark physics. While in many cases the relevant soft functions are defined in terms of gluon operators, at subleading order in power counting soft functions containing quark fields appear. We present a detailed discussion of the properties of the soft-quark soft function consisting of a quark propagator dressed by two finite-length Wilson lines connecting at one point. This function enters in the factorization theorem for the Higgs-boson decay amplitude of the $h\toγγ$ process mediated by light-quark loops. We perform the renormalization of this soft function at one-loop order, derive its two-loop anomalous dimension and discuss solutions to its renormalization-group evolution equation in momentum space, in Laplace space and in the "diagonal space", where the evolution is strictly multiplicative.

preprint2020arXiv

Sharp endpoint estimates for eigenfunctions restricted to submanifolds of codimension 2

Burq-Gérard-Tzvetkov and Hu established $L^p$ estimates ($2\le p\le \infty$) for the restriction of eigenfunctions to submanifolds. The estimates are sharp, except for the log loss at the endpoint $L^2$ estimates for submanifolds of codimension 2. It has long been believed that the log loss at the endpoint can be removed in general, while the problem is still open. So this paper is devoted to the study of sharp endpoint restriction estimates for eigenfunctions in this case. Chen and Sogge removed the log loss for the geodesics on 3-dimensional manifolds. In this paper, we generalize their result to higher dimensions and prove that the log loss can be removed for totally geodesic submanifolds of codimension 2. Moreover, on 3-dimensional manifolds, we can remove the log loss for curves with nonvanishing geodesic curvatures, and more general finite type curves. The problem in 3D is essentially related to Hilbert transforms along curves in the plane and a class of singular oscillatory integrals studied by Phong-Stein, Ricci-Stein, Pan, Seeger, Carbery-Pérez.

preprint2020arXiv

The Search for Electroweakinos

In this review, we consider a general theoretical framework for fermionic color-singlet states, including a singlet, a doublet and a triplet under the standard model SU(2)$_{\rm L}$ gauge symmetry, corresponding to the Bino, Higgsino and Wino in Supersymmetric theories, generically dubbed as "electroweakinos" for their mass eigenstates. Depending on the relations among their three mass parameters and the mixings after the electroweak symmetry breaking, this sector leads to rich phenomenology potentially accessible at the current and near-future experiments. We discuss the decay patterns of the electroweakinos and their observable signatures at colliders. We review the existing bounds on the model parameters. We summarize the current status for the comprehensive searches from the ATLAS and CMS experiments at the LHC. We comment on the prospects for future colliders. An important feature of the theory is that the lightest neutral electroweakino can be identified as a WIMP cold dark matter candidate. We take into account the existing bounds on the parameters from the dark matter direct detection experiments and discuss the complementarity for the electroweakino searches at colliders.

preprint2020arXiv

Top-quark pair production at complete-NLO accuracy with NNLO+NNLL$'$ corrections in QCD

We describe predictions for top-quark pair differential distributions at hadron colliders, which combine state-of-the-art NNLO QCD calculations and NLO electroweak corrections together with double resummation at NNLL$'$ accuracy of threshold logarithms and small-mass logarithms. This is the first time that such a combination has appeared in the literature. Numerical results are presented for the invariant-mass distribution, the transverse-momentum distribution as well as rapidity distributions.

preprint2019arXiv

Transverse Parton Distribution and Fragmentation Functions at NNLO: the Quark Case

We revisit the calculation of perturbative quark transverse momentum dependent parton distribution functions and fragmentation functions using the exponential regulator for rapidity divergences. We show that the exponential regulator provides a consistent framework for the calculation of various ingredients in transverse momentum dependent factorization. Compared to existing regulators in the literature, the exponential regulator has a couple of advantages which we explain in detail. As a result, the calculation is greatly simplified and we are able to obtain the next-to-next-to-leading order results up to $\mathcal{O}(ε^2)$ in dimensional regularization. These terms are necessary for a higher order calculation which is made possible with the simplification brought by the new regulator. As a by-product, we have obtained the two-loop quark jet function for the Energy-Energy Correlator in the back-to-back limit, which is the last missing ingredient for its N$^3$LL resummation.