Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
46works
0followers
29topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

46 published item(s)

preprint2026arXiv

Bridging Passive and Active: Enhancing Conversation Starter Recommendation via Active Expression Modeling

Large Language Model (LLM)-driven conversational search is shifting information retrieval from reactive keyword matching to proactive, open-ended dialogues. In this context, Conversation Starters are widely deployed to provide personalized query recommendations that help users initiate dialogues. Conventionally, recommending these starters relies on a closed "exposure-click" loop. Yet, this feedback loop mechanism traps the system in an echo chamber where, compounded by data sparsity, it fails to capture the dynamic nature of conversational search intents shaped by the open world. As a result, the system skews towards popular but generic suggestions.In this work, we uncover an untapped paradigm shift to shatter this harmful feedback loop: harnessing user "free will" through active user expressions. Unlike traditional recommendations, conversational search empowers users to bypass menus entirely through manually typed queries. The open-world intents in active queries hold the key to breaking this loop. However, incorporating them is non-trivial: (1) there exists an inherent distribution shift between active queries and formulated starters. (2) Furthermore, the "non-ID-able" nature of open text renders traditional item-based popularity statistics ineffective for large-scale industrial streaming training. To this end, we propose Passive-Active Bridge (PA-Bridge), a novel framework that employs an adversarial distribution aligner to bridge the distributional gap between passively recommended starters and active expressions. Moreover, we introduce a semantic discretizer to enable the deployment of popularity debiasing algorithms. Online A/B tests on our platform, demonstrate that PA-Bridge significantly boosts the Feature Penetration Rate by 0.54% and User Active Days

preprint2026arXiv

Earth-o1: A Grid-free Observation-native Atmospheric World Model

Despite the unprecedented volume of multimodal data provided by modern Earth observation systems, our ability to model atmospheric dynamics remains constrained. Traditional modeling frameworks force heterogeneous measurements into predefined spatial grids, inherently limiting the full exploitation of raw sensor data and creating severe computational bottlenecks. Here we present Earth-o1, an observation-native atmospheric world model that overcomes these structural limitations. Rather than relying on conventional atmospheric dynamical modeling systems or traditional data assimilation, Earth-o1 directly learns the continuous, three-dimensional physical evolution of the Earth system from ungridded observational data. By integrating diverse sensor inputs into a unified, grid-free dynamical field, the model autonomously advances the atmospheric state in space and time. We show that this fundamentally distinct paradigm enables direct, real-time forecasting and cross-sensor inference without the overhead of explicit numerical solvers. In hindcast evaluations, Earth-o1 achieves surface forecast skill comparable to the operational Integrated Forecasting System (IFS). These results establish that continuous, observation-driven world models -- a new class of fully observation-native geophysical simulators -- can match the fidelity of established physical frameworks, providing a scalable data-driven foundation for a digital twin of the Earth.

preprint2026arXiv

Efficient Paths and Dense Rewards: Probabilistic Flow Reasoning for Large Language Models

High-quality chain-of-thought has demonstrated strong potential for unlocking the reasoning capabilities of large language models. However, current paradigms typically treat the reasoning process as an indivisible sequence, lacking an intrinsic mechanism to quantify step-wise information gain. This granularity gap manifests in two limitations: inference inefficiency from redundant exploration without explicit guidance, and optimization difficulty due to sparse outcome supervision or costly external verifiers. In this work, we propose CoT-Flow, a framework that reconceptualizes discrete reasoning steps as a continuous probabilistic flow, quantifying the contribution of each step toward the ground-truth answer. Built on this formulation, CoT-Flow enables two complementary methodologies: flow-guided decoding, which employs a greedy flow-based decoding strategy to extract information-efficient reasoning paths, and flow-based reinforcement learning, which constructs a verifier-free dense reward function. Experiments on challenging benchmarks demonstrate that CoT-Flow achieves a superior balance between inference efficiency and reasoning performance.

preprint2026arXiv

Evolving-RL: End-to-End Optimization of Experience-Driven Self-Evolving Capability within Agents

Experience-driven self-evolving agents aim to overcome the static nature of large language models by distilling reusable experience from past interactions, thus enabling adaptation to novel tasks at deployment time. This process places substantial demands on the foundation model's capacities for abstraction, generalization, and in-context learning. However, most existing studies focus primarily on system-level design choices, such as how experience is represented and managed, neglecting the inherent capabilities of the underlying model. While some recent works have started to optimize the experience utilization stage via reinforcement learning, they still fail to treat self-evolution as a unified process to be jointly optimized. To this end, we propose Evolving-RL, an efficient algorithmic framework that jointly improves the experience extraction and utilization capabilities required for self-evolution. Specifically, we center the learning process on experience extraction and evaluation, using the two supervisory signals derived from evaluation to optimize the extractor and solver separately and thus enable their coordinated co-evolution. Experiments on ALFWorld and Mind2Web show that Evolving-RL effectively enhances LLMs' ability to extract and reuse experience, leading to strong performance gains on out-of-distribution tasks (up to 98.7% relative improvement over the GRPO baseline on ALFWorld unseen tasks and 35.8% on Mind2Web), and these gains are fully unlocked only through the coordinated co-evolution of experience extraction and utilization. Furthermore, Evolving-RL inherently functions as an experience-augmented RL algorithm. By internalizing reusable experience patterns directly into model parameters, it achieves remarkable performance gains over standard baselines on both seen and unseen tasks, even in the absence of test-time experience accumulation.

preprint2026arXiv

LRCP: Low-Rank Compressibility Guided Visual Token Pruning for Efficient LVLMs

Large vision-language models (LVLMs) achieve strong multimodal understanding, but their inference cost grows rapidly with the number of visual tokens, especially for high-resolution images and long videos. Existing attention-based methods estimate token importance from attention scores, which may introduce positional bias, while representation-based methods reduce visual redundancy based on feature relations or reconstruction errors, overlooking the global structure of the visual token set. In this paper, we revisit visual token compression from the perspective of low-rank compressibility. Across models and datasets, we observe that visual token representations exhibit a pronounced low-rank structure, with a dominant subspace that remains stable even after a large fraction of tokens is randomly removed. Motivated by this finding, we propose LRCP, a training-free compression framework that first estimates the dominant low-rank subspace of visual tokens via PCA, and then scores each token by its projection residual onto this subspace, retaining tokens that are poorly explained by the low-rank background. Extensive experiments show that LRCP achieves superior results, preserving 94.7% of the original image-understanding performance with an 88.9% token reduction and 97.8% of the average video-understanding accuracy with an 87.5% token reduction.

preprint2026arXiv

Quantum Dynamics Simulation of the Advection-Diffusion Equation

The advection-diffusion equation is simulated on a superconducting quantum computer via several quantum algorithms. Three formulations are considered: (1) Trotterization, (2) variational quantum time evolution (VarQTE), and (3) adaptive variational quantum dynamics simulation (AVQDS). These schemes were originally developed for the Hamiltonian simulation of many-body quantum systems. The finite-difference discretized operator of the transport equation is formulated as a Hamiltonian and solved without the need for ancillary qubits. Computations are conducted on a quantum simulator (IBM Qiskit Aer) and an actual quantum hardware (IBM Fez). The former emulates the latter without the noise. The predicted results are compared with direct numerical simulation (DNS) data with infidelities of the order $10^{-5}$. In the quantum simulator, Trotterization is observed to have the lowest infidelity and is suitable for fault-tolerant computation. The AVQDS algorithm requires the lowest gate count and the lowest circuit depth. The VarQTE algorithm is the next best in terms of gate counts, but the number of its optimization variables is directly proportional to the number of qubits. Due to current hardware limitations, Trotterization cannot be implemented, as it has an overwhelming large number of operations. Meanwhile, AVQDS and VarQTE can be executed, but suffer from large errors due to significant hardware noise. These algorithms present a new paradigm for computational transport phenomena on quantum computers.

preprint2026arXiv

RAVE: Re-Allocating Visual Attention in Large Multimodal Models

Large multimodal models (LMMs) inherit the self-attention mechanism of pretrained language backbones, yet standard attention can exhibit suboptimal allocation, including cross-modal misallocation between textual and visual evidence and intra-visual imbalance among visual tokens. We propose RAVE (Re-Allocating Visual Attention), a lightweight pair-gating mechanism that adds a learned query--key bias to pre-softmax attention scores over visual keys, derived from pre-RoPE query and key features. RAVE requires no architectural modification to the backbone and can be trained end-to-end with the rest of the model. Across a suite of multimodal benchmarks, RAVE improves over standard attention by an average of 3 points, with the largest gains on perception-intensive tasks -- including multilingual OCR, chart understanding, document VQA, and scene text VQA -- where accurate visual grounding is critical.

preprint2026arXiv

Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective

RLVR has become a widely adopted paradigm for improving LLMs' reasoning capabilities, and GRPO is one of its most representative algorithms. In this paper, we first show that GRPO admits an equivalent discriminative reformulation as a weighted positive-negative score difference. Under this view, GRPO increases sequence-level scores of verified positive rollouts and decreases those of negative rollouts, where the scores are averages of clipped token-level importance sampling ratios. This reformulation reveals two structural limitations of GRPO: likelihood-misaligned scoring, where clipped ratio-based surrogate scores are optimized instead of generation likelihoods, and score-insensitive credit assignment, where rollout-level credit is assigned without accounting for relative score gaps between positive and negative rollouts in the same group. To address these limitations, we propose ConSPO, a framework for Contrastive Sequence-level Policy Optimization in RLVR. ConSPO replaces GRPO's clipped ratio-based scores with length-normalized sequence log-probabilities, aligning the optimized rollout scores with the likelihoods used in autoregressive generation. It then optimizes a group-wise InfoNCE-style objective that contrasts each positive rollout against negative distractors from the same group, enabling credit assignment to depend on their relative scores. This contrastive formulation amplifies updates for poorly separated positives while concentrating suppressive updates on high-scoring negatives. Moreover, ConSPO introduces a curriculum-scheduled margin, guiding optimization from coarse positive-negative ordering in early training toward stronger separation in later stages. Extensive evaluations across diverse backbone models, parameter scales, and training datasets show that ConSPO consistently outperforms several strong RLVR baselines on challenging mathematical reasoning benchmarks.

preprint2026arXiv

StellarF: A Physics-Informed LoRA Framework for Stellar Flare Forecasting with Historical & Statistical Data

Stellar flare forecasting represents a critical frontier in astrophysics, offering profound insights into stellar activity mechanisms and exoplanetary habitability assessments. Yet the inherent unpredictability of flare activity, rooted in stellar diversity and evolutionary stages, underpins the field's core challenges: (1) sparse, incomplete, noisy lightcurve data from traditional observations; (2) ineffective multi-scale flare evolution capture via single representations; (3) poor physical interpretability in data-driven models lacking physics-informed priors. To address these challenges, we propose StellarF, a physics-informed framework synergizing general Al with astrophysical domain knowledge via three core components: a unified preprocessing pipeline for lightcurve refinement (missing-value imputation, temporal patch partitioning, adaptive sample filtering); a Low-Rank Adaptation (LoRA)-finetuned large language model (LLM) backbone enhanced by first-order difference augmentation, flare statistical information, and flare historical record modules for multimodal fusion instead of only simple representations; and a novel physics-informed loss embedding a minimum rising rate prior, appended to the cross-entropy loss, to align with flare physics. Extensive experiments on Kepler and TESS datasets show StellarF achieves state-of-the-art performance across key metrics, setting new benchmarks for flare forecasting. This work bridges general AI with astrophysics, offering a practical, physically interpretable paradigm for transient event forecasting in time-domain astronomy.

preprint2026arXiv

Toward Scalable Terminal Task Synthesis via Skill Graphs

Terminal agents have demonstrated strong potential for autonomous command-line execution, yet their training remains constrained by the scarcity of high-quality and diverse execution trajectories. Existing approaches mitigate this bottleneck by synthesizing large-scale terminal task instances for trajectory sampling. However, they primarily focus on scaling the number of tasks while providing limited control over the diversity of execution trajectories that agents actually experience during training. In this paper, we present SkillSynth, an automated framework for terminal task synthesis built on a scenario-mediated skill graph. SkillSynth first constructs a large-scale skill graph, where scenarios serve as intermediate transition nodes that connect diverse command-line skills. It then samples paths from this graph as abstractions of real-world workflows, and uses a multi-agent harness to instantiate them into executable task instances. By grounding task synthesis in graph-sampled workflow paths, SkillSynth explicitly controls the diversity of minimal execution trajectories required to solve the synthesized tasks. Experiments on Terminal-Bench demonstrate the effectiveness of SkillSynth. Moreover, task instances synthesized by SkillSynth have been adopted to train Hy3 Preview, contributing to its enhanced agentic capabilities in terminal-based settings.

preprint2026arXiv

UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning

User simulators serve as the critical interactive environment for agent post-training, and an ideal user simulator generalizes across domains and proactively engages in negotiation by challenging or bargaining. However, current methods exhibit two issues. They rely on static and context-unaware profiles, necessitating extensive manual redesign for new scenarios, thus limiting generalizability. Moreover, they neglect human strategic thinking, leading to vulnerability to agent manipulation. To address these issues, we propose UserLM-R1, a novel user language model with reasoning capability. Specifically, we first construct comprehensive user profiles with both static roles and dynamic scenario-specific goals for adaptation to diverse scenarios. Then, we propose a goal-driven decision-making policy to generate high-quality rationales before producing responses, and further refine the reasoning and improve strategic capabilities with supervised fine-tuning and multi-reward reinforcement learning. Extensive experimental results demonstrate that UserLM-R1 outperforms competitive baselines, particularly on the more challenging adversarial set.

preprint2025arXiv

Ab initio superionic-liquid phase diagram of Fe1-xOx under Earth's inner core conditions

The superionic state is a phase of matter in which liquid-like ionic mobility coexists with a solid crystalline lattice. Recently identified in Earth's inner core (IC), this state has attracted considerable attention for its unique kinetic behavior and geophysical implications. However, the ab initio phase diagram describing the equilibrium between the superionic phase and the liquid solution under core conditions remains largely unexplored. Here, we present a thermodynamic approach to compute the Gibbs free energy and construct the ab initio superionic-liquid phase diagram for the Fe1-xOx system under IC conditions. We find that oxygen forms superionic states in both hcp and bcc Fe phases, with a pronounced influence on cooperative diffusion of iron in the bcc lattice. The stability fields of these superionic phases are sensitive to oxygen stoichiometry. The presence of superionic states leads to a higher oxygen concentration in the IC than previously estimated. Our work establishes a framework for investigating superionic-liquid equilibria under extreme conditions.

preprint2025arXiv

SHIELD: Spherical-Projection Hybrid-Frontier Integration for Efficient LiDAR-based Drone Exploration

This paper introduces SHIELD, a Spherical-Projection Hybrid-Frontier Integration for Efficient LiDAR-based Drone exploration method. Although laser LiDAR offers the advantage of a wide field of view, its application in UAV exploration still faces several challenges. The observation quality of LiDAR point clouds is generally inferior to that of depth cameras. Traditional frontier methods based on known and unknown regions impose a heavy computational burden, especially when handling the wide field of view of LiDAR. In addition, regions without point cloud are also difficult to classify as free space through raycasting. To address these problems, the SHIELD is proposed. It maintains an observation-quality occupancy map and performs ray-casting on this map to address the issue of inconsistent point-cloud quality during exploration. A hybrid frontier method is used to tackle both the computational burden and the limitations of point-cloud quality exploration. In addition, an outward spherical-projection ray-casting strategy is proposed to jointly ensure flight safety and exploration efficiency in open areas. Simulations and flight experiments prove the effectiveness of SHIELD. This work will be open-sourced to contribute to the research community.

preprint2025arXiv

Ultrahigh-Energy Gamma-ray Emission Associated with Black Hole-Jet Systems

Black holes (BH), one of the most intriguing objects in the universe, can manifest themselves through electromagnetic radiation initiated by the accretion flow. Some stellar-mass BHs drive relativistic jets when accreting matter from their companion stars, forming microquasars. Non-thermal emission from the radio to tera-electronvolt (TeV) gamma-ray band has been observed from microquasars, indicating the acceleration of relativistic particles. Here we report detection of four microquasars (SS 433, V4641 Sgr, GRS 1915+105, MAXI J1820+070) of spectrum extending to the ultrahigh-energy (UHE; photon energy $E>100$ TeV) band and one microquasar (Cygnus X-1) of spectrum approaching 100 TeV, using the Large High Altitude Air Shower Observatory (LHAASO). Notably, the total emission associated with SS 433 cannot be interpreted with a single leptonic component. In the UHE band, its emission is in spatial coincidence with a giant atomic cloud, which is consistent with a hadronic origin. An elongated source is discovered from V4641 Sgr with the spectrum continuing up to 800 TeV. The detection of UHE gamma rays demonstrates that accreting BHs and their environments can operate as extremely efficient accelerators of particles out of 1 peta-electronvolt (PeV), suggesting microquasars to be important contributors to Galactic cosmic rays especially around the `knee' region.

preprint2024arXiv

Lookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Mapping

Tone mapping aims to convert high dynamic range (HDR) images to low dynamic range (LDR) representations, a critical task in the camera imaging pipeline. In recent years, 3-Dimensional LookUp Table (3D LUT) based methods have gained attention due to their ability to strike a favorable balance between enhancement performance and computational efficiency. However, these methods often fail to deliver satisfactory results in local areas since the look-up table is a global operator for tone mapping, which works based on pixel values and fails to incorporate crucial local information. To this end, this paper aims to address this issue by exploring a novel strategy that integrates global and local operators by utilizing closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we employ image-adaptive 3D LUTs to manipulate the tone in the low-frequency image by leveraging the specific characteristics of the frequency information. Furthermore, we utilize local Laplacian filters to refine the edge details in the high-frequency components in an adaptive manner. Local Laplacian filters are widely used to preserve edge details in photographs, but their conventional usage involves manual tuning and fixed implementation within camera imaging pipelines or photo editing tools. We propose to learn parameter value maps progressively for local Laplacian filters from annotated data using a lightweight network. Our model achieves simultaneous global tone manipulation and local edge detail preservation in an end-to-end manner. Extensive experimental results on two benchmark datasets demonstrate that the proposed method performs favorably against state-of-the-art methods.

preprint2023arXiv

Benchmarks and results of the two-band Hubbard model from the Gutzwiller conjugate gradient minimization theory

Ground-state properties, such as energies and double occupancies, of a one-dimensional two-band Hubbard model are calculated using a first principles Gutzwiller conjugate gradient minimization theory. The favorable agreement with the results from the density matrix renormalization group theory demonstrates the accuracy of our method. A rotationally invariant approach is further incorporated into the method to greatly reduce the computational complexity with a speedup of 300 times. Moreover, we investigate the Mott transition between a metal and a Mott insulator by evaluating the charge gap. With greatly reduced computational effort, our method reproduces the phase diagram in reasonable agreement with the density matrix renormalization group theory.

preprint2022arXiv

A new approach for amplitudes with multiple fermion lines

A new approach for tree-level amplitudes with multiple fermion lines is presented. It mainly focuses on the simplification of fermion lines. By calculating two vectors recursively without any matrix multiplications, the result of a fermion line is reduced to a very compact form depending only on the two vectors. The comparisons with other packages are presented, and the results show that our package FDC gives a very good performance in the processes of multiple fermion lines with this new approach and some other improvements. A further comparison with WHIZARD shows that this new approach has a competitive efficiency in computing pure amplitude square without phase space integration.

preprint2022arXiv

A Simple Meta-learning Paradigm for Zero-shot Intent Classification with Mixture Attention Mechanism

Zero-shot intent classification is a vital and challenging task in dialogue systems, which aims to deal with numerous fast-emerging unacquainted intents without annotated training data. To obtain more satisfactory performance, the crucial points lie in two aspects: extracting better utterance features and strengthening the model generalization ability. In this paper, we propose a simple yet effective meta-learning paradigm for zero-shot intent classification. To learn better semantic representations for utterances, we introduce a new mixture attention mechanism, which encodes the pertinent word occurrence patterns by leveraging the distributional signature attention and multi-layer perceptron attention simultaneously. To strengthen the transfer ability of the model from seen classes to unseen classes, we reformulate zero-shot intent classification with a meta-learning strategy, which trains the model by simulating multiple zero-shot classification tasks on seen categories, and promotes the model generalization ability with a meta-adapting procedure on mimic unseen categories. Extensive experiments on two real-world dialogue datasets in different languages show that our model outperforms other strong baselines on both standard and generalized zero-shot intent classification tasks.

preprint2022arXiv

Adaptive variational quantum eigensolvers for highly excited states

Highly excited states of quantum many-body systems are central objects in the study of quantum dynamics and thermalization that challenge classical computational methods due to their volume-law entanglement content. In this work, we explore the potential of variational quantum algorithms to approximate such states. We propose an adaptive variational algorithm, adaptive VQE-X, that self-generates a variational ansatz for arbitrary eigenstates of a many-body Hamiltonian $H$ by attempting to minimize the energy variance with respect to $H$. We benchmark the method by applying it to an Ising spin chain with integrable and nonintegrable regimes, where we calculate various quantities of interest, including the total energy, magnetization density, and entanglement entropy. We also compare the performance of adaptive VQE-X to an adaptive variant of the folded-spectrum method. For both methods, we find a strong dependence of the algorithm's performance on the choice of operator pool used for the adaptive construction of the ansatz. In particular, an operator pool including long-range two-body gates accelerates the convergence of both algorithms in the nonintegrable regime. We also study the scaling of the number of variational parameters with system size, finding that an exponentially large number of parameters may be necessary to approximate individual highly excited states. Nevertheless, we argue that these methods lay a foundation for the use of quantum algorithms to study finite-energy-density properties of many-body systems.

preprint2022arXiv

Electron-phonon coupling strength from ab initio frozen-phonon approach

We propose a fast method for high-throughput screening of potential superconducting materials. The method is based on calculating metallic screening of zone-center phonon modes, which provides an accurate estimate for the electron-phonon coupling strength. This method is complementary to the recently proposed Rigid Muffin Tin (RMT) method, which amounts to integrating the electron-phonon coupling over the entire Brillouin zone (as opposed to the zone center), but in a relatively inferior approximation. We illustrate the use of this method by applying it to MgB$_\text{2}$, where the high-temperature superconductivity is known to be driven largely by the zone-center modes, and compare it to a sister compound AlB$_\text{2}$. We further illustrate the usage of this descriptor by screening a large number of binary hydrides, for which accurate first-principle calculations of electron-phonon coupling have been recently published. Together with the RMT descriptor, this method opens a way to perform initial high-throughput screening in search of conventional superconductors via machine learning or data mining.

preprint2022arXiv

Fermi Level Depinning in Two-Dimensional Materials Using a Fluorinated Bilayer Graphene Barrier

Strong Fermi level pinning (FLP) - often attributed to metal-induced gap states at the interfacial contacts - severely reduces the tunability of the Schottky barrier height of the junction and limits applications of the 2D materials in electronics and optoelectronics. Here, we show that fluorinated bilayer graphene (FBLG) can be used as a barrier to effectively prevent FLP at metal/2D materials interfaces. FLBG can be produced via short exposure (1-3 min) to SF6 plasma that fluorinates only the top layer of a bilayer graphene with covalent C-F bonding, while the bottom layer remains intrinsic, resulting in a band gap opening of about 75 meV. Inserting FBLG between the metallic contacts and a layer of MoS2 reduces the Schottky barrier height dramatically for the low-work function metals (313 and 260 meV for Ti and Cr, respectively) while it increases for the high-work function one ( 160 meV for Pd), corresponding to an improved pinning factor. Our results provide a straightforward method to generate atomically thin dielectrics with applications not only for depinning the Fermi level at metal/transition metal dichalcogenide (TMD) interfaces but also for solving many other problems in electronics and optoelectronics

preprint2022arXiv

High-throughput screening of strong electron-phonon couplings in ternary metal diborides

We perform a high-throughput screening on phonon-mediated superconductivity in ternary metal diboride structure with alkali, alkaline earth, and transition metals. We find 17 ground states and 78 low-energy metastable phases. From fast calculations of zone-center electron-phonon coupling, 43 compounds are revealed to show electron-phonon coupling strength higher than that of MgB2. An anti-correlation between energetic stability and electron-phonon coupling strength is identified. We suggest two phases, i.e., Li3ZrB8 and Ca3YB8, to be synthesized, which show reasonable energetic stability and superconducting critical temperature.

preprint2022arXiv

Label-enhanced Prototypical Network with Contrastive Learning for Multi-label Few-shot Aspect Category Detection

Multi-label aspect category detection allows a given review sentence to contain multiple aspect categories, which is shown to be more practical in sentiment analysis and attracting increasing attention. As annotating large amounts of data is time-consuming and labor-intensive, data scarcity occurs frequently in real-world scenarios, which motivates multi-label few-shot aspect category detection. However, research on this problem is still in infancy and few methods are available. In this paper, we propose a novel label-enhanced prototypical network (LPN) for multi-label few-shot aspect category detection. The highlights of LPN can be summarized as follows. First, it leverages label description as auxiliary knowledge to learn more discriminative prototypes, which can retain aspect-relevant information while eliminating the harmful effect caused by irrelevant aspects. Second, it integrates with contrastive learning, which encourages that the sentences with the same aspect label are pulled together in embedding space while simultaneously pushing apart the sentences with different aspect labels. In addition, it introduces an adaptive multi-label inference module to predict the aspect count in the sentence, which is simple yet effective. Extensive experimental results on three datasets demonstrate that our proposed model LPN can consistently achieve state-of-the-art performance.

preprint2022arXiv

Probing Visual-Audio Representation for Video Highlight Detection via Hard-Pairs Guided Contrastive Learning

Video highlight detection is a crucial yet challenging problem that aims to identify the interesting moments in untrimmed videos. The key to this task lies in effective video representations that jointly pursue two goals, \textit{i.e.}, cross-modal representation learning and fine-grained feature discrimination. In this paper, these two challenges are tackled by not only enriching intra-modality and cross-modality relations for representation modeling but also shaping the features in a discriminative manner. Our proposed method mainly leverages the intra-modality encoding and cross-modality co-occurrence encoding for fully representation modeling. Specifically, intra-modality encoding augments the modality-wise features and dampens irrelevant modality via within-modality relation learning in both audio and visual signals. Meanwhile, cross-modality co-occurrence encoding focuses on the co-occurrence inter-modality relations and selectively captures effective information among multi-modality. The multi-modal representation is further enhanced by the global information abstracted from the local context. In addition, we enlarge the discriminative power of feature embedding with a hard-pairs guided contrastive learning (HPCL) scheme. A hard-pairs sampling strategy is further employed to mine the hard samples for improving feature discrimination in HPCL. Extensive experiments conducted on two benchmarks demonstrate the effectiveness and superiority of our proposed methods compared to other state-of-the-art methods.

preprint2022arXiv

Pyramid Region-based Slot Attention Network for Temporal Action Proposal Generation

It has been found that temporal action proposal generation, which aims to discover the temporal action instances within the range of the start and end frames in the untrimmed videos, can largely benefit from proper temporal and semantic context exploitation. The latest efforts were dedicated to considering the temporal context and similarity-based semantic contexts through self-attention modules. However, they still suffer from cluttered background information and limited contextual feature learning. In this paper, we propose a novel Pyramid Region-based Slot Attention (PRSlot) module to address these issues. Instead of using the similarity computation, our PRSlot module directly learns the local relations in an encoder-decoder manner and generates the representation of a local region enhanced based on the attention over input features called \textit{slot}. Specifically, upon the input snippet-level features, PRSlot module takes the target snippet as \textit{query}, its surrounding region as \textit{key} and then generates slot representations for each \textit{query-key} slot by aggregating the local snippet context with a parallel pyramid strategy. Based on PRSlot modules, we present a novel Pyramid Region-based Slot Attention Network termed PRSA-Net to learn a unified visual representation with rich temporal and semantic context for better proposal generation. Extensive experiments are conducted on two widely adopted THUMOS14 and ActivityNet-1.3 benchmarks. Our PRSA-Net outperforms other state-of-the-art methods. In particular, we improve the AR@100 from the previous best 50.67% to 56.12% for proposal generation and raise the mAP under 0.5 tIoU from 51.9\% to 58.7\% for action detection on THUMOS14. \textit{Code is available at} \url{https://github.com/handhand123/PRSA-Net}

preprint2022arXiv

Structure and motifs of iron oxides from 1 to 3 TPa

Iron oxides are fundamental components of planet-forming materials. Understanding the Fe-O system's behavior and properties under high pressure can help us identify many new phases and states possible in exoplanetary interiors, especially terrestrial ones. Using the adaptive genetic algorithm (AGA), we investigate the structure of iron oxides for a wide range of stoichiometries ($0.25\leq x_O \leq 0.8$) at 1, 2, and 3 TPa. Five unreported ground-state structures with Fe$_2$O, FeO, Fe$_3$O$_5$, FeO$_2$, and FeO$_4$ compositions are identified. The calculated density of states (DOS) suggests that, except for FeO$_4$, all phases are metallic, but their carrier densities decrease with increasing pressure and oxygen content. The cluster alignment analysis of stable and metastable phases shows that several motifs may co-exist in a structure of iron oxides with low O content. In contrast, most iron oxides with high O content adopt a simple BCC motif at TPa pressures. Our results provide a crystal structure database of iron oxides for modeling and understanding the interiors of exoplanets.

preprint2022arXiv

Study of the rare decay $J/ψ\to 2γ+hadrons$ at the BESIII

Two-photon radiative decay process $J/ψ\to 2γ+hadrons$ is studied, and the main contribution processes $J/ψ\to 2γ+ g g g$ and $J/ψ\to 2γ+ q \bar{q}$ are calculated. With the specific condition at the BESIII, this rare decay process and the main background process $e^{+} e^{-} \to γγ+ hadrons (q \bar{q})$ are investigated. The results show that the ratio of signal to background can reach 1.24 with the optimized selection criteria at the BESIII. In addition, a few distributions of the signal and background are presented. All the results show that the signal is large enough to be measured in the experiment.

preprint2022arXiv

The Second Place Solution for The 4th Large-scale Video Object Segmentation Challenge--Track 3: Referring Video Object Segmentation

The referring video object segmentation task (RVOS) aims to segment object instances in a given video referred by a language expression in all video frames. Due to the requirement of understanding cross-modal semantics within individual instances, this task is more challenging than the traditional semi-supervised video object segmentation where the ground truth object masks in the first frame are given. With the great achievement of Transformer in object detection and object segmentation, RVOS has been made remarkable progress where ReferFormer achieved the state-of-the-art performance. In this work, based on the strong baseline framework--ReferFormer, we propose several tricks to boost further, including cyclical learning rates, semi-supervised approach, and test-time augmentation inference. The improved ReferFormer ranks 2nd place on CVPR2022 Referring Youtube-VOS Challenge.

preprint2022arXiv

Two-step nucleation of the Earth's inner core

It has long been assumed the Earth's solid inner core started to grow when molten iron cooled to its melting point. However, the nucleation mechanism, which is a necessary step of crystallization, has not been well understood. Recent studies found it requires an unrealistic degree of undercooling to nucleate the stable hexagonal close-packed (hcp) phase of iron, which can never be reached under the actual Earth's core conditions. This contradiction leads to the inner core nucleation paradox [1]. Here, using a persistent-embryo method and molecular dynamics simulations, we demonstrate that the metastable body-centered cubic (bcc) phase of iron has a much higher nucleation rate than the hcp phase under inner-core conditions. Thus, the bcc nucleation is likely to be the first step of inner core formation instead of direct nucleation of the hcp phase. This mechanism reduces the required undercooling of iron nucleation, which provides a key factor to solve the inner-core nucleation paradox. The two-step nucleation scenario of the inner core also opens a new avenue for understanding the structure and anisotropy of the present inner core.

preprint2021arXiv

Low-rank matrix recovery via regularized nuclear norm minimization

In this paper, we theoretically investigate the low-rank matrix recovery problem in the context of the unconstrained regularized nuclear norm minimization (RNNM) framework. Our theoretical findings show that, the RNNM method is able to provide a robust recovery of any matrix $X$ (not necessary to be exactly low-rank) from its few noisy measurements $\textbf{b}=\mathcal{A}(X)+\textbf{n}$ with a bounded constraint $\|\textbf{n}\|_{2}\leqε$, provided that the $tk$-order restricted isometry constant (RIC) of $\mathcal{A}$ satisfies a certain constraint related to $t>0$. Specifically, the obtained recovery condition in the case of $t>4/3$ is found to be same with the sharp condition established previously by Cai and Zhang (2014) to guarantee the exact recovery of any rank-$k$ matrix via the constrained nuclear norm minimization method. More importantly, to the best of our knowledge, we are the first to establish the $tk$-order RIC based coefficient estimate of the robust null space property in the case of $0<t\leq1$.

preprint2020arXiv

An analysis of noise folding for low-rank matrix recovery

Previous work regarding low-rank matrix recovery has concentrated on the scenarios in which the matrix is noise-free and the measurements are corrupted by noise. However, in practical application, the matrix itself is usually perturbed by random noise preceding to measurement. This paper concisely investigates this scenario and evidences that, for most measurement schemes utilized in compressed sensing, the two models are equivalent with the central distinctness that the noise associated with (\ref{eq.3}) is larger by a factor to $mn/M$, where $m,~n$ are the dimension of the matrix and $M$ is the number of measurements. Additionally, this paper discusses the reconstruction of low-rank matrices in the setting, presents sufficient conditions based on the associating null space property to guarantee the robust recovery and obtains the number of measurements. Furthermore, for the non-Gaussian noise scenario, we further explore it and give the corresponding result. The simulation experiments conducted, on the one hand show effect of noise variance on recovery performance, on the other hand demonstrate the verifiability of the proposed model.

preprint2020arXiv

An Optimal Condition of Robust Low-rank Matrices Recovery

In this paper we investigate the reconstruction conditions of nuclear norm minimization for low-rank matrix recovery. We obtain sufficient conditions $δ_{tr}<t/(4-t)$ with $0<t<4/3$ to guarantee the robust reconstruction $(z\neq0)$ or exact reconstruction $(z=0)$ of all rank $r$ matrices $X\in\mathbb{R}^{m\times n}$ from $b=\mathcal{A}(X)+z$ via nuclear norm minimization. Furthermore, we not only show that when $t=1$, the upper bound of $δ_r<1/3$ is the same as the result of Cai and Zhang \cite{Cai and Zhang}, but also demonstrate that the gained upper bounds concerning the recovery error are better. Moreover, we prove that the restricted isometry property condition is sharp. Besides, the numerical experiments are conducted to reveal the nuclear norm minimization method is stable and robust for the recovery of low-rank matrix.

preprint2020arXiv

Dissipativity and positive off-diagonal property of operators on ordered Banach spaces

In this paper, we provide a sublinear function $p$ on ordered Banach spaces, which depends on the order structure of the space. With respect to this $p$, we study the relation between $p$-contractivity of positive semigroups and the $p$-dissipativity of its generators. The positive off-diagonal property of generators is also studied in ordered vector spaces.

preprint2020arXiv

Efficient step-merged quantum imaginary time evolution algorithm for quantum chemistry

We develop a resource efficient step-merged quantum imaginary time evolution approach (smQITE) to solve for the ground state of a Hamiltonian on quantum computers. This heuristic method features a fixed shallow quantum circuit depth along the state evolution path. We use this algorithm to determine binding energy curves of a set of molecules, including H$_2$, H$_4$, H$_6$, LiH, HF, H$_2$O and BeH$_2$, and find highly accurate results. The required quantum resources of smQITE calculations can be further reduced by adopting the circuit form of the variational quantum eigensolver (VQE) technique, such as the unitary coupled cluster ansatz. We demonstrate that smQITE achieves a similar computational accuracy as VQE at the same fixed-circuit ansatz, without requiring a generally complicated high-dimensional non-convex optimization. Finally, smQITE calculations are carried out on Rigetti quantum processing units (QPUs), demonstrating that the approach is readily applicable on current noisy intermediate-scale quantum (NISQ) devices.

preprint2020arXiv

Evidence for a new extended solid of nitrogen

A new extended solid nitrogen, referred to post-layered-polymeric nitrogen (PLP-N), was observed by further heating the layered-polymeric nitrogen (LP-N) to above 2300 K at 161 GPa. The new phase is found to be very transparent and exhibits ultra-large d-spacings ranging from 2.8 to 4.9 Å at 172 GPa, suggesting a possible large-unit-cell 2D chain-like or 0D cluster-type structure with wide bandgap. However, the observed X-ray diffraction pattern and Raman scattering data cannot match any predicted structures in the published literature. This finding further complicates the phase diagram of nitrogen and also highlights the path dependence of the high-pressure dissociative transition in nitrogen. In addition, the forming boundary between cg-N and LP-N has been determined.

preprint2020arXiv

Ground state properties of one-dimensional and two-dimensional Hubbard model from Gutzwiller conjugate gradient minimization theory

We introduce Gutzwiller conjugate gradient minimization (GCGM) theory, an ab initio quantum many-body theory for computing the ground-state properties of infinite systems. GCGM uses the Gutzwiller wave function but does not use the commonly adopted Gutzwiller approximation (GA), which is a major source of inaccuracy. Instead, the theory uses an approximation that is based on the occupation probability of the on-site configurations, rather than approximations that decouple the site-site correlations as used in the GA. We test the theory in the one-dimensional and two-dimensional Hubbard models at various electron densities and find that GCGM reproduces energies and double occupancies in reasonable agreement with benchmark data at a very small computational cost.

preprint2020arXiv

Gutzwiller Hybrid Quantum-Classical Computing Approach for Correlated Materials

Rapid progress in noisy intermediate-scale quantum (NISQ) computing technology has led to the development of novel resource-efficient hybrid quantum-classical algorithms, such as the variational quantum eigensolver (VQE), that can address open challenges in quantum chemistry, physics and material science. Proof-of-principle quantum chemistry simulations for small molecules have been demonstrated on NISQ devices. While several approaches have been theoretically proposed for correlated materials, NISQ simulations of interacting periodic models on current quantum devices have not yet been demonstrated. Here, we develop a hybrid quantum-classical simulation framework for correlated electron systems based on the Gutzwiller variational embedding approach. We implement this framework on Rigetti quantum processing units (QPUs) and apply it to the periodic Anderson model, which describes a correlated heavy electron band hybridizing with non-interacting conduction electrons. Our simulation results quantitatively reproduce the known ground state quantum phase diagram including metallic, Kondo and Mott insulating phases. This is the first fully self-consistent hybrid quantum-classical simulation of an infinite correlated lattice model executed on QPUs, demonstrating that the Gutzwiller hybrid quantum-classical embedding framework is a powerful approach to simulate correlated materials on NISQ hardware. This benchmark study also puts forth a concrete pathway towards practical quantum advantage on NISQ devices.

preprint2020arXiv

Joint COCO and Mapillary Workshop at ICCV 2019 Keypoint Detection Challenge Track Technical Report: Distribution-Aware Coordinate Representation for Human Pose Estimation

In this paper, we focus on the coordinate representation in human pose estimation. While being the standard choice, heatmap based representation has not been systematically investigated. We found that the process of coordinate decoding (i.e. transforming the predicted heatmaps to the coordinates) is surprisingly significant for human pose estimation performance, which nevertheless was not recognised before. In light of the discovered importance, we further probe the design limitations of the standard coordinate decoding method and propose a principled distribution-aware decoding method. Meanwhile, we improve the standard coordinate encoding process (i.e. transforming ground-truth coordinates to heatmaps) by generating accurate heatmap distributions for unbiased model training. Taking them together, we formulate a novel Distribution-Aware coordinate Representation for Keypoint (DARK) method. Serving as a model-agnostic plug-in, DARK significantly improves the performance of a variety of state-of-the-art human pose estimation models. Extensive experiments show that DARK yields the best results on COCO keypoint detection challenge, validating the usefulness and effectiveness of our novel coordinate representation idea. The project page containing more details is at https://ilovepose.github.io/coco

preprint2020arXiv

Shallow-circuit variational quantum eigensolver based on symmetry-inspired Hilbert space partitioning for quantum chemical calculations

Development of resource-friendly quantum algorithms remains highly desirable for noisy intermediate-scale quantum computing. Based on the variational quantum eigensolver (VQE) with unitary coupled cluster ansatz, we demonstrate that partitioning of the Hilbert space made possible by the point group symmetry of the molecular systems greatly reduces the number of variational operators by confining the variational search within a subspace. In addition, we found that instead of including all subterms for each excitation operator, a single-term representation suffices to reach required accuracy for various molecules tested, resulting in an additional shortening of the quantum circuit. With these strategies, VQE calculations on a noiseless quantum simulator achieve energies within a few meVs of those obtained with the full UCCSD ansatz for $\mathrm{H}_4$ square, $\mathrm{H}_4$ chain and $\mathrm{H}_6$ hexagon molecules; while the number of controlled-NOT (CNOT) gates, a measure of the quantum-circuit depth, is reduced by a factor of as large as 35. Furthermore, we introduced an efficient &#34;score&#34; parameter to rank the excitation operators, so that the operators causing larger energy reduction can be applied first. Using $\mathrm{H}_4$ square and $\mathrm{H}_4$ chain as examples, We demonstrated on noisy quantum simulators that the first few variational operators can bring the energy within the chemical accuracy, while additional operators do not improve the energy since the accumulative noise outweighs the gain from the expansion of the variational ansatz.

preprint2020arXiv

TADOC: Text Analytics Directly on Compression

This article provides a comprehensive description of Text Analytics Directly on Compression (TADOC), which enables direct document analytics on compressed textual data. The article explains the concept of TADOC and the challenges to its effective realizations. Additionally, a series of guidelines and technical solutions that effectively address those challenges, including the adoption of a hierarchical compression method and a set of novel algorithms and data structure designs, are presented. Experiments on six data analytics tasks of various complexities show that TADOC can save 90.8% storage space and 87.9% memory usage, while halving data processing times.

preprint2020arXiv

TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis

This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities. Compared to most previous publicly available text understanding systems and tools, TexSmart holds some unique features. First, the NER function of TexSmart supports over 1,000 entity types, while most other public tools typically support several to (at most) dozens of entity types. Second, TexSmart introduces new semantic analysis functions like semantic expansion and deep semantic representation, that are absent in most previous systems. Third, a spectrum of algorithms (from very fast algorithms to those that are relatively slow but more accurate) are implemented for one function in TexSmart, to fulfill the requirements of different academic and industrial applications. The adoption of unsupervised or weakly-supervised algorithms is especially emphasized, with the goal of easily updating our models to include fresh data with less human annotation efforts. The main contents of this report include major functions of TexSmart, algorithms for achieving these functions, how to use the TexSmart toolkit and Web APIs, and evaluation results of some key algorithms.

preprint2020arXiv

The perturbation analysis of nonconvex low-rank matrix robust recovery

In this paper, we bring forward a completely perturbed nonconvex Schatten $p$-minimization to address a model of completely perturbed low-rank matrix recovery. The paper that based on the restricted isometry property generalizes the investigation to a complete perturbation model thinking over not only noise but also perturbation, gives the restricted isometry property condition that guarantees the recovery of low-rank matrix and the corresponding reconstruction error bound. In particular, the analysis of the result reveals that in the case that $p$ decreases $0$ and $a>1$ for the complete perturbation and low-rank matrix, the condition is the optimal sufficient condition $δ_{2r}<1$ \cite{Recht et al 2010}. The numerical experiments are conducted to show better performance, and provides outperformance of the nonconvex Schatten $p$-minimization method comparing with the convex nuclear norm minimization approach in the completely perturbed scenario.

preprint2020arXiv

Unexpectedly strong diamagnetism and superparamagnetism of aromatic peptides due to self-assembling and cations

There is a considerable amount of work that shows the biomagnetism of organic components without ferromagnetic components at the molecular level, but it is of great challenge to cover the giant gap of biomagnetism between their experimental and theoretical results. Here, we show that the diamagnetism of an aromatic peptide, the AYFFF, is greatly enhanced for about 11 times by self-assembling, reaching two orders of magnitude higher than the mass susceptibility of pure water. Moreover, the AYFFF self-assemblies further mixed with ZnCl2 solution of sufficiently high concentrations display superparamagnetism, with the mass susceptibility reaching more than two orders of magnitude higher than the absolute value of pure water, which may approach the mass susceptibility of ferromagnetism. The aromatic rings in the peptide molecules and the cations are the keys to such a strong diamagnetism and superparamagnetism of aromatic peptides.

preprint2019arXiv

The dynamical and thermodynamical origin of dissipative chaos

Chaos is usually referred to the sensitivity to initial conditions in which the nonlinearity plays a crucial role. Beyond such a mathematical description, the understanding of the underlying physical origin of the chaos is still not very clear. Here we study the dissipative chaos from the perspective of the nonequilibrium dynamics. This was not fully investigated in the traditional chaos theory, despite of the Lorenz&#39;s original discovery of chaos from the nonequilibrium atmosphere. We found that the nonequilibriumness as the degree of detailed balance breaking can be quantified by the appearance of the steady state probability flux in the state space. We uncovered that the dynamical origin of the onset and offset of the dissipative chaos such as Lorentz attractor is from the sudden appearance and disappearance of such nonequilibrium flux. We also uncovered that the dissipation associated with the flux quantified by the entropy production rate gives the thermodynamical origin of dissipative chaos. The sharp changes in the degree of nonequilibriumness by the flux and the entropy production rate also provide alternative quantitative indicators for the onset and offset of the dissipative chaos.

preprint2018arXiv

Spatially-correlated Site Occupancy in the Nonstoichiometric Meta-stable ε-Al60Sm11 Phase during Devitrification of Al-10.2 at.% Sm Glasses

A metastable ε-Al60Sm11 phase appears during the initial devitrification of as-quenched Al-10.2 at.% Sm glasses. The ε phase is nonstoichiometric in nature since Al occupation is observed on the 16f Sm lattice sites. Scanning transmission electron microscopic images reveal profound spatial correlation of Sm content on these sites, which cannot be explained by the &#34;average crystal&#34; description from Rietveld analysis of diffraction data. Thermodynamically favorable configurations, established by Monte Carlo (MC) simulations based on a cluster-expansion model, also give qualitatively different correlation functions from experimental observations. On the other hand, molecular dynamics simulations of the growth of ε-Al60Sm11 in undercooled liquid show that when the diffusion range of Sm is limited to ~ 4 Å, the correlation function of the as-grown crystal structure agrees well with that of the STEM images. Our results show that kinetic effects, especially the limited diffusivity of Sm atoms plays the fundamental role in determining the nonstoichiometric site occupancies of the ε-Al60Sm11 phase during the crystallization process.

preprint2017arXiv

A self-contained algorithm for determination of solid-liquid equilibria in an alloy system

We describe a self-contained procedure to evaluate the free energy of liquid and solid phases of an alloy system. The free energy of a single-element solid phase is calculated with thermodynamic integration using the Einstein crystal as the reference system. Then, free energy difference between the solid and liquid phases is calculated by Gibbs-Duhem integration. The central part of our method is the construction of a reversible alchemical path connecting a pure liquid and a liquid alloy to calculate the mixing enthalpy and entropy. We have applied the method to calculate the free energy of solid and liquid phases in the Al-Sm system. The driving force for fcc-Al nucleation in Al-Sm liquid and the melting curve for fcc-Al and Al3Sm are also calculated.