Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
25works
0followers
20topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

25 published item(s)

preprint2026arXiv

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Traditional workflow-based agents exhibit limited intelligence when addressing real-world problems requiring tool invocation. Tool-integrated reasoning (TIR) agents capable of autonomous reasoning and tool invocation are rapidly emerging as a powerful approach for complex decision-making tasks involving multi-step interactions with external environments. In this work, we introduce MindWatcher, a TIR agent integrating interleaved thinking and multimodal chain-of-thought (CoT) reasoning. MindWatcher can autonomously decide whether and how to invoke diverse tools and coordinate their use, without relying on human prompts or workflows. The interleaved thinking paradigm enables the model to switch between thinking and tool calling at any intermediate stage, while its multimodal CoT capability allows manipulation of images during reasoning to yield more precise search results. We implement automated data auditing and evaluation pipelines, complemented by manually curated high-quality datasets for training, and we construct a benchmark, called MindWatcher-Evaluate Bench (MWE-Bench), to evaluate its performance. MindWatcher is equipped with a comprehensive suite of auxiliary reasoning tools, enabling it to address broad-domain multimodal problems. A large-scale, high-quality local image retrieval database, covering eight categories including cars, animals, and plants, endows model with robust object recognition despite its small size. Finally, we design a more efficient training infrastructure for MindWatcher, enhancing training speed and hardware utilization. Experiments not only demonstrate that MindWatcher matches or exceeds the performance of larger or more recent models through superior tool invocation, but also uncover critical insights for agent training, such as the genetic inheritance phenomenon in agentic RL.

preprint2026arXiv

Position: How can Graphs Help Large Language Models?

With the rapid advancement of large language models (LLMs), classic graph learning tasks have greatly benefited from LLMs, including improved encoding of textual features, more efficient construction of graphs from text, and enhanced reasoning over knowledge graphs. In this paper, we ask a complementary question: How can graphs help LLMs? We address this question from three perspectives: 1) graphs provide an up-to-date knowledge source that helps reduce LLM hallucinations, 2) graph-based prompting techniques-such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and Graph-of-Thought (GoT)-enhance LLM reasoning capabilities, and 3) integrating graphs into LLMs improves their understanding of structured data, expanding their applicability to domains such as e-commerce, code, and relational databases (RDBs). We further outlook some future directions including designing sparse LLM architectures based on graphs and brain-inspired memory systems.

preprint2026arXiv

SubTokenTest: A Practical Benchmark for Real-World Sub-token Understanding

Recent advancements in large language models (LLMs) have significantly enhanced their reasoning capabilities. However, they continue to struggle with basic character-level tasks, such as counting letters in words, a problem rooted in their tokenization process. While existing benchmarks have highlighted this weakness through basic character operations, such failures are often dismissed due to lacking practical relevance. Yet, many real-world applications, such as navigating text-based maps or interpreting structured tables, rely heavily on precise sub-token understanding. In this regard, we introduce SubTokenTest, a comprehensive benchmark that assesses sub-token understanding through practical, utility-driven tasks. Our benchmark includes ten tasks across four domains and isolates tokenization-related failures by decoupling performance from complex reasoning. We provide a comprehensive evaluation of nine advanced LLMs. Additionally, we investigate the impact of test-time scaling on sub-token reasoning and explore how character-level information is encoded within the hidden states.

preprint2022arXiv

CSSAM:Code Search via Attention Matching of Code Semantics and Structures

Despite the continuous efforts in improving both the effectiveness and efficiency of code search, two issues remained unsolved. First, programming languages have inherent strong structural linkages, and feature mining of code as text form would omit the structural information contained inside it. Second, there is a potential semantic relationship between code and query, it is challenging to align code and text across sequences so that vectors are spatially consistent during similarity matching. To tackle both issues, in this paper, a code search model named CSSAM (Code Semantics and Structures Attention Matching) is proposed. By introducing semantic and structural matching mechanisms, CSSAM effectively extracts and fuses multidimensional code features. Specifically, the cross and residual layer was developed to facilitate high-latitude spatial alignment of code and query at the token level. By leveraging the residual interaction, a matching module is designed to preserve more code semantics and descriptive features, that enhances the adhesion between the code and its corresponding query text. Besides, to improve the model's comprehension of the code's inherent structure, a code representation structure named CSRG (Code Semantic Representation Graph) is proposed for jointly representing abstract syntax tree nodes and the data flow of the codes. According to the experimental results on two publicly available datasets containing 540k and 330k code segments, CSSAM significantly outperforms the baselines in terms of achieving the highest SR@1/5/10, MRR, and NDCG@50 on both datasets respectively. Moreover, the ablation study is conducted to quantitatively measure the impact of each key component of CSSAM on the efficiency and effectiveness of code search, which offers the insights into the improvement of advanced code search solutions.

preprint2022arXiv

Equilibrium Fluctuations in Mean-field Disordered Models

Mean-field models of glasses that present a random first order transition exhibit highly non-trivial fluctuations. Building on previous studies that focused on the critical scaling regime, we here obtain a fully quantitative framework for all equilibrium conditions. By means of the replica method we evaluate Gaussian fluctuations of the overlaps around the thermodynamic limit, decomposing them in thermal fluctuations inside each state and heterogeneous fluctuations between different states. We first test and compare our analytical results with numerical simulation results for the p-spin spherical model and the random orthogonal model, and then analyze the random Lorentz gas. In all cases, a strong quantitative agreement is obtained. Our analysis thus provides a robust scheme for identifying the key finite-size (or finite-dimensional) corrections to the mean-field treatment of these paradigmatic glass models.

preprint2022arXiv

Local dynamical heterogeneity in glass formers

We study the local dynamical fluctuations in glass-forming models of particles embedded in $d$-dimensional space, in the mean-field limit of $d\to\infty$. Our analytical calculation reveals that single-particle observables, such as squared particle displacements, display divergent fluctuations around the dynamical (or mode-coupling) transition, due to the emergence of nontrivial correlations between displacements along different directions. This effect notably gives rise to a divergent non-Gaussian parameter, $α_2$. The $d\to\infty$ local dynamics therefore becomes quite rich upon approaching the glass transition. The finite-$d$ remnant of this phenomenon further provides a long sought-after, first-principle explanation for the growth of $α_2$ around the glass transition that is \emph{not based on multi-particle correlations}.

preprint2021arXiv

Cloud Cover and Aurora Contamination at Dome A in 2017 from KLCAM

Dome A in Antarctica has many characteristics that make it an excellent site for astronomical observations, from the optical to the terahertz. Quantitative site testing is still needed to confirm the site's properties. In this paper, we present a statistical analysis of cloud cover and aurora contamination from the Kunlun Cloud and Aurora Monitor (KLCAM). KLCAM is an automatic, unattended all-sky camera aiming for long-term monitoring of the usable observing time and optical sky background at Dome~A. It was installed at Dome~A in January 2017, worked through the austral winter, and collected over 47,000 images over 490 days. A semi-quantitative visual data analysis of cloud cover and auroral contamination was carried out by five individuals. The analysis shows that the night sky was free of cloud for 83 per cent of the time, which ranks Dome~A highly in a comparison with other observatory sites. Although aurorae were detected somewhere on an image for nearly 45 per cent of the time, the strongest auroral emission lines can be filtered out with customized filters.

preprint2021arXiv

Dynamically Emerging Topological Phase Transitions in Nonlinear Interacting Soliton Lattices

We demonstrate dynamical topological phase transitions in evolving Su-Schrieffer-Heeger (SSH) lattices made of interacting soliton arrays, which are entirely driven by nonlinearity and thereby exemplify emergent nonlinear topological phenomena. The phase transitions occur from topologically trivial-to-nontrivial phase in periodic succession with crossovers from topologically nontrivial-to-trivial regime. The signature of phase transition is gap-closing and re-opening point, where two extended states are pulled from the bands into the gap to become localized topological edge states. Crossovers occur via decoupling of the edge states from the bulk of the lattice.

preprint2021arXiv

Network Clustering for Multi-task Learning

The Multi-Task Learning (MTL) technique has been widely studied by word-wide researchers. The majority of current MTL studies adopt the hard parameter sharing structure, where hard layers tend to learn general representations over all tasks and specific layers are prone to learn specific representations for each task. Since the specific layers directly follow the hard layers, the MTL model needs to estimate this direct change (from general to specific) as well. To alleviate this problem, we introduce the novel cluster layer, which groups tasks into clusters during training procedures. In a cluster layer, the tasks in the same cluster are further required to share the same network. By this way, the cluster layer produces the general presentation for the same cluster, while produces relatively specific presentations for different clusters. As transitions the cluster layers are used between the hard layers and the specific layers. The MTL model thus learns general representations to specific representations gradually. We evaluate our model with MTL document classification and the results demonstrate the cluster layer is quite efficient in MTL.

preprint2021arXiv

On the nature of valence charge and spin excitations via multi-orbital Hubbard models for infinite-layer nickelates

Building upon the recent progress on the intriguing underlying physics for the newly discovered infinite-layer nickelates, in this article we review an examination of valence charge and spin excitations via multi-orbital Hubbard models as way to determine the fundamental building blocks for Hamiltonians that can describe the low energy properties of infinite-layer nickelates. We summarize key results from density-functional approaches, and apply them to the study of x-ray absorption to determine the valence ground states of infinite-layer nickelates in their parent form, and show that a fundamental $d^9$ configuration as in the cuprates is incompatible with a self-doped ground state having holes in both $d_{x^2-y^2}$ and a rare-earth-derived axial orbital. When doped, we determine that the rare-earth-derived orbitals empty and additional holes form low spin $(S=0)$ $d^8$ Ni states, which can be well-described as a doped single-band Hubbard model. Using exact diagonalization for a 2-orbital model involving Ni and rare earth orbitals, we find clear magnons at 1/2 filling that persist when doped, albeit with larger damping, and with a dependence on the precise orbital energy separation between the Ni- and rare-earth-derived orbitals. Taken together, a full two-band model for infinite-layer nickelates can well describe the valence charge and spin excitations observed experimentally.

preprint2021arXiv

The dimensional evolution of structure and dynamics in hard sphere liquids

The formulation of the mean-field, infinite-dimensional solution of hard sphere glasses is a significant milestone for theoretical physics. How relevant this description might be for understanding low-dimensional glass-forming liquids, however, remains unclear. These liquids indeed exhibit a complex interplay between structure and dynamics, and the importance of this interplay might only slowly diminish as dimension $d$ increases. A careful numerical assessment of the matter has long been hindered by the exponential increase of computational costs with $d$. By revisiting a once common simulation technique involving the use of periodic boundary conditions modeled on $D_d$ lattices, we here partly sidestep this difficulty, thus allowing the study of hard sphere liquids up to $d=13$. Parallel efforts by Mangeat and Zamponi [Phys. Rev. E 93, 012609 (2016)] have expanded the mean-field description of glasses to finite $d$ by leveraging standard liquid-state theory, and thus help bridge the gap from the other direction. The relatively smooth evolution of both structure and dynamics across the $d$ gap allows us to relate the two approaches, and to identify some of the missing features that a finite-$d$ theory of glasses might hope to include to achieve near quantitative agreement.

preprint2020arXiv

Application of Deep Q-Network in Portfolio Management

Machine Learning algorithms and Neural Networks are widely applied to many different areas such as stock market prediction, face recognition and population analysis. This paper will introduce a strategy based on the classic Deep Reinforcement Learning algorithm, Deep Q-Network, for portfolio management in stock market. It is a type of deep neural network which is optimized by Q Learning. To make the DQN adapt to financial market, we first discretize the action space which is defined as the weight of portfolio in different assets so that portfolio management becomes a problem that Deep Q-Network can solve. Next, we combine the Convolutional Neural Network and dueling Q-net to enhance the recognition ability of the algorithm. Experimentally, we chose five lowrelevant American stocks to test the model. The result demonstrates that the DQN based strategy outperforms the ten other traditional strategies. The profit of DQN algorithm is 30% more than the profit of other strategies. Moreover, the Sharpe ratio associated with Max Drawdown demonstrates that the risk of policy made with DQN is the lowest.

preprint2020arXiv

Automation of the AST3 optical sky survey from Dome~A, Antarctica

The 0.5\,m Antarctic Survey Telescopes (AST3) were designed for time-domain optical/infrared astronomy. They are located in Dome~A, Antarctica, where they can take advantage of the continuous dark time during winter. Since the site is unattended in winter, everything for the operation, from observing to data reduction, had to be fully automated. Here, we present a brief overview of the AST3 project and some of its unique characteristics due to its location in Antarctica. We summarise the various components of the survey, including the customized hardware and software, that make complete automation possible.

preprint2020arXiv

Deep Hierarchical Classification for Category Prediction in E-commerce System

In e-commerce system, category prediction is to automatically predict categories of given texts. Different from traditional classification where there are no relations between classes, category prediction is reckoned as a standard hierarchical classification problem since categories are usually organized as a hierarchical tree. In this paper, we address hierarchical category prediction. We propose a Deep Hierarchical Classification framework, which incorporates the multi-scale hierarchical information in neural networks and introduces a representation sharing strategy according to the category tree. We also define a novel combined loss function to punish hierarchical prediction losses. The evaluation shows that the proposed approach outperforms existing approaches in accuracy.

preprint2020arXiv

FashionBERT: Text and Image Matching with Adaptive Loss for Cross-modal Retrieval

In this paper, we address the text and image matching in cross-modal retrieval of the fashion industry. Different from the matching in the general domain, the fashion matching is required to pay much more attention to the fine-grained information in the fashion images and texts. Pioneer approaches detect the region of interests (i.e., RoIs) from images and use the RoI embeddings as image representations. In general, RoIs tend to represent the "object-level" information in the fashion images, while fashion texts are prone to describe more detailed information, e.g. styles, attributes. RoIs are thus not fine-grained enough for fashion text and image matching. To this end, we propose FashionBERT, which leverages patches as image features. With the pre-trained BERT model as the backbone network, FashionBERT learns high level representations of texts and images. Meanwhile, we propose an adaptive loss to trade off multitask learning in the FashionBERT modeling. Two tasks (i.e., text and image matching and cross-modal retrieval) are incorporated to evaluate FashionBERT. On the public dataset, experiments demonstrate FashionBERT achieves significant improvements in performances than the baseline and state-of-the-art approaches. In practice, FashionBERT is applied in a concrete cross-modal retrieval application. We provide the detailed matching performance and inference efficiency analysis.

preprint2020arXiv

Modeling Dynamic Heterogeneous Network for Link Prediction using Hierarchical Attention with Temporal RNN

Network embedding aims to learn low-dimensional representations of nodes while capturing structure information of networks. It has achieved great success on many tasks of network analysis such as link prediction and node classification. Most of existing network embedding algorithms focus on how to learn static homogeneous networks effectively. However, networks in the real world are more complex, e.g., networks may consist of several types of nodes and edges (called heterogeneous information) and may vary over time in terms of dynamic nodes and edges (called evolutionary patterns). Limited work has been done for network embedding of dynamic heterogeneous networks as it is challenging to learn both evolutionary and heterogeneous information simultaneously. In this paper, we propose a novel dynamic heterogeneous network embedding method, termed as DyHATR, which uses hierarchical attention to learn heterogeneous information and incorporates recurrent neural networks with temporal attention to capture evolutionary patterns. We benchmark our method on four real-world datasets for the task of link prediction. Experimental results show that DyHATR significantly outperforms several state-of-the-art baselines.

preprint2020arXiv

Moduli of Curves of Genus One with Twisted Fields

We construct a smooth Artin stack parameterizing the stable weighted curves of genus one with twisted fields and prove that it is isomorphic to the blowup stack of the moduli of genus one weighted curves studied by Hu and Li. This leads to a blowup-free construction of Vakil-Zinger's desingularization of the moduli of genus one stable maps to projective spaces. This construction provides the cornerstone of the theory of stacks with twisted fields, which is thoroughly studied in arXiv:2005.03384 and leads to a blowup-free resolution of the stable map moduli of genus two.

preprint2020arXiv

Night-time measurements of astronomical seeing at Dome A in Antarctica

Seeing, the angular size of stellar images blurred by atmospheric turbulence, is a critical parameter used to assess the quality of astronomical sites. Median values at the best mid-latitude sites are generally in the range of 0.6--0.8\,arcsec. Sites on the Antarctic plateau are characterized by comparatively-weak turbulence in the free-atmosphere above a strong but thin boundary layer. The median seeing at Dome C is estimated to be 0.23--0.36 arcsec above a boundary layer that has a typical height of 30\,m. At Dome A and F, the only previous seeing measurements were made during daytime. Here we report the first direct measurements of night-time seeing at Dome A, using a Differential Image Motion Monitor. Located at a height of just 8\,m, it recorded seeing as low as 0.13\,arcsec, and provided seeing statistics that are comparable to those for a 20\,m height at Dome C. It indicates that the boundary layer was below 8\,m 31\% of the time. At such times the median seeing was 0.31\,arcsec, consistent with free-atmosphere seeing. The seeing and boundary layer thickness are found to be strongly correlated with the near-surface temperature gradient. The correlation confirms a median thickness of approximately 14\,m for the boundary layer at Dome A, as found from a sonic radar. The thinner boundary layer makes it less challenging to locate a telescope above it, thereby giving greater access to the free-atmosphere.

preprint2020arXiv

Time-aware Graph Embedding: A temporal smoothness and task-oriented approach

Knowledge graph embedding, which aims to learn the low-dimensional representations of entities and relationships, has attracted considerable research efforts recently. However, most knowledge graph embedding methods focus on the structural relationships in fixed triples while ignoring the temporal information. Currently, existing time-aware graph embedding methods only focus on the factual plausibility, while ignoring the temporal smoothness which models the interactions between a fact and its contexts, and thus can capture fine-granularity temporal relationships. This leads to the limited performance of embedding related applications. To solve this problem, this paper presents a Robustly Time-aware Graph Embedding (RTGE) method by incorporating temporal smoothness. Two major innovations of our paper are presented here. At first, RTGE integrates a measure of temporal smoothness in the learning process of the time-aware graph embedding. Via the proposed additional smoothing factor, RTGE can preserve both structural information and evolutionary patterns of a given graph. Secondly, RTGE provides a general task-oriented negative sampling strategy associated with temporally-aware information, which further improves the adaptive ability of the proposed algorithm and plays an essential role in obtaining superior performance in various tasks. Extensive experiments conducted on multiple benchmark tasks show that RTGE can increase performance in entity/relationship/temporal scoping prediction tasks.

preprint2020arXiv

Vehicle Re-Identification Based on Complementary Features

In this work, we present our solution to the vehicle re-identification (vehicle Re-ID) track in AI City Challenge 2020 (AIC2020). The purpose of vehicle Re-ID is to retrieve the same vehicle appeared across multiple cameras, and it could make a great contribution to the Intelligent Traffic System(ITS) and smart city. Due to the vehicle's orientation, lighting and inter-class similarity, it is difficult to achieve robust and discriminative representation feature. For the vehicle Re-ID track in AIC2020, our method is to fuse features extracted from different networks in order to take advantages of these networks and achieve complementary features. For each single model, several methods such as multi-loss, filter grafting, semi-supervised are used to increase the representation ability as better as possible. Top performance in City-Scale Multi-Camera Vehicle Re-Identification demonstrated the advantage of our methods, and we got 5-th place in the vehicle Re-ID track of AIC2020. The codes are available at https://github.com/gggcy/AIC2020_ReID.

preprint2019arXiv

Realization of robust boundary modes and non-contractible loop states in photonic Kagome lattices

Corbino-geometry has well-known applications in physics, as in the design of graphene heterostructures for detecting fractional quantum Hall states or superconducting waveguides for illustrating circuit quantum electrodynamics. Here, we propose and demonstrate a photonic Kagome lattice in the Corbino-geometry that leads to direct observation of non-contractible loop states protected by real-space topology. Such states represent the "missing" flat-band eigenmodes, manifested as one-dimensional loops winding around a torus, or lines infinitely extending to the entire flat-band lattice. In finite (truncated) Kagome lattices, however, line states cannot preserve as they are no longer the eigenmodes, in sharp contrast to the case of Lieb lattices. Using a continuous-wave laser writing technique, we experimentally establish finite Kagome lattices with desired cutting edges, as well as in the Corbino-geometry to eliminate edge effects. We thereby observe, for the first time to our knowledge, the robust boundary modes exhibiting self-healing properties, and the localized modes along toroidal direction as a direct manifestation of the non-contractible loop states.

preprint2018arXiv

Clustering and assembly dynamics of a one-dimensional microphase former

Both ordered and disordered microphases ubiquitously form in suspensions of particles that interact through competing short-range attraction and long-range repulsion (SALR). While ordered microphases are more appealing materials targets, understanding the rich structural and dynamical properties of their disordered counterparts is essential to controlling their mesoscale assembly. Here, we study the disordered regime of a one-dimensional (1D) SALR model, whose simplicity enables detailed analysis by transfer matrices and Monte Carlo simulations. We first characterize the signature of the clustering process on macroscopic observables, and then assess the equilibration dynamics of various simulation algorithms. We notably find that cluster moves markedly accelerate the mixing time, but that event chains are of limited help in the clustering regime. These insights will guide further study of three-dimensional microphase formers.

preprint2018arXiv

Correlation lengths in quasi-one-dimensional systems via transfer matrices

Using transfer matrices up to next-nearest-neighbour (NNN) interactions, we examine the structural correlations of quasi-one-dimensional systems of hard disks confined by two parallel lines and hard spheres confined in cylinders. Simulations have shown that the non-monotonic and non-smooth growth of the correlation length in these systems accompanies structural crossovers (Fu et al., Soft Matter, 2017, 13, 3296). Here, we identify the theoretical basis for these behaviour. In particular, we associate kinks in the growth of correlation lengths with eigenvalue crossing and splitting. Understanding the origin of such structural crossovers answers questions raised by earlier studies, and thus bridges the gap between theory and simulations for these reference models.