Source author record

Siyuan Chen

Siyuan Chen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computation and Language Computer Vision astro-ph.CO astro-ph.GA astro-ph.HE gr-qc Information Retrieval physics.data-an Social and Information Networks astro-ph.IM Computational Engineering, Finance, and Science cond-mat.mtrl-sci eess.IV Graphics hep-ph Methodology physics.ins-det

Catalog footprint

What is connected

19works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

Most multimodal models treat every negative pair alike, ignoring the ambiguous negatives that differ from the positive by only a small detail. We propose Boundary-Aware Curriculum with Local Attention (BACL), a lightweight add-on that turns these borderline cases into a curriculum signal. A Boundary-aware Negative Sampler gradually raises difficulty, while a Contrastive Local Attention loss highlights where the mismatch occurs. The two modules are fully differentiable and work with any off-the-shelf dual encoder. Theory predicts a fast O(1/n) error rate; practice shows up to +32% R@1 over CLIP and new SOTA on four large-scale benchmarks, all without extra labels.

preprint2026arXiv

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Building upon FutureX, which established a live benchmark for general-purpose future prediction, this report introduces FutureX-Pro, including FutureX-Finance, FutureX-Retail, FutureX-PublicHealth, FutureX-NaturalDisaster, and FutureX-Search. These together form a specialized framework extending agentic future prediction to high-value vertical domains. While generalist agents demonstrate proficiency in open-domain search, their reliability in capital-intensive and safety-critical sectors remains under-explored. FutureX-Pro targets four economically and socially pivotal verticals: Finance, Retail, Public Health, and Natural Disaster. We benchmark agentic Large Language Models (LLMs) on entry-level yet foundational prediction tasks -- ranging from forecasting market indicators and supply chain demands to tracking epidemic trends and natural disasters. By adapting the contamination-free, live-evaluation pipeline of FutureX, we assess whether current State-of-the-Art (SOTA) agentic LLMs possess the domain grounding necessary for industrial deployment. Our findings reveal the performance gap between generalist reasoning and the precision required for high-value vertical applications.

preprint2026arXiv

Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems

Enhancing customer experience is essential for business success, particularly as service demands grow in scale and complexity. Generative artificial intelligence and Large Language Models (LLMs) have empowered intelligent interaction systems to deliver efficient, personalized, and 24/7 support. In practice, intelligent interaction systems encounter several challenges: (1) Constructing high-quality data for cold-start training is difficult, hindering self-evolution and raising labor costs. (2) Multi-turn dialogue performance remains suboptimal due to inadequate intent understanding, rule compliance, and solution extraction. (3) Frequent evolution of business rules affects system operability and transferability, constraining low-cost expansion and adaptability. (4) Reliance on a single LLM is insufficient in complex scenarios, where the absence of multi-agent frameworks and effective collaboration undermines process completeness and service quality. (5) The open-domain nature of multi-turn dialogues, lacking unified golden answers, hampers quantitative evaluation and continuous optimization. To address these challenges, we introduce WOWService, an intelligent interaction system tailored for industrial applications. With the integration of LLMs and multi-agent architectures, WOWService enables autonomous task management and collaborative problem-solving. Specifically, WOWService focuses on core modules including data construction, general capability enhancement, business scenario adaptation, multi-agent coordination, and automated evaluation. Currently, WOWService is deployed on the Meituan App, achieving significant gains in key metrics, e.g., User Satisfaction Metric 1 (USM 1) -27.53% and User Satisfaction Metric 2 (USM 2) +25.51%, demonstrating its effectiveness in capturing user needs and advancing personalized service.

preprint2026arXiv

Seeing through the Conflict: Transparent Knowledge Conflict Handling in Retrieval-Augmented Generation

Large language models (LLMs) equipped with retrieval--the Retrieval-Augmented Generation (RAG) paradigm--should combine their parametric knowledge with external evidence, yet in practice they often hallucinate, over-trust noisy snippets, or ignore vital context. We introduce TCR (Transparent Conflict Resolution), a plug-and-play framework that makes this decision process observable and controllable. TCR (i) disentangles semantic match and factual consistency via dual contrastive encoders, (ii) estimates self-answerability to gauge confidence in internal memory, and (iii) feeds the three scalar signals to the generator through a lightweight soft-prompt with SNR-based weighting. Across seven benchmarks TCR improves conflict detection (+5-18 F1), raises knowledge-gap recovery by +21.4 pp and cuts misleading-context overrides by -29.3 pp, while adding only 0.3% parameters. The signals align with human judgements and expose temporal decision patterns.

preprint2026arXiv

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

Large language models (LLMs) demonstrate impressive generalization abilities, yet adapting them effectively across multiple heterogeneous domains remains challenging due to inter-domain interference. To overcome this challenge, we propose a partition-based multi-stage fine-tuning framework designed to exploit inter-domain synergies while minimizing negative transfer. Our approach strategically partitions domains into subsets (stages) by balancing domain discrepancy, synergy, and model capacity constraints. We theoretically analyze the proposed framework and derive novel generalization bounds that justify our partitioning strategy. Extensive empirical evaluations on various language understanding tasks show that our method consistently outperforms state-of-the-art baselines.

preprint2026arXiv

WorldParticle: Unified Simulation of Lagrangian Particle Dynamics via Transformer

A unified simulator that can model diverse physical phenomena without solver-specific redesign is a long-standing goal across simulation science. We present a learning-based particle simulator built on a single transformer architecture to model cloth, elastic solds, Newtonian and non-Newtonian fluids, granular materials, and molecular dynamics. Our model follows a prediction-correction design on a shared Lagrangian particle representation. An explicit predictor first advances particles under the known external forces, producing an intermediate state that captures externally driven motion but not inter-particle interactions. A learned corrector then predicts the residual position and velocity updates through three stages: a particle tokenizer that encodes local particle-particle, particle-boundary, and topology-guided interactions; a super-token encoder that hierarchically merges particle tokens into a compact set of super tokens via alternating self-attention and token merging; and a super-token decoder that lifts these super tokens back to particle resolution through cross-attention to predict per-particle position and velocity corrections. Progressive token merging reduces the attention cost at successive encoder layers by halving the token count at each level, and the decoder communicates through the compact super-token set rather than full particle-to-particle attention. Across the six dynamics categories, the same architecture generalizes to unseen materials, boundary configurations, initial conditions, and external forces. We further demonstrate downstream interactive control, inverse design, and learning from real-world manipulation data, reducing the need for per-phenomenon solver engineering.

preprint2023arXiv

DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking

Object detection and pose estimation are difficult tasks in robotics and autonomous driving. Existing object detection and pose estimation methods mostly adopt the same-dimensional data for training. For example, 2D object detection usually requires a large amount of 2D annotation data with high cost. Using high-dimensional information to supervise lower-dimensional tasks is a feasible way to reduce datasets size. In this work, the DR-WLC, a dimensionality reduction cognitive model, which can perform both object detection and pose estimation tasks at the same time is proposed. The model only requires 3D model of objects and unlabeled environment images (with or without objects) to finish the training. In addition, a bounding boxes generation strategy is also proposed to build the relationship between 3D model and 2D object detection task. Experiments show that our method can qualify the work without any manual annotations and it is easy to deploy for practical applications. Source code is at https://github.com/IN2-ViAUn/DR-WLC.

preprint2022arXiv

Benchmarks for Industrial Inspection Based on Structured Light

Robustness and accuracy are two critical metrics for industrial inspection. In this paper, we propose benchmarks that can evaluate the structured light method's performance. Our evaluation metric was learning from a lot of inspection tasks from the factories. The metric we proposed consists of four detailed criteria such as flatness, length, height and sphericity. Then we can judge whether the structured light method/device can be applied to a specified inspection task by our evaluation metric quickly. A structured light device built for TypeC pin needles inspection performance is evaluated via our metrics in the final experimental section.

preprint2022arXiv

Graph Neural Networks with Dynamic and Static Representations for Social Recommendation

Recommender systems based on graph neural networks receive increasing research interest due to their excellent ability to learn a variety of side information including social networks. However, previous works usually focus on modeling users, not much attention is paid to items. Moreover, the possible changes in the attraction of items over time, which is like the dynamic interest of users are rarely considered, and neither do the correlations among items. To overcome these limitations, this paper proposes graph neural networks with dynamic and static representations for social recommendation (GNN-DSR), which considers both dynamic and static representations of users and items and incorporates their relational influence. GNN-DSR models the short-term dynamic and long-term static interactional representations of the user's interest and the item's attraction, respectively. Furthermore, the attention mechanism is used to aggregate the social influence of users on the target user and the correlative items' influence on a given item. The final latent factors of user and item are combined to make a prediction. Experiments on three real-world recommender system datasets validate the effectiveness of GNN-DSR.

preprint2022arXiv

Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression

Depression is a prominent health challenge to the world, and early risk detection (ERD) of depression from online posts can be a promising technique for combating the threat. Early depression detection faces the challenge of efficiently tackling streaming data, balancing the tradeoff between timeliness, accuracy and explainability. To tackle these challenges, we propose a psychiatric scale guided risky post screening method that can capture risky posts related to the dimensions defined in clinical depression scales, and providing interpretable diagnostic basis. A Hierarchical Attentional Network equipped with BERT (HAN-BERT) is proposed to further advance explainable predictions. For ERD, we propose an online algorithm based on an evolving queue of risky posts that can significantly reduce the number of model inferences to boost efficiency. Experiments show that our method outperforms the competitive feature-based and neural models under conventional depression detection settings, and achieves simultaneous improvement in both efficacy and efficiency for ERD.

preprint2022arXiv

Searching For Gravitational Waves From Cosmological Phase Transitions With The NANOGrav 12.5-year dataset

We search for a first-order phase transition gravitational wave signal in 45 pulsars from the NANOGrav 12.5 year dataset. We find that the data can be modeled in terms of a strong first order phase transition taking place at temperatures below the electroweak scale. However, we do not observe any strong preference for a phase-transition interpretation of the signal over the standard astrophysical interpretation in terms of supermassive black holes mergers; but we expect to gain additional discriminating power with future datasets, improving the signal to noise ratio and extending the sensitivity window to lower frequencies. An interesting open question is how well gravitational wave observatories could separate such signals.

preprint2022arXiv

Symptom Identification for Interpretable Detection of Multiple Mental Disorders

Mental disease detection (MDD) from social media has suffered from poor generalizability and interpretability, due to lack of symptom modeling. This paper introduces PsySym, the first annotated symptom identification corpus of multiple psychiatric disorders, to facilitate further research progress. PsySym is annotated according to a knowledge graph of the 38 symptom classes related to 7 mental diseases complied from established clinical manuals and scales, and a novel annotation framework for diversity and quality. Experiments show that symptom-assisted MDD enabled by PsySym can outperform strong pure-text baselines. We also exhibit the convincing MDD explanations provided by symptom predictions with case studies, and point to their further potential applications.

preprint2021arXiv

Ab initio electronic density in solids by many-body plane-wave auxiliary-field quantum Monte Carlo calculations

We present accurate many-body results of the electronic densities in several solid materials, including Si, NaCl, and Cu. These results are obtained using the ab initio auxiliary-field quantum Monte Carlo (AFQMC) method working in a plane-wave basis with norm-conserving, multiple-projector pseudopotentials. AFQMC has been shown to be an excellent many-body total energy method. Computation of observables and correlation functions other than the ground-state energy requires back-propagation, whose adaption and implementation in the plane-wave basis AFQMC framework are discussed in the present paper. This development allows us to compute correlation functions, electronic densities and interatomic forces, paving the way for geometry optimizations and calculations of thermodynamic properties in solids. Finite supercell size effects are considerably more subtle in the many-body framework than in independent-electron calculations. We analyze the convergence of the electronic density, and obtain best estimates for the thermodynamic limit. The densities from several typical density functionals are benchmarked against our near-exact results. The electronic densities we have obtained can also be used to help construct improved density functionals.

preprint2021arXiv

Applying clock comparison methods to pulsar timing observations

Frequency metrology outperforms any other branch of metrology in accuracy (parts in $10^{-16}$) and small fluctuations ($<10^{-17}$). In turn, among celestial bodies, the rotation speed of millisecond pulsars (MSP) is by far the most stable ($<10^{-18}$). Therefore, the precise measurement of the time of arrival (TOA) of pulsar signals is expected to disclose information about cosmological phenomena, and to enlarge our astrophysical knowledge. Related to this topic, Pulsar Timing Array (PTA) projects have been developed and operated for the last decades. The TOAs from a pulsar can be affected by local emission and environmental effects, in the direction of the propagation through the interstellar medium or universally by gravitational waves from super massive black hole binaries. These effects (signals) can manifest as a low-frequency fluctuation over time, phenomenologically similar to a red noise. While the remaining pulsar intrinsic and instrumental background (noise) are white. This article focuses on the frequency metrology of pulsars. From our standpoint, the pulsar is an accurate clock, to be measured simultaneously with several telescopes in order to reject the uncorrelated white noise. We apply the modern statistical methods of time-and-frequency metrology to simulated pulsar data, and we show the detection limit of the correlated red noise signal between telescopes.

preprint2021arXiv

Neural Relational Inference with Efficient Message Passing Mechanisms

Many complex processes can be viewed as dynamical systems of interacting agents. In many cases, only the state sequences of individual agents are observed, while the interacting relations and the dynamical rules are unknown. The neural relational inference (NRI) model adopts graph neural networks that pass messages over a latent graph to jointly learn the relations and the dynamics based on the observed data. However, NRI infers the relations independently and suffers from error accumulation in multi-step prediction at dynamics learning procedure. Besides, relation reconstruction without prior knowledge becomes more difficult in more complex systems. This paper introduces efficient message passing mechanisms to the graph neural networks with structural prior knowledge to address these problems. A relation interaction mechanism is proposed to capture the coexistence of all relations, and a spatio-temporal message passing mechanism is proposed to use historical information to alleviate error accumulation. Additionally, the structural prior knowledge, symmetry as a special case, is introduced for better relation prediction in more complex systems. The experimental results on simulated physics systems show that the proposed method outperforms existing state-of-the-art methods.

preprint2021arXiv

Response and Uncertainty of the Parabolic Variance PVAR to Non-Integer Exponents of the Power Law

Oscillator fluctuations are described as the phase or frequency noise spectrum, or in terms of a wavelet variance as a function of the measurement time. The spectrum is generally approximated by the `power law,' i.e., a Laurent polynomial with integer exponents of the frequency. This article extends the domain of application of PVAR, a wavelet variance which uses the linear regression on phase data to estimate the frequency, and called `parabolic' because such regression is equivalent to a parabolic-shaped weight function applied to frequency fluctuations. In turn, PVAR is relevant in that it improves on the widely-used Modified Allan variance (MVAR) enabling the detection of the same noise processes at the same confidence level in a shorter measurement time. More specifically, we provide (i) the analytical expression of the response of the PVAR to the frequency-noise spectrum in the general case of non-integer exponents of the frequency, and (ii) a useful approximate expression of the statistical uncertainty.

preprint2021arXiv

The NANOGrav 12.5-year Data Set: Search For An Isotropic Stochastic Gravitational-Wave Background

We search for an isotropic stochastic gravitational-wave background (GWB) in the $12.5$-year pulsar timing data set collected by the North American Nanohertz Observatory for Gravitational Waves. Our analysis finds strong evidence of a stochastic process, modeled as a power-law, with common amplitude and spectral slope across pulsars. The Bayesian posterior of the amplitude for an $f^{-2/3}$ power-law spectrum, expressed as the characteristic GW strain, has median $1.92 \times 10^{-15}$ and $5\%$--$95\%$ quantiles of $1.37$--$2.67 \times 10^{-15}$ at a reference frequency of $f_\mathrm{yr} = 1 ~\mathrm{yr}^{-1}$. The Bayes factor in favor of the common-spectrum process versus independent red-noise processes in each pulsar exceeds $10,000$. However, we find no statistically significant evidence that this process has quadrupolar spatial correlations, which we would consider necessary to claim a GWB detection consistent with general relativity. We find that the process has neither monopolar nor dipolar correlations, which may arise from, for example, reference clock or solar system ephemeris systematics, respectively. The amplitude posterior has significant support above previously reported upper limits; we explain this in terms of the Bayesian priors assumed for intrinsic pulsar red noise. We examine potential implications for the supermassive black hole binary population under the hypothesis that the signal is indeed astrophysical in nature.

preprint2020arXiv

A Novel Framework with Information Fusion and Neighborhood Enhancement for User Identity Linkage

User identity linkage across social networks is an essential problem for cross-network data mining. Since network structure, profile and content information describe different aspects of users, it is critical to learn effective user representations that integrate heterogeneous information. This paper proposes a novel framework with INformation FUsion and Neighborhood Enhancement (INFUNE) for user identity linkage. The information fusion component adopts a group of encoders and decoders to fuse heterogeneous information and generate discriminative node embeddings for preliminary matching. Then, these embeddings are fed to the neighborhood enhancement component, a novel graph neural network, to produce adaptive neighborhood embeddings that reflect the overlapping degree of neighborhoods of varying candidate user pairs. The importance of node embeddings and neighborhood embeddings are weighted for final prediction. The proposed method is evaluated on real-world social network data. The experimental results show that INFUNE significantly outperforms existing state-of-the-art methods.

preprint2020arXiv

Massive black hole binary systems and the NANOGrav 12.5 year results

The North American Nanohertz Observatory for Gravitational Waves (NANOGrav) has recently reported evidence for the presence of a common stochastic signal across their array of pulsars. The origin of this signal is still unclear. One of the possibilities is that it is due to a stochastic gravitational wave background (SGWB) in the $\sim 1-10\,{\rm nHz}$ frequency region. Taking the NANOGrav observational result at face value, we show that this signal would be fully consistent with a SGWB produced by an unresolved population of in-spiralling massive black hole binaries (MBHBs) predicted by current theoretical models. Considering an astrophysically agnostic model we find that the MBHB merger rate is loosely constrained to the range $10^{-11} - 2$ $\mathrm{Mpc}^{-3}\,\mathrm{Gyr}^{-1}$. Including additional constraints from galaxy pairing fractions and MBH-bulge scaling relations, we find that the MBHB merger rate is $10^{-5} - 5\times10^{-4}$ $\mathrm{Mpc}^{-3}\,\mathrm{Gyr}^{-1}$, the MBHB merger time-scale is $\le 3\,\mathrm{Gyr}$ and the norm of the $M_\mathrm{BH}-M_\mathrm{bulge}$ relation $\ge 1.2\times 10^{8}\,M_\odot$ (all intervals quoted at 90\% confidence). Regardless of the astrophysical details of MBHB assembly, this result would imply that a sufficiently large population of massive black holes pair up, form binaries and merge within a Hubble time.

Siyuan Chen

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Aligning by Misaligning: Boundary-aware Curriculum Learning for Multimodal Alignment

FutureX-Pro: Extending Future Prediction to High-Value Vertical Domains

Higher Satisfaction, Lower Cost: A Technical Report on How LLMs Revolutionize Meituan's Intelligent Interaction Systems

Seeing through the Conflict: Transparent Knowledge Conflict Handling in Retrieval-Augmented Generation

Synergy over Discrepancy: A Partition-Based Approach to Multi-Domain LLM Fine-Tuning

WorldParticle: Unified Simulation of Lagrangian Particle Dynamics via Transformer

DR-WLC: Dimensionality Reduction cognition for object detection and pose estimation by Watching, Learning and Checking

Benchmarks for Industrial Inspection Based on Structured Light

Graph Neural Networks with Dynamic and Static Representations for Social Recommendation

Psychiatric Scale Guided Risky Post Screening for Early Detection of Depression

Searching For Gravitational Waves From Cosmological Phase Transitions With The NANOGrav 12.5-year dataset

Symptom Identification for Interpretable Detection of Multiple Mental Disorders

Ab initio electronic density in solids by many-body plane-wave auxiliary-field quantum Monte Carlo calculations

Applying clock comparison methods to pulsar timing observations

Neural Relational Inference with Efficient Message Passing Mechanisms

Response and Uncertainty of the Parabolic Variance PVAR to Non-Integer Exponents of the Power Law

The NANOGrav 12.5-year Data Set: Search For An Isotropic Stochastic Gravitational-Wave Background

A Novel Framework with Information Fusion and Neighborhood Enhancement for User Identity Linkage

Massive black hole binary systems and the NANOGrav 12.5 year results