Researcher profile

Jiaxin Mao

Jiaxin Mao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Beyond Monolithic Architectures: A Multi-Agent Search and Knowledge Optimization Framework for Agentic Search

Agentic search has emerged as a promising paradigm for complex information seeking by enabling Large Language Models (LLMs) to interleave reasoning with tool use. However, prevailing systems rely on monolithic agents that suffer from structural bottlenecks, including unconstrained reasoning outputs that inflate trajectories, sparse outcome-level rewards that complicate credit assignment, and stochastic search noise that destabilizes learning. To address these challenges, we propose \textbf{M-ASK} (Multi-Agent Search and Knowledge), a framework that explicitly decouples agentic search into two complementary roles: Search Behavior Agents, which plan and execute search actions, and Knowledge Management Agents, which aggregate, filter, and maintain a compact internal context. This decomposition allows each agent to focus on a well-defined subtask and reduces interference between search and context construction. Furthermore, to enable stable coordination, M-ASK employs turn-level rewards to provide granular supervision for both search decisions and knowledge updates. Experiments on multi-hop QA benchmarks demonstrate that M-ASK outperforms strong baselines, achieving not only superior answer accuracy but also significantly more stable training dynamics.\footnote{The source code for M-ASK is available at https://github.com/chenyiqun/M-ASK.}

preprint2023arXiv

RANS Simulations of Turbulent Round Jets in the Presence of Density Difference and Comparison with High-Resolution Experimental Data

In this paper, the novel experimental data reported by Qin et al. [1] are used to assess the predictive capability of the Realizable k-epsilon (RKE) model and Reynolds stress transport (RST) model for buoyant jets and understand the reasons for discrepancies. In particular, we present the comparison between simulation results of a turbulent buoyant jet flow in the self-similar region with high-resolution experimental data obtained for a jet injected from a 2 mm nozzle into a 300x300x300 $mm^3$ tank, with nominal Reynolds number equal to 10,000. Results show that streamwise velocity profiles predicted by the RST model had good agreement with experimental data, while the larger spreading rate was predicted by the RKE model. For turbulent statistics, turbulent kinetic energy witnessed a discrepancy in the center region, with shear stress well predicted for both models. Comparison of the turbulent kinetic energy production term with experimental data revealed reasons for the discrepancy and also showed that the gradient of the streamwise velocity in the crosswise direction contributes the most to the turbulent kinetic energy production. Investigation of model coefficients of the turbulent dissipation equation for the RKE model has revealed that $C_{\varepsilon2}$ is critical in model accuracy.

preprint2022arXiv

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always available. To address this problem, in this paper, we propose a novel DR framework named Disentangled Dense Retrieval (DDR) to support effective and flexible domain adaptation for DR models. DDR consists of a Relevance Estimation Module (REM) for modeling domain-invariant matching patterns and several Domain Adaption Modules (DAMs) for modeling domain-specific features of multiple target corpora. By making the REM and DAMs disentangled, DDR enables a flexible training paradigm in which REM is trained with supervision once and DAMs are trained with unsupervised data. Comprehensive experiments in different domains and languages show that DDR significantly improves ranking performance compared to strong DR baselines and substantially outperforms traditional retrieval methods in most scenarios.

preprint2022arXiv

Evaluating Interpolation and Extrapolation Performance of Neural Retrieval Models

A retrieval model should not only interpolate the training data but also extrapolate well to the queries that are different from the training data. While neural retrieval models have demonstrated impressive performance on ad-hoc search benchmarks, we still know little about how they perform in terms of interpolation and extrapolation. In this paper, we demonstrate the importance of separately evaluating the two capabilities of neural retrieval models. Firstly, we examine existing ad-hoc search benchmarks from the two perspectives. We investigate the distribution of training and test data and find a considerable overlap in query entities, query intent, and relevance labels. This finding implies that the evaluation on these test sets is biased toward interpolation and cannot accurately reflect the extrapolation capacity. Secondly, we propose a novel evaluation protocol to separately evaluate the interpolation and extrapolation performance on existing benchmark datasets. It resamples the training and test data based on query similarity and utilizes the resampled dataset for training and evaluation. Finally, we leverage the proposed evaluation protocol to comprehensively revisit a number of widely-adopted neural retrieval models. Results show models perform differently when moving from interpolation to extrapolation. For example, representation-based retrieval models perform almost as well as interaction-based retrieval models in terms of interpolation but not extrapolation. Therefore, it is necessary to separately evaluate both interpolation and extrapolation performance and the proposed resampling method serves as a simple yet effective evaluation tool for future IR studies.

preprint2022arXiv

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.

preprint2022arXiv

KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems

The progress of recommender systems is hampered mainly by evaluation as it requires real-time interactions between humans and systems, which is too laborious and expensive. This issue is usually approached by utilizing the interaction history to conduct offline evaluation. However, existing datasets of user-item interactions are partially observed, leaving it unclear how and to what extent the missing interactions will influence the evaluation. To answer this question, we collect a fully-observed dataset from Kuaishou's online environment, where almost all 1,411 users have been exposed to all 3,327 items. To the best of our knowledge, this is the first real-world fully-observed data with millions of user-item interactions. With this unique dataset, we conduct a preliminary analysis of how the two factors - data density and exposure bias - affect the evaluation results of multi-round conversational recommendation. Our main discoveries are that the performance ranking of different methods varies with the two factors, and this effect can only be alleviated in certain cases by estimating missing interactions for user simulation. This demonstrates the necessity of the fully-observed dataset. We release the dataset and the pipeline implementation for evaluation at https://kuairec.com

preprint2020arXiv

Neural Logic Reasoning

Recent years have witnessed the success of deep neural networks in many research areas. The fundamental idea behind the design of most neural networks is to learn similarity patterns from data for prediction and inference, which lacks the ability of cognitive reasoning. However, the concrete ability of reasoning is critical to many theoretical and practical problems. On the other hand, traditional symbolic reasoning methods do well in making logical inference, but they are mostly hard rule-based reasoning, which limits their generalization ability to different tasks since difference tasks may require different rules. Both reasoning and generalization ability are important for prediction tasks such as recommender systems, where reasoning provides strong connection between user history and target items for accurate prediction, and generalization helps the model to draw a robust user portrait over noisy inputs. In this paper, we propose Logic-Integrated Neural Network (LINN) to integrate the power of deep learning and logic reasoning. LINN is a dynamic neural architecture that builds the computational graph according to input logical expressions. It learns basic logical operations such as AND, OR, NOT as neural modules, and conducts propositional logical reasoning through the network for inference. Experiments on theoretical task show that LINN achieves significant performance on solving logical equations and variables. Furthermore, we test our approach on the practical task of recommendation by formulating the task into a logical inference problem. Experiments show that LINN significantly outperforms state-of-the-art recommendation models in Top-K recommendation, which verifies the potential of LINN in practice.

preprint2020arXiv

RepBERT: Contextualized Text Embeddings for First-Stage Retrieval

Although exact term match between queries and documents is the dominant method to perform first-stage retrieval, we propose a different approach, called RepBERT, to represent documents and queries with fixed-length contextualized embeddings. The inner products of query and document embeddings are regarded as relevance scores. On MS MARCO Passage Ranking task, RepBERT achieves state-of-the-art results among all initial retrieval techniques. And its efficiency is comparable to bag-of-words methods.