Researcher profile

Peitian Zhang

Peitian Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

A Multi-Task Embedder For Retrieval Augmented LLMs

LLMs confront inherent limitations in terms of its knowledge, memory, and action. The retrieval augmentation stands as a vital mechanism to address these limitations, which brings in useful information from external sources to augment the LLM. However, existing retrieval methods encounter two pressing issues. On one hand, the general retrievers are not properly optimized for retrieval augmentation hence exhibit limited effectiveness; on the other hand, the task-specific retrievers excel in the targeted retrieval augmentation scenario, while lack the versatility to handle diverse scenarios. In this work, we propose \textbf{LLM-Embedder} for the unified support of diverse retrieval augmentation scenarios. Our method presents three technical contributions. Firstly, we introduce a new \textit{reward formulation}, namely {rank-aware reward}. It exploits the ranking position of the desired output among $N$ sampled outputs from the LLM, which leads to fine-grained and robust computation of reward from the LLM's feedback. Secondly, we design a novel \textit{distillation objective}, called graded distillation. It incorporates both the absolute value and the relative order of the reward for more sufficient utilization of the LLM's feedback. Thirdly, we systematically optimize the \textit{multi-task learning}, which effectively unifies the multiple retrieval functionalities into one model. In our experiment, LLM-Embedder notably improves the LLM's performances in various downstream tasks, and outperforms both general and task-specific retrievers with a substantial advantage.

preprint2022arXiv

GateFormer: Speeding Up News Feed Recommendation with Input Gated Transformers

News feed recommendation is an important web service. In recent years, pre-trained language models (PLMs) have been intensively applied to improve the recommendation quality. However, the utilization of these deep models is limited in many aspects, such as lack of explainability and being incompatible with the existing inverted index systems. Above all, the PLMs based recommenders are inefficient, as the encoding of user-side information will take huge computation costs. Although the computation can be accelerated with efficient transformers or distilled PLMs, it is still not enough to make timely recommendations for the active users, who are associated with super long news browsing histories. In this work, we tackle the efficient news recommendation problem from a distinctive perspective. Instead of relying on the entire input (i.e., the collection of news articles a user ever browsed), we argue that the user's interest can be fully captured merely with those representative keywords. Motivated by this, we propose GateFormer, where the input data is gated before feeding into transformers. The gating module is made personalized, lightweight and end-to-end learnable, such that it may perform accurate and efficient filtering of informative user input. GateFormer achieves highly impressive performances in experiments, where it notably outperforms the existing acceleration approaches in both accuracy and efficiency. We also surprisingly find that even with over 10-fold compression of the original input, GateFormer is still able to maintain on-par performances with the SOTA methods.

preprint2022arXiv

Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer

Document retrieval has been extensively studied within the index-retrieve framework for decades, which has withstood the test of time. Unfortunately, such a pipelined framework limits the optimization of the final retrieval quality, because indexing and retrieving are separated stages that can not be jointly optimized in an end-to-end manner. In order to unify these two stages, we explore a model-based indexer for document retrieval. Concretely, we propose Ultron, which encodes the knowledge of all documents into the model and aims to directly retrieve relevant documents end-to-end. For the model-based indexer, how to represent docids and how to train the model are two main issues to be explored. Existing solutions suffer from semantically deficient docids and limited supervised data. To tackle these two problems, first, we devise two types of docids that are richer in semantics and easier for model inference. In addition, we propose a three-stage training workflow to capture more knowledge contained in the corpus and associations between queries and docids. Experiments on two public datasets demonstrate the superiority of Ultron over advanced baselines for document retrieval.