Researcher profile

Jianhao Yan

Jianhao Yan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2025arXiv

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. However, a growing concern lies in their tendency to produce excessively long reasoning traces, which are often filled with redundant content (e.g., repeated definitions), over-analysis of simple problems, and superficial exploration of multiple reasoning paths for harder tasks. This inefficiency introduces significant challenges for training, inference, and real-world deployment (e.g., in agent-based systems), where token economy is critical. In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm. We identify common patterns of inefficiency, examine methods proposed across the LRM lifecycle, i.e., from pretraining to inference, and discuss promising future directions for research. To support ongoing development, we also maintain a real-time GitHub repository tracking recent progress in the field. We hope this survey serves as a foundation for further exploration and inspires innovation in this rapidly evolving area.

preprint2022arXiv

Probing Causes of Hallucinations in Neural Machine Translations

Hallucination, one kind of pathological translations that bothers Neural Machine Translation, has recently drawn much attention. In simple terms, hallucinated translations are fluent sentences but barely related to source inputs. Arguably, it remains an open problem how hallucination occurs. In this paper, we propose to use probing methods to investigate the causes of hallucinations from the perspective of model architecture, aiming to avoid such problems in future architecture designs. By conducting experiments over various NMT datasets, we find that hallucination is often accompanied by the deficient encoder, especially embeddings, and vulnerable cross-attentions, while, interestingly, cross-attention mitigates some errors caused by the encoder.

preprint2020arXiv

Dual Past and Future for Neural Machine Translation

Though remarkable successes have been achieved by Neural Machine Translation (NMT) in recent years, it still suffers from the inadequate-translation problem. Previous studies show that explicitly modeling the Past and Future contents of the source sentence is beneficial for translation performance. However, it is not clear whether the commonly used heuristic objective is good enough to guide the Past and Future. In this paper, we present a novel dual framework that leverages both source-to-target and target-to-source NMT models to provide a more direct and accurate supervision signal for the Past and Future modules. Experimental results demonstrate that our proposed method significantly improves the adequacy of NMT predictions and surpasses previous methods in two well-studied translation tasks.

preprint2020arXiv

Learning to Encode Evolutionary Knowledge for Automatic Commenting Long Novels

Static knowledge graph has been incorporated extensively into sequence-to-sequence framework for text generation. While effectively representing structured context, static knowledge graph failed to represent knowledge evolution, which is required in modeling dynamic events. In this paper, an automatic commenting task is proposed for long novels, which involves understanding context of more than tens of thousands of words. To model the dynamic storyline, especially the transitions of the characters and their relations, Evolutionary Knowledge Graph(EKG) is proposed and learned within a multi-task framework. Given a specific passage to comment, sequential modeling is used to incorporate historical and future embedding for context representation. Further, a graph-to-sequence model is designed to utilize the EKG for comment generation. Extensive experimental results show that our EKG-based method is superior to several strong baselines on both automatic and human evaluations.