Source author record

Pengfei Liu

Pengfei Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

37works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

AcademiClaw: When Students Set Challenges for AI Agents

Benchmarks within the OpenClaw ecosystem have thus far evaluated exclusively assistant-level tasks, leaving the academic-level capabilities of OpenClaw largely unexamined. We introduce AcademiClaw, a bilingual benchmark of 80 complex, long-horizon tasks sourced directly from university students' real academic workflows -- homework, research projects, competitions, and personal projects -- that they found current AI agents unable to solve effectively. Curated from 230 student-submitted candidates through rigorous expert review, the final task set spans 25+ professional domains, ranging from olympiad-level mathematics and linguistics problems to GPU-intensive reinforcement learning and full-stack system debugging, with 16 tasks requiring CUDA GPU execution. Each task executes in an isolated Docker sandbox and is scored on task completion by multi-dimensional rubrics combining six complementary techniques, with an independent five-category safety audit providing additional behavioral analysis. Experiments on six frontier models show that even the best achieves only a 55\% pass rate. Further analysis uncovers sharp capability boundaries across task domains, divergent behavioral strategies among models, and a disconnect between token consumption and output quality, providing fine-grained diagnostic signals beyond what aggregate metrics reveal. We hope that AcademiClaw and its open-sourced data and code can serve as a useful resource for the OpenClaw community, driving progress toward agents that are more capable and versatile across the full breadth of real-world academic demands. All data and code are available at https://github.com/GAIR-NLP/AcademiClaw.

preprint2026arXiv

Feedback World Model Enables Precise Guidance of Diffusion Policy

World models aim to improve robotic decision making by predicting the consequences of actions. However, in practice, their predictions often become unreliable once the robot encounters states outside the training distribution, limiting their effectiveness at deployment. We observe that execution itself provides a natural but underutilized signal: after each action, the robot directly observes the true next state, revealing the mismatch between predicted and actual outcomes. Building on this insight, we propose feedback world model, a new paradigm that closes the loop between prediction and observation at inference time. Instead of treating the world model as a static open-loop predictor, our method maintains a lightweight feedback state that is updated online to iteratively correct future predictions, compensating for model errors using real-time observations without additional training data or parameter updates. We show that this process can be interpreted as a latent-space observer and admits convergence guarantees under mild conditions. We further introduce action-aware guidance to better translate corrected predictions into control by emphasizing action-controllable components while suppressing irrelevant variations. Experiments on LIBERO-Plus, Robomimic, and real-world manipulation tasks demonstrate that our method substantially improves both prediction accuracy and policy performance under distribution shift. In particular, it reduces world model prediction error by up to 76.4% and improves out-of-distribution (OOD) success rate by 30%. These results show that incorporating real-time feedback at inference time provides a simple yet powerful alternative to static world modeling.

preprint2026arXiv

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Recent large vision-language models (VLMs) remain fundamentally constrained by a persistent dichotomy: understanding and generation are treated as distinct problems, leading to fragmented architectures, cascaded pipelines, and misaligned representation spaces. We argue that this divide is not merely an engineering artifact, but a structural limitation that hinders the emergence of native multimodal intelligence. Hence, we introduce SenseNova-U1, a native unified multimodal paradigm built upon NEO-unify, in which understanding and generation evolve as synergistic views of a single underlying process. We launch two native unified variants, SenseNova-U1-8B-MoT and SenseNova-U1-A3B-MoT, built on dense (8B) and mixture-of-experts (30B-A3B) understanding baselines, respectively. Designed from first principles, they rival top-tier understanding-only VLMs across text understanding, vision-language perception, knowledge reasoning, agentic decision-making, and spatial intelligence. Meanwhile, they deliver strong semantic consistency and visual fidelity, excelling in conventional or knowledge-intensive any-to-image (X2I) synthesis, complex text-rich infographic generation, and interleaved vision-language generation, with or without think patterns. Beyond performance, we show detailed model design, data preprocessing, pre-/post-training, and inference strategies to support community research. Last but not least, preliminary evidence demonstrates that our models extend beyond perception and generation, performing strongly in vision-language-action (VLA) and world model (WM) scenarios. This points toward a broader roadmap where models do not translate between modalities, but think and act across them in a native manner. Multimodal AI is no longer about connecting separate systems, but about building a unified one and trusting the necessary capabilities to emerge from within.

preprint2026arXiv

SimCT: Recovering Lost Supervision for Cross-Tokenizer On-Policy Distillation

On-policy distillation (OPD) is a standard tool for transferring teacher behavior to a smaller student, but it implicitly assumes that teacher and student predictions are comparable token by token, an assumption that fails whenever the two models tokenize the same text differently. Under heterogeneous tokenizers, exact shared-token matching silently discards a large fraction of the teacher signal at precisely the positions where vocabularies disagree. We propose \textbf{\underline{Sim}ple \underline{C}ross-\underline{T}okenizer OPD (SimCT)}, which restores this signal by enlarging the supervision space: alongside shared tokens, SimCT compares teacher and student over short multi-token continuations that both tokenizers can realize, leaving the OPD loss form itself unchanged. We show that these units are the finest jointly tokenizable supervision interface, and that coarser alternatives remove teacher-student distinctions that are useful for on-policy learning. Across three heterogeneous teacher-student pairs on mathematical reasoning and code-generation benchmarks, SimCT shows consistent gains over shared-vocabulary OPD and representative cross-tokenizer baselines, with ablations confirming that the improvements come from recovering supervision discarded by exact shared-token matching. Code is available at \href{https://github.com/sunjie279/SimCT-}{https://github.com/sunjie279/SimCT-}.

preprint2024arXiv

InFoBench: Evaluating Instruction Following Ability in Large Language Models

This paper introduces the Decomposed Requirements Following Ratio (DRFR), a new metric for evaluating Large Language Models' (LLMs) ability to follow instructions. Addressing a gap in current methodologies, DRFR breaks down complex instructions into simpler criteria, facilitating a detailed analysis of LLMs' compliance with various aspects of tasks. Alongside this metric, we present InFoBench, a benchmark comprising 500 diverse instructions and 2,250 decomposed questions across multiple constraint categories. Our experiments compare DRFR with traditional scoring methods and explore annotation sources, including human experts, crowd-sourced workers, and GPT-4. The findings demonstrate DRFR's higher reliability and the effectiveness of using GPT-4 as a cost-efficient annotator. The evaluation of several advanced LLMs using this framework reveals their strengths and areas needing improvement, particularly in complex instruction-following. This study contributes a novel metric and benchmark, offering insights for future LLM development and evaluation.

preprint2022arXiv

Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification

In this paper, we ask the research question of whether all the datasets in the benchmark are necessary. We approach this by first characterizing the distinguishability of datasets when comparing different systems. Experiments on 9 datasets and 36 systems show that several existing benchmark datasets contribute little to discriminating top-scoring systems, while those less used datasets exhibit impressive discriminative power. We further, taking the text classification task as a case study, investigate the possibility of predicting dataset discrimination based on its properties (e.g., average sentence length). Our preliminary experiments promisingly show that given a sufficient number of training experimental records, a meaningful predictor can be learned to estimate dataset discrimination over unseen datasets. We released all datasets with features explored in this work on DataLab: \url{https://datalab.nlpedia.ai}.

preprint2022arXiv

Artificial Neural Networks for Finger Vein Recognition: A Survey

Finger vein recognition is an emerging biometric recognition technology. Different from the other biometric features on the body surface, the venous vascular tissue of the fingers is buried deep inside the skin. Due to this advantage, finger vein recognition is highly stable and private. They are almost impossible to be stolen and difficult to interfere with by external conditions. Unlike the finger vein recognition methods based on traditional machine learning, the artificial neural network technique, especially deep learning, it without relying on feature engineering and have superior performance. To summarize the development of finger vein recognition based on artificial neural networks, this paper collects 149 related papers. First, we introduce the background of finger vein recognition and the motivation of this survey. Then, the development history of artificial neural networks and the representative networks on finger vein recognition tasks are introduced. The public datasets that are widely used in finger vein recognition are then described. After that, we summarize the related finger vein recognition tasks based on classical neural networks and deep neural networks, respectively. Finally, the challenges and potential development directions in finger vein recognition are discussed. To our best knowledge, this paper is the first comprehensive survey focusing on finger vein recognition based on artificial neural networks.

preprint2022arXiv

BRIO: Bringing Order to Abstractive Summarization

Abstractive summarization models are commonly trained using maximum likelihood estimation, which assumes a deterministic (one-point) target distribution in which an ideal model will assign all the probability mass to the reference summary. This assumption may lead to performance degradation during inference, where the model needs to compare several system-generated (candidate) summaries that have deviated from the reference summary. To address this problem, we propose a novel training paradigm which assumes a non-deterministic distribution so that different candidate summaries are assigned probability mass according to their quality. Our method achieves a new state-of-the-art result on the CNN/DailyMail (47.78 ROUGE-1) and XSum (49.07 ROUGE-1) datasets. Further analysis also shows that our model can estimate probabilities of candidate summaries that are more correlated with their level of quality.

preprint2022arXiv

DataLab: A Platform for Data Analysis and Intervention

Despite data's crucial role in machine learning, most existing tools and research tend to focus on systems on top of existing data rather than how to interpret and manipulate data. In this paper, we propose DataLab, a unified data-oriented platform that not only allows users to interactively analyze the characteristics of data, but also provides a standardized interface for different data processing operations. Additionally, in view of the ongoing proliferation of datasets, \toolname has features for dataset recommendation and global vision analysis that help researchers form a better view of the data ecosystem. So far, DataLab covers 1,715 datasets and 3,583 of its transformed version (e.g., hyponyms replacement), where 728 datasets support various analyses (e.g., with respect to gender bias) with the help of 140M samples annotated by 318 feature functions. DataLab is under active development and will be supported going forward. We have released a web platform, web API, Python SDK, PyPI published package and online documentation, which hopefully, can meet the diverse needs of researchers.

preprint2022arXiv

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

Emotion recognition is a challenging and actively-studied research area that plays a critical role in emotion-aware human-computer interaction systems. In a multimodal setting, temporal alignment between different modalities has not been well investigated yet. This paper presents a new model named as Gated Bidirectional Alignment Network (GBAN), which consists of an attention-based bidirectional alignment network over LSTM hidden states to explicitly capture the alignment relationship between speech and text, and a novel group gated fusion (GGF) layer to integrate the representations of different modalities. We empirically show that the attention-aligned representations outperform the last-hidden-states of LSTM significantly, and the proposed GBAN model outperforms existing state-of-the-art multimodal approaches on the IEMOCAP dataset.

preprint2022arXiv

I^2R-Net: Intra- and Inter-Human Relation Network for Multi-Person Pose Estimation

In this paper, we present the Intra- and Inter-Human Relation Networks (I^2R-Net) for Multi-Person Pose Estimation. It involves two basic modules. First, the Intra-Human Relation Module operates on a single person and aims to capture Intra-Human dependencies. Second, the Inter-Human Relation Module considers the relation between multiple instances and focuses on capturing Inter-Human interactions. The Inter-Human Relation Module can be designed very lightweight by reducing the resolution of feature map, yet learn useful relation information to significantly boost the performance of the Intra-Human Relation Module. Even without bells and whistles, our method can compete or outperform current competition winners. We conduct extensive experiments on COCO, CrowdPose, and OCHuman datasets. The results demonstrate that the proposed model surpasses all the state-of-the-art methods. Concretely, the proposed method achieves 77.4% AP on CrowPose dataset and 67.8% AP on OCHuman dataset respectively, outperforming existing methods by a large margin. Additionally, the ablation study and visualization analysis also prove the effectiveness of our model.

preprint2022arXiv

KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models

Knowledge Graphs (KGs) store information in the form of (head, predicate, tail)-triples. To augment KGs with new knowledge, researchers proposed models for KG Completion (KGC) tasks such as link prediction; i.e., answering (h; p; ?) or (?; p; t) queries. Such models are usually evaluated with averaged metrics on a held-out test set. While useful for tracking progress, averaged single-score metrics cannot reveal what exactly a model has learned -- or failed to learn. To address this issue, we propose KGxBoard: an interactive framework for performing fine-grained evaluation on meaningful subsets of the data, each of which tests individual and interpretable capabilities of a KGC model. In our experiments, we highlight the findings that we discovered with the use of KGxBoard, which would have been impossible to detect with standard averaged single-score metrics.

preprint2022arXiv

reStructured Pre-training

In this work, we try to decipher the internal connection of NLP technology development in the past decades, searching for essence, which rewards us with a (potential) new learning paradigm for NLP tasks, dubbed as reStructured Pre-training (RST). In such a paradigm, the role of data will be re-emphasized, and model pre-training and fine-tuning of downstream tasks are viewed as a process of data storing and accessing. Based on that, we operationalize the simple principle that a good storage mechanism should not only have the ability to cache a large amount of data but also consider the ease of access. We achieve this by pre-training models over restructured data that consist of a variety of valuable information instead of raw data after overcoming several engineering challenges. Experimentally, RST models not only surpass strong competitors (e.g., T0) on 52/55 popular datasets from a variety of NLP tasks, but also achieve superior performance in National College Entrance Examination - English (Gaokao-English),the most authoritative examination in China. Specifically, the proposed system Qin achieves 40 points higher than the average scores made by students and 15 points higher than GPT3 with 1/16 parameters. In particular, Qin gets a high score of 138.5 (the full mark is 150) in the 2018 English exam (national paper III). We have released the Gaokao Benchmark with an online submission platform. In addition, we test our model in the 2022 College Entrance Examination English that happened a few days ago (2022.06.08), and it gets a total score of 134 (v.s. GPT3's 108).

preprint2022arXiv

Star-Transformer

Although Transformer has achieved great successes on many NLP tasks, its heavy structure with fully-connected attention connections leads to dependencies on large training data. In this paper, we present Star-Transformer, a lightweight alternative by careful sparsification. To reduce model complexity, we replace the fully-connected structure with a star-shaped topology, in which every two non-adjacent nodes are connected through a shared relay node. Thus, complexity is reduced from quadratic to linear, while preserving capacity to capture both local composition and long-range dependency. The experiments on four tasks (22 datasets) show that Star-Transformer achieved significant improvements against the standard Transformer for the modestly sized datasets.

preprint2022arXiv

The MSXF TTS System for ICASSP 2022 ADD Challenge

This paper presents our MSXF TTS system for Task 3.1 of the Audio Deep Synthesis Detection (ADD) Challenge 2022. We use an end to end text to speech system, and add a constraint loss to the system when training stage. The end to end TTS system is VITS, and the pre-training self-supervised model is wav2vec 2.0. And we also explore the influence of the speech speed and volume in spoofing. The faster speech means the less the silence part in audio, the easier to fool the detector. We also find the smaller the volume, the better spoofing ability, though we normalize volume for submission. Our team is identified as C2, and we got the fourth place in the challenge.

preprint2021arXiv

Can We Automate Scientific Reviewing?

The rapid development of science and technology has been accompanied by an exponential growth in peer-reviewed scientific publications. At the same time, the review of each paper is a laborious process that must be carried out by subject matter experts. Thus, providing high-quality reviews of this growing number of papers is a significant challenge. In this work, we ask the question "can we automate scientific reviewing?", discussing the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers. Arguably the most difficult part of this is defining what a "good" review is in the first place, so we first discuss possible evaluation measures for such reviews. We then collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews. Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews, but the generated text can suffer from lower constructiveness for all aspects except the explanation of the core ideas of the papers, which are largely factually correct. We finally summarize eight challenges in the pursuit of a good review generation system together with potential solutions, which, hopefully, will inspire more future research on this subject. We make all code, and the dataset publicly available: https://github.com/neulab/ReviewAdvisor, as well as a ReviewAdvisor system: http://review.nlpedia.ai/.

preprint2021arXiv

Towards More Fine-grained and Reliable NLP Performance Prediction

Performance prediction, the task of estimating a system's performance without performing experiments, allows us to reduce the experimental burden caused by the combinatorial explosion of different datasets, languages, tasks, and models. In this paper, we make two contributions to improving performance prediction for NLP tasks. First, we examine performance predictors not only for holistic measures of accuracy like F1 or BLEU but also fine-grained performance measures such as accuracy over individual classes of examples. Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration. We perform an analysis of four types of NLP tasks, and both demonstrate the feasibility of fine-grained performance prediction and the necessity to perform reliability analysis for performance prediction methods in the future. We make our code publicly available: \url{https://github.com/neulab/Reliable-NLPPP}

preprint2020arXiv

Distinct Topological Surface States on the Two Terminations of MnBi$_4$Te$_7$

The recent discovered intrinsic magnetic topological insulator MnBi2Te4 have been met with unusual success in hosting emergent phenomena such as the quantum anomalous Hall effect and the axion insulator states. However, the surface-bulk correspondence of the Mn-Bi-Te family, composed by the superlattice-like MnBi2Te4/(Bi2Te3)n (n = 0, 1, 2, 3 ...) layered structure, remains intriguing but elusive. Here, by using scanning tunneling microscopy (STM) and angle-resolved photoemission spectroscopy (ARPES) techniques, we unambiguously assign the two distinct surface states of MnBi4Te7 (n = 1) to the quintuple-layer (QL) Bi2Te3 termination and the septuple-layer (SL) MnBi2Te4 termination, respectively. A comparison of the experimental observations with theoretical calculations reveals the diverging topological behaviors, especially the hybridization effect between magnetic and nonmagnetic layers, on the two terminations: a gap on the QL termination originating from the topological surface states of the QL hybridizing with the bands of the beneath SL, and a gapless Dirac-cone band structure on the SL termination with time-reversal symmetry. The quasi-particle interference patterns further confirm the topological nature of the surface states for both terminations, continuing far above the Fermi energy. The QL termination carries a spin-helical Dirac state with hexagonal warping, while at the SL termination, a strongly canted helical state from the surface lies between a pair of Rashba-split states from its neighboring layer. Our work elucidates an unprecedented hybridization effect between the building blocks of the topological surface states, and also reveals the termination-dependent time-reversal symmetry breaking in a magnetic topological insulator, rendering an ideal platform to realize the half-integer quantum Hall effect and relevant quantum phenomena.

preprint2020arXiv

Eliciting Information from Sensitive Survey Questions

This paper considers how to elicit information from sensitive survey questions. First we thoroughly evaluate list experiments (LE), a leading method in the experimental literature on sensitive questions. Our empirical results demonstrate that the assumptions required to identify sensitive information in LE are violated for the majority of surveys. Next we propose a novel survey method, called Multiple Response Technique (MRT), for eliciting information from sensitive questions. We require all of the respondents to answer three questions related to the sensitive information. This technique recovers sensitive information at a disaggregated level while still allowing arbitrary misreporting in survey responses. An application of the MRT provides novel empirical evidence on sexual orientation and Lesbian, Gay, Bisexual, and Transgender (LGBT)-related sentiment.

preprint2020arXiv

Extractive Summarization as Text Matching

This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. Instead of following the commonly used framework of extracting sentences individually and modeling the relationship between sentences, we formulate the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be (extracted from the original text) matched in a semantic space. Notably, this paradigm shift to semantic matching framework is well-grounded in our comprehensive analysis of the inherent gap between sentence-level and summary-level extractors based on the property of the dataset. Besides, even instantiating the framework with a simple form of a matching model, we have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1). Experiments on the other five datasets also show the effectiveness of the matching framework. We believe the power of this matching-based summarization framework has not been fully exploited. To encourage more instantiations in the future, we have released our codes, processed dataset, as well as generated summaries in https://github.com/maszhongming/MatchSum.

preprint2020arXiv

Heterogeneous Graph Neural Networks for Extractive Document Summarization

As a crucial step in extractive document summarization, learning cross-sentence relations has been explored by a plethora of approaches. An intuitive way is to put them in the graph-based neural network, which has a more complex structure for capturing inter-sentence relationships. In this paper, we present a heterogeneous graph-based neural network for extractive summarization (HeterSumGraph), which contains semantic nodes of different granularity levels apart from sentences. These additional nodes act as the intermediary between sentences and enrich the cross-sentence relations. Besides, our graph structure is flexible in natural extension from a single-document setting to multi-document via introducing document nodes. To our knowledge, we are the first one to introduce different types of nodes into graph-based neural networks for extractive document summarization and perform a comprehensive qualitative analysis to investigate their benefits. The code will be released on Github

preprint2020arXiv

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets: (ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. As a by-product of this paper, we have open-sourced a project that involves a comprehensive summary of recent NER papers and classifies them into different research topics: https://github.com/pfliu-nlp/Named-Entity-Recognition-NER-Papers.

preprint2020arXiv

Robust Covariance Estimation for High-dimensional Compositional Data with Application to Microbial Communities Analysis

Microbial communities analysis is drawing growing attention due to the rapid development of high-throughput sequencing techniques nowadays. The observed data has the following typical characteristics: it is high-dimensional, compositional (lying in a simplex) and even would be leptokurtic and highly skewed due to the existence of overly abundant taxa, which makes the conventional correlation analysis infeasible to study the co-occurrence and co-exclusion relationship between microbial taxa. In this article, we address the challenges of covariance estimation for this kind of data. Assuming the basis covariance matrix lying in a well-recognized class of sparse covariance matrices, we adopt a proxy matrix known as centered log-ratio covariance matrix in the literature, which is approximately indistinguishable from the real basis covariance matrix as the dimensionality tends to infinity. We construct a Median-of-Means (MOM) estimator for the centered log-ratio covariance matrix and propose a thresholding procedure that is adaptive to the variability of individual entries. By imposing a much weaker finite fourth moment condition compared with the sub-Gaussianity condition in the literature, we derive the optimal rate of convergence under the spectral norm. In addition, we also provide theoretical guarantee on support recovery. The adaptive thresholding procedure of the MOM estimator is easy to implement and gains robustness when outliers or heavy-tailedness exist. Thorough simulation studies are conducted to show the advantages of the proposed procedure over some state-of-the-arts methods. At last, we apply the proposed method to analyze a microbiome dataset in human gut. The R script for implementing the method is available at https://github.com/heyongstat/RCEC.

preprint2020arXiv

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving

In this work, we propose an efficient and accurate monocular 3D detection framework in single shot. Most successful 3D detectors take the projection constraint from the 3D bounding box to the 2D box as an important component. Four edges of a 2D box provide only four constraints and the performance deteriorates dramatically with the small error of the 2D detector. Different from these approaches, our method predicts the nine perspective keypoints of a 3D bounding box in image space, and then utilize the geometric relationship of 3D and 2D perspectives to recover the dimension, location, and orientation in 3D space. In this method, the properties of the object can be predicted stably even when the estimation of keypoints is very noisy, which enables us to obtain fast detection speed with a small architecture. Training our method only uses the 3D properties of the object without the need for external networks or supervision data. Our method is the first real-time system for monocular image 3D detection while achieves state-of-the-art performance on the KITTI benchmark. Code will be released at https://github.com/Banconxuan/RTM3D.

preprint2019arXiv

A van der Waals antiferromagnetic topological insulator with weak interlayer magnetic coupling

Magnetic topological insulators (TI) provide an important material platform to explore quantum phenomena such as quantized anomalous Hall (QAH) effect and Majorana modes, etc. Their successful material realization is thus essential for our fundamental understanding and potential technical revolutions. By realizing a bulk van der Waals material MnBi4Te7 with alternating septuple [MnBi2Te4] and quintuple [Bi2Te3] layers, we show that it is ferromagnetic in plane but antiferromagnetic along the c axis with an out-of-plane saturation field of ~ 0.22 T at 2 K. Our angle-resolved photoemission spectroscopy measurements and first-principles calculations further demonstrate that MnBi4Te7 is a Z2 antiferromagnetic TI with two types of surface states associated with the [MnBi2Te4] or [Bi2Te3] termination, respectively. Additionally, its superlattice nature may make various heterostructures of [MnBi2Te4] and [Bi2Te3] layers possible by exfoliation. Therefore, the low saturation field and the superlattice nature of MnBi4Te7 make it an ideal system to investigate rich emergent phenomena.

preprint2016arXiv

Deep Multi-Task Learning with Shared Memory

Neural network based models have achieved impressive results on various specific tasks. However, in previous works, most models are learned separately based on single-task supervised objectives, which often suffer from insufficient training data. In this paper, we propose two deep architectures which can be trained jointly on multiple related tasks. More specifically, we augment neural model with an external memory, which is shared by several tasks. Experiments on two groups of text classification tasks show that our proposed architectures can improve the performance of a task with the help of other related tasks.

preprint2016arXiv

Modelling Interaction of Sentence Pair with coupled-LSTMs

Recently, there is rising interest in modelling the interactions of two sentences with deep neural networks. However, most of the existing methods encode two sequences with separate encoders, in which a sentence is encoded with little or no information from the other sentence. In this paper, we propose a deep architecture to model the strong interaction of sentence pair with two coupled-LSTMs. Specifically, we introduce two coupled ways to model the interdependences of two LSTMs, coupling the local contextualized interactions of two sentences. We then aggregate these interactions and use a dynamic pooling to select the most informative features. Experiments on two very large datasets demonstrate the efficacy of our proposed architecture and its superiority to state-of-the-art methods.

preprint2016arXiv

Recurrent Neural Network for Text Classification with Multi-Task Learning

Neural network based methods have obtained great progress on a variety of natural language processing tasks. However, in most previous works, the models are learned based on single-task supervised objectives, which often suffer from insufficient training data. In this paper, we use the multi-task learning framework to jointly learn across multiple related tasks. Based on recurrent neural network, we propose three different mechanisms of sharing information to model text with task-specific and shared layers. The entire network is trained jointly on all these tasks. Experiments on four benchmark text classification tasks show that our proposed models can improve the performance of a task with the help of other related tasks.

preprint2016arXiv

Syntax-based Attention Model for Natural Language Inference

Introducing attentional mechanism in neural network is a powerful concept, and has achieved impressive results in many natural language processing tasks. However, most of the existing models impose attentional distribution on a flat topology, namely the entire input representation sequence. Clearly, any well-formed sentence has its accompanying syntactic tree structure, which is a much rich topology. Applying attention to such topology not only exploits the underlying syntax, but also makes attention more interpretable. In this paper, we explore this direction in the context of natural language inference. The results demonstrate its efficacy. We also perform extensive qualitative analysis, deriving insights and intuitions of why and how our model works.

preprint2015arXiv

Local-set-based Graph Signal Reconstruction

Signal processing on graph is attracting more and more attentions. For a graph signal in the low-frequency subspace, the missing data associated with unsampled vertices can be reconstructed through the sampled data by exploiting the smoothness of the graph signal. In this paper, the concept of local set is introduced and two local-set-based iterative methods are proposed to reconstruct bandlimited graph signal from sampled data. In each iteration, one of the proposed methods reweights the sampled residuals for different vertices, while the other propagates the sampled residuals in their respective local sets. These algorithms are built on frame theory and the concept of local sets, based on which several frames and contraction operators are proposed. We then prove that the reconstruction methods converge to the original signal under certain conditions and demonstrate the new methods lead to a significantly faster convergence compared with the baseline method. Furthermore, the correspondence between graph signal sampling and time-domain irregular sampling is analyzed comprehensively, which may be helpful to future works on graph signals. Computer simulations are conducted. The experimental results demonstrate the effectiveness of the reconstruction methods in various sampling geometries, imprecise priori knowledge of cutoff frequency, and noisy scenarios.

preprint2015arXiv

Optimal Local Multi-scale Basis Functions for Linear Elliptic Equations with Rough Coefficient

This paper addresses a multi-scale finite element method for second order linear elliptic equations with arbitrarily rough coefficient. We propose a local oversampling method to construct basis functions that have optimal local approximation property. Our methodology is based on the compactness of the solution operator restricted on local regions of the spatial domain, and does not depend on any scale-separation or periodicity assumption of the coefficient. We focus on a special type of basis functions that are harmonic on each element and have optimal approximation property. We first reduce our problem to approximating the trace of the solution space on each edge of the underlying mesh, and then achieve this goal through the singular value decomposition of an oversampling operator. Rigorous error estimates can be obtained through thresholding in constructing the basis functions. Numerical results for several problems with multiple spatial scales and high contrast inclusions are presented to demonstrate the compactness of the local solution space and the capacity of our method in identifying and exploiting this compact structure to achieve computational savings.

preprint2015arXiv

Self-similar Singularity of a 1D Model for the 3D Axisymmetric Euler Equations

We investigate the self-similar singularity of a 1D model for the 3D axisymmetric Euler equations, which is motivated by a particular singularity formation scenario observed in numerical computation. We prove the existence of a discrete family of self-similar profiles for this model and analyze their far-field properties. The self-similar profiles we find agree with direct simulation of the model and seem to have some stability.

preprint2014arXiv

A Heterogeneous Stochastic FEM Framework for Elliptic PDEs

We introduce a new concept of sparsity for the stochastic elliptic operator $-{\rm div}\left(a(x,ω)\nabla(\cdot)\right)$, which reflects the compactness of its inverse operator in the stochastic direction and allows for spatially heterogeneous stochastic structure. This new concept of sparsity motivates a heterogeneous stochastic finite element method ({\bf HSFEM}) framework for linear elliptic equations, which discretizes the equations using the heterogeneous coupling of spatial basis with local stochastic basis to exploit the local stochastic structure of the solution space. We also provide a sampling method to construct the local stochastic basis for this framework using the randomized range finding techniques. The resulting HSFEM involves two stages and suits the multi-query setting: in the offline stage, the local stochastic structure of the solution space is identified; in the online stage, the equation can be efficiently solved for multiple forcing functions. An online error estimation and correction procedure through Monte Carlo sampling is given. Numerical results for several problems with high dimensional stochastic input are presented to demonstrate the efficiency of the HSFEM in the online stage.

preprint2013arXiv

A superradiant laser based on two-photon Raman transition of caesium atoms

We propose a superradiant laser based on two-photon Raman transition of caesium-133 atoms which collectively emit photons on an ultra narrow transition into the mode of a low Q resonator known as optical bad-cavity regime. The spin-spin correlation which characterizes the collective effect is demonstrated. We theoretically predict that the optical radiation has an extremely narrow linewidth in the 98 (1) *10-2 mHz range, smaller than the transition itself due to collective effects, and a power level of 7 (1)*10-10 W is possible, which can provide a possible new way to realize an optical clock with a millihertz linewidth.

preprint2013arXiv

Edge Balance Ratio: Power Law from Vertices to Edges in Directed Complex Network

Power law distribution is common in real-world networks including online social networks. Many studies on complex networks focus on the characteristics of vertices, which are always proved to follow the power law. However, few researches have been done on edges in directed networks. In this paper, edge balance ratio is firstly proposed to measure the balance property of edges in directed networks. Based on edge balance ratio, balance profile and positivity are put forward to describe the balance level of the whole network. Then the distribution of edge balance ratio is theoretically analyzed. In a directed network whose vertex in-degree follows the power law with scaling exponent $γ$, it is proved that the edge balance ratio follows a piecewise power law, with the scaling exponent of each section linearly dependents on $γ$. The theoretical analysis is verified by numerical simulations. Moreover, the theoretical analysis is confirmed by statistics of real-world online social networks, including Twitter network with 35 million users and Sina Weibo network with 110 million users.

preprint2012arXiv

Follow Whom? Chinese Users Have Different Choice

Sina Weibo, which was launched in 2009, is the most popular Chinese micro-blogging service. It has been reported that Sina Weibo has more than 400 million registered users by the end of the third quarter in 2012. Sina Weibo and Twitter have a lot in common, however, in terms of the following preference, Sina Weibo users, most of whom are Chinese, behave differently compared with those of Twitter. This work is based on a data set of Sina Weibo which contains 80.8 million users' profiles and 7.2 billion relations and a large data set of Twitter. Firstly some basic features of Sina Weibo and Twitter are analyzed such as degree and activeness distribution, correlation between degree and activeness, and the degree of separation. Then the following preference is investigated by studying the assortative mixing, friend similarities, following distribution, edge balance ratio, and ranking correlation, where edge balance ratio is newly proposed to measure balance property of graphs. It is found that Sina Weibo has a lower reciprocity rate, more positive balanced relations and is more disassortative. Coinciding with Asian traditional culture, the following preference of Sina Weibo users is more concentrated and hierarchical: they are more likely to follow people at higher or the same social levels and less likely to follow people lower than themselves. In contrast, the same kind of following preference is weaker in Twitter. Twitter users are open as they follow people from levels, which accords with its global characteristic and the prevalence of western civilization. The message forwarding behavior is studied by displaying the propagation levels, delays, and critical users. The following preference derives from not only the usage habits but also underlying reasons such as personalities and social moralities that is worthy of future research.

preprint2012arXiv

Thermal modelling for endocardiac radiofrequency ablation: comparison of hyperbolic bioheat equation and Pennes bioheat equation with finite element method

The objectives of this study are to model the endocardiac radiofrequency (RF) ablation procedure and to employ the Hyperbolic Bioheat Equation (HBE), which takes the thermal wave behaviour into account, comparing the results with those obtained using the common Pennes Bioheat Equation (BE) method. A complex model is created to cover particular endocardiac physical and geometry environment. Finite Element Method (FEM) is adopted to study the model with both BE and HBE methods. Different convection coefficients and voltages are applied to simulate different conditions. Lesion size, max temperature and specified position temperature are selected as criteria to evaluate the simulated results. The study found that during ablation, the lesion size difference ratio can reach 20% in some periods. The difference is obvious and cannot be neglected.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint

Fields this researcher appears in

Source provenance

Where this author record came from

arxivconfidence 95%

external id: arxiv:2605.02661:author:78:pengfei-liu

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.07711:author:7:pengfei-liu

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.12500:author:35:pengfei-liu

Imported May 20, 2026Synced May 20, 2026

arxivconfidence 95%

external id: arxiv:2605.15705:author:6:pengfei-liu

Imported May 20, 2026Synced May 20, 2026

7 works

Xipeng Qiu

Researcher

Xipeng Qiu contributes to research discovery and scholarly infrastructure.

Open to collaborate

7 works

Xuanjing Huang

Researcher

Xuanjing Huang contributes to research discovery and scholarly infrastructure.

Open to collaborate

5 works

Graham Neubig

Researcher

Graham Neubig contributes to research discovery and scholarly infrastructure.

Open to collaborate

4 works

Jinlan Fu

Researcher

Jinlan Fu contributes to research discovery and scholarly infrastructure.

Open to collaborate

Pengfei Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

37 published item(s)

AcademiClaw: When Students Set Challenges for AI Agents

Feedback World Model Enables Precise Guidance of Diffusion Policy

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

SimCT: Recovering Lost Supervision for Cross-Tokenizer On-Policy Distillation

InFoBench: Evaluating Instruction Following Ability in Large Language Models

Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification

Artificial Neural Networks for Finger Vein Recognition: A Survey

BRIO: Bringing Order to Abstractive Summarization

DataLab: A Platform for Data Analysis and Intervention

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

I^2R-Net: Intra- and Inter-Human Relation Network for Multi-Person Pose Estimation

KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models

reStructured Pre-training

Star-Transformer

The MSXF TTS System for ICASSP 2022 ADD Challenge

Can We Automate Scientific Reviewing?

Towards More Fine-grained and Reliable NLP Performance Prediction

Distinct Topological Surface States on the Two Terminations of MnBi$_4$Te$_7$

Eliciting Information from Sensitive Survey Questions

Extractive Summarization as Text Matching

Heterogeneous Graph Neural Networks for Extractive Document Summarization

Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study

Robust Covariance Estimation for High-dimensional Compositional Data with Application to Microbial Communities Analysis

RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving

A van der Waals antiferromagnetic topological insulator with weak interlayer magnetic coupling

Deep Multi-Task Learning with Shared Memory

Modelling Interaction of Sentence Pair with coupled-LSTMs

Recurrent Neural Network for Text Classification with Multi-Task Learning

Syntax-based Attention Model for Natural Language Inference

Local-set-based Graph Signal Reconstruction

Optimal Local Multi-scale Basis Functions for Linear Elliptic Equations with Rough Coefficient

Self-similar Singularity of a 1D Model for the 3D Axisymmetric Euler Equations

A Heterogeneous Stochastic FEM Framework for Elliptic PDEs

A superradiant laser based on two-photon Raman transition of caesium atoms

Edge Balance Ratio: Power Law from Vertices to Edges in Directed Complex Network

Follow Whom? Chinese Users Have Different Choice

Thermal modelling for endocardiac radiofrequency ablation: comparison of hyperbolic bioheat equation and Pennes bioheat equation with finite element method