Researcher profile

Cheng-Lin Liu

Cheng-Lin Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2026arXiv

MR-Align: Meta-Reasoning Informed Factuality Alignment for Large Reasoning Models

Large reasoning models (LRMs) show strong capabilities in complex reasoning, yet their marginal gains on evidence-dependent factual questions are limited. We find this limitation is partially attributable to a reasoning-answer hit gap, where the model identifies the correct facts during reasoning but fails to incorporate them into the final response, thereby reducing factual fidelity. To address this issue, we propose MR-ALIGN, a Meta-Reasoning informed alignment framework that enhances factuality without relying on external verifiers. MR-ALIGN quantifies state transition probabilities along the model's thinking process and constructs a transition-aware implicit reward that reinforces beneficial reasoning patterns while suppressing defective ones at the atomic thinking segments. This re-weighting reshapes token-level signals into probability-aware segment scores, encouraging coherent reasoning trajectories that are more conducive to factual correctness. Empirical evaluations across four factual QA datasets and one long-form factuality benchmark show that MR-ALIGN consistently improves accuracy and truthfulness while reducing misleading reasoning. These results highlight that aligning the reasoning process itself, rather than merely the outputs, is pivotal for advancing factuality in LRMs.

preprint2022arXiv

Document Dewarping with Control Points

Document images are now widely captured by handheld devices such as mobile phones. The OCR performance on these images are largely affected due to geometric distortion of the document paper, diverse camera positions and complex backgrounds. In this paper, we propose a simple yet effective approach to rectify distorted document image by estimating control points and reference points. After that, we use interpolation method between control points and reference points to convert sparse mappings to backward mapping, and remap the original distorted document image to the rectified image. Furthermore, control points are controllable to facilitate interaction or subsequent adjustment. We can flexibly select post-processing methods and the number of vertices according to different application scenarios. Experiments show that our approach can rectify document images with various distortion types, and yield state-of-the-art performance on real-world dataset. This paper also provides a training dataset based on control points for document dewarping. Both the code and the dataset are released at https://github.com/gwxie/Document-Dewarping-with-Control-Points.

preprint2022arXiv

Emergence of Machine Language: Towards Symbolic Intelligence with Neural Networks

Representation is a core issue in artificial intelligence. Humans use discrete language to communicate and learn from each other, while machines use continuous features (like vector, matrix, or tensor in deep neural networks) to represent cognitive patterns. Discrete symbols are low-dimensional, decoupled, and have strong reasoning ability, while continuous features are high-dimensional, coupled, and have incredible abstracting capabilities. In recent years, deep learning has developed the idea of continuous representation to the extreme, using millions of parameters to achieve high accuracies. Although this is reasonable from the statistical perspective, it has other major problems like lacking interpretability, poor generalization, and is easy to be attacked. Since both paradigms have strengths and weaknesses, a better choice is to seek reconciliation. In this paper, we make an initial attempt towards this direction. Specifically, we propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation. This process is highly similar to human language, which is a natural combination of discrete symbols and neural systems, where the brain processes continuous signals and represents intelligence via discrete language. To mimic this functionality, we denote our approach as machine language. By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language through cooperation. Moreover, through experiments we show that discrete language representation has several advantages compared with continuous feature representation, from the aspects of interpretability, generalization, and robustness.

preprint2022arXiv

Plane Geometry Diagram Parsing

Geometry diagram parsing plays a key role in geometry problem solving, wherein the primitive extraction and relation parsing remain challenging due to the complex layout and between-primitive relationship. In this paper, we propose a powerful diagram parser based on deep learning and graph reasoning. Specifically, a modified instance segmentation method is proposed to extract geometric primitives, and the graph neural network (GNN) is leveraged to realize relation parsing and primitive classification incorporating geometric features and prior knowledge. All the modules are integrated into an end-to-end model called PGDPNet to perform all the sub-tasks simultaneously. In addition, we build a new large-scale geometry diagram dataset named PGDP5K with primitive level annotations. Experiments on PGDP5K and an existing dataset IMP-Geometry3K show that our model outperforms state-of-the-art methods in four sub-tasks remarkably. Our code, dataset and appendix material are available at https://github.com/mingliangzhang2018/PGDP.

preprint2022arXiv

Towards Open-Set Text Recognition via Label-to-Prototype Learning

Scene text recognition is a popular topic and extensively used in the industry. Although many methods have achieved satisfactory performance for the close-set text recognition challenges, these methods lose feasibility in open-set scenarios, where collecting data or retraining models for novel characters could yield a high cost. For example, annotating samples for foreign languages can be expensive, whereas retraining the model each time when a novel character is discovered from historical documents costs both time and resources. In this paper, we introduce and formulate a new open-set text recognition task which demands the capability to spot and recognize novel characters without retraining. A label-to-prototype learning framework is also proposed as a baseline for the proposed task. Specifically, the framework introduces a generalizable label-to-prototype mapping function to build prototypes (class centers) for both seen and unseen classes. An open-set predictor is then utilized to recognize or reject samples according to the prototypes. The implementation of rejection capability over out-of-set characters allows automatic spotting of unknown characters in the incoming data stream. Extensive experiments show that our method achieves promising performance on a variety of zero-shot, close-set, and open-set text recognition datasets

preprint2022arXiv

Unsupervised Structure-Texture Separation Network for Oracle Character Recognition

Oracle bone script is the earliest-known Chinese writing system of the Shang dynasty and is precious to archeology and philology. However, real-world scanned oracle data are rare and few experts are available for annotation which make the automatic recognition of scanned oracle characters become a challenging task. Therefore, we aim to explore unsupervised domain adaptation to transfer knowledge from handprinted oracle data, which are easy to acquire, to scanned domain. We propose a structure-texture separation network (STSN), which is an end-to-end learning framework for joint disentanglement, transformation, adaptation and recognition. First, STSN disentangles features into structure (glyph) and texture (noise) components by generative models, and then aligns handprinted and scanned data in structure feature space such that the negative influence caused by serious noises can be avoided when adapting. Second, transformation is achieved via swapping the learned textures across domains and a classifier for final classification is trained to predict the labels of the transformed scanned characters. This not only guarantees the absolute separation, but also enhances the discriminative ability of the learned features. Extensive experiments on Oracle-241 dataset show that STSN outperforms other adaptation methods and successfully improves recognition performance on scanned data even when they are contaminated by long burial and careless excavation.

preprint2020arXiv

Towards Robust Pattern Recognition: A Review

The accuracies for many pattern recognition tasks have increased rapidly year by year, achieving or even outperforming human performance. From the perspective of accuracy, pattern recognition seems to be a nearly-solved problem. However, once launched in real applications, the high-accuracy pattern recognition systems may become unstable and unreliable, due to the lack of robustness in open and changing environments. In this paper, we present a comprehensive review of research towards robust pattern recognition from the perspective of breaking three basic and implicit assumptions: closed-world assumption, independent and identically distributed assumption, and clean and big data assumption, which form the foundation of most pattern recognition models. Actually, our brain is robust at learning concepts continually and incrementally, in complex, open and changing environments, with different contexts, modalities and tasks, by showing only a few examples, under weak or noisy supervision. These are the major differences between human intelligence and machine intelligence, which are closely related to the above three assumptions. After witnessing the significant progress in accuracy improvement nowadays, this review paper will enable us to analyze the shortcomings and limitations of current methods and identify future research directions for robust pattern recognition.