Researcher profile

Yingmin Liu

Yingmin Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Agentic Retrieval-Augmented Generation for Financial Document Question Answering

Financial document question answering (QA) demands complex multi-step numerical reasoning over heterogeneous evidence--structured tables, textual narratives, and footnotes--scattered across corporate filings. Existing retrieval-augmented generation (RAG) approaches adopt a single-pass retrieve-then-generate paradigm that struggles with the compositional reasoning chains prevalent in financial analysis. We propose FinAgent-RAG, an agentic RAG framework that orchestrates iterative retrieval-reasoning loops with self-verification, specifically engineered for the precision requirements of financial numerical reasoning. The framework integrates three domain-specific innovations: (1) a Contrastive Financial Retriever trained with hard negative mining to distinguish semantically similar but numerically distinct financial passages, (2) a Program-of-Thought reasoning module that generates executable Python code for precise arithmetic rather than relying on error-prone LLM-based mental computation, and (3) an Adaptive Strategy Router that dynamically allocates computational resources based on question complexity, reducing API costs by 41.3% on FinQA while preserving accuracy. Extensive experiments on three benchmark datasets--FinQA, ConvFinQA, and TAT-QA--demonstrate that FinAgent-RAG achieves 76.81%, 78.46%, and 74.96% execution accuracy respectively, outperforming the strongest baseline by 5.62--9.32 percentage points. Ablation studies, cross-backbone evaluation with four LLMs, and deployment cost analysis confirm the framework's robustness and practical viability for financial institutions.

preprint2026arXiv

ConflictRAG: Detecting and Resolving Knowledge Conflicts in Retrieval Augmented Generation

Retrieval-Augmented Generation (RAG) systems implicitly assume mutual consistency among retrieved documents -- an assumption that frequently fails in practice. We present ConflictRAG, a conflict-aware RAG framework that detects, classifies, and resolves knowledge conflicts prior to answer generation. The framework introduces three contributions: (1) a two-stage conflict detection module combining a lightweight embedding-based MLP classifier with selective LLM refinement, reducing API costs by 62% while maintaining 90.8% detection accuracy; (2) an Entropy-TOPSIS framework for data-driven source credibility assessment, improving selection accuracy by 7.1% over manual heuristics; and (3) a Conflict-Aware RAG Score (CARS) for diagnostic evaluation of conflict-handling capabilities. Experiments on three benchmarks against six baselines demonstrate 88.7% conflict-detection F1 and consistent 5.3--6.1% correctness gains over the strongest conflict-aware baseline, with the pipeline transferring effectively across backbone LLMs.

preprint2026arXiv

CyberCorrect: A Cybernetic Framework for Closed-Loop Self-Correction in Large Language Models

Large language model (LLM) self-correction -- the ability to detect and fix errors in generated outputs -- remains largely ad hoc, relying on generic prompts such as "please reconsider your answer" without systematic error analysis or convergence guarantees. We propose CyberCorrect, a framework that formalizes LLM self-correction as a closed-loop control system grounded in cybernetic theory. The framework models the LLM generator as the plant and introduces a tri-modal Error Detector (combining self-consistency, verbalized confidence, and logic-chain verification) as the sensor. A type-directed Correction Controller generates targeted repair instructions based on diagnosed error categories, while a Convergence Judge determines iteration termination using stability criteria adapted from control theory. We further introduce three control-theoretic evaluation metrics -- convergence rate, overshoot rate, and oscillation rate -- that capture correction dynamics beyond final accuracy. Experiments on our constructed CyberCorrect-Bench (440 reasoning tasks with annotated error types and correction paths) show that CyberCorrect achieves 79.8% final accuracy, improving upon the best existing self-correction method by 6.2 percentage points, while reducing overshoot (erroneous over-correction) by 41% through its convergence control mechanism.

preprint2022arXiv

Technical Report (v1.0)--Pseudo-random Cartesian Sampling for Dynamic MRI

For an effective application of compressed sensing (CS), which exploits the underlying compressibility of an image, one of the requirements is that the undersampling artifact be incoherent (noise-like) in the sparsifying transform domain. For cardiovascular MRI (CMR), several pseudo-random sampling methods have been proposed that yield a high level of incoherence. In this technical report, we present a collection of five pseudo-random Cartesian sampling methods that can be applied to 2D cine and flow, 3D volumetric cine, and 4D flow imaging. Four out of the five presented methods yield fast computation for on-the-fly generation of the sampling mask, without the need to create and store pre-computed look-up tables. In addition, the sampling distribution is parameterized, providing control over the sampling density. For each sampling method in the report, (i) we briefly describe the methodology, (ii) list default values of the pertinent parameters, and (iii) provide a publicly available MATLAB implementation.

preprint2020arXiv

Automatic Extraction and Sign Determination of Respiratory Signal in Real-time Cardiac Magnetic Resonance imaging

In real-time (RT) cardiac cine imaging, a stack of 2D slices is collected sequentially under free-breathing conditions. A complete heartbeat from each slice is then used for cardiac function quantification. The inter-slice respiratory mismatch can compromise accurate quantification of cardiac function. Methods based on principal components analysis (PCA) have been proposed to extract the respiratory signal from RT cardiac cine, but these methods cannot resolve the inter-slice sign ambiguity of the respiratory signal. In this work, we propose a fully automatic sign correction procedure based on the similarity of neighboring slices and correlation to the center-of-mass curve. The proposed method is evaluated in eleven volunteers, with ten slices per volunteer. The motion in a manually selected region-of-interest (ROI) is used as a reference. The results show that the extracted respiratory signal has a high, positive correlation with the reference in all cases. The qualitative assessment of images also shows that the proposed approach can accurately identify heartbeats, one from each slice, belonging to the same respiratory phase. This approach can improve cardiac function quantification for RT cine without manual intervention.

preprint2020arXiv

OCMR (v1.0)--Open-Access Multi-Coil k-Space Dataset for Cardiovascular Magnetic Resonance Imaging

Cardiovascular MRI (CMR) is a non-invasive imaging modality that provides excellent soft-tissue contrast without the use of ionizing radiation. Physiological motions and limited speed of MRI data acquisition necessitate development of accelerated methods, which typically rely on undersampling. Recovering diagnostic quality CMR images from highly undersampled data has been an active area of research. Recently, several data acquisition and processing methods have been proposed to accelerate CMR. The availability of data to objectively evaluate and compare different reconstruction methods could expedite innovation and promote clinical translation of these methods. In this work, we introduce an open-access dataset, called OCMR, that provides multi-coil k-space data from 53 fully sampled and 212 prospectively undersampled cardiac cine series.