Researcher profile

Hongfang Liu

Hongfang Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Three-dimensional quantum anomalous Hall effect in Weyl semimetals

The quantum anomalous Hall effect (QAHE) is a quantum phenomenon in which a two-dimensional system exhibits a quantized Hall resistance $h/e^2$ in the absence of magnetic field, where $h$ is the Planck constant and $e$ is the electron charge. In this work, we extend this novel phase to three dimensions and thus propose a three-dimensional QAHE exhibiting richer and more versatile transport behaviors. We first confirm this three-dimensional QAHE through the quantized Chern number, then establish its bulk-boundary correspondence, and finally reaffirm it via the distinctive transport properties. Remarkably, we find that the three-dimensional QAHE hosts two chiral surface states along one spatial direction while a pair of chiral hinge states along another direction, and the location of the hinge states depends sensitively on the Fermi energy. These two types of boundary states are further connected through a perpendicular chiral surface states, whose chirality is also Fermi energy dependent. Consequently, depending on the transport direction, its Hall resistance can quantize to $0$, $h/e^2$, or $\pm h/e^2$ when the Fermi energy is tuned across the charge neutral point. This three-dimensional QAHE not only fill the gap in the Hall effect family but also holds significant potentials in device applications such as in-memory computing.

preprint2026arXiv

Toward Global Large Language Models in Medicine

Despite continuous advances in medical technology, the global distribution of health care resources remains uneven. The development of large language models (LLMs) has transformed the landscape of medicine and holds promise for improving health care quality and expanding access to medical information globally. However, existing LLMs are primarily trained on high-resource languages, limiting their applicability in global medical scenarios. To address this gap, we constructed GlobMed, a large multilingual medical dataset, containing over 500,000 entries spanning 12 languages, including four low-resource languages. Building on this, we established GlobMed-Bench, which systematically assesses 56 state-of-the-art proprietary and open-weight LLMs across multiple multilingual medical tasks, revealing significant performance disparities across languages, particularly for low-resource languages. Additionally, we introduced GlobMed-LLMs, a suite of multilingual medical LLMs trained on GlobMed, with parameters ranging from 1.7B to 8B. GlobMed-LLMs achieved an average performance improvement of over 40% relative to baseline models, with a more than threefold increase in performance on low-resource languages. Together, these resources provide an important foundation for advancing the equitable development and application of LLMs globally, enabling broader language communities to benefit from technological advances.

preprint2025arXiv

Clinical Document Metadata Extraction: A Scoping Review

Clinical document metadata, such as document type, structure, author role, medical specialty, and encounter setting, is essential for accurate interpretation of information captured in clinical documents. However, vast documentation heterogeneity and drift over time challenge harmonization of document metadata. Automated extraction methods have emerged to coalesce metadata from disparate practices into target schema. This scoping review aims to catalog research on clinical document metadata extraction, identify methodological trends and applications, and highlight gaps. We followed the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines to identify articles that perform clinical document metadata extraction. We initially found and screened 266 articles published between January 2011 and August 2025, then comprehensively reviewed 67 we deemed relevant to our study. Among the articles included, 45 were methodological, 17 used document metadata as features in a downstream application, and 5 analyzed document metadata composition. We observe myriad purposes for methodological study and application types. Available labelled public data remains sparse except for structural section datasets. Methods for extracting document metadata have progressed from largely rule-based and traditional machine learning with ample feature engineering to transformer-based architectures with minimal feature engineering. The emergence of large language models has enabled broader exploration of generalizability across tasks and datasets, allowing the possibility of advanced clinical text processing systems. We anticipate that research will continue to expand into richer document metadata representations and integrate further into clinical applications and workflows.

preprint2022arXiv

An Open Natural Language Processing Development Framework for EHR-based Clinical Research: A case demonstration using the National COVID Cohort Collaborative (N3C)

While we pay attention to the latest advances in clinical natural language processing (NLP), we can notice some resistance in the clinical and translational research community to adopt NLP models due to limited transparency, interpretability, and usability. In this study, we proposed an open natural language processing development framework. We evaluated it through the implementation of NLP algorithms for the National COVID Cohort Collaborative (N3C). Based on the interests in information extraction from COVID-19 related clinical notes, our work includes 1) an open data annotation process using COVID-19 signs and symptoms as the use case, 2) a community-driven ruleset composing platform, and 3) a synthetic text data generation workflow to generate texts for information extraction tasks without involving human subjects. The corpora were derived from texts from three different institutions (Mayo Clinic, University of Kentucky, University of Minnesota). The gold standard annotations were tested with a single institution's (Mayo) ruleset. This resulted in performances of 0.876, 0.706, and 0.694 in F-scores for Mayo, Minnesota, and Kentucky test datasets, respectively. The study as a consortium effort of the N3C NLP subgroup demonstrates the feasibility of creating a federated NLP algorithm development and benchmarking platform to enhance multi-institution clinical NLP study and adoption. Although we use COVID-19 as a use case in this effort, our framework is general enough to be applied to other domains of interest in clinical NLP.

preprint2022arXiv

Bulk-Bulk Correspondence in Disordered Non-Hermitian Systems

The consistency between eigenvalues calculated under open and periodic boundary conditions, named as {\it bulk-bulk correspondence} ($\mathcal{BBC}$), can be destroyed in systems with non-Hermitian skin effect (NHSE). In spite of the great success of the generalized Brillouin zone (GBZ) theory in clean non-Hermitian systems, the applicability of GBZ theory is questionable when the translational symmetry is broken. Thus, it is of great value to rebuild the $\mathcal{BBC}$ for disorder samples, which extends the application of GBZ theory in non-Hermitian systems. Here, we propose a scheme reconstructing $\mathcal{BBC}$, which can be regarded as the solution of an optimization problem. By solving this optimization problem analytically, we reconstruct the $\mathcal{BBC}$ and obtain the modified GBZ theory in several prototypical disordered non-Hermitian models. The modified GBZ theory gives a precise description of NHSE, which predicts the intriguing disorder-enhanced and disorder-irrelevant NHSEs.

preprint2022arXiv

CancerBERT: a BERT model for Extracting Breast Cancer Phenotypes from Electronic Health Records

Accurate extraction of breast cancer patients' phenotypes is important for clinical decision support and clinical research. Current models do not take full advantage of cancer domain-specific corpus, whether pre-training Bidirectional Encoder Representations from Transformer model on cancer-specific corpus could improve the performances of extracting breast cancer phenotypes from texts data remains to be explored. The objective of this study is to develop and evaluate the CancerBERT model for extracting breast cancer phenotypes from clinical texts in electronic health records. This data used in the study included 21,291 breast cancer patients diagnosed from 2010 to 2020, patients' clinical notes and pathology reports were collected from the University of Minnesota Clinical Data Repository (UMN). Results: About 3 million clinical notes and pathology reports in electronic health records for 21,291 breast cancer patients were collected to train the CancerBERT model. 200 pathology reports and 50 clinical notes of breast cancer patients that contain 9,685 sentences and 221,356 tokens were manually annotated by two annotators. 20% of the annotated data was used as a test set. Our CancerBERT model achieved the best performance with macro F1 scores equal to 0.876 (95% CI, 0.896-0.902) for exact match and 0.904 (95% CI, 0.896-0.902) for the lenient match. The NER models we developed would facilitate the automated information extraction from clinical texts to further help clinical decision support. Conclusions and Relevance: In this study, we focused on the breast cancer-related concepts extraction from EHR data and obtained a comprehensive annotated dataset that contains 7 types of breast cancer-related concepts. The CancerBERT model with customized vocabulary could significantly improve the performance for extracting breast cancer phenotypes from clinical texts.

preprint2022arXiv

The NLP Sandbox: an efficient model-to-data system to enable federated and unbiased evaluation of clinical NLP models

Objective The evaluation of natural language processing (NLP) models for clinical text de-identification relies on the availability of clinical notes, which is often restricted due to privacy concerns. The NLP Sandbox is an approach for alleviating the lack of data and evaluation frameworks for NLP models by adopting a federated, model-to-data approach. This enables unbiased federated model evaluation without the need for sharing sensitive data from multiple institutions. Materials and Methods We leveraged the Synapse collaborative framework, containerization software, and OpenAPI generator to build the NLP Sandbox (nlpsandbox.io). We evaluated two state-of-the-art NLP de-identification focused annotation models, Philter and NeuroNER, using data from three institutions. We further validated model performance using data from an external validation site. Results We demonstrated the usefulness of the NLP Sandbox through de-identification clinical model evaluation. The external developer was able to incorporate their model into the NLP Sandbox template and provide user experience feedback. Discussion We demonstrated the feasibility of using the NLP Sandbox to conduct a multi-site evaluation of clinical text de-identification models without the sharing of data. Standardized model and data schemas enable smooth model transfer and implementation. To generalize the NLP Sandbox, work is required on the part of data owners and model developers to develop suitable and standardized schemas and to adapt their data or model to fit the schemas. Conclusions The NLP Sandbox lowers the barrier to utilizing clinical data for NLP model evaluation and facilitates federated, multi-site, unbiased evaluation of NLP models.

preprint2021arXiv

Comparisons of Graph Neural Networks on Cancer Classification Leveraging a Joint of Phenotypic and Genetic Features

Cancer is responsible for millions of deaths worldwide every year. Although significant progress hasbeen achieved in cancer medicine, many issues remain to be addressed for improving cancer therapy.Appropriate cancer patient stratification is the prerequisite for selecting appropriate treatment plan, ascancer patients are of known heterogeneous genetic make-ups and phenotypic differences. In thisstudy, built upon deep phenotypic characterizations extractable from Mayo Clinic electronic healthrecords (EHRs) and genetic test reports for a collection of cancer patients, we evaluated variousgraph neural networks (GNNs) leveraging a joint of phenotypic and genetic features for cancer typeclassification. Models were applied and fine-tuned on the Mayo Clinic cancer disease dataset. Theassessment was done through the reported accuracy, precision, recall, and F1 values as well as throughF1 scores based on the disease class. Per our evaluation results, GNNs on average outperformed thebaseline models with mean statistics always being higher that those of the baseline models (0.849 vs0.772 for accuracy, 0.858 vs 0.794 for precision, 0.843 vs 0.759 for recall, and 0.843 vs 0.855 for F1score). Among GNNs, ChebNet, GraphSAGE, and TAGCN showed the best performance, while GATshowed the worst. We applied and compared eight GNN models including AGNN, ChebNet, GAT,GCN, GIN, GraphSAGE, SGC, and TAGCN on the Mayo Clinic cancer disease dataset and assessedtheir performance as well as compared them with each other and with more conventional machinelearning models such as decision tree, gradient boosting, multi-layer perceptron, naive bayes, andrandom forest which we used as the baselines.

preprint2020arXiv

Clinical Concept Extraction: a Methodology Review

Background Concept extraction, a subdomain of natural language processing (NLP) with a focus on extracting concepts of interest, has been adopted to computationally extract clinical information from text for a wide range of applications ranging from clinical decision support to care quality improvement. Objectives In this literature review, we provide a methodology review of clinical concept extraction, aiming to catalog development processes, available methods and tools, and specific considerations when developing clinical concept extraction applications. Methods Based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, a literature search was conducted for retrieving EHR-based information extraction articles written in English and published from January 2009 through June 2019 from Ovid MEDLINE In-Process & Other Non-Indexed Citations, Ovid MEDLINE, Ovid EMBASE, Scopus, Web of Science, and the ACM Digital Library. Results A total of 6,686 publications were retrieved. After title and abstract screening, 228 publications were selected. The methods used for developing clinical concept extraction applications were discussed in this review.

preprint2020arXiv

Digital pathology-based study of cell- and tissue-level morphologic features in serous borderline ovarian tumor and high-grade serous ovarian cancer

Serous borderline ovarian tumor (SBOT) and high-grade serous ovarian cancer (HGSOC) are two distinct subtypes of epithelial ovarian tumors, with markedly different biologic background, behavior, prognosis, and treatment. However, the histologic diagnosis of serous ovarian tumors can be subjectively variable and labor-intensive as multiple tumor slides/blocks need to be thoroughly examined to search for these features. In this study, we aimed to evaluate technical feasibility of using digital pathological approaches to facilitate objective and scalable diagnosis screening for SBOT and HGSOC. Based on Groovy scripts and QuPath, a novel informatics system was developed to facilitate interactive annotation and imaging data exchange for machine learning purposes. Through this developed system, cellular boundaries were detected and expanded set of cellular features were extracted to represent cell- and tissue-level characteristics. According to our evaluation, cell-level classification was accurately achieved for both tumor and stroma cells with greater than 90% accuracy. Upon further re-examinations, 44.2% of the misclassified cells were due to over-/under-segmentations or low-quality of imaging areas. For a total number of 6,485 imaging patches with sufficient tumor and stroma cells (ten of each at least), we achieved 91-95% accuracy to differentiate HGSOC v. SBOT. When all the patches were considered for a WSI to make consensus prediction, 97% accuracy was achieved for accurately classifying all patients, indicating that cellular features digitally extracted from pathological images can be used for cell classification and SBOT v. HGSOC differentiation. Introducing digital pathology into ovarian cancer research could be beneficial to discover potential clinical implications.

preprint2020arXiv

How Good is Artificial Intelligence at Automatically Answering Consumer Questions Related to Alzheimer's Disease?

Alzheimer's Disease (AD) is the most common type of dementia, comprising 60-80% of cases. There were an estimated 5.8 million Americans living with Alzheimer's dementia in 2019, and this number will almost double every 20 years. The total lifetime cost of care for someone with dementia is estimated to be $350,174 in 2018, 70% of which is associated with family-provided care. Most family caregivers face emotional, financial and physical difficulties. As a medium to relieve this burden, online communities in social media websites such as Twitter, Reddit, and Yahoo! Answers provide potential venues for caregivers to search relevant questions and answers, or post questions and seek answers from other members. However, there are often a limited number of relevant questions and responses to search from, and posted questions are rarely answered immediately. Due to recent advancement in Artificial Intelligence (AI), particularly Natural Language Processing (NLP), we propose to utilize AI to automatically generate answers to AD-related consumer questions posted by caregivers and evaluate how good AI is at answering those questions. To the best of our knowledge, this is the first study in the literature applying and evaluating AI models designed to automatically answer consumer questions related to AD.

preprint2020arXiv

Real-World Data Analysis of Implantable Cardioverter Defibrillator (ICD) in Patients with Hypertrophic Cardiomyopathy (HCM)

Background: One of the common causes of sudden cardiac death (SCD) in young people is hypertrophic cardiomyopathy (HCM) and the primary prevention of SCD is with an implantable cardioverter defibrillators (ICD). Concerning the incidence of appropriate ICD therapy and the complications associated with ICD implantation and discharge, patients with implanted ICDs are closely monitored and interrogation reports are generated from clinical consultations. Methods: In this study, we compared the performance of structured device data and unstructured interrogation reports for extracting information of ICD therapy and heart rhythm. We sampled 687 reports with a gold standard generated through manual chart review. A rule-based natural language processing (NLP) system was developed using 480 reports and the information in the corresponding device data was aggregated for the task. We compared the performance of the NLP system with information aggregated from structured device data using the remaining 207 reports. Results: The rule-based NLP system achieved F-measure of 0.92 and 0.98 for ICD therapy and heart rhythm while the performance of aggregating device data was significantly lower with F-measure of 0.78 and 0.45 respectively. Limitations of using only structured device data include no differentiation of real events from management events, data availability, and disparate perspectives of vendor and data granularity while using interrogation reports needs to overcome non-representative keyword/pattern and contextual errors. Conclusions: Extracting phenotyping information from data generated in real-world requires the incorporation of medical knowledge. It is essential to analyze, compare, and harmonize multiple data sources for real-world evidence generation.