Researcher profile

Chunhua Weng

Chunhua Weng contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2026arXiv

CPGPrompt: Translating Clinical Guidelines into LLM-Executable Decision Support

Clinical practice guidelines (CPGs) provide evidence-based recommendations for patient care; however, integrating them into Artificial Intelligence (AI) remains challenging. Previous approaches, such as rule-based systems, face significant limitations, including poor interpretability, inconsistent adherence to guidelines, and narrow domain applicability. To address this, we develop and validate CPGPrompt, an auto-prompting system that converts narrative clinical guidelines into large language models (LLMs). Our framework translates CPGs into structured decision trees and utilizes an LLM to dynamically navigate them for patient case evaluation. Synthetic vignettes were generated across three domains (headache, lower back pain, and prostate cancer) and distributed into four categories to test different decision scenarios. System performance was assessed on both binary specialty-referral decisions and fine-grained pathway-classification tasks. The binary specialty referral classification achieved consistently strong performance across all domains (F1: 0.85-1.00), with high recall (1.00 $\pm$ 0.00). In contrast, multi-class pathway assignment showed reduced performance, with domain-specific variations: headache (F1: 0.47), lower back pain (F1: 0.72), and prostate cancer (F1: 0.77). Domain-specific performance differences reflected the structure of each guideline. The headache guideline highlighted challenges with negation handling. The lower back pain guideline required temporal reasoning. In contrast, prostate cancer pathways benefited from quantifiable laboratory tests, resulting in more reliable decision-making.

preprint2026arXiv

Scalable Scientific Interest Profiling Using Large Language Models

Research profiles highlight scientists' research focus, enabling talent discovery and collaborations, but are often outdated. Automated, scalable methods are urgently needed to keep profiles current. We design and evaluate two Large Language Models (LLMs)-based methods to generate scientific interest profiles--one summarizing PubMed abstracts and the other using Medical Subject Headings (MeSH) terms--comparing them with researchers' self-summarized interests. We collected titles, MeSH terms, and abstracts of PubMed publications for 595 faculty at Columbia University Irving Medical Center, obtaining human-written profiles for 167. GPT-4o-mini was prompted to summarize each researcher's interests. Manual and automated evaluations characterized similarities between machine-generated and self-written profiles. The similarity study showed low ROUGE-L, BLEU, and METEOR scores, reflecting little terminological overlap. BERTScore analysis revealed moderate semantic similarity (F1: 0.542 for MeSH-based, 0.555 for abstract-based), despite low lexical overlap. In validation, paraphrased summaries achieved a higher F1 of 0.851. Comparing original and manually paraphrased summaries indicated limitations of such metrics. Kullback-Leibler (KL) Divergence of TF-IDF values (8.56 for MeSH-based, 8.58 for abstract-based) suggests machine summaries employ different keywords than human-written ones. Manual reviews showed 77.78% rated MeSH-based profiling "good" or "excellent," with readability rated favorably in 93.44% of cases, though granularity and accuracy varied. Panel reviews favored 67.86% of MeSH-derived profiles over abstract-derived ones. LLMs promise to automate scientific interest profiling at scale. MeSH-derived profiles have better readability than abstract-derived ones. Machine-generated summaries differ from human-written ones in concept choice, with the latter initiating more novel ideas.

preprint2026arXiv

Toward Global Large Language Models in Medicine

Despite continuous advances in medical technology, the global distribution of health care resources remains uneven. The development of large language models (LLMs) has transformed the landscape of medicine and holds promise for improving health care quality and expanding access to medical information globally. However, existing LLMs are primarily trained on high-resource languages, limiting their applicability in global medical scenarios. To address this gap, we constructed GlobMed, a large multilingual medical dataset, containing over 500,000 entries spanning 12 languages, including four low-resource languages. Building on this, we established GlobMed-Bench, which systematically assesses 56 state-of-the-art proprietary and open-weight LLMs across multiple multilingual medical tasks, revealing significant performance disparities across languages, particularly for low-resource languages. Additionally, we introduced GlobMed-LLMs, a suite of multilingual medical LLMs trained on GlobMed, with parameters ranging from 1.7B to 8B. GlobMed-LLMs achieved an average performance improvement of over 40% relative to baseline models, with a more than threefold increase in performance on low-resource languages. Together, these resources provide an important foundation for advancing the equitable development and application of LLMs globally, enabling broader language communities to benefit from technological advances.