Researcher profile

Dingwen Li

Dingwen Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Prompt-Induced Score Variance in Zero-Shot Binary Vision-Language Safety Classification

Single-prompt first-token probabilities from zero-shot vision-language model (VLM) safety classifiers are treated as decision scores, but we show they are unreliable under semantically equivalent prompt reformulation: even when the binary label is constrained to a fixed output position, equivalent prompts can induce materially different unsafe probabilities for the same sample. Across multimodal safety benchmarks and multiple VLM families, cross-prompt variance is strongly associated with prompt-level disagreement and higher error, making it a useful fragility diagnostic. A training-free mean ensemble improves NLL on all 14 dataset-model evaluation pairs and ECE on 12/14 relative to a train-selected single-prompt baseline, and wins more head-to-head NLL comparisons than labeled temperature scaling, Platt scaling, and isotonic regression applied to the same prompt. Ranking gains are consistent against the train-selected baseline on both AUROC and AUPRC, and against the full 15-prompt distribution remain consistent on AUPRC while softening on AUROC. Labeled calibration on top of the mean provides further gains when labels are available, identifying prompt averaging as a strong label-free first stage rather than a replacement for calibration. We frame this as a reliability stress test for zero-shot VLM first-token safety scores and recommend prompt-family evaluation with mean aggregation as a standard label-free reliability baseline.

preprint2022arXiv

Predicting Intraoperative Hypoxemia with Hybrid Inference Sequence Autoencoder Networks

We present an end-to-end model using streaming physiological time series to predict near-term risk for hypoxemia, a rare, but life-threatening condition known to cause serious patient harm during surgery. Inspired by the fact that a hypoxemia event is defined based on a future sequence of low SpO2 (i.e., blood oxygen saturation) instances, we propose the hybrid inference network (hiNet) that makes hybrid inference on both future low SpO2 instances and hypoxemia outcomes. hiNet integrates 1) a joint sequence autoencoder that simultaneously optimizes a discriminative decoder for label prediction, and 2) two auxiliary decoders trained for data reconstruction and forecast, which seamlessly learn contextual latent representations that capture the transition from present states to future states. All decoders share a memory-based encoder that helps capture the global dynamics of patient measurement. For a large surgical cohort of 72,081 surgeries at a major academic medical center, our model outperforms strong baselines including the model used by the state-of-the-art hypoxemia prediction system. With its capability to make real-time predictions of near-term hypoxemic at clinically acceptable alarm rates, hiNet shows promise in improving clinical decision making and easing burden of perioperative care.