Researcher profile

Juexiao Zhou

Juexiao Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2026arXiv

Honesty-Aware Multi-Agent Framework for High-Fidelity Synthetic Data Generation in Digital Psychiatric Intake Doctor-Patient Interactions

Data scarcity and unreliable self-reporting -- such as concealment or exaggeration -- pose fundamental challenges to psychiatric intake and assessment. We propose a multi-agent synthesis framework that explicitly models patient deception to generate high-fidelity, publicly releasable synthetic psychiatric intake records. Starting from DAIC-WOZ interviews, we construct enriched patient profiles and simulate a four-role workflow: a \emph{Patient} completes self-rated scales and participates in a semi-structured interview under a topic-dependent honesty state; an \emph{Assessor} selects instruments based on demographics and chief complaints; an \emph{Evaluator} conducts the interview grounded in rater-administered scales, tracks suspicion, and completes ratings; and a \emph{Diagnostician} integrates all evidence into a diagnostic summary. Each case links the patient profile, self-rated and rater-administered responses, interview transcript, diagnostic summary, and honesty state. We validate the framework through four complementary evaluations: diagnostic consistency and severity grading, chain-of-thought ablations, human evaluation of clinical realism and dishonesty modeling, and LLM-based comparative evaluation. The resulting corpus spans multiple disorders and severity levels, enabling controlled study of dishonesty-aware psychiatric assessment and the training and evaluation of adaptive dialogue agents.

preprint2026arXiv

Towards Trustworthy Dermatology MLLMs: A Benchmark and Multimodal Evaluator for Diagnostic Narratives

Multimodal large language models (LLMs) are increasingly used to generate dermatology diagnostic narratives directly from images. However, reliable evaluation remains the primary bottleneck for responsible clinical deployment. We introduce a novel evaluation framework that combines DermBench, a meticulously curated benchmark, with DermEval, a robust automatic evaluator, to enable clinically meaningful, reproducible, and scalable assessment. We build DermBench, which pairs 4,000 real-world dermatology images with expert-certified diagnostic narratives and uses an LLM-based judge to score candidate narratives across clinically grounded dimensions, enabling consistent and comprehensive evaluation of multimodal models. For individual case assessment, we train DermEval, a reference-free multimodal evaluator. Given an image and a generated narrative, DermEval produces a structured critique along with an overall score and per-dimension ratings. This capability enables fine-grained, per-case analysis, which is critical for identifying model limitations and biases. Experiments on a diverse dataset of 4,500 cases demonstrate that DermBench and DermEval achieve close alignment with expert ratings, with mean deviations of 0.251 and 0.117 (out of 5), respectively, providing reliable measurement of diagnostic ability and trustworthiness across different multimodal LLMs.