Researcher profile

Yuhan Wang

Yuhan Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Any2Any 3D Diffusion Models with Knowledge Transfer: A Radiotherapy Planning Study

Voxel-wise dose prediction is a critical yet challenging task in practical radiotherapy (RT) planning, as bespoke models trained from scratch often struggle to generalize across diverse clinical settings. Meanwhile, generative models trained on billion-scale datasets from vision domains have achieved impressive performance. Herein, we propose DiffKT3D, a unified Any2Any 3D diffusion framework that leverages prior knowledge from pretrained video diffusion models for efficient and clinically meaningful dose prediction. To enable flexible conditioning across multiple clinical modalities (CT, anatomical structures, body, beam settings, etc.), we introduce an Any2Any conditional paradigm utilizing modality-specific embeddings without cross-attention overhead. Further, we design a novel reinforcement learning (RL) post-training mechanism guided by a clinically-informed Scorecard explicitly tailored to institutional treatment preferences. Compared with winner of GDP-HMM challenge, DiffKT3D sets a new state-of-the-art in dose prediction by reducing voxel-level MAE from 2.07 to 1.93. In addition, DiffKT3D achieves superior image quality and preference match. These results demonstrate that transferring diffusion priors via modality-aware conditioning and clinically aligned RL post-training can provide a robust and generalizable solution for RT planning across various clinical scenarios.

preprint2026arXiv

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning

Large language models (LLMs) and agentic systems have shown promise for clinical decision support, but existing works largely assume that evidence has already been curated and handed to the model. Real-world clinical workflows instead require agents to actively seek, iteratively plan, and synthesize multimodal evidence from heterogeneous sources. In this paper, we introduce ClinSeekAgent, an automated agentic framework for dynamic multimodal evidence seeking that shifts the paradigm from passive evidence consumption to active evidence acquisition. Given only a clinical query and access to raw data sources, ClinSeekAgent gathers evidence by querying medical knowledge bases, navigating raw EHRs, and invoking medical imaging tools; refines its hypotheses as new information emerges; and integrates the collected evidence into grounded clinical decisions. ClinSeekAgent serves both as an inference-time agent for frontier LLMs and as a training-time pipeline for distilling high-quality agent trajectories into compact open-source models. To validate its inference-time effectiveness, we construct ClinSeek-Bench, which pairs Curated Input reasoning from fixed pre-selected evidence with Automated Evidence-Seeking over raw clinical data. On text-only EHR tasks, ClinSeekAgent improves Claude Opus 4.6 from 60.0 to 63.2 overall F1 and MiniMax M2.5 from 43.1 to 47.3, with positive risk-prediction gains in 7 out of 9 evaluated host models. On multimodal tasks, ClinSeekAgent improves Claude Opus 4.6 from 47.5 to 62.6 (+15.1); all evaluated models improve across the three CXR-related task groups. We further validate ClinSeekAgent as a training pipeline by distilling agentic evidence-seeking trajectories into ClinSeek-35B-A3B, which achieves 34.0 average F1 on existing AgentEHR-Bench, improving over its Qwen3.5-35B-A3B baseline by +11.9 points and approaching Claude Opus 4.6.

preprint2026arXiv

Divergence Meets Consensus: A Multi-Source Negative Sampling Framework for Sequential Recommendation

Negative sampling is significant for training sequential recommendation models under implicit feedback. The predominant strategy, self-guided hard negative sampling, selects negatives based on the model's current state but suffers from three limitations: (1) the coupling between sampling and model updates triggers a vicious cycle that drives the model into local optima; (2) relying on current model parameters narrows sampling to a small region of the item space, reducing diversity and harming generalization; (3) identifying a hard negative requires scoring the entire candidate pool, causing substantial computational overhead with minimal information gain. To address these challenges, we propose MDCNS (Multi-source Divergence-Consensus for Negative Sampling), a novel "Teacher-Peer-Self" framework inspired by Vygotsky's Zone of Proximal Development (ZPD) theory. The proposed method comprises three components, including multi-source scoring, divergence re-ranking, and consensus distillation. Firstly, multi-source scoring incorporates peer and ensemble teacher models to inject external negative signals and break the self-reinforcement loop. Then, divergence re-ranking exploits prediction discrepancy between self and peer models to enhance sampling diversity. Finally, consensus distillation aligns the self model with the teacher via KL divergence, simultaneously improving computational cost utilization. Extensive experiments on six real-world datasets and five backbone models show that MDCNS consistently outperforms state-of-the-art negative sampling methods, demonstrating strong effectiveness and generalization.

preprint2026arXiv

Graph-Based Financial Fraud Detection with Calibrated Risk Scoring and Structural Regularization

Financial transaction fraud prevention faces challenges such as complex relationship structures, concealed behavioral patterns, and dynamically changing data distribution. Discrimination models relying solely on independent sample features are insufficient to fully characterize the risks of group collaboration and chain transfers within transaction networks. This paper proposes a graph neural network representation learning and risk discrimination framework for financial transaction fraud prevention. It integrates transaction records and identity information into node attributes and constructs a transaction graph based on shared attributes and interaction consistency to explicitly model inter-transaction relationships. In model design, a multi-layer message passing mechanism is employed to aggregate neighborhood information, learn node embedding representations containing structural context semantics, and output transaction-level fraud probability and risk scores through a lightweight risk discrimination head. A weighted supervision objective is introduced to mitigate training bias caused by class imbalance, and structural consistency regularization constraints are combined to suppress the impact of noisy edges on representation drift, thereby improving the stability and usability of risk characterization. Experiments are conducted on a publicly available financial transaction dataset, comparing various methods in the same direction and comprehensively evaluating them under a unified evaluation protocol. The results show that the proposed method outperforms other methods in risk ranking and probability calibration quality, validating the effectiveness of graph structure modeling and representation learning collaboration in financial transaction fraud prevention.

preprint2026arXiv

S2Aligner: Pair-Efficient and Transferable Pre-Training for Sparse Text-Attributed Graphs

Pre-training on text-attributed graphs (TAGs) is central to building transferable graph foundation models, where LLM-as-Aligner methods align graph and text representations through the semantic knowledge of large language models. However, these methods usually assume that node texts provide sufficient and reliable supervision, an assumption often violated in real-world sparse TAGs. When textual anchors are missing, noisy, or uneven across domains, graph structures must be aligned with weak semantic evidence, leading to unreliable structure-semantics correspondence and sparsity-induced transfer bias. This paper presents S2Aligner, a sparsity-aware and structure-enhanced LLM-as-Aligner framework for graph-text pre-training on sparse TAGs. The key idea is to decouple semantic alignment from structural modeling, allowing topology-aware signals to enhance alignment without contaminating the shared semantic space. Specifically, S2Aligner decomposes graph-text representations into semantic and structural components, uses structure-oriented reconstruction with consistency control to inject reliable topology cues into text representations, and suppresses inconsistent structural signals under textual sparsity. Moreover, S2Aligner introduces sparsity-aware cross-domain risk balancing, which calibrates domain risks through a global-domain density ratio and downweights unreliable sparse samples via graph reliability estimation. Theoretical analysis shows that this objective reduces cross-domain generalization gaps by controlling domain risk discrepancy. Extensive experiments across diverse graph domains, sparsity levels, and downstream tasks demonstrate that S2Aligner consistently outperforms existing baselines.

preprint2026arXiv

Unsupervised Graph Modeling for Anomaly Detection in Accounting Subject Relationships

This paper addresses the problem of anomaly detection in accounting subject association structures, proposing a structured modeling and unsupervised discriminant framework based on graph neural networks. This framework is used to mine stable correspondences between subjects and identify structural deviations from general ledger details and voucher entries. The method first abstracts accounting subjects as graph nodes, and the co-occurrence and debit/credit correspondence of subjects in the same business record are abstracted as weighted edges. The edge weights are characterized by statistical measures such as co-occurrence frequency or amount aggregation, thus forming a period-level accounting subject association graph. In the representation learning stage, a message passing mechanism is used to fuse the node's own attributes and neighborhood context to obtain node embeddings containing structural information. In the anomaly detection stage, the rationality of subject pair connections is estimated through a relation reconstruction decoder, and edge-level anomaly scores are defined based on the degree of deviation in reconstruction probabilities. These scores are then aggregated to obtain node-level risk ranking and local anomaly localization. This framework can simultaneously capture local substructure anomalies and cross-community anomaly connections without relying on anomaly labeling, outputting traceable subject pair risk clues. Comparative experiments demonstrate more stable comprehensive discriminant capabilities and higher top-ranking accuracy.

preprint2022arXiv

Assembly development for the Simons Observatory focal plane readout module

The Simons Observatory (SO) is a suite of instruments sensitive to temperature and polarization of the cosmic microwave background (CMB) to be located at Cerro Toco in the Atacama Desert in Chile. Five telescopes, one large aperture telescope and four small aperture telescopes, will host roughly 70,000 highly multiplexed transition edge sensor (TES) detectors operated at 100 mK. Each SO focal plane module (UFM) couples 1,764 TESes to microwave resonators in a microwave multiplexing (uMux) readout circuit. Before detector integration, the 100 mK uMux components are packaged into multiplexing modules (UMMs), which are independently validated to ensure they meet SO performance specifications. Here we present the assembly developments of these UMM readout packages for mid frequency (90/150 GHz) and ultra high frequency (220/280 GHz) UFMs.

preprint2022arXiv

Development and performance of Universal Readout Harnesses for the Simons Observatory

The Simons Observatory (SO) is a ground-based cosmic microwave background (CMB) survey experiment that consists of three 0.5 m small-aperture telescopes and one 6 m large-aperture telescope, sited at an elevation of 5200 m in the Atacama Desert in Chile. SO will utilize more than 60,000 transition edge sensors (TES) to observe CMB temperature and polarization in six frequency bands from 27-280 GHz. Common to both the small and large aperture telescope receivers (LATR) is the 300K-4K Universal Readout Harness (URH), which supports up to 600 DC bias lines and 24 radio frequency (RF) channels consisting of input and output coaxial cables, input attenuators and custom high dynamic range 40K low-noise amplifiers (LNAs) on the output readout coaxial cable. Each RF channel can read out up to 1000 TES detectors. In this paper, we will present the design and characterization of the six URHs constructed for the initial phase of SO deployment.

preprint2022arXiv

Simons Observatory Focal-Plane Module: Detector Re-biasing With Bias-step Measurements

The Simons Observatory is a ground-based cosmic microwave background survey experiment that consists of three 0.5 m small-aperture telescopes and one 6 m large-aperture telescope, sited at an elevation of 5200 m in the Atacama Desert in Chile. SO will deploy 60,000 transition-edge sensor (TES) bolometers in 49 separate focal-plane modules across a suite of four telescopes covering 30/40 GHz low frequency (LF), 90/150 GHz mid frequency (MF), and 220/280 GHz ultra-high frequency (UHF). Each MF and UHF focal-plane module packages 1720 optical detectors spreading across 12 detector bias lines that provide voltage biasing to the detectors. During observation, detectors are subject to varying atmospheric emission and hence need to be re-biased accordingly. The re-biasing process includes measuring the detector properties such as the TES resistance and responsivity in a fast manner. Based on the result, detectors within one bias line then are biased with suitable voltage. Here we describe a technique for re-biasing detectors in the modules using the result from bias-step measurement.

preprint2022arXiv

The Simons Observatory 220 and 280 GHz Focal-Plane Module: Design and Initial Characterization

The Simons Observatory (SO) will detect and map the temperature and polarization of the millimeter-wavelength sky from Cerro Toco, Chile across a range of angular scales, providing rich data sets for cosmological and astrophysical analysis. The SO focal planes will be tiled with compact hexagonal packages, called Universal Focal-plane Modules (UFMs), in which the transition-edge sensor (TES) detectors are coupled to 100 mK microwave-multiplexing electronics. Three different types of dichroic TES detector arrays with bands centered at 30/40, 90/150, and 220/280 GHz will be implemented across the 49 planned UFMs. The 90/150GHz and 220/280 GHz arrays each contain 1,764 TESes, which are read out with two 910x multiplexer circuits. The modules contain a series of densely routed silicon chips, which are packaged together in a controlled electromagnetic environment with robust heat-sinking to 100 mK. Following an overview of the module design, we report on early results from the first 220/280GHz UFM, including detector yield, as well as readout and detector noise levels.

preprint2022arXiv

The Simons Observatory: Development and Validation of the Large Aperture Telescope Receiver

The Simons Observatory (SO) is a ground-based cosmic microwave background (CMB) survey experiment that consists of three 0.5 m small-aperture telescopes (SATs) and one 6 m large-aperture telescope (LAT), sited at an elevation of 5200 m in the Atacama Desert in Chile. In order to meet the sensitivity requirements set for next-generation CMB telescopes, the LAT will deploy 30,000 transition edge sensor (TES) detectors at 100 mK across 7 optics tubes (OT), all within the Large Aperture Telescope Receiver (LATR). Additionally, the LATR has the capability to expand to 62,000 TES across 13 OTs. The LAT will be capable of making arcminute-resolution observations of the CMB, with detector bands centered at 30, 40, 90, 150, 230, and 280 GHz. We have rigorously tested the LATR systems prior to deployment in order to fully characterize the instrument and show that it can achieve the desired sensitivity levels. We show that the LATR meets cryogenic and mechanical requirements, and maintains acceptably low baseline readout noise.

preprint2020arXiv

CurricularFace: Adaptive Curriculum Learning Loss for Deep Face Recognition

As an emerging topic in face recognition, designing margin-based loss functions can increase the feature margin between different classes for enhanced discriminability. More recently, the idea of mining-based strategies is adopted to emphasize the misclassified samples, achieving promising results. However, during the entire training process, the prior methods either do not explicitly emphasize the sample based on its importance that renders the hard samples not fully exploited; or explicitly emphasize the effects of semi-hard/hard samples even at the early training stage that may lead to convergence issue. In this work, we propose a novel Adaptive Curriculum Learning loss (CurricularFace) that embeds the idea of curriculum learning into the loss function to achieve a novel training strategy for deep face recognition, which mainly addresses easy samples in the early training stage and hard ones in the later stage. Specifically, our CurricularFace adaptively adjusts the relative importance of easy and hard samples during different training stages. In each stage, different samples are assigned with different importance according to their corresponding difficultness. Extensive experimental results on popular benchmarks demonstrate the superiority of our CurricularFace over the state-of-the-art competitors.

preprint2020arXiv

Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only

Home design is a complex task that normally requires architects to finish with their professional skills and tools. It will be fascinating that if one can produce a house plan intuitively without knowing much knowledge about home design and experience of using complex designing tools, for example, via natural language. In this paper, we formulate it as a language conditioned visual content generation problem that is further divided into a floor plan generation and an interior texture (such as floor and wall) synthesis task. The only control signal of the generation process is the linguistic expression given by users that describe the house details. To this end, we propose a House Plan Generative Model (HPGM) that first translates the language input to a structural graph representation and then predicts the layout of rooms with a Graph Conditioned Layout Prediction Network (GC LPN) and generates the interior texture with a Language Conditioned Texture GAN (LCT-GAN). With some post-processing, the final product of this task is a 3D house model. To train and evaluate our model, we build the first Text-to-3D House Model dataset.