Source author record

Wen Yan

Wen Yan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.soft eess.IV physics.comp-ph physics.flu-dyn Biological Physics Computational Engineering, Finance, and Science cond-mat.mtrl-sci Machine Learning physics.chem-ph

Catalog footprint

What is connected

9works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Bridging Quantum Mechanics to Organic Liquid Properties via a Universal Force Field

Molecular dynamics (MD) simulations are essential tools for unraveling atomistic insights into the structure and dynamics of condensed-phase systems. However, the universal and accurate prediction of macroscopic properties from ab initio calculations remains a significant challenge, often hindered by the trade-off between computational cost and simulation accuracy. Here, we present ByteFF-Pol, a graph neural network (GNN)-parameterized polarizable force field, trained exclusively on high-level quantum mechanics (QM) data. Leveraging physically-motivated force field forms and training strategies, ByteFF-Pol exhibits exceptional performance in predicting thermodynamic and transport properties for a wide range of small-molecule liquids and electrolytes, outperforming state-of-the-art (SOTA) classical and machine learning force fields. The zero-shot prediction capability of ByteFF-Pol bridges the gap between microscopic QM calculations and macroscopic liquid properties, enabling the exploration of previously intractable chemical spaces. This advancement holds transformative potential for applications such as electrolyte design and custom-tailored solvent, representing a pivotal step toward data-driven materials discovery.

preprint2026arXiv

CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation

Estimating free energy differences quantifies thermodynamic preferences in molecular interactions, which is central to chemistry and drug discovery. Despite fruitful progress, existing methods still face key limitations: classical computational approaches remain prohibitively expensive due to their reliance on extensive molecular dynamics simulations, while deep learning-based methods are constrained by either less-expressive generative models or input dimensions tied to a specific system, resulting in negligible generalization. To address these challenges, we propose CARD, a generative framework that employs a novel radix-based decomposition to bijectively convert 3D coordinates into mixed discrete-continuous sequences, enabling coarse-to-fine autoregressive modeling with enhanced expressiveness. Notably, the model corresponds to a distribution with zero free energy, serving as a proposal for absolute free energy computation of arbitrary systems without relying on alchemical pathways. Experiments across diverse tasks demonstrate that CARD matches the accuracy of classical computational methods on unseen systems with diverse topologies, while achieving an approximately 40-fold speedup in inference.

preprint2022arXiv

Cross-Modality Image Registration using a Training-Time Privileged Third Modality

In this work, we consider the task of pairwise cross-modality image registration, which may benefit from exploiting additional images available only at training time from an additional modality that is different to those being registered. As an example, we focus on aligning intra-subject multiparametric Magnetic Resonance (mpMR) images, between T2-weighted (T2w) scans and diffusion-weighted scans with high b-value (DWI$_{high-b}$). For the application of localising tumours in mpMR images, diffusion scans with zero b-value (DWI$_{b=0}$) are considered easier to register to T2w due to the availability of corresponding features. We propose a learning from privileged modality algorithm, using a training-only imaging modality DWI$_{b=0}$, to support the challenging multi-modality registration problems. We present experimental results based on 369 sets of 3D multiparametric MRI images from 356 prostate cancer patients and report, with statistical significance, a lowered median target registration error of 4.34 mm, when registering the holdout DWI$_{high-b}$ and T2w image pairs, compared with that of 7.96 mm before registration. Results also show that the proposed learning-based registration networks enabled efficient registration with comparable or better accuracy, compared with a classical iterative algorithm and other tested learning-based methods with/without the additional modality. These compared algorithms also failed to produce any significantly improved alignment between DWI$_{high-b}$ and T2w in this challenging application.

preprint2022arXiv

Few-shot image segmentation for cross-institution male pelvic organs using registration-assisted prototypical learning

The ability to adapt medical image segmentation networks for a novel class such as an unseen anatomical or pathological structure, when only a few labelled examples of this class are available from local healthcare providers, is sought-after. This potentially addresses two widely recognised limitations in deploying modern deep learning models to clinical practice, expertise-and-labour-intensive labelling and cross-institution generalisation. This work presents the first 3D few-shot interclass segmentation network for medical images, using a labelled multi-institution dataset from prostate cancer patients with eight regions of interest. We propose an image alignment module registering the predicted segmentation of both query and support data, in a standard prototypical learning algorithm, to a reference atlas space. The built-in registration mechanism can effectively utilise the prior knowledge of consistent anatomy between subjects, regardless whether they are from the same institution or not. Experimental results demonstrated that the proposed registration-assisted prototypical learning significantly improved segmentation accuracy (p-values<0.01) on query data from a holdout institution, with varying availability of support data from multiple institutions. We also report the additional benefits of the proposed 3D networks with 75% fewer parameters and an arguably simpler implementation, compared with existing 2D few-shot approaches that segment 2D slices of volumetric medical images.

preprint2022arXiv

Hydrodynamic instabilities and collective dynamics in activity-balanced pusher-puller mixtures

Microorganisms living in microfluidic environments often form multi-species swarms, where they can leverage collective motions to achieve enhanced transport and spreading. Nevertheless, there is a general lack of physical understandings of the origins of the multiscale unstable dynamics observed within these systems. Here, we build a computational model to study binary suspensions of rear- and front-actuated microswimmers, or respectively the so-called "pusher" and "puller" particles, that have different populations and swimming speeds. We perform direct particle simulations to reveal that collective system dynamics are possible even in the scenario of an "activity-balanced" mixture, which produces near zero mean extra stress. We first construct a continuum kinetic model to describe the initial transient period when the system is near uniform isotropy and then perform linear stability analysis to reveal the system's finite-wavelength hydrodynamic instabilities, in contrast with the long-wavelength instabilities of pure pusher/puller suspensions. Then, we carry out slender-body discrete particle simulations to resolve both the short time instabilities and the the longtime dynamics, which feature non-trivial density fluctuations and spatially-correlated motions, distinct from those of single-species.

preprint2022arXiv

The impact of using voxel-level segmentation metrics on evaluating multifocal prostate cancer localisation

Dice similarity coefficient (DSC) and Hausdorff distance (HD) are widely used for evaluating medical image segmentation. They have also been criticised, when reported alone, for their unclear or even misleading clinical interpretation. DSCs may also differ substantially from HDs, due to boundary smoothness or multiple regions of interest (ROIs) within a subject. More importantly, either metric can also have a nonlinear, non-monotonic relationship with outcomes based on Type 1 and 2 errors, designed for specific clinical decisions that use the resulting segmentation. Whilst cases causing disagreement between these metrics are not difficult to postulate. This work first proposes a new asymmetric detection metric, adapting those used in object detection, for planning prostate cancer procedures. The lesion-level metrics is then compared with the voxel-level DSC and HD, whereas a 3D UNet is used for segmenting lesions from multiparametric MR (mpMR) images. Based on experimental results we report pairwise agreement and correlation 1) between DSC and HD, and 2) between voxel-level DSC and recall-controlled precision at lesion-level, with Cohen's [0.49, 0.61] and Pearson's [0.66, 0.76] (p-values}<0.001) at varying cut-offs. However, the differences in false-positives and false-negatives, between the actual errors and the perceived counterparts if DSC is used, can be as high as 152 and 154, respectively, out of the 357 test set lesions. We therefore carefully conclude that, despite of the significant correlations, voxel-level metrics such as DSC can misrepresent lesion-level detection accuracy for evaluating localisation of multifocal prostate cancer and should be interpreted with caution.

preprint2022arXiv

Towards the cellular-scale simulation of motor-driven cytoskeletal assemblies

The cytoskeleton -- a collection of polymeric filaments, molecular motors, and crosslinkers -- is a foundational example of active matter, and in the cell assembles into organelles that guide basic biological functions. Simulation of cytoskeletal assemblies is an important tool for modeling cellular processes and understanding their surprising material properties. Here we present aLENS, a novel computational framework to surmount the limits of conventional simulation methods. We model molecular motors with crosslinking kinetics that adhere to a thermodynamic energy landscape, and integrate the system dynamics while efficiently and stably enforcing hard-body repulsion between filaments -- molecular potentials are entirely avoided in imposing steric constraints. Utilizing parallel computing, we simulate different mixtures of tens to hundreds of thousands of cytoskeletal filaments and crosslinking motors, recapitulating self-emergent phenomena such as bundle formation and buckling, and elucidating how motor type, thermal fluctuations, internal stresses, and confinement determine the evolution of active matter aggregates.

preprint2020arXiv

A scalable computational platform for particulate Stokes suspensions

We describe a computational framework for simulating suspensions of rigid particles in Newtonian Stokes flow. One central building block is a collision-resolution algorithm that overcomes the numerical constraints arising from particle collisions. This algorithm extends the well-known complementarity method for non-smooth multi-body dynamics to resolve collisions in dense rigid body suspensions. This approach formulates the collision resolution problem as a linear complementarity problem with geometric `non-overlapping' constraints imposed at each timestep. It is then reformulated as a constrained quadratic programming problem and the Barzilai-Borwein projected gradient descent method is applied for its solution. This framework is designed to be applicable for any convex particle shape, e.g., spheres and spherocylinders, and applicable to any Stokes mobility solver, including the Rotne-Prager-Yamakawa approximation, Stokesian Dynamics, and PDE solvers (e.g., boundary integral and immersed boundary methods). In particular, this method imposes Newton's Third Law and records the entire contact network. Further, we describe a fast, parallel, and spectrally-accurate boundary integral method tailored for spherical particles, capable of resolving lubrication effects. We show weak and strong parallel scalings up to $8\times 10^4$ particles with approximately $4\times 10^7$ degrees of freedom on $1792$ cores. We demonstrate the versatility of this framework with several examples, including sedimentation of particle clusters, and active matter systems composed of ensembles of particles driven to rotate.

preprint2015arXiv

Electron counting and a large family of two-dimensional semiconductors

Comparing with the conventional semiconductors, the choice of the two dimensional semiconductor (2DSC) materials is very limited. Based on proper electron counting, we propose a large family of 2DSCs, all adopting the same structure and consisting of only main group elements. Using advanced density functional calculations, we demonstrate the attainability of these materials, and show that they cover a large range of lattice constants, band gaps and band edge states, therefore are good candidate materials for heterojunctions. This family of two dimensional materials may pave a way toward fabrication of 2DSC devices at the same thriving level as 3D semiconductors.

Wen Yan

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Bridging Quantum Mechanics to Organic Liquid Properties via a Universal Force Field

CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation

Cross-Modality Image Registration using a Training-Time Privileged Third Modality

Few-shot image segmentation for cross-institution male pelvic organs using registration-assisted prototypical learning

Hydrodynamic instabilities and collective dynamics in activity-balanced pusher-puller mixtures

The impact of using voxel-level segmentation metrics on evaluating multifocal prostate cancer localisation

Towards the cellular-scale simulation of motor-driven cytoskeletal assemblies

A scalable computational platform for particulate Stokes suspensions

Electron counting and a large family of two-dimensional semiconductors