Source author record

Yuchen Xu

Yuchen Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci eess.IV Applications cond-mat.mes-hall Distributed, Parallel, and Cluster Computing physics.comp-ph

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fine-grained MoE Load Balancing with Linear Programming

Mixture-of-Experts (MoE) has emerged as a promising approach to scale up deep learning models due to its significant reduction in computational resources. However, the dynamic nature of MoE leads to load imbalance among experts, severely impacting training efficiency. While previous research has attempted to address the load balancing challenge, existing solutions either compromise model accuracy or introduce additional system overhead. As a result, they fail to achieve fine-grained load balancing, which is crucial to optimizing training efficiency. We propose a novel parallelization strategy to achieve fine-grained load balancing in MoE systems. Our system is capable of achieving optimal load balancing in every micro-batch through efficient token scheduling across GPUs. Our experimental results demonstrate that MicroMoE improves the end-to-end training throughput by up to 47.6% compared with the state-of-the-art system, and almost consistently achieves optimal load balance among GPUs.

preprint2023arXiv

Feature detection and hypothesis testing for extremely noisy nanoparticle images using topological data analysis

We propose a flexible algorithm for feature detection and hypothesis testing in images with ultra low signal-to-noise ratio using cubical persistent homology. Our main application is in the identification of atomic columns and other features in transmission electron microscopy (TEM). Cubical persistent homology is used to identify local minima and their size in subregions in the frames of nanoparticle videos, which are hypothesized to correspond to relevant atomic features. We compare the performance of our algorithm to other employed methods for the detection of columns and their intensity. Additionally, Monte Carlo goodness-of-fit testing using real valued summaries of persistence diagrams derived from smoothed images (generated from pixels residing in the vacuum region of an image) is developed and employed to identify whether or not the proposed atomic features generated by our algorithm are due to noise. Using these summaries derived from the generated persistence diagrams, one can produce univariate time series for the nanoparticle videos, thus providing a means for assessing fluxional behavior. A guarantee on the false discovery rate for multiple Monte Carlo testing of identical hypotheses is also established.

preprint2020arXiv

Outlier Guided Optimization of Abdominal Segmentation

Abdominal multi-organ segmentation of computed tomography (CT) images has been the subject of extensive research interest. It presents a substantial challenge in medical image processing, as the shape and distribution of abdominal organs can vary greatly among the population and within an individual over time. While continuous integration of novel datasets into the training set provides potential for better segmentation performance, collection of data at scale is not only costly, but also impractical in some contexts. Moreover, it remains unclear what marginal value additional data have to offer. Herein, we propose a single-pass active learning method through human quality assurance (QA). We built on a pre-trained 3D U-Net model for abdominal multi-organ segmentation and augmented the dataset either with outlier data (e.g., exemplars for which the baseline algorithm failed) or inliers (e.g., exemplars for which the baseline algorithm worked). The new models were trained using the augmented datasets with 5-fold cross-validation (for outlier data) and withheld outlier samples (for inlier data). Manual labeling of outliers increased Dice scores with outliers by 0.130, compared to an increase of 0.067 with inliers (p<0.001, two-tailed paired t-test). By adding 5 to 37 inliers or outliers to training, we find that the marginal value of adding outliers is higher than that of adding inliers. In summary, improvement on single-organ performance was obtained without diminishing multi-organ performance or significantly increasing training time. Hence, identification and correction of baseline failure cases present an effective and efficient method of selecting training data to improve algorithm performance.

preprint2020arXiv

Validation and Optimization of Multi-Organ Segmentation on Clinical Imaging Archives

Segmentation of abdominal computed tomography(CT) provides spatial context, morphological properties, and a framework for tissue-specific radiomics to guide quantitative Radiological assessment. A 2015 MICCAI challenge spurred substantial innovation in multi-organ abdominal CT segmentation with both traditional and deep learning methods. Recent innovations in deep methods have driven performance toward levels for which clinical translation is appealing. However, continued cross-validation on open datasets presents the risk of indirect knowledge contamination and could result in circular reasoning. Moreover, 'real world' segmentations can be challenging due to the wide variability of abdomen physiology within patients. Herein, we perform two data retrievals to capture clinically acquired deidentified abdominal CT cohorts with respect to a recently published variation on 3D U-Net (baseline algorithm). First, we retrieved 2004 deidentified studies on 476 patients with diagnosis codes involving spleen abnormalities (cohort A). Second, we retrieved 4313 deidentified studies on 1754 patients without diagnosis codes involving spleen abnormalities (cohort B). We perform prospective evaluation of the existing algorithm on both cohorts, yielding 13% and 8% failure rate, respectively. Then, we identified 51 subjects in cohort A with segmentation failures and manually corrected the liver and gallbladder labels. We re-trained the model adding the manual labels, resulting in performance improvement of 9% and 6% failure rate for the A and B cohorts, respectively. In summary, the performance of the baseline on the prospective cohorts was similar to that on previously published datasets. Moreover, adding data from the first cohort substantively improved performance when evaluated on the second withheld validation cohort.

preprint2016arXiv

Low lattice thermal conductivity of stanene

A fundamental understanding of phonon transport in stanene is crucial to predict the thermal performance in potential stanene-based devices. By combining first-principle calculation and phonon Boltzmann transport equation, we obtain the lattice thermal conductivity of stanene. A much lower thermal conductivity (11.6 W/mK) is observed in stanene, which indicates higher thermoelectric efficiency over other 2D materials. The contributions of acoustic and optical phonons to the lattice thermal conductivity are evaluated. Detailed analysis of phase space for three-phonon processes shows that phonon scattering channels LA+LA/TA/ZA$\leftrightarrow$TA/ZA are restricted, leading to the dominant contributions of high-group-velocity LA phonons to the thermal conductivity. The size dependence of thermal conductivity is investigated as well for the purpose of the design of thermoelectric nanostructures.

preprint2015arXiv

Thermal conductivity of monolayer MoS2, MoSe2, and WS2: Interplay of mass effect, interatomic bonding and anharmonicity

Phonons are essential for understanding the thermal properties in monolayer transition metal dichalcogenides, which limit their thermal performance for potential applications. We investigate the lattice dynamics and thermodynamic properties of MoS2, MoSe2, and WS2 by first principles calculations. The obtained phonon frequencies and thermal conductivities agree well with the measurements. Our results show that the thermal conductivity of MoS2 is highest among the three materials due to its much lower average atomic mass. We also discuss the competition between mass effect, interatomic bonding and anharmonic vibrations in determining the thermal conductivity of WS2. Strong covalent W-S bonding and low anharmonicity in WS2 are found to be crucial in understanding its much higher thermal conductivity compared to MoSe2.