Source author record

Juhee Lee

Juhee Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Applications Genomics cond-mat.quant-gas cond-mat.stat-mech Methodology Populations and Evolution

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

A Bayesian Feature Allocation Model for Identification of Cell Subpopulations Using Cytometry Data

A Bayesian feature allocation model (FAM) is presented for identifying cell subpopulations based on multiple samples of cell surface or intracellular marker expression level data obtained by cytometry by time of flight (CyTOF). Cell subpopulations are characterized by differences in expression patterns of makers, and individual cells are clustered into the subpopulations based on the patterns of their observed expression levels. A finite Indian buffet process is used to model subpopulations as latent features, and a model-based method based on these latent feature subpopulations is used to construct cell clusters within each sample. Non-ignorable missing data due to technical artifacts in mass cytometry instruments are accounted for by defining a static missing data mechanism. In contrast to conventional cell clustering methods based on observed marker expression levels that are applied separately to different samples, the FAM based method can be applied simultaneously to multiple samples, and can identify important cell subpopulations likely to be missed by conventional clustering. The proposed FAM based method is applied to jointly analyze three datasets, generated by CyTOF, to study natural killer (NK) cells. Because the subpopulations identified by the FAM may define novel NK cell subsets, this statistical analysis may provide useful information about the biology of NK cells and their potential role in cancer immunotherapy which may lead, in turn, to development of improved cellular therapies. Simulation studies of the proposed method's behavior under two cases of known subpopulations also are presented, followed by analysis of the CyTOF NK cell surface marker data.

preprint2016arXiv

Comment on Influence of induced interactions on superfluid properties of quasi-two-dimensional dilute Fermi gases with spin-orbit coupling

In an article in 2013, Caldas et al. [Phys. Rev. A 88, 023615 (2013)] derived analytical expressions of the induced interaction within the scheme of Gorkov and Melik-Barkhudrov in quasi-two-dimensional Fermi gases with Rashba spin-orbit coupling (SOC). They claimed that the induced interaction is exactly the same as the one for the case without SOC when the SOC is weak, and in the region of strong SOC, it starts from a reduced value and then recovers the value for the zero SOC in the limit of large SOC. We point out that their calculations contain the critical errors and inconsistencies that significantly affect the basis of these claims.

preprint2015arXiv

A Bayesian feature allocation model for tumor heterogeneity

We develop a feature allocation model for inference on genetic tumor variation using next-generation sequencing data. Specifically, we record single nucleotide variants (SNVs) based on short reads mapped to human reference genome and characterize tumor heterogeneity by latent haplotypes defined as a scaffold of SNVs on the same homologous genome. For multiple samples from a single tumor, assuming that each sample is composed of some sample-specific proportions of these haplotypes, we then fit the observed variant allele fractions of SNVs for each sample and estimate the proportions of haplotypes. Varying proportions of haplotypes across samples is evidence of tumor heterogeneity since it implies varying composition of cell subpopulations. Taking a Bayesian perspective, we proceed with a prior probability model for all relevant unknown quantities, including, in particular, a prior probability model on the binary indicators that characterize the latent haplotypes. Such prior models are known as feature allocation models. Specifically, we define a simplified version of the Indian buffet process, one of the most traditional feature allocation models. The proposed model allows overlapping clustering of SNVs in defining latent haplotypes, which reflects the evolutionary process of subclonal expansion in tumor samples.

preprint2015arXiv

First-order phase transition and tricritical scaling behavior of the Blume-Capel model: a Wang-Landau sampling approach

We investigate the tricritical scaling behavior of the two-dimensional spin-$1$ Blume-Capel model using the Wang-Landau method measuring the joint density of states for lattice sizes up to $48\times 48$ sites. The first-order transition curve is systematically determined employing the method of field mixing in conjunction with finite-size scaling, showing a significant deviation from the previous data points. Deep in the first-order area of the phase diagram, we also find that the specific heat exhibits a double-peak structure of the Schottky-like anomaly appearing with the transition peak. At the tricritical point, we characterize the tricritical exponents through finite-size scaling analysis including the phenomenological finite-size scaling with thermodynamic variables. Our estimation of the tricritical eigenvalue exponents, $y_t = 1.804(5)$, $y_g = 0.80(1)$, and $y_h = 1.925(3)$, provides the first Wang-Landau verification of the conjectured exact values, demonstrating the effectiveness of the density-of-states-based approach in finite-size scaling study of multicritical phenomena.

preprint2014arXiv

Bayesian Inference for Tumor Subclones Accounting for Sequencing and Structural Variants

Tumor samples are heterogeneous. They consist of different subclones that are characterized by differences in DNA nucleotide sequences and copy numbers on multiple loci. Heterogeneity can be measured through the identification of the subclonal copy number and sequence at a selected set of loci. Understanding that the accurate identification of variant allele fractions greatly depends on a precise determination of copy numbers, we develop a Bayesian feature allocation model for jointly calling subclonal copy numbers and the corresponding allele sequences for the same loci. The proposed method utilizes three random matrices, L, Z and w to represent subclonal copy numbers (L), numbers of subclonal variant alleles (Z) and cellular fractions of subclones in samples (w), respectively. The unknown number of subclones implies a random number of columns for these matrices. We use next-generation sequencing data to estimate the subclonal structures through inference on these three matrices. Using simulation studies and a real data analysis, we demonstrate how posterior inference on the subclonal structure is enhanced with the joint modeling of both structure and sequencing variants on subclonal genomes. Software is available at http://compgenome.org/BayClone2.