Source author record

Hyunghoon Cho

Hyunghoon Cho appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Genomics Machine Learning Molecular Networks Social and Information Networks

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Contact Tracing Mobile Apps for COVID-19: Privacy Considerations and Related Trade-offs

Contact tracing is an essential tool for public health officials and local communities to fight the spread of novel diseases, such as for the COVID-19 pandemic. The Singaporean government just released a mobile phone app, TraceTogether, that is designed to assist health officials in tracking down exposures after an infected individual is identified. However, there are important privacy implications of the existence of such tracking apps. Here, we analyze some of those implications and discuss ways of ameliorating the privacy concerns without decreasing usefulness to public health. We hope in writing this document to ensure that privacy is a central feature of conversations surrounding mobile contact tracing apps and to encourage community efforts to develop alternative effective solutions with stronger privacy protection for the users. Importantly, though we discuss potential modifications, this document is not meant as a formal research paper, but instead is a response to some of the privacy characteristics of direct contact tracing apps like TraceTogether and an early-stage Request for Comments to the community. Date written: 2020-03-24 Minor correction: 2020-03-30

preprint2015arXiv

Diffusion Component Analysis: Unraveling Functional Topology in Biological Networks

Complex biological systems have been successfully modeled by biochemical and genetic interaction networks, typically gathered from high-throughput (HTP) data. These networks can be used to infer functional relationships between genes or proteins. Using the intuition that the topological role of a gene in a network relates to its biological function, local or diffusion based "guilt-by-association" and graph-theoretic methods have had success in inferring gene functions. Here we seek to improve function prediction by integrating diffusion-based methods with a novel dimensionality reduction technique to overcome the incomplete and noisy nature of network data. In this paper, we introduce diffusion component analysis (DCA), a framework that plugs in a diffusion model and learns a low-dimensional vector representation of each node to encode the topological properties of a network. As a proof of concept, we demonstrate DCA's substantial improvement over state-of-the-art diffusion-based approaches in predicting protein function from molecular interaction networks. Moreover, our DCA framework can integrate multiple networks from heterogeneous sources, consisting of genomic information, biochemical experiments and other resources, to even further improve function prediction. Yet another layer of performance gain is achieved by integrating the DCA framework with support vector machines that take our node vector representations as features. Overall, our DCA framework provides a novel representation of nodes in a network that can be used as a plug-in architecture to other machine learning algorithms to decipher topological properties of and obtain novel insights into interactomes.

preprint2014arXiv

High-resolution transcriptome analysis with long-read RNA sequencing

RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2x75 bp and 2x262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.