Researcher profile

Sandra E. Safo

Sandra E. Safo contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2021arXiv

sJIVE: Supervised Joint and Individual Variation Explained

Analyzing multi-source data, which are multiple views of data on the same subjects, has become increasingly common in molecular biomedical research. Recent methods have sought to uncover underlying structure and relationships within and/or between the data sources, and other methods have sought to build a predictive model for an outcome using all sources. However, existing methods that do both are presently limited because they either (1) only consider data structure shared by all datasets while ignoring structures unique to each source, or (2) they extract underlying structures first without consideration to the outcome. We propose a method called supervised joint and individual variation explained (sJIVE) that can simultaneously (1) identify shared (joint) and source-specific (individual) underlying structure and (2) build a linear prediction model for an outcome using these structures. These two components are weighted to compromise between explaining variation in the multi-source data and in the outcome. Simulations show sJIVE to outperform existing methods when large amounts of noise are present in the multi-source data. An application to data from the COPDGene study reveals gene expression and proteomic patterns that are predictive of lung function. Functions to perform sJIVE are included in the R.JIVE package, available online at http://github.com/lockEF/r.jive .

preprint2020arXiv

Bayesian Integrative Analysis and Prediction with Application to Atherosclerosis Cardiovascular Disease

Cardiovascular diseases (CVD), including atherosclerosis CVD (ASCVD), are multifactorial diseases that present a major economic and social burden worldwide. Tremendous efforts have been made to understand traditional risk factors for ASCVD, but these risk factors account for only about half of all cases of ASCVD. It remains a critical need to identify nontraditional risk factors (e.g., genetic variants, genes) contributing to ASCVD. Further, incorporating functional knowledge in prediction models have the potential to reveal pathways associated with disease risk. We propose Bayesian hierarchical factor analysis models that associate multiple omics data, predict a clinical outcome, allow for prior functional information, and can accommodate clinical covariates. The models, motivated by available data and the need for other risk factors of ASCVD, are used for the integrative analysis of clinical, demographic, and multi-omics data to identify genetic variants, genes, and gene pathways potentially contributing to 10-year ASCVD risk in healthy adults. Our findings revealed several genetic variants, genes and gene pathways that were highly associated with ASCVD risk. Interestingly, some of these have been implicated in CVD risk. The others could be explored for their potential roles in CVD. Our findings underscore the merit in joint association and prediction models.

preprint2020arXiv

Sparse Linear Discriminant Analysis for Multi-view Structured Data

Classification methods that leverage the strengths of data from multiple sources (multi-view data) simultaneously have enormous potential to yield more powerful findings than two step methods: association followed by classification. We propose two methods, sparse integrative discriminant analysis (SIDA) and SIDA with incorporation of network information (SIDANet), for joint association and classification studies. The methods consider the overall association between multi-veiw data, and the separation within each view in choosing discriminant vectors that are associated and optimally separate subjects into different classes. SIDANet is among the first methods to incorporate prior structural information in joint association and classification studies. It uses the normalized Laplacian of a graph to smooth coefficients of predictor variables, thus encouraging selection of predictors that are connected and behave similarly. We demonstrate the effectiveness of our methods on a set of synthetic and real datasets. Our findings underscore the benefit of joint association and classification methods if the goal is to correlate multi-view data and to perform classification.