Researcher profile

Shawn Gu

Shawn Gu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2021arXiv

Analysis of Moral Judgement on Reddit

Moral outrage has become synonymous with social media in recent years. However, the preponderance of academic analysis on social media websites has focused on hate speech and misinformation. This paper focuses on analyzing moral judgements rendered on social media by capturing the moral judgements that are passed in the subreddit /r/AmITheAsshole on Reddit. Using the labels associated with each judgement we train a classifier that can take a comment and determine whether it judges the user who made the original post to have positive or negative moral valence. Then, we use this classifier to investigate an assortment of website traits surrounding moral judgements in ten other subreddits, including where negative moral users like to post and their posting patterns. Our findings also indicate that posts that are judged in a positive manner will score higher.

preprint2020arXiv

Data-driven biological network alignment that uses topological, sequence, and functional information

Many proteins remain functionally unannotated. Sequence alignment (SA) uncovers missing annotations by transferring functional knowledge between species' sequence-conserved regions. Because SA is imperfect, network alignment (NA) complements SA by transferring functional knowledge between conserved biological network, rather than just sequence, regions of different species. Existing NA assumes that it is topological similarity (isomorphic-like matching) between network regions that corresponds to the regions' functional relatedness. However, we recently found that functionally unrelated proteins are almost as topologically similar as functionally related proteins. So, we redefined NA as a data-driven framework, TARA, which learns from network and protein functional data what kind of topological relatedness (rather than similarity) between proteins corresponds to the proteins' functional relatedness. TARA used topological information (within each network) but not sequence information (between proteins across networks). Yet, its alignments yielded higher protein functional prediction accuracy than alignments of existing NA methods, even those that used both topological and sequence information. Here, we propose TARA++ that is also data-driven, like TARA and unlike other existing methods, but that uses across-network sequence information on top of within-network topological information, unlike TARA. To deal with the within-and-across-network analysis, we adapt social network embedding to the problem of biological NA. TARA++ outperforms protein functional prediction accuracy of existing methods.

preprint2020arXiv

Data-driven network alignment

Biological network alignment (NA) aims to find a node mapping between species' molecular networks that uncovers similar network regions, thus allowing for transfer of functional knowledge between the aligned nodes. However, current NA methods do not end up aligning functionally related nodes. A likely reason is that they assume it is topologically similar nodes that are functionally related. However, we show that this assumption does not hold well. So, a paradigm shift is needed with how the NA problem is approached. We redefine NA as a data-driven framework, TARA (daTA-dRiven network Alignment), which attempts to learn the relationship between topological relatedness and functional relatedness without assuming that topological relatedness corresponds to topological similarity, like traditional NA methods do. TARA trains a classifier to predict whether two nodes from different networks are functionally related based on their network topological patterns. We find that TARA is able to make accurate predictions. TARA then takes each pair of nodes that are predicted as related to be part of an alignment. Like traditional NA methods, TARA uses this alignment for the across-species transfer of functional knowledge. Clearly, TARA as currently implemented uses topological but not protein sequence information for this task. We find that TARA outperforms existing state-of-the-art NA methods that also use topological information, WAVE and SANA, and even outperforms or complements a state-of-the-art NA method that uses both topological and sequence information, PrimAlign. Hence, adding sequence information to TARA, which is our future work, is likely to further improve its performance.

preprint2020arXiv

Pairwise versus multiple network alignment

Biological network alignment (NA) aims to identify similar regions between molecular networks of different species. NA can be local or global. Just as the recent trend in the NA field, we also focus on global NA, which can be pairwise (PNA) and multiple (MNA). PNA produces aligned node pairs between two networks. MNA produces aligned node clusters between more than two networks. Recently, the focus has shifted from PNA to MNA, because MNA captures conserved regions between more networks than PNA (and MNA is thus considered to be more insightful), though at higher computational complexity. The issue is that, due to the different outputs of PNA and MNA, a PNA method is only compared to other PNA methods, and an MNA method is only compared to other MNA methods. Comparison of PNA against MNA must be done to evaluate whether MNA's higher complexity is justified by its higher accuracy. We introduce a framework that allows for this. We evaluate eight prominent PNA and MNA methods, on synthetic and real-world biological networks, using topological and functional alignment quality measures. We compare PNA against MNA in both a pairwise (native to PNA) and multiple (native to MNA) manner. PNA is expected to perform better under the pairwise evaluation framework. Indeed this is what we find. MNA is expected to perform better under the multiple evaluation framework. Shockingly, we find this not to always hold; PNA is often better than MNA in this framework, depending on the choice of evaluation test.