Researcher profile

Sema K. Sgaier

Sema K. Sgaier contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Causal datasheet: An approximate guide to practically assess Bayesian networks in the real world

In solving real-world problems like changing healthcare-seeking behaviors, designing interventions to improve downstream outcomes requires an understanding of the causal links within the system. Causal Bayesian Networks (BN) have been proposed as one such powerful method. In real-world applications, however, confidence in the results of BNs are often moderate at best. This is due in part to the inability to validate against some ground truth, as the DAG is not available. This is especially problematic if the learned DAG conflicts with pre-existing domain doctrine. At the policy level, one must justify insights generated by such analysis, preferably accompanying them with uncertainty estimation. Here we propose a causal extension to the datasheet concept proposed by Gebru et al (2018) to include approximate BN performance expectations for any given dataset. To generate the results for a prototype Causal Datasheet, we constructed over 30,000 synthetic datasets with properties mirroring characteristics of real data. We then recorded the results given by state-of-the-art structure learning algorithms. These results were used to populate the Causal Datasheet, and recommendations were automatically generated dependent on expected performance. As a proof of concept, we used our Causal Datasheet Generation Tool (CDG-T) to assign expected performance expectations to a maternal health survey we conducted in Uttar Pradesh, India.

preprint2020arXiv

The Pace and Pulse of the Fight against Coronavirus across the US, A Google Trends Approach

The coronavirus pandemic is impacting our lives at unprecedented speed and scale - including how we eat and work, what we worry about, how much we move, and our ability to earn. Google Trends can be used as a proxy for what people are thinking, needing, and planning. We use it to provide both insights into, and potential indicators of, important changes in information-seeking patterns during pandemics like COVID-19. Key questions we address are: (1) What is the relationship between the coronavirus outbreak and internet searches related to healthcare seeking, government support programs, media sources of different ideologies, planning around social activities, travel, and food, and new coronavirus-specific behaviors and concerns?; (2) How does the popularity of search terms differ across states and regions and can we explain these differences?; (3) Can we find distinct, tangible search patterns across states suggestive of policy gaps to inform pandemic response? (4) Does Google Trends data correlate with and potentially precede real-life events? We suggest strategic shifts for policy makers to improve the precision and effectiveness of non-pharmaceutical interventions (NPIs) and recommend the development of a real-time dashboard as a decision-making tool. Methods used include trend analysis of US search data; geographic analyses of the differences in search popularity across US states during March 1st to April 15th, 2020; and Principal Component Analyses (PCA) to extract search patterns across states.