Researcher profile

Stephen Todd

Stephen Todd contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Predicting feature imputability in the absence of ground truth

Data imputation is the most popular method of dealing with missing values, but in most real life applications, large missing data can occur and it is difficult or impossible to evaluate whether data has been imputed accurately (lack of ground truth). This paper addresses these issues by proposing an effective and simple principal component based method for determining whether individual data features can be accurately imputed - feature imputability. In particular, we establish a strong linear relationship between principal component loadings and feature imputability, even in the presence of extreme missingness and lack of ground truth. This work will have important implications in practical data imputation strategies.

preprint2013arXiv

Hyper-Graph Based Database Partitioning for Transactional Workloads

A common approach to scaling transactional databases in practice is horizontal partitioning, which increases system scalability, high availability and self-manageability. Usu- ally it is very challenging to choose or design an optimal partitioning scheme for a given workload and database. In this technical report, we propose a fine-grained hyper-graph based database partitioning system for transactional work- loads. The partitioning system takes a database, a workload, a node cluster and partitioning constraints as input and out- puts a lookup-table encoding the final database partitioning decision. The database partitioning problem is modeled as a multi-constraints hyper-graph partitioning problem. By deriving a min-cut of the hyper-graph, our system can min- imize the total number of distributed transactions in the workload, balance the sizes and workload accesses of the partitions and satisfy all the partition constraints imposed. Our system is highly interactive as it allows users to im- pose partition constraints, watch visualized partitioning ef- fects, and provide feedback based on human expertise and indirect domain knowledge for generating better partition- ing schemes.