Researcher profile

A. C. Thomas

A. C. Thomas contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2013arXiv

Competing Process Hazard Function Models for Player Ratings in Ice Hockey

Evaluating the overall ability of players in the National Hockey League (NHL) is a difficult task. Existing methods such as the famous "plus/minus" statistic have many shortcomings. Standard linear regression methods work well when player substitutions are relatively uncommon and scoring events are relatively common, such as in basketball, but as neither of these conditions exists for hockey, we use an approach that embraces the unique characteristics of the sport. We model the scoring rate for each team as its own semi-Markov process, with hazard functions for each process that depend on the players on the ice. This method yields offensive and defensive player ability ratings which take into account quality of teammates and opponents, the game situation, and other desired factors, that themselves have a meaningful interpretation in terms of game outcomes. Additionally, since the number of parameters in this model can be quite large, we make use of two different shrinkage methods depending on the question of interest: full Bayesian hierarchical models that partially pool parameters according to player position, and penalized maximum likelihood estimation to select a smaller number of parameters that stand out as being substantially different from average. We apply the model to all five-on-five (full-strength) situations for games in five NHL seasons.

preprint2013arXiv

The Social Contagion Hypothesis: Comment on "Social Contagion Theory: Examining Dynamic Social Networks and Human Behavior"

I reflect on the statistical methods of the Christakis-Fowler studies on network-based contagion of traits by checking the sensitivity of these kinds of results to various alternate specifications and generative mechanisms. Despite the honest efforts of all involved, I remain pessimistic about establishing whether binary health outcomes or product adoptions are contagious if the evidence comes from simultaneously observed data.

preprint2013arXiv

Trouble With The Curve: Improving MLB Pitch Classification

The PITCHf/x database has allowed the statistical analysis of of Major League Baseball (MLB) to flourish since its introduction in late 2006. Using PITCHf/x, pitches have been classified by hand, requiring considerable effort, or using neural network clustering and classification, which is often difficult to interpret. To address these issues, we use model-based clustering with a multivariate Gaussian mixture model and an appropriate adjustment factor as an alternative to current methods. Furthermore, we describe a new pitch classification algorithm based on our clustering approach to address the problems of pitch misclassification. We illustrate our methods for various pitchers from the PITCHf/x database that covers a wide variety of pitch types.

preprint2012arXiv

A Practical Implementation of the Bernoulli Factory

The Bernoulli Factory is an algorithm that takes as input a series of i.i.d. Bernoulli random variables with an unknown but fixed success probability $p$, and outputs a corresponding series of Bernoulli random variables with success probability $f(p)$, where the function $f$ is known and defined on the interval $[0,1]$. While several practical uses of the method have been proposed in Monte Carlo applications, these require an implementation framework that is flexible, general and efficient. We present such a framework for functions that are either strictly linear, concave, or convex on the unit interval using a series of envelope functions defined through a cascade, and show that this method not only greatly reduces the number of input bits needed in practice compared to other currently proposed solutions for more specific problems, and is easy to specify for simple forms, but can easily be coupled to asymptotically efficient methods to allow for theoretically strong results.

preprint2012arXiv

Contrasting Multiple Social Network Autocorrelations for Binary Outcomes, With Applications To Technology Adoption

The rise of socially targeted marketing suggests that decisions made by consumers can be predicted not only from their personal tastes and characteristics, but also from the decisions of people who are close to them in their networks. One obstacle to consider is that there may be several different measures for "closeness" that are appropriate, either through different types of friendships, or different functions of distance on one kind of friendship, where only a subset of these networks may actually be relevant. Another is that these decisions are often binary and more difficult to model with conventional approaches, both conceptually and computationally. To address these issues, we present a hierarchical model for individual binary outcomes that uses and extends the machinery of the auto-probit method for binary data. We demonstrate the behavior of the parameters estimated by the multiple network-regime auto-probit model (m-NAP) under various sensitivity conditions, such as the impact of the prior distribution and the nature of the structure of the network, and demonstrate on several examples of correlated binary data in networks of interest to Information Systems, including the adoption of Caller Ring-Back Tones, whose use is governed by direct connection but explained by additional network topologies.