Researcher profile

Galit Shmueli

Galit Shmueli contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

A Tree-based Semi-Varying Coefficient Model for the COM-Poisson Distribution

We propose a tree-based semi-varying coefficient model for the Conway-Maxwell- Poisson (CMP or COM-Poisson) distribution which is a two-parameter generalization of the Poisson distribution and is flexible enough to capture both under-dispersion and over-dispersion in count data. The advantage of tree-based methods is their scalability to high-dimensional data. We develop CMPMOB, an estimation procedure for a semi-varying coefficient model, using model-based recursive partitioning (MOB). The proposed framework is broader than the existing MOB framework as it allows node-invariant effects to be included in the model. To simplify the computational burden of the exhaustive search employed in the original MOB algorithm, a new split point estimation procedure is proposed by borrowing tools from change point estimation methodology. The proposed method uses only the estimated score functions without fitting models for each split point and, therefore, is computationally simpler. Since the tree-based methods only provide a piece-wise constant approximation to the underlying smooth function, we propose the CMPBoost semi-varying coefficient model which uses the gradient boosting procedure for estimation. The usefulness of the proposed methods are illustrated using simulation studies and a real example from a bike sharing system in Washington, DC.

preprint2020arXiv

Beyond Our Behavior: The GDPR and Humanistic Personalization

Personalization should take the human person seriously. This requires a deeper understanding of how recommender systems can shape both our self-understanding and identity. We unpack key European humanistic and philosophical ideas underlying the General Data Protection Regulation (GDPR) and propose a new paradigm of humanistic personalization. Humanistic personalization responds to the IEEE's call for Ethically Aligned Design (EAD) and is based on fundamental human capacities and values. Humanistic personalization focuses on narrative accuracy: the subjective fit between a person's self-narrative and both the input (personal data) and output of a recommender system. In doing so, we re-frame the distinction between implicit and explicit data collection as one of nonconscious ("organismic") behavior and conscious ("reflective") action. This distinction raises important ethical and interpretive issues related to agency, self-understanding, and political participation. Finally, we discuss how an emphasis on narrative accuracy can reduce opportunities for epistemic injustice done to data subjects.

preprint2020arXiv

Efficient Estimation of COM-Poisson Regression and Generalized Additive Model

The Conway-Maxwell-Poisson (CMP) or COM-Poison regression is a popular model for count data due to its ability to capture both under dispersion and over dispersion. However, CMP regression is limited when dealing with complex nonlinear relationships. With today's wide availability of count data, especially due to the growing collection of data on human and social behavior, there is need for count data models that can capture complex nonlinear relationships. One useful approach is additive models; but, there has been no additive model implementation for the CMP distribution. To fill this void, we first propose a flexible estimation framework for CMP regression based on iterative reweighed least squares (IRLS) and then extend this model to allow for additive components using a penalized splines approach. Because the CMP distribution belongs to the exponential family, convergence of IRLS is guaranteed under some regularity conditions. Further, it is also known that IRLS provides smaller standard errors compared to gradient-based methods. We illustrate the usefulness of this approach through extensive simulation studies and using real data from a bike sharing system in Washington, DC.

preprint2020arXiv

Selected Topics in Statistical Computing

The field of computational statistics refers to statistical methods or tools that are computationally intensive. Due to the recent advances in computing power some of these methods have become prominent and central to modern data analysis. In this article we focus on several of the main methods including density estimation, kernel smoothing, smoothing splines, and additive models. While the field of computational statistics includes many more methods, this article serves as a brief introduction to selected popular topics.