Researcher profile

Dipak K. Dey

Dipak K. Dey contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2021arXiv

Fast Bayesian inference of Block Nearest Neighbor Gaussian process for large data

This paper presents the development of a spatial block-Nearest Neighbor Gaussian process (block-NNGP) for location-referenced large spatial data. The key idea behind this approach is to divide the spatial domain into several blocks which are dependent under some constraints. The cross-blocks capture the large-scale spatial dependence, while each block captures the small-scale spatial dependence. The resulting block-NNGP enjoys Markov properties reflected on its sparse precision matrix. It is embedded as a prior within the class of latent Gaussian models, thus Bayesian inference is obtained using the integrated nested Laplace approximation (INLA). The performance of the block-NNGP is illustrated on simulated examples and massive real data for locations in the order of $10^4$.

preprint2020arXiv

Flexible Modeling of Hurdle Conway-Maxwell-Poisson Distributions with Application to Mining Injuries

While the hurdle Poisson regression is a popular class of models for count data with excessive zeros, the link function in the binary component may be unsuitable for highly imbalanced cases. Ordinary Poisson regression is unable to handle the presence of dispersion. In this paper, we introduce Conway-Maxwell-Poisson (CMP) distribution and integrate use of flexible skewed Weibull link functions as better alternative. We take a fully Bayesian approach to draw inference from the underlying models to better explain skewness and quantify dispersion, with Deviance Information Criteria (DIC) used for model selection. For empirical investigation, we analyze mining injury data for period 2013-2016 from the U.S. Mine Safety and Health Administration (MSHA). The risk factors describing proportions of employee hours spent in each type of mining work are compositional data; the probabilistic principal components analysis (PPCA) is deployed to deal with such covariates. The hurdle CMP regression is additionally adjusted for exposure, measured by the total employee working hours, to make inference on rate of mining injuries; we tested its competitiveness against other models. This can be used as predictive model in the mining workplace to identify features that increase the risk of injuries so that prevention can be implemented.

preprint2020arXiv

Heckman selection-t model: parameter estimation via the EM-algorithm

Heckman selection model is perhaps the most popular econometric model in the analysis of data with sample selection. The analyses of this model are based on the normality assumption for the error terms, however, in some applications, the distribution of the error term departs significantly from normality, for instance, in the presence of heavy tails and/or atypical observation. In this paper, we explore the Heckman selection-t model where the random errors follow a bivariate Student's-t distribution. We develop an analytically tractable and efficient EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters, with standard errors as a by-product. The algorithm has closed-form expressions at the E-step, that rely on formulas for the mean and variance of the truncated Student's-t distributions. Simulations studies show the vulnerability of the Heckman selection-normal model, as well as the robustness aspects of the Heckman selection-t model. Two real examples are analyzed, illustrating the usefulness of the proposed methods. The proposed algorithms and methods are implemented in the new R package HeckmanEM.

preprint2020arXiv

Power laws distributions in objective priors

The use of objective prior in Bayesian applications has become a common practice to analyze data without subjective information. Formal rules usually obtain these priors distributions, and the data provide the dominant information in the posterior distribution. However, these priors are typically improper and may lead to improper posterior. Here, we show, for a general family of distributions, that the obtained objective priors for the parameters either follow a power-law distribution or has an asymptotic power-law behavior. As a result, we observed that the exponents of the model are between 0.5 and 1. Understand these behaviors allow us to easily verify if such priors lead to proper or improper posteriors directly from the exponent of the power-law. The general family considered in our study includes essential models such as Exponential, Gamma, Weibull, Nakagami-m, Haf-Normal, Rayleigh, Erlang, and Maxwell Boltzmann distributions, to list a few. In summary, we show that comprehending the mechanisms describing the shapes of the priors provides essential information that can be used in situations where additional complexity is presented.

preprint2020arXiv

Skewed link regression models for imbalanced binary response with applications to life insurance

For a portfolio of life insurance policies observed for a stated period of time, e.g., one year, mortality is typically a rare event. When we examine the outcome of dying or not from such portfolios, we have an imbalanced binary response. The popular logistic and probit regression models can be inappropriate for imbalanced binary response as model estimates may be biased, and if not addressed properly, it can lead to serious adverse predictions. In this paper, we propose the use of skewed link regression models (Generalized Extreme Value, Weibull, and Frechet link models) as more superior models to handle imbalanced binary response. We adopt a fully Bayesian approach for the generalized linear models (GLMs) under the proposed link functions to help better explain the high skewness. To calibrate our proposed Bayesian models, we use a real dataset of death claims experience drawn from a life insurance company's portfolio. Bayesian estimates of parameters were obtained using the Metropolis-Hastings algorithm and for Bayesian model selection and comparison, the Deviance Information Criterion (DIC) statistic has been used. For our mortality dataset, we find that these skewed link models are more superior than the widely used binary models with standard link functions. We evaluate the predictive power of the different underlying models by measuring and comparing aggregated death counts and death benefits.