Source author record

Srijan Sengupta

Srijan Sengupta appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Methodology math.CA Applications Computation math.GM physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

7works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Word Embeddings as Statistical Estimators

Word embeddings are a fundamental tool in natural language processing. Currently, word embedding methods are evaluated on the basis of empirical performance on benchmark data sets, and there is a lack of rigorous understanding of their theoretical properties. This paper studies word embeddings from a statistical theoretical perspective, which is essential for formal inference and uncertainty quantification. We propose a copula-based statistical model for text data and show that under this model, the now-classical Word2Vec method can be interpreted as a statistical estimation method for estimating the theoretical pointwise mutual information (PMI). Next, by building on the work of Levy and Goldberg (2014), we develop a missing value-based estimator as a statistically tractable and interpretable alternative to the Word2Vec approach. The estimation error of this estimator is comparable to Word2Vec and improves upon the truncation-based method proposed by Levy and Goldberg (2014). The proposed estimator also performs comparably to Word2Vec in a benchmark sentiment analysis task on the IMDb Movie Reviews data set.

preprint2021arXiv

The Value of Summary Statistics for Anomaly Detection in Temporally-Evolving Networks: A Performance Evaluation Study

Network data has emerged as an active research area in statistics. Much of the focus of ongoing research has been on static networks that represent a single snapshot or aggregated historical data unchanging over time. However, most networks result from temporally-evolving systems that exhibit intrinsic dynamic behavior. Monitoring such temporally-varying networks to detect anomalous changes has applications in both social and physical sciences. In this work, we perform an evaluation study of summary statistics for anomaly detection in temporally-evolving networks by incorporating principles from statistical process monitoring. In contrast to previous studies, we deliberately incorporate temporal auto-correlation in our study. Other considerations in our comprehensive assessment include types and duration of anomaly, model type, and sparsity in temporally-evolving networks. We conclude that summary statistics can be valuable tools for network monitoring and often perform better than more complicated statistics.

preprint2020arXiv

Scalable Estimation of Epidemic Thresholds via Node Sampling

Infectious or contagious diseases can be transmitted from one person to another through social contact networks. In today's interconnected global society, such contagion processes can cause global public health hazards, as exemplified by the ongoing Covid-19 pandemic. It is therefore of great practical relevance to investigate the network trans-mission of contagious diseases from the perspective of statistical inference. An important and widely studied boundary condition for contagion processes over networks is the so-called epidemic threshold. The epidemic threshold plays a key role in determining whether a pathogen introduced into a social contact network will cause an epidemic or die out. In this paper, we investigate epidemic thresholds from the perspective of statistical network inference. We identify two major challenges that are caused by high computational and sampling complexity of the epidemic threshold. We develop two statistically accurate and computationally efficient approximation techniques to address these issues under the Chung-Lu modeling framework. The second approximation, which is based on random walk sampling, further enjoys the advantage of requiring data on a vanishingly small fraction of nodes. We establish theoretical guarantees for both methods and demonstrate their empirical superiority.

preprint2015arXiv

A subsampled double bootstrap for massive data

The bootstrap is a popular and powerful method for assessing precision of estimators and inferential methods. However, for massive datasets which are increasingly prevalent, the bootstrap becomes prohibitively costly in computation and its feasibility is questionable even with modern parallel computing platforms. Recently Kleiner, Talwalkar, Sarkar, and Jordan (2014) proposed a method called BLB (Bag of Little Bootstraps) for massive data which is more computationally scalable with little sacrifice of statistical accuracy. Building on BLB and the idea of fast double bootstrap, we propose a new resampling method, the subsampled double bootstrap, for both independent data and time series data. We establish consistency of the subsampled double bootstrap under mild conditions for both independent and dependent cases. Methodologically, the subsampled double bootstrap is superior to BLB in terms of running time, more sample coverage and automatic implementation with less tuning parameters for a given time budget. Its advantage relative to BLB and bootstrap is also demonstrated in numerical simulations and a data illustration.

preprint2015arXiv

Analytic Solution of Linear Fractional Differential Equation with Jumarie Derivative in Term of Mittag-Leffler Function

There is no unified method to solve the fractional differential equation. The type of derivative here used in this paper is of Jumarie formulation, for the several differential equations studied. Here we develop an algorithm to solve the linear fractional differential equation composed via Jumarie fractional derivative in terms of Mittag-Leffler function; and show its conjugation with ordinary calculus. In these fractional differential equations the one parameter Mittag-Leffler function plays the role similar as exponential function used in ordinary differential equations.

preprint2015arXiv

Analytical solution with tanh-method and fractional sub-equation method for non-linear partial differential equations and corresponding fractional differential equation composed with Jumarie fractional derivative

The solution of non-linear differential equation, non-linear partial differential equation and non-linear fractional differential equation is current research in Applied Science. Here tanh-method and Fractional Sub-Equation methods are used to solve three non-linear differential equations and the corresponding fractional differential equation. The fractional differential equations here are composed with Jumarie fractional derivative. Both the solution is obtained in analytical traveling wave solution form. We have not come across solutions of these equations reported anywhere earlier.

preprint2015arXiv

Characterization of non-differentiable points in a function by Fractional derivative of Jumarrie type

There are many functions which are continuous everywhere but not differentiable at some points, like in physical systems of ECG, EEG plots, and cracks pattern and for several other phenomena. Using classical calculus those functions cannot be characterized-especially at the non-differentiable points. To characterize those functions the concept of Fractional Derivative is used. From the analysis it is established that though those functions are unreachable at the non-differentiable points, in classical sense but can be characterized using Fractional derivative. In this paper we demonstrate use of modified Riemann-Liouvelli derivative by Jumarrie to calculate the fractional derivatives of the non-differentiable points of a function, which may be one step to characterize and distinguish and compare several non-differentiable points in a system or across the systems. This method we are extending to differentiate various ECG graphs by quantification of non-differentiable points; is useful method in differential diagnostic. Each steps of calculating these fractional derivatives is elaborated.

Srijan Sengupta

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Word Embeddings as Statistical Estimators

The Value of Summary Statistics for Anomaly Detection in Temporally-Evolving Networks: A Performance Evaluation Study

Scalable Estimation of Epidemic Thresholds via Node Sampling

A subsampled double bootstrap for massive data

Analytic Solution of Linear Fractional Differential Equation with Jumarie Derivative in Term of Mittag-Leffler Function

Analytical solution with tanh-method and fractional sub-equation method for non-linear partial differential equations and corresponding fractional differential equation composed with Jumarie fractional derivative

Characterization of non-differentiable points in a function by Fractional derivative of Jumarrie type