Researcher profile

Samiran Ghosh

Samiran Ghosh contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2020arXiv

Shock waves in a rotating non-Maxwellian magnetized dusty plasma

A theoretical model is presented to study characteristics of dust acoustic shock in a viscous, magnetized and rotating dusty plasma at both fast and slow time scales. By employing reductive perturbation technique the nonlinear Zakharov--Kuznetsov (ZK) equation has been derived for both cases when dust is inactive and dynamic (fast and slow time scales). Both electrons and ions are considered to follow kappa/Cairns distribution. It is observed that the viscosity in both cases when dust is in background and active plays as a key role in dissipation for the propagation of acoustic shock. Magnetic field and rotation are responsible for the dispersive term. Superthermality has been found to affect significantly on the formation of shock wave along with viscous nature of plasma. The present investigation may be beneficial to understanding the rotating plasma in particular experiments being carried out.

preprint2019arXiv

Robust Variable Selection Criteria for the Penalized Regression

We propose a robust variable selection procedure using a divergence based M-estimator combined with a penalty function. It produces robust estimates of the regression parameters and simultaneously selects the important explanatory variables. An efficient algorithm based on the quadratic approximation of the estimating equation is constructed. The asymptotic distribution and the influence function of the regression coefficients are derived. The widely used model selection procedures based on the Mallows's $C_p$ statistic and Akaike information criterion (AIC) often show very poor performance in the presence of heavy-tailed error or outliers. For this purpose, we introduce robust versions of these information criteria based on our proposed method. The simulation studies show that the robust variable selection technique outperforms the classical likelihood-based techniques in the presence of outliers. The performance of the proposed method is also explored through the real data analysis.

preprint2012arXiv

Outlier detection from ETL Execution trace

Extract, Transform, Load (ETL) is an integral part of Data Warehousing (DW) implementation. The commercial tools that are used for this purpose captures lot of execution trace in form of various log files with plethora of information. However there has been hardly any initiative where any proactive analyses have been done on the ETL logs to improve their efficiency. In this paper we utilize outlier detection technique to find the processes varying most from the group in terms of execution trace. As our experiment was carried on actual production processes, any outlier we would consider as a signal rather than a noise. To identify the input parameters for the outlier detection algorithm we employ a survey among developer community with varied mix of experience and expertise. We use simple text parsing to extract these features from the logs, as shortlisted from the survey. Subsequently we applied outlier detection technique (Clustering based) on the logs. By this process we reduced our domain of detailed analysis from 500 logs to 44 logs (8 Percentage). Among the 5 outlier cluster, 2 of them are genuine concern, while the other 3 figure out because of the huge number of rows involved.

preprint2012arXiv

Outlier Detection Techniques for SQL and ETL Tuning

RDBMS is the heart for both OLTP and OLAP types of applications. For both types of applications thousands of queries expressed in terms of SQL are executed on daily basis. All the commercial DBMS engines capture various attributes in system tables about these executed queries. These queries need to conform to best practices and need to be tuned to ensure optimal performance. While we use checklists, often tools to enforce the same, a black box technique on the queries for profiling, outlier detection is not employed for a summary level understanding. This is the motivation of the paper, as this not only points out to inefficiencies built in the system, but also has the potential to point evolving best practices and inappropriate usage. Certainly this can reduce latency in information flow and optimal utilization of hardware and software capacity. In this paper we start with formulating the problem. We explore four outlier detection techniques. We apply these techniques over rich corpora of production queries and analyze the results. We also explore benefit of an ensemble approach. We conclude with future courses of action. The same philosophy we have used for optimization of extraction, transform, load (ETL) jobs in one of our previous work. We give a brief introduction of the same in section four.

preprint2011arXiv

An imputation-based approach for parameter estimation in the presence of ambiguous censoring with application in industrial supply chain

This paper describes a novel approach based on "proportional imputation" when identical units produced in a batch have random but independent installation and failure times. The current problem is motivated by a real life industrial production-delivery supply chain where identical units are shipped after production to a third party warehouse and then sold at a future date for possible installation. Due to practical limitations, at any given time point, the exact installation as well as the failure times are known for only those units which have failed within that time frame after the installation. Hence, in-house reliability engineers are presented with a very limited, as well as partial, data to estimate different model parameters related to installation and failure distributions. In reality, other units in the batch are generally not utilized due to lack of proper statistical methodology, leading to gross misspecification. In this paper we have introduced a likelihood based parametric and computationally efficient solution to overcome this problem.