Researcher profile

Srikumar Venugopal

Srikumar Venugopal contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2013arXiv

"The tail wags the dog": A study of anomaly detection in commercial application performance

The IT industry needs systems management models that leverage available application information to detect quality of service, scalability and health of service. Ideally this technique would be common for varying application types with different n-tier architectures under normal production conditions of varying load, user session traffic, transaction type, transaction mix, and hosting environment. This paper shows that a whole of service measurement paradigm utilizing a black box M/M/1 queuing model and auto regression curve fitting of the associated CDF are an accurate model to characterize system performance signatures. This modeling method is also used to detect application slow down events. The technique was shown to work for a diverse range of workloads ranging from 76 Tx/ 5min to 19,025 Tx/ 5min. The method did not rely on customizations specific to the n-tier architecture of the systems being analyzed and so the performance anomaly detection technique was shown to be platform and configuration agnostic.

preprint2013arXiv

Big Data and Cross-Document Coreference Resolution: Current State and Future Opportunities

Information Extraction (IE) is the task of automatically extracting structured information from unstructured/semi-structured machine-readable documents. Among various IE tasks, extracting actionable intelligence from ever-increasing amount of data depends critically upon Cross-Document Coreference Resolution (CDCR) - the task of identifying entity mentions across multiple documents that refer to the same underlying entity. Recently, document datasets of the order of peta-/tera-bytes has raised many challenges for performing effective CDCR such as scaling to large numbers of mentions and limited representational power. The problem of analysing such datasets is called "big data". The aim of this paper is to provide readers with an understanding of the central concepts, subtasks, and the current state-of-the-art in CDCR process. We provide assessment of existing tools/techniques for CDCR subtasks and highlight big data challenges in each of them to help readers identify important and outstanding issues for further investigation. Finally, we provide concluding remarks and discuss possible directions for future work.

preprint2013arXiv

Scalable Protein Sequence Similarity Search using Locality-Sensitive Hashing and MapReduce

Metagenomics is the study of environments through genetic sampling of their microbiota. Metagenomic studies produce large datasets that are estimated to grow at a faster rate than the available computational capacity. A key step in the study of metagenome data is sequence similarity searching which is computationally intensive over large datasets. Tools such as BLAST require large dedicated computing infrastructure to perform such analysis and may not be available to every researcher. In this paper, we propose a novel approach called ScalLoPS that performs searching on protein sequence datasets using LSH (Locality-Sensitive Hashing) that is implemented using the MapReduce distributed framework. ScalLoPS is designed to scale across computing resources sourced from cloud computing providers. We present the design and implementation of ScalLoPS followed by evaluation with datasets derived from both traditional as well as metagenomic studies. Our experiments show that with this method approximates the quality of BLAST results while improving the scalability of protein sequence search.

preprint2011arXiv

Cost of Virtual Machine Live Migration in Clouds: A Performance Evaluation

Virtualization has become commonplace in modern data centers, often referred as "computing clouds". The capability of virtual machine live migration brings benefits such as improved performance, manageability and fault tolerance, while allowing workload movement with a short service downtime. However, service levels of applications are likely to be negatively affected during a live migration. For this reason, a better understanding of its effects on system performance is desirable. In this paper, we evaluate the effects of live migration of virtual machines on the performance of applications running inside Xen VMs. Results show that, in most cases, migration overhead is acceptable but cannot be disregarded, especially in systems where availability and responsiveness are governed by strict Service Level Agreements. Despite that, there is a high potential for live migration applicability in data centers serving modernInternet applications. Our results are based on a workload covering the domain of multi-tier Web 2.0 applications.