Source author record

K. Gopinath

K. Gopinath appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Performance Distributed, Parallel, and Cluster Computing Operating Systems Programming Languages Software Engineering

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Are Markov Models Effective for Storage Reliability Modelling?

Continuous Time Markov Chains (CTMC) have been used extensively to model reliability of storage systems. While the exponentially distributed sojourn time of Markov models is widely known to be unrealistic (and it is necessary to consider Weibull-type models for components such as disks), recent work has also highlighted some additional infirmities with the CTMC model, such as the ability to handle repair times. Due to the memoryless property of these models, any failure or repair of one component resets the "clock" to zero with any partial repair or aging in some other subsystem forgotten. It has therefore been argued that simulation is the only accurate technique available for modelling the reliability of a storage system with multiple components. We show how both the above problematic aspects can be handled when we consider a careful set of approximations in a detailed model of the system. A detailed model has many states, and the transitions between them and the current state captures the "memory" of the various components. We model a non-exponential distribution using a sum of exponential distributions, along with the use of a CTMC solver in a probabilistic model checking tool that has support for reducing large state spaces. Furthermore, it is possible to get results close to what is obtained through simulation and at much lower cost.

preprint2015arXiv

Scalable Reliability Modelling of RAID Storage Subsystems

Reliability modelling of RAID storage systems with its various components such as RAID controllers, enclosures, expanders, interconnects and disks is important from a storage system designer's point of view. A model that can express all the failure characteristics of the whole RAID storage system can be used to evaluate design choices, perform cost reliability trade-offs and conduct sensitivity analyses. However, including such details makes the computational models of reliability quickly infeasible. We present a CTMC reliability model for RAID storage systems that scales to much larger systems than heretofore reported and we try to model all the components as accurately as possible. We use several state-space reduction techniques at the user level, such as aggregating all in-series components and hierarchical decomposition, to reduce the size of our model. To automate computation of reliability, we use the PRISM model checker as a CTMC solver where appropriate. Our modelling techniques using PRISM are more practical (in both time and effort) compared to previously reported Monte-Carlo simulation techniques. Our model for RAID storage systems (that includes, for example, disks, expanders, enclosures) uses Weibull distributions for disks and, where appropriate, correlated failure modes for disks, while we use exponential distributions with independent failure modes for all other components. To use the CTMC solver, we approximate the Weibull distribution for a disk using sum of exponentials and we confirm that this model gives results that are in reasonably good agreement with those from the sequential Monte Carlo simulation methods for RAID disk subsystems reported in literature earlier. Using a combination of scalable techniques, we are able to model and compute reliability for fairly large configurations with upto 600 disks using this model.

preprint2013arXiv

Distributed Wear levelling of Flash Memories

For large scale distributed storage systems, flash memories are an excellent choice because flash memories consume less power, take lesser floor space for a target throughput and provide faster access to data. In a traditional distributed filesystem, even distribution is required to ensure load-balancing, balanced space utilisation and failure tolerance. In the presence of flash memories, in addition, we should also ensure that the number of writes to these different flash storage nodes are evenly distributed, to ensure even wear of flash storage nodes, so that unpredictable failures of storage nodes are avoided. This requires that we distribute updates and do garbage collection, across the flash storage nodes. We have motivated the distributed wearlevelling problem considering the replica placement algorithm for HDFS. Viewing the wearlevelling across flash storage nodes as a distributed co-ordination problem, we present an alternate design, to reduce the message communication cost across participating nodes. We demonstrate the effectiveness of our design through simulation

preprint2013arXiv

LFTL: A multi-threaded FTL for a Parallel IO Flash Card under Linux

New PCI-e flash cards and SSDs supporting over 100,000 IOPs are now available, with several usecases in the design of a high performance storage system. By using an array of flash chips, arranged in multiple banks, large capacities are achieved. Such multi-banked architecture allow parallel read, write and erase operations. In a raw PCI-e flash card, such parallelism is directly available to the software layer. In addition, the devices have restrictions such as, pages within a block can only be written sequentially. The devices also have larger minimum write sizes (greater than 4KB). Current flash translation layers (FTLs) in Linux are not well suited for such devices due to the high device speeds, architectural restrictions as well as other factors such as high lock contention. We present a FTL for Linux that takes into account the hardware restrictions, that also exploits the parallelism to achieve high speeds. We also consider leveraging the parallelism for garbage collection by scheduling the garbage collection activities on idle banks. We propose and evaluate an adaptive method to vary the amount of garbage collection according to the current I/O load on the device.

preprint2008arXiv

Structure and Interpretation of Computer Programs

Call graphs depict the static, caller-callee relation between "functions" in a program. With most source/target languages supporting functions as the primitive unit of composition, call graphs naturally form the fundamental control flow representation available to understand/develop software. They are also the substrate on which various interprocedural analyses are performed and are integral part of program comprehension/testing. Given their universality and usefulness, it is imperative to ask if call graphs exhibit any intrinsic graph theoretic features -- across versions, program domains and source languages. This work is an attempt to answer these questions: we present and investigate a set of meaningful graph measures that help us understand call graphs better; we establish how these measures correlate, if any, across different languages and program domains; we also assess the overall, language independent software quality by suitably interpreting these measures.