Source author record

Hengfeng Wei

Hengfeng Wei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Databases Software Engineering

Catalog footprint

What is connected

7works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

MET: Model Checking-Driven Explorative Testing of CRDT Designs and Implementations

Internet-scale distributed systems often replicate data at multiple geographic locations to provide low latency and high availability. The Conflict-free Replicated Data Type (CRDT) is a framework that provides a principled approach to maintaining eventual consistency among data replicas. CRDTs have been notoriously difficult to design and implement correctly. Subtle deep bugs lie in the complex and tedious handling of all possible cases of conflicting data updates. We argue that the CRDT design should be formally specified and model-checked to uncover deep bugs. The implementation further needs to be systematically tested. On the one hand, the testing needs to inherit the exhaustive nature of the model checking and ensures the coverage of testing. On the other hand, the testing is expected to find coding errors which cannot be detected by design level verification. Towards the challenges above, we propose the Model Checking-driven Explorative Testing (MET) framework. At the design level, MET uses TLA+ to specify and model check CRDT designs. At the implementation level, MET conducts model checking-driven explorative testing, in the sense that the test cases are automatically generated from the model checking traces. The system execution is controlled to proceed deterministically, following the model checking trace. The explorative testing systematically controls and permutes all nondeterministic message reorderings. We apply MET in our practical development of CRDTs. The bugs in both designs and implementations of CRDTs are found. As for bugs which can be found by traditional testing techniques, MET greatly reduces the cost of fixing the bugs. Moreover, MET can find subtle deep bugs which cannot be found by existing techniques at a reasonable cost. We further discuss how MET provides us with sufficient confidence in the correctness of our CRDT designs and implementations.

preprint2022arXiv

Remove-Win: a Design Framework for Conflict-free Replicated Data Types

Distributed storage systems employ replication to improve performance and reliability. To provide low latency data access, replicas are often required to accept updates without coordination with each other, and the updates are then propagated asynchronously. This brings the critical challenge of conflict resolution among concurrent updates. Conflict-free Replicated Data Type (CRDT) is a principled approach to addressing this challenge. However, existing CRDT designs are tricky, and hard to be generalized to other data types. A design framework is in great need to guide the systematic design of new CRDTs. To address this challenge, we propose RWF -- the Remove-Win design Framework for CRDTs. RWF leverages the simple but powerful remove-win strategy to resolve conflicting updates, and provides generic design for a variety of data container types. Two exemplar implementations following RWF are given over the Redis data type store, which demonstrate the effectiveness of RWF. Performance measurements of our implementations further show the efficiency of CRDT designs following RWF.

preprint2022arXiv

Verifying Transactional Consistency of MongoDB

MongoDB is a popular general-purpose, document-oriented, distributed NoSQL database. It supports transactions in three different deployments: single-document transactions utilizing the WiredTiger storage engine in a standalone node, multi-document transactions in a replica set which consists of a primary node and several secondary nodes, and distributed transactions in a sharded cluster which is a group of multiple replica sets, among which data is sharded. A natural and fundamental question about MongoDB transactions is: What transactional consistency guarantee do MongoDB Transactions in each deployment provide? However, it lacks both concise pseudocode of MongoDB transactions in each deployment and formal specification of the consistency guarantees which MongoDB claimed to provide. In this work, we formally specify and verify the transactional consistency protocols of MongoDB. Specifically, we provide a concise pseudocode for the transactional consistency protocols in each MongoDB deployment, namely WIREDTIGER, REPLICASET, and SHARDEDCLUSTER, based on the official documents and source code. We then prove that WIREDTIGER, REPLICASET, and SHARDEDCLUSTER satisfy different variants of snapshot isolation, namely Strong-SI, Realtime-SI, and Session-SI, respectively. We also propose and evaluate efficient white-box checking algorithms for MongoDB transaction protocols against their consistency guarantees, effectively circumventing the NP-hard obstacle in theory.

preprint2022arXiv

ViSearch: Weak Consistency Measurement for Replicated Data Types

Large-scale replicated data type stores often resort to eventual consistency to guarantee low latency and high availability. It is widely accepted that programming over eventually consistent data stores is challenging, since arbitrary divergence among replicas is allowed. Moreover, pragmatic protocols actually achieve consistency guarantees stronger than eventual consistency, which can be and need to be utilized to facilitate the reasoning of and programming over replicated data types. Toward the challenges above, we propose the ViSearch framework for precise measurement of eventual consistency semantics. ViSearch employs the visibility-arbitration specification methodology in concurrent programming, which extends the linearizability-based specification methodology with a dynamic visibility relation among operations, in addition to the standard dynamic happen-before and linearization relations. The consistency measurement using ViSearch is NP-hard in general. To enable practical and efficient consistency measurement in replicated data type stores, the ViSearch framework refactors the existing brute-force checking algorithm to a generic algorithm skeleton, which further enables efficient pruning of the search space and effective parallelization. We employ the ViSearch framework for consistency measurement in two replicated data type stores Riak and CRDT-Redis. The experimental evaluation shows the usefulness and cost-effectiveness of consistency measurement based on the ViSearch framework in realistic scenarios.

preprint2020arXiv

Fine-grained Analysis on Fast Implementations of Distributed Multi-writer Atomic Registers

Distributed multi-writer atomic registers are at the heart of a large number of distributed algorithms. While enjoying the benefits of atomicity, researchers further explore fast implementations of atomic reigsters which are optimal in terms of data access latency. Though it is proved that multi-writer atomic register implementations are impossible when both read and write are required to be fast, it is still open whether implementations are impossible when only write or read is required to be fast. This work proves the impossibility of fast write implementations based on a series of chain arguments among indistiguishable executions. We also show the necessary and sufficient condition for fast read implementations by extending the results in the single-writer case. This work concludes a series of studies on fast implementations of distributed atomic registers.

preprint2015arXiv

Almost Strong Consistency: "Good Enough" in Distributed Storage Systems

A consistency/latency tradeoff arises as soon as a distributed storage system replicates data. For low latency, modern storage systems often settle for weak consistency conditions, which provide little, or even worse, no guarantee for data consistency. In this paper we propose the notion of almost strong consistency as a better balance option for the consistency/latency tradeoff. It provides both deterministically bounded staleness of data versions for each read and probabilistic quantification on the rate of "reading stale values", while achieving low latency. In the context of distributed storage systems, we investigate almost strong consistency in terms of 2-atomicity. Our 2AM (2-Atomicity Maintenance) algorithm completes both reads and writes in one communication round-trip, and guarantees that each read obtains the value of within the latest 2 versions. To quantify the rate of "reading stale values", we decompose the so-called "old-new inversion" phenomenon into concurrency patterns and read-write patterns, and propose a stochastic queueing model and a "timed balls-into-bins model" to analyze them, respectively. The theoretical analysis not only demonstrates that "old-new inversions" rarely occur as expected, but also reveals that the read-write pattern dominates in guaranteeing such rare data inconsistencies. These are further confirmed by the experimental results, showing that 2-atomicity is "good enough" in distributed storage systems by achieving low latency, bounded staleness, and rare data inconsistencies.

preprint2013arXiv

Verifying PRAM Consistency over Read/Write Traces of Data Replicas

Data replication technologies enable efficient and highly-available data access, thus gaining more and more interests in both the academia and the industry. However, data replication introduces the problem of data consistency. Modern commercial data replication systems often provide weak consistency for high availability under certain failure scenarios. An important weak consistency is Pipelined-RAM (PRAM) consistency. It allows different processes to hold different views of data. To determine whether a data replication system indeed provides PRAM consistency, we study the problem of Verifying PRAM Consistency over read/write traces (or VPC, for short). We first identify four variants of VPC according to a) whether there are Multiple shared variables (or one Single variable), and b) whether write operations can assign Duplicate values (or only Unique values) for each shared variable; the four variants are labeled VPC-SU, VPC-MU, VPC-SD, and VPC-MD. Second, we present a simple VPC-MU algorithm, called RW-CLOSURE. It constructs an operation graph $\mathcal{G}$ by iteratively adding edges according to three rules. Its time complexity is $O(n^5)$, where n is the number of operations in the trace. Third, we present an improved VPC-MU algorithm, called READ-CENTRIC, with time complexity $O(n^4)$. Basically it attempts to construct the operation graph $\mathcal{G}$ in an incremental and efficient way. Its correctness is based on that of RW-CLOSURE. Finally, we prove that VPC-SD (so is VPC-MD) is $\sf{NP}$-complete by reducing the strongly $\sf{NP}$-complete problem 3-PARTITION to it.

Hengfeng Wei

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

MET: Model Checking-Driven Explorative Testing of CRDT Designs and Implementations

Remove-Win: a Design Framework for Conflict-free Replicated Data Types

Verifying Transactional Consistency of MongoDB

ViSearch: Weak Consistency Measurement for Replicated Data Types

Fine-grained Analysis on Fast Implementations of Distributed Multi-writer Atomic Registers

Almost Strong Consistency: "Good Enough" in Distributed Storage Systems

Verifying PRAM Consistency over Read/Write Traces of Data Replicas