Researcher profile

Pierre Sutra

Pierre Sutra contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Revisiting Lower Bounds for Two-Step Consensus

A seminal result by Lamport shows that at least $\max\{2e+f+1,2f+1\}$ processes are required to implement partially synchronous consensus that tolerates $f$ process failures and can furthermore decide in two message delays under $e$ failures. This lower bound is matched by the classical Fast Paxos protocol. However, more recent practical protocols, such as Egalitarian Paxos, provide two-step decisions with fewer processes, seemingly contradicting the lower bound. We show that this discrepancy arises because the classical bound requires two-step decisions under a wide range of scenarios, not all of which are relevant in practice. We propose a more pragmatic condition for which we establish tight bounds on the number of processes required. Interestingly, these bounds depend on whether consensus is implemented as an atomic object or a decision task. For consensus as an object, $\max\{2e+f-1,2f+1\}$ processes are necessary and sufficient for two-step decisions, while for a task the tight bound is $\max\{2e+f, 2f+1\}$.

preprint2022arXiv

The Weakest Failure Detector for Genuine Atomic Multicast (Extended Version)

Atomic broadcast is a group communication primitive to order messages across a set of distributed processes. Atomic multicast is its natural generalization where each message $m$ is addressed to $dst(m)$, a subset of the processes called its destination group. A solution to atomic multicast is genuine when a process takes steps only if a message is addressed to it. Genuine solutions are the ones used in practice because they have better performance. Let $G$ be all the destination groups and $F$ be the cyclic families in it, that is the subsets of $G$ whose intersection graph is hamiltonian. This paper establishes that the weakest failure detector to solve genuine atomic multicast is $μ=(\wedge_{g,h \in G}~Σ_{g \cap h}) \wedge (\wedge_{g \in G}~Ω_g) \wedge γ$, where (i) $Σ_P$ and $Ω_P$ are the quorum and leader failure detectors restricted to the processes in $P$, and (ii) $γ$ is a new failure detector that informs the processes in a cyclic family $f \in F$ when $f$ is faulty. We also study two classical variations of atomic multicast. The first variation requires that message delivery follows the real-time order. In this case, $μ$ must be strengthened with $1^{g \cap h}$, the indicator failure detector that informs each process in $g \cup h$ when $g \cap h$ is faulty. The second variation requires a message to be delivered when the destination group runs in isolation. We prove that its weakest failure detector is at least $μ\wedge (\wedge_{g, h \in G}~Ω_{g \cap h})$. This value is attained when $F=\varnothing$.

preprint2020arXiv

Leaderless State-Machine Replication: Specification, Properties, Limits (Extended Version)

Modern Internet services commonly replicate critical data across several geographical locations using state-machine replication (SMR). Due to their reliance on a leader replica, classical SMR protocols offer limited scalability and availability in this setting. To solve this problem, recent protocols follow instead a leaderless approach, in which each replica is able to make progress using a quorum of its peers. In this paper, we study this new emerging class of SMR protocols and states some of their limits. We first propose a framework that captures the essence of leaderless state-machine replication (Leaderless SMR). Then, we introduce a set of desirable properties for these protocols: (R)eliability, (O)ptimal (L)atency and (L)oad Balancing. We show that protocols matching all of the ROLL properties are subject to a trade-off between performance and reliability. We also establish a lower bound on the message delay to execute a command in protocols optimal for the ROLL properties. This lower bound explains the persistent chaining effect observed in experimental results.

preprint2020arXiv

State-Machine Replication for Planet-Scale Systems (Extended Version)

Online applications now routinely replicate their data at multiple sites around the world. In this paper we present Atlas, the first state-machine replication protocol tailored for such planet-scale systems. Atlas does not rely on a distinguished leader, so clients enjoy the same quality of service independently of their geographical locations. Furthermore, client-perceived latency improves as we add sites closer to clients. To achieve this, Atlas minimizes the size of its quorums using an observation that concurrent data center failures are rare. It also processes a high percentage of accesses in a single round trip, even when these conflict. We experimentally demonstrate that Atlas consistently outperforms state-of-the-art protocols in planet-scale scenarios. In particular, Atlas is up to two times faster than Flexible Paxos with identical failure assumptions, and more than doubles the performance of Egalitarian Paxos in the YCSB benchmark.

preprint2014arXiv

A Practical Distributed Universal Construction with Unknown Participants

Modern distributed systems employ atomic read-modify-write primitives to coordinate concurrent operations. Such primitives are typically built on top of a central server, or rely on an agreement protocol. Both approaches provide a universal construction, that is, a general mechanism to construct atomic and responsive objects. These two techniques are however known to be inherently costly. As a consequence, they may result in bottlenecks in applications using them for coordination. In this paper, we investigate another direction to implement a universal construction. Our idea is to delegate the implementation of the universal construction to the clients, and solely implement a distributed shared atomic memory at the servers side. The construction we propose is obstruction-free. It can be implemented in a purely asynchronous manner, and it does not assume the knowledge of the participants. It is built on top of grafarius and racing objects, two novel shared abstractions that we introduce in detail. To assess the benefits of our approach, we present a prototype implementation on top of the Cassandra data store, and compare it empirically to the Zookeeper coordination service.

preprint2013arXiv

Non-Monotonic Snapshot Isolation

Many distributed applications require transactions. However, transactional protocols that require strong synchronization are costly in large scale environments. Two properties help with scalability of a transactional system: genuine partial replication (GPR), which leverages the intrinsic parallelism of a workload, and snapshot isolation (SI), which decreases the need for synchronization. We show that, under standard assumptions (data store accesses are not known in advance, and transactions may access arbitrary objects in the data store), it is impossible to have both SI and GPR. To circumvent this impossibility, we propose a weaker consistency criterion, called Non-monotonic Snapshot Isolation (NMSI). NMSI retains the most important properties of SI, i.e., read-only transactions always commit, and two write-conflicting updates do not both commit. We present a GPR protocol that ensures NMSI, and has lower message cost (i.e., it contacts fewer replicas and/or commits faster) than previous approaches.