Researcher profile

Peter Pietzuch

Peter Pietzuch contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

CAP-VMs: Capability-Based Isolation and Sharing for Microservices

Cloud stacks must isolate application components, while permitting efficient data sharing between components deployed on the same physical host. Traditionally, the MMU enforces isolation and permits sharing at page granularity. MMU approaches, however, lead to cloud stacks with large TCBs in kernel space, and page granularity requires inefficient OS interfaces for data sharing. Forthcoming CPUs with hardware support for memory capabilities offer new opportunities to implement isolation and sharing at a finer granularity. We describe cVMs, a new VM-like abstraction that uses memory capabilities to isolate application components while supporting efficient data sharing, all without mandating application code to be capability-aware. cVMs share a single virtual address space safely, each having only capabilities to access its own memory. A cVM may include a library OS, thus minimizing its dependency on the cloud environment. cVMs efficiently exchange data through two capability-based primitives assisted by a small trusted monitor: (i) an asynchronous read-write interface to buffers shared between cVMs; and (ii) a call interface to transfer control between cVMs. Using these two primitives, we build more expressive mechanisms for efficient cross-cVM communication. Our prototype implementation using CHERI RISC-V capabilities shows that cVMs isolate services (Redis and Python) with low overhead while improving data sharing.

preprint2022arXiv

CTR: Checkpoint, Transfer, and Restore for Secure Enclaves

Hardware-based Trusted Execution Environments (TEEs) are becoming increasingly prevalent in cloud computing, forming the basis for confidential computing. However, the security goals of TEEs sometimes conflict with existing cloud functionality, such as VM or process migration, because TEE memory cannot be read by the hypervisor, OS, or other software on the platform. Whilst some newer TEE architectures support migration of entire protected VMs, there is currently no practical solution for migrating individual processes containing in-process TEEs. The inability to migrate such processes leads to operational inefficiencies or even data loss if the host platform must be urgently restarted. We present CTR, a software-only design to retrofit migration functionality into existing TEE architectures, whilst maintaining their expected security guarantees. Our design allows TEEs to be interrupted and migrated at arbitrary points in their execution, thus maintaining compatibility with existing VM and process migration techniques. By cooperatively involving the TEE in the migration process, our design also allows application developers to specify stateful migration-related policies, such as limiting the number of times a particular TEE may be migrated. Our prototype implementation for Intel SGX demonstrates that migration latency increases linearly with the size of the TEE memory and is dominated by TEE system operations.

preprint2022arXiv

Dropbear: Machine Learning Marketplaces made Trustworthy with Byzantine Model Agreement

Marketplaces for machine learning (ML) models are emerging as a way for organizations to monetize models. They allow model owners to retain control over hosted models by using cloud resources to execute ML inference requests for a fee, preserving model confidentiality. Clients that rely on hosted models require trustworthy inference results, even when models are managed by third parties. While the resilience and robustness of inference results can be improved by combining multiple independent models, such support is unavailable in today's marketplaces. We describe Dropbear, the first ML model marketplace that provides clients with strong integrity guarantees by combining results from multiple models in a trustworthy fashion. Dropbear replicates inference computation across a model group, which consists of multiple cloud-based GPU nodes belonging to different model owners. Clients receive inference certificates that prove agreement using a Byzantine consensus protocol, even under model heterogeneity and concurrent model updates. To improve performance, Dropbear batches inference and consensus operations separately: it first performs the inference computation across a model group, before ordering requests and model updates. Despite its strong integrity guarantees, Dropbear's performance matches that of state-of-the-art ML inference systems: deployed across 3 cloud sites, it handles 800 requests/s with ImageNet models.

preprint2022arXiv

IA-CCF: Individual Accountability for Permissioned Ledgers

Permissioned ledger systems allow a consortium of members that do not trust one another to execute transactions safely on a set of replicas. Such systems typically use Byzantine fault tolerance (BFT) protocols to distribute trust, which only ensures safety when fewer than 1/3 of the replicas misbehave. Providing guarantees beyond this threshold is a challenge: current systems assume that the ledger is corrupt and fail to identify misbehaving replicas or hold the members that operate them accountable -- instead all members share the blame. We describe IA-CCF, a new permissioned ledger system that provides individual accountability. It can assign blame to the individual members that operate misbehaving replicas regardless of the number of misbehaving replicas or members. IA-CCF achieves this by signing and logging BFT protocol messages in the ledger, and by using Merkle trees to provide clients with succinct, universally-verifiable receipts as evidence of successful transaction execution. Anyone can audit the ledger against a set of receipts to discover inconsistencies and identify replicas that signed contradictory statements. IA-CCF also supports changes to consortium membership and replicas by tracking signing keys using a sub-ledger of governance transactions. IA-CCF provides strong disincentives to misbehavior with low overhead: it executes 47,000 tx/s while providing clients with receipts in two network round trips.

preprint2020arXiv

Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing

Serverless computing is an excellent fit for big data processing because it can scale quickly and cheaply to thousands of parallel functions. Existing serverless platforms isolate functions in ephemeral, stateless containers, preventing them from directly sharing memory. This forces users to duplicate and serialise data repeatedly, adding unnecessary performance and resource costs. We believe that a new lightweight isolation approach is needed, which supports sharing memory directly between functions and reduces resource overheads. We introduce Faaslets, a new isolation abstraction for high-performance serverless computing. Faaslets isolate the memory of executed functions using software-fault isolation (SFI), as provided by WebAssembly, while allowing memory regions to be shared between functions in the same address space. Faaslets can thus avoid expensive data movement when functions are co-located on the same machine. Our runtime for Faaslets, Faasm, isolates other resources, e.g. CPU and network, using standard Linux cgroups, and provides a low-level POSIX host interface for networking, file system access and dynamic loading. To reduce initialisation times, Faasm restores Faaslets from already-initialised snapshots. We compare Faasm to a standard container-based platform and show that, when training a machine learning model, it achieves a 2x speed-up with 10x less memory; for serving machine learning inference, Faasm doubles the throughput and reduces tail latency by 90%.

preprint2020arXiv

SGX-LKL: Securing the Host OS Interface for Trusted Execution

Hardware support for trusted execution in modern CPUs enables tenants to shield their data processing workloads in otherwise untrusted cloud environments. Runtime systems for the trusted execution must rely on an interface to the untrusted host OS to use external resources such as storage, network, and other functions. Attackers may exploit this interface to leak data or corrupt the computation. We describe SGX-LKL, a system for running Linux binaries inside of Intel SGX enclaves that only exposes a minimal, protected and oblivious host interface: the interface is (i) minimal because SGX-LKL uses a complete library OS inside the enclave, including file system and network stacks, which requires a host interface with only 7 calls; (ii) protected because SGX-LKL transparently encrypts and integrity-protects all data passed via low-level I/O operations; and (iii) oblivious because SGX-LKL performs host operations independently of the application workload. For oblivious disk I/O, SGX-LKL uses an encrypted ext4 file system with shuffled disk blocks. We show that SGX-LKL protects TensorFlow training with a 21% overhead.

preprint2020arXiv

The EuroSys 2020 Online Conference: Experience and lessons learned

The 15th European Conference on Computer Systems (EuroSys'20) was organized as a virtual (online) conference on April 27-30, 2020. The main EuroSys'20 track took place April 28-30, 2020, preceded by five workshops (EdgeSys'20, EuroDW'20, EuroSec'20, PaPoC'20, SPMA'20) on April 27, 2020. The decision to hold a virtual (online) conference was taken in early April 2020, after consultations with the EuroSys community and internal discussions about potential options, eventually allowing about three weeks for the organization. This paper describes the choices we made to organize EuroSys'20 as a virtual (online) conference, the challenges we addressed, and the lessons learned.