Researcher profile

David Bermbach

David Bermbach contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

FaaSMoE: A Serverless Framework for Multi-Tenant Mixture-of-Experts Serving

Mixture-of-Experts (MoE) models offer high capacity with efficient inference cost by activating a small subset of expert models per input. However, deploying MoE models requires all experts to reside in memory, creating a gap between the resource used by activated experts and the provisioned resources. This underutilization is further pronounced in multi-tenant scenarios. In this paper, we propose FaaSMoE, a multi-tenant MoE serving architecture built on Function-as-a-Service (FaaS) platforms. FaaSMoE decouples the control and execution planes of MoE by deploying experts as stateless FaaS functions, enabling on-demand and scale-to-zero expert invocation across tenants. FaaSMoE further supports configurable expert granularity within functions, trading off per-expert elasticity for reduced invocation overhead. We implement a prototype with an open-source edge-oriented FaaS platform and evaluate it using Qwen1.5-moe-2.7B under multi-tenant workloads. Compared to a full-model baseline, FaaSMoE uses less than one third of the resources, demonstrating a practical and resource-efficient path towards scalable MoE serving in a multi-tenant environment.

preprint2026arXiv

FLAM: Evaluating Model Performance with Aggregatable Measures in Federated Learning

Performance evaluation is essential for assessing the quality of machine learning (ML) models and guiding deployment decisions. In federated learning (FL), assessing the performance is challenging because data are distributed across participants. Consequently, the coordinator must rely on locally computed evaluation metrics and aggregate them to assess the global model. A key challenge is that common aggregation strategies, such as weighted averaging based on the local samples per participant, do not always produce the same results as centralized evaluation. Existing definitions of performance evaluation are largely tailored to accuracy and do not generalize to other metrics, leading to inconsistencies between participant-based and centralized evaluation. However, such discrepancies are inconsistent with the FL objective and lead to a wrong calculation of the metric. To address this issue, we examine the underlying reasons for these discrepancies and propose FLAM, a performance evaluation method based on aggregatable measures that yields the same results as centralized evaluation without the need for a global test dataset.

preprint2026arXiv

Konflux: Optimized Function Fusion for Serverless Applications

Function-as-a-Service (FaaS) has become a central paradigm in serverless cloud computing, yet optimizing FaaS deployments remains challenging. Using function fusion, multiple functions can be combined into a single deployment unit, which can be used to reduce cost and latency of complex serverless applications comprising multiple functions. Even in small-scale applications, the number of possible fusion configurations is vast, making brute-force benchmarking in production both cost- and time-prohibitive. In this paper, we present a system that can analyze every possible fusion setup of complex applications. By emulating the FaaS platform, our system enables local experimentation, eliminating the need to reconfigure the live platform and significantly reducing associated cost and time. We evaluate all fusion configurations across a number of example FaaS applications and resource limits. Our results reveal that, when analyzing cost and latency trade-offs, only a limited set of fusion configurations represent optimal solutions, which are strongly influenced by the specific pricing model in use.

preprint2022arXiv

Celestial: Virtual Software System Testbeds for the LEO Edge

As private space companies such as SpaceX and Telesat are building large LEO satellite constellations to provide global broadband Internet access, researchers have proposed to embed compute services within satellite constellations to provide computing services on the LEO edge. While the LEO edge is merely theoretical at the moment, providers are expected to rapidly develop their satellite technologies to keep the upper hand in the new space race. In this paper, we answer the question of how researchers can explore the possibilities of LEO edge computing and evaluate arbitrary software systems in an accurate runtime environment and with cost-efficient scalability. To that end, we present Celestial, a virtual testbed for the LEO edge based on microVMs. Celestial can efficiently emulate individual satellites and their movement as well as ground station servers with realistic network conditions and in an application-agnostic manner, which we show empirically. Additionally, we explore opportunities and implications of deploying a real-time remote sensing application on LEO edge infrastructure in a case study on Celestial.

preprint2022arXiv

Fusionize: Improving Serverless Application Performance through Feedback-Driven Function Fusion

Serverless computing increases developer productivity by removing operational concerns such as managing hardware or software runtimes. Developers, however, still need to partition their application into functions, which can be error-prone and adds complexity: Using a small function size where only the smallest logical unit of an application is inside a function maximizes flexibility and reusability. Yet, having small functions leads to invocation overheads, additional cold starts, and may increase cost due to busy waiting. In this paper we present Fusionize, a framework that removes these concerns from developers by automatically fusing the application code into a multi-function orchestration with varying function size. Developers only need to write the application code following a lightweight programming model and do not need to worry how the application is turned into functions. Our framework automatically fuses different parts of the application into functions and manages their interactions. Leveraging monitoring data, the framework optimizes the distribution of application parts to functions to optimize deployment goals such as end-to-end latency and cost. Using two example applications, we show that Fusionize can automatically and iteratively improve the deployment artifacts of the application.

preprint2022arXiv

Network Emulation in Large-Scale Virtual Edge Testbeds: A Note of Caution and the Way Forward

The growing research and industry interest in the Internet of Things and the edge computing paradigm has increased the need for cost-efficient virtual testbeds for large-scale distributed applications. Researchers, students, and practitioners need to test and evaluate the interplay of hundreds or thousands of real software components and services connected with a realistic edge network without access to physical infrastructure. While advances in virtualization technologies have enabled parts of this, network emulation as a crucial part in the development of edge testbeds is lagging behind: As we show in this paper, NetEm, the current state-of-the-art network emulation tooling included in the Linux kernel, imposes prohibitive scalability limits. We quantify these limits, investigate possible causes, and present a way forward for network emulation in large-scale virtual edge testbeds based on eBPFs.

preprint2022arXiv

QoS-Aware Resource Placement for LEO Satellite Edge Computing

With the advent of large LEO satellite communication networks to provide global broadband Internet access, interest in providing edge computing resources within LEO networks has emerged. The LEO Edge promises low-latency, high-bandwidth access to compute and storage resources for a global base of clients and IoT devices regardless of their geographical location. Current proposals assume compute resources or service replicas at every LEO satellite, which requires high upfront investments and can lead to over-provisioning. To implement and use the LEO Edge efficiently, methods for server and service placement are required that help select an optimal subset of satellites as server or service replica locations. In this paper, we show how the existing research on resource placement on a 2D torus can be applied to this problem by leveraging the unique topology of LEO satellite networks. Further, we extend the existing discrete resource placement methods to allow placement with QoS constraints. In simulation of proposed LEO satellite communication networks, we show how QoS depends on orbital parameters and that our proposed method can take these effects into account where the existing approach cannot.

preprint2022arXiv

Streaming vs. Functions: A Cost Perspective on Cloud Event Processing

In cloud event processing, data generated at the edge is processed in real-time by cloud resources. Both distributed stream processing (DSP) and Function-as-a-Service (FaaS) have been proposed to implement such event processing applications. FaaS emphasizes fast development and easy operation, while DSP emphasizes efficient handling of large data volumes. Despite their architectural differences, both can be used to model and implement loosely-coupled job graphs. In this paper, we consider the selection of FaaS and DSP from a cost perspective. We implement stateless and stateful workflows from the Theodolite benchmarking suite using cloud FaaS and DSP. In an extensive evaluation, we show how application type, cloud service provider, and runtime environment can influence the cost of application deployments and derive decision guidelines for cloud engineers.

preprint2022arXiv

Towards Distributed Coordination for Fog Platforms

Distributed fog and edge applications communicate over unreliable networks and are subject to high communication delays. This makes using existing distributed coordination technologies from cloud applications infeasible, as they are built on the assumption of a highly reliable, low-latency datacenter network to achieve strict consistency with low overheads. To help implement configuration and state management for fog platforms and applications, we propose a novel decentralized approach that lets systems specify coordination strategies and membership for different sets of coordination data.

preprint2021arXiv

Edge (of the Earth) Replication: Optimizing Content Delivery in Large LEO Satellite Communication Networks

Large low earth orbit (LEO) satellite networks such as SpaceX's Starlink constellation promise to deliver low-latency, high-bandwidth Internet access with global coverage. As an alternative to terrestrial fiber as a global Internet backbone, they could potentially serve billions of Internet-connected devices. Currently, operators of CDNs exploit the hierarchical topology of the Internet to place points-of-presence near users, yet this approach is no longer possible when the topology changes to a single, wide-area, converged access and backhaul network. In this paper, we explore the opportunities of points-of-presence for CDNs within the satellite network itself, as it could provide better access latency for users while reducing operational costs for the satellite Internet service providers. We propose four strategies for selecting points-of-presence in satellite constellations that we evaluate through extensive simulation. In one case, we find that replicating web content within satellites can reduce bandwidth use in the constellation by 93% over an approach without replication in the network, while storing just 0.01% of all content in individual satellites.

preprint2021arXiv

Towards Grassroots Peering at the Edge

Fog Computing allows applications to address their latency and privacy requirements while coping with bandwidth limitations of Internet service providers (ISPs). Existing research on fog systems has so far mostly taken a very high-level view on the actual fog infrastructure. In this position paper, we identify and discuss the problem of having multiple ISPs in edge-to-edge communication. As a possible solution we propose that edge operators create direct edge-to-edge links in a grassroots fashion and discuss different implementation options. Based on this, we highlight some important open research challenges that result from this.

preprint2020arXiv

Benchmarking Web API Quality -- Revisited

Modern applications increasingly interact with web APIs -- reusable components, deployed and operated outside the application, and accessed over the network. Their existence, arguably, spurs application innovations, making it easy to integrate data or functionalities. While previous work has analyzed the ecosystem of web APIs and their design, little is known about web API quality at runtime. This gap is critical, as qualities including availability, latency, or provider security preferences can severely impact applications and user experience. In this paper, we revisit a 3-month, geo-distributed benchmark of popular web APIs, originally performed in 2015. We repeat this benchmark in 2018 and compare results from these two benchmarks regarding availability and latency. We furthermore introduce new results from assessing provider security preferences, collected both in 2015 and 2018, and results from our attempts to reach out to API providers with the results from our 2015 experiments. Our extensive experiments show that web API qualities vary 1.) based on the geo-distribution of clients, 2.) during our individual experiments, and 3.) between the two experiments. Our findings provide evidence to foster the discussion around web API quality, and can act as a basis for the creation of tools and approaches to mitigate quality issues.

preprint2020arXiv

FBase: A Replication Service for Data-Intensive Fog Applications

The combination of edge and cloud in the fog computing paradigm enables a new breed of data-intensive applications. These applications, however, have to face a number of fog-specific challenges which developers have to repetitively address for every single application. In this paper, we propose a replication service specifically tailored to the needs of data-intensive fog applications that aims to ease or eliminate challenges caused by the highly distributed and heterogeneous environment fog applications operate in. Furthermore, we present our prototypical proof-of-concept implementation FBase that we have made available as open source.

preprint2020arXiv

Fog Computing as Privacy Enabler

Despite broad discussions on privacy challenges arising from fog computing, the authors argue that privacy and security requirements might actually drive the adoption of fog computing. They present four patterns of fog computing fostering data privacy and the security of business secrets, complementing existing cryptographic approaches. Their practical application is illuminated on the basis of three case studies.

preprint2020arXiv

GeoBroker: Leveraging Geo-Contexts for IoT Data Distribution

In the Internet of Things, the relevance of data often depends on the geographic context of data producers and consumers. Today's data distribution services, however, mostly focus on data content and not on geo-context, which could help to reduce the dissemination of excess data in many IoT scenarios. In this paper, we propose to use the geo-context information associated with devices to control data distribution. We define what geo-context dimensions exist and compare our definition with concepts from related work. Furthermore, we designed GeoBroker, a data distribution service that uses the location of things, as well as geofences for messages and subscriptions, to control data distribution. This way, we enable new IoT application scenarios while also increasing overall system efficiency for scenarios where geo-contexts matter by delivering only relevant messages. We evaluate our approach based on a proof-of-concept prototype and several experiments.

preprint2020arXiv

SimRa: Using Crowdsourcing to Identify Near Miss Hotspots in Bicycle Traffic

An increased modal share of bicycle traffic is a key mechanism to reduce emissions and solve traffic-related problems. However, a lack of (perceived) safety keeps people from using their bikes more frequently. To improve safety in bicycle traffic, city planners need an overview of accidents, near miss incidents, and bike routes. Such information, however, is currently not available. In this paper, we describe SimRa, a platform for collecting data on bicycle routes and near miss incidents using smartphone-based crowdsourcing. We also describe how we identify dangerous near miss hotspots based on the collected data and propose a scoring model.

preprint2020arXiv

Towards Auction-Based Function Placement in Serverless Fog Platforms

The Function-as-a-Service (FaaS) paradigm has a lot of potential as a computing model for fog environments comprising both cloud and edge nodes. When the request rate exceeds capacity limits at the edge, some functions need to be offloaded from the edge towards the cloud. In this position paper, we propose an auction-based approach in which application developers bid on resources. This allows fog nodes to make a local decision about which functions to offload while maximizing revenue. For a first evaluation of our approach, we use simulation.