Researcher profile

Simon Kassing

Simon Kassing contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

New primitives for bounded degradation in network service

Certain new ascendant data center workloads can absorb some degradation in network service, not needing fully reliable data transport and/or their fair-share of network bandwidth. This opens up opportunities for superior network and infrastructure multiplexing by having this flexible traffic cede capacity under congestion to regular traffic with stricter needs. We posit there is opportunity in network service primitives which permit degradation within certain bounds, such that flexible traffic still receives an acceptable level of service, while benefiting from its weaker requirements. We propose two primitives, namely guaranteed partial delivery and bounded deprioritization. We design a budgeting algorithm to provide guarantees relative to their fair share, which is measured via probing. The requirement of budgeting and probing limits the algorithm's applicability to large flexible flows. We evaluate our algorithm with large flexible flows and for three workloads of regular flows of small size, large size and a distribution of sizes. Across the workloads, our algorithm achieves less speed-up of regular flows than fixed prioritization, especially for the small flows workload (1.25x vs. 6.82 in the 99th %-tile). Our algorithm provides better guarantees in the workload with large regular flows (with 14.5% vs. 32.5% of flexible flows being slowed down beyond their guarantee). However, it provides not much better or even slightly worse guarantees for the other two workloads. The ability to enforce guarantees is influenced by flow fair share interdependence, measurement inaccuracies and dependency on convergence. We observe that priority changes to probe or to deprioritize causes queue shifts which deteriorate guarantees and limit possible speed-up, especially of small flows. We find that mechanisms to both prioritize traffic and track guarantees should be as non-disruptive as possible.

preprint2022arXiv

Resource Allocation in Serverless Query Processing

Data lakes hold a growing amount of cold data that is infrequently accessed, yet require interactive response times. Serverless functions are seen as a way to address this use case since they offer an appealing alternative to maintaining (and paying for) a fixed infrastructure. Recent research has analyzed the potential of serverless for data processing. In this paper, we expand on such work by looking into the question of serverless resource allocation to data processing tasks (number and size of the functions). We formulate a general model to roughly estimate completion time and financial cost, which we apply to augment an existing serverless data processing system with an advisory tool that automatically identifies configurations striking a good balance -- which we define as being close to the "knee" of their Pareto frontier. The model takes into account key aspects of serverless: start-up, computation, network transfers, and overhead as a function of the input sizes and intermediate result exchanges. Using (micro)benchmarks and parts of TPC-H, we show that this advisor is capable of pinpointing configurations desirable to the user. Moreover, we identify and discuss several aspects of data processing on serverless affecting efficiency. By using an automated tool to configure the resources, the barrier to using serverless for data processing is lowered and the narrow window where it is cost effective can be expanded by using a more optimal allocation instead of having to over-provision the design.