Source author record

Roy Friedman

Roy Friedman appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing Operating Systems Data Structures and Algorithms Networking and Internet Architecture

Catalog footprint

What is connected

9works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Limited Associativity Caching in the Data Plane

In-network caching promises to improve the performance of networked and edge applications as it shortens the paths data need to travel. This is by storing so-called hot items in the network switches on-route between clients who access the data and the storage servers who maintain it. Since the data flows through those switches in any case, it is natural to cache hot items there. Most software-managed caches treat the cache as a fully associative region. Alas, a fully associative design seems to be at odds with programmable switches' goal of handling packets in a short bounded amount of time, as well as their restricted programming model. In this work, we present PKache, a generic limited associativity cache implementation in the programmable switches' domain-specific P4 language, and demonstrate its utility by realizing multiple popular cache management schemes.

preprint2022arXiv

Multilevel Bidirectional Cache Filter

Modern caches are often required to handle a massive amount of data, which exceeds the amount of available memory; thus, hybrid caches, specifically DRAM/SSD combination, become more and more prevalent. In such environments, in addition to the classical hit-ratio target, saving writes to the second-level cache is a dominant factor to avoid write amplification and wear out, two notorious phenomena of SSD. This paper presents BiDiFilter, a novel multilevel caching scheme that controls demotions and promotions between cache levels using a frequency sketch filter. Further, it splits the higher cache level into two areas to keep the most recent and the most frequent items close to the user. We conduct an extensive evaluation over real-world traces, comparing to previous multilevel policies. We show that using our mechanism yields an x10 saving of writes in almost all cases and often improving latencies by up to 20%.

preprint2022arXiv

SQUAD: Combining Sketching and Sampling Is Better than Either for Per-item Quantile Estimation

Stream monitoring is fundamental in many data stream applications, such as financial data trackers, security, anomaly detection, and load balancing. In that respect, quantiles are of particular interest, as they often capture the user's utility. For example, if a video connection has high tail latency, the perceived quality will suffer, even if the average and median latencies are low. In this work, we consider the problem of approximating the per-item quantiles. Elements in our stream are (ID, latency) tuples, and we wish to track the latency quantiles for each ID. Existing quantile sketches are designed for a single number stream (e.g., containing just the latency). While one could allocate a separate sketch instance for each ID, this may require an infeasible amount of memory. Instead, we consider tracking the quantiles for the heavy hitters (most frequent items), which are often considered particularly important, without knowing them beforehand. We first present a simple sampling algorithm that serves as a benchmark. Then, we design an algorithm that augments a quantile sketch within each entry of a heavy hitter algorithm, resulting in similar space complexity but with a deterministic error guarantee. Finally, we present SQUAD, a method that combines sampling and sketching while improving the asymptotic space complexity. Intuitively, SQUAD uses a background sampling process to capture the behaviour of the latencies of an item before it is allocated with a sketch, thereby allowing us to use fewer samples and sketches. Our solutions are rigorously analyzed, and we demonstrate the superiority of our approach using extensive simulations.

preprint2016arXiv

COARA: Code Offloading on Android with AspectJ

Smartphones suffer from limited computational capabilities and battery life. A method to mitigate these problems is code offloading: executing application code on a remote server. We introduce COARA, a middleware platform for code offloading on Android that uses aspect-oriented programming (AOP) with AspectJ. AOP allows COARA to intercept code for offloading without a customized compiler or modification of the operating system. COARA requires minimal changes to application source code, and does not require the application developer to be aware of AOP. Since state transfer to the server is often a bottleneck that hinders performance, COARA uses AOP to intercept the transmission of large objects from the client and replaces them with object proxies. The server can begin execution of the offloaded application code, regardless of whether all required objects been transferred to the server. We run COARA with Android applications from the Google Play store on a Nexus 4 running unmodified Android 4.3 to prove that our platform improves performance and reduces energy consumption. Our approach yields speedups of 24x and 6x over WiFi and 3G respectively.

preprint2016arXiv

Efficient Summing over Sliding Windows

This paper considers the problem of maintaining statistic aggregates over the last W elements of a data stream. First, the problem of counting the number of 1's in the last W bits of a binary stream is considered. A lower bound of Ω(1/ε + log W) memory bits for Wε-additive approximations is derived. This is followed by an algorithm whose memory consumption is O(1/ε + log W) bits, indicating that the algorithm is optimal and that the bound is tight. Next, the more general problem of maintaining a sum of the last W integers, each in the range of {0,1,...,R}, is addressed. The paper shows that approximating the sum within an additive error of RWε can also be done using Θ(1/ε + log W) bits for ε=Ω(1/W). For ε=o(1/W), we present a succinct algorithm which uses B(1 + o(1)) bits, where B=Θ(Wlog(1/Wε)) is the derived lower bound. We show that all lower bounds generalize to randomized algorithms as well. All algorithms process new elements and answer queries in O(1) worst-case time.

preprint2016arXiv

Hardening Cassandra Against Byzantine Failures

Cassandra is one of the most widely used distributed data stores these days. Cassandra supports flexible consistency guarantees over a wide-column data access model and provides almost linear scale-out performance. This enables application developers to tailor the performance and availability of Cassandra to their exact application's needs and required semantics. Yet, Cassandra is designed to withstand benign failures, and cannot cope with most forms of Byzantine attacks. In this work, we present an analysis of Cassandra's vulnerabilities and propose protocols for hardening Cassandra against Byzantine failures. We examine several alternative design choices and compare between them both qualitatively and empirically by using the Yahoo! Cloud Serving Benchmark (YCSB) performance benchmark. We include incremental performance analysis for our algorithmic and cryptographic adjustments, supporting our design choices.

preprint2016arXiv

ICE Buckets: Improved Counter Estimation for Network Measurement

Measurement capabilities are essential for a variety of network applications, such as load balancing, routing, fairness and intrusion detection. These capabilities require large counter arrays in order to monitor the traffic of all network flows. While commodity SRAM memories are capable of operating at line speed, they are too small to accommodate large counter arrays. Previous works suggested estimators, which trade precision for reduced space. However, in order to accurately estimate the largest counter, these methods compromise the accuracy of the smaller counters. In this work, we present a closed form representation of the optimal estimation function. We then introduce Independent Counter Estimation Buckets (ICE-Buckets), a novel algorithm that improves estimation accuracy for all counters. This is achieved by separating the flows to buckets and configuring the optimal estimation function according to each bucket's counter scale. We prove a tighter upper bound on the relative error and demonstrate an accuracy improvement of up to 57 times on real Internet packet traces.

preprint2015arXiv

Fisheye Consistency: Keeping Data in Synch in a Georeplicated World

Over the last thirty years, numerous consistency conditions for replicated data have been proposed and implemented. Popular examples of such conditions include linearizability (or atomicity), sequential consistency, causal consistency, and eventual consistency. These consistency conditions are usually defined independently from the computing entities (nodes) that manipulate the replicated data; i.e., they do not take into account how computing entities might be linked to one another, or geographically distributed. To address this lack, as a first contribution, this paper introduces the notion of proximity graph between computing nodes. If two nodes are connected in this graph, their operations must satisfy a strong consistency condition, while the operations invoked by other nodes are allowed to satisfy a weaker condition. The second contribution is the use of such a graph to provide a generic approach to the hybridization of data consistency conditions into the same system. We illustrate this approach on sequential consistency and causal consistency, and present a model in which all data operations are causally consistent, while operations by neighboring processes in the proximity graph are sequentially consistent. The third contribution of the paper is the design and the proof of a distributed algorithm based on this proximity graph, which combines sequential consistency and causal consistency (the resulting condition is called fisheye consistency). In doing so the paper not only extends the domain of consistency conditions, but provides a generic provably correct solution of direct relevance to modern georeplicated systems.

preprint2015arXiv

TinyLFU: A Highly Efficient Cache Admission Policy

This paper proposes to use a frequency based cache admission policy in order to boost the effectiveness of caches subject to skewed access distributions. Given a newly accessed item and an eviction candidate from the cache, our scheme decides, based on the recent access history, whether it is worth admitting the new item into the cache at the expense of the eviction candidate. Realizing this concept is enabled through a novel approximate LFU structure called TinyLFU, which maintains an approximate representation of the access frequency of a large sample of recently accessed items. TinyLFU is very compact and light-weight as it builds upon Bloom filter theory. We study the properties of TinyLFU through simulations of both synthetic workloads as well as multiple real traces from several sources. These simulations demonstrate the performance boost obtained by enhancing various replacement policies with the TinyLFU eviction policy. Also, a new combined replacement and eviction policy scheme nicknamed W-TinyLFU is presented. W-TinyLFU is demonstrated to obtain equal or better hit-ratios than other state of the art replacement policies on these traces. It is the only scheme to obtain such good results on all traces.

Roy Friedman

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Limited Associativity Caching in the Data Plane

Multilevel Bidirectional Cache Filter

SQUAD: Combining Sketching and Sampling Is Better than Either for Per-item Quantile Estimation

COARA: Code Offloading on Android with AspectJ

Efficient Summing over Sliding Windows

Hardening Cassandra Against Byzantine Failures

ICE Buckets: Improved Counter Estimation for Network Measurement

Fisheye Consistency: Keeping Data in Synch in a Georeplicated World

TinyLFU: A Highly Efficient Cache Admission Policy