Source author record

Daniel E. Lucani

Daniel E. Lucani appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Networking and Internet Architecture Cryptography and Security Performance Distributed, Parallel, and Cluster Computing Multimedia

Catalog footprint

What is connected

15works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bifrost: Secure, Scalable and Efficient File Sharing System Using Dual Deduplication

We consider the problem of sharing sensitive or valuable files across users while partially relying on a common, untrusted third-party, e.g., a Cloud Storage Provider (CSP). Although users can rely on a secure peer-to-peer (P2P) channel for file sharing, this introduces potential delay on the data transfer and requires the sender to remain active and connected while the transfer process occurs. Instead of using the P2P channel for the entire file, users can upload information about the file on a common CSP and share only the essential information that enables the receiver to download and recover the original file. This paper introduces Bifrost, an innovative file sharing system inspired by recent results on dual deduplication. Bifrost achieves the desired functionality and simultaneously guarantees that (1) the CSP can efficiently compress outsourced data; (2) the secure P2P channel is used only to transmit short, but crucial information; (3) users can check for data integrity, i.e., detect if the CSP alters the outsourced data; and (4) only the sender (data owner) and the intended receiver can access the file after sharing, i.e., the cloud or no malicious adversary can infer useful information about the shared file. We analyze compression and bandwidth performance using a proof-of-concept implementation. Our experiments show that secure file sharing can be achieved by sending only 650 bits on the P2P channel, irrespective of file size, while the CSP that aids the sharing can enjoy a compression rate of 86.9 %.

preprint2022arXiv

Bonsai: A Generalized Look at Dual Deduplication

Cloud Service Providers (CSPs) offer a vast amount of storage space at competitive prices to cope with the growing demand for digital data storage. Dual deduplication is a recent framework designed to improve data compression on the CSP while keeping clients' data private from the CSP. To achieve this, clients perform lightweight information-theoretic transformations to their data prior to upload. We investigate the effectiveness of dual deduplication, and propose an improvement for the existing state-of-the-art method. We name our proposal Bonsai as it aims at reducing storage fingerprint and improving scalability. In detail, Bonsai achieves (1) significant reduction in client storage, (2) reduction in total required storage (client + CSP), and (3) reducing the deduplication time on the CSP. Our experiments show that Bonsai achieves compression rates of 68\% on the cloud and 5\% on the client, while allowing the cloud to identify deduplications in a time-efficient manner. We also show that combining our method with universal compressors in the cloud, e.g., Brotli, can yield better overall compression on the data compared to only applying the universal compressor or plain Bonsai. Finally, we show that Bonsai and its variants provide sufficient privacy against an honest-but-curious CPS that knows the distribution of the Clients' original data.

preprint2021arXiv

ZipLine: In-Network Compression at Line Speed

Network appliances continue to offer novel opportunities to offload processing from computing nodes directly into the data plane. One popular concern of network operators and their customers is to move data increasingly faster. A common technique to increase data throughput is to compress it before its transmission. However, this requires compression of the data -- a time and energy demanding pre-processing phase -- and decompression upon reception -- a similarly resource consuming operation. Moreover, if multiple nodes transfer similar data chunks across the network hop (e.g., a given pair of switches), each node effectively wastes resources by executing similar steps. This paper proposes ZipLine, an approach to design and implement (de)compression at line speed leveraging the Tofino hardware platform which is programmable using the P4_16 language. We report on lessons learned while building the system and show throughput, latency and compression measurements on synthetic and real-world traces, showcasing the benefits and trade-offs of our design.

preprint2020arXiv

Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication

With the advent of the Internet of Things (IoT), the ever growing number of connected devices observed in recent years and foreseen for the next decade suggests that more and more data will have to be transmitted over a network, before being processed and stored in data centers. Generalized deduplication (GD) is a novel technique to effectively reduce the data storage cost by identifying similar data chunks, and able to gradually reduce the pressure from the network infrastructure by limiting the data that needs to be transmitted. This paper presents Hermes, an application-level protocol for the data-plane that can operate over generalized deduplication, as well as over classic deduplication. Hermes significantly reduces the data transmission traffic while effectively decreasing the energy footprint, a relevant matter to consider in the context of IoT deployments. We fully implemented Hermes and evaluated its performance using consumer-grade IoT devices (e.g., Raspberry Pi 4B models). Our results highlight several trade-offs that must be taken into account when considering real-world workloads.

preprint2020arXiv

Memory-aware Online Compression of CAN Bus Data for Future Vehicular Systems

Vehicles generate a large amount of data from their internal sensors. This data is not only useful for a vehicle's proper operation, but it provides car manufacturers with the ability to optimize performance of individual vehicles and companies with fleets of vehicles (e.g., trucks, taxis, tractors) to optimize their operations to reduce fuel costs and plan repairs. This paper proposes algorithms to compress CAN bus data, specifically, packaged as MDF4 files. In particular, we propose lightweight, online and configurable compression algorithms that allow limited devices to choose the amount of RAM and Flash allocated to them. We show that our proposals can outperform LZW for the same RAM footprint, and can even deliver comparable or better performance to DEFLATE under the same RAM limitations.

preprint2020arXiv

Yggdrasil: Privacy-aware Dual Deduplication in Multi Client Settings

This paper proposes Yggdrasil, a protocol for privacy-aware dual data deduplication in multi client settings. Yggdrasil is designed to reduce the cloud storage space while safeguarding the privacy of the client's outsourced data. Yggdrasil combines three innovative tools to achieve this goal. First, generalized deduplication, an emerging technique to reduce data footprint. Second, non-deterministic transformations that are described compactly and improve the degree of data compression in the Cloud (across users). Third, data preprocessing in the clients in the form of lightweight, privacy-driven transformations prior to upload. This guarantees that an honest-but-curious Cloud service trying to retrieve the client's actual data will face a high degree of uncertainty as to what the original data is. We provide a mathematical analysis of the measure of uncertainty as well as the compression potential of our protocol. Our experiments with a HDFS log data set shows that 49% overall compression can be achieved, with clients storing only 12% for privacy and the Cloud storing the rest. This is achieved while ensuring that each fragment uploaded to the Cloud would have 10^296 possible original strings from the client. Higher uncertainty is possible, with some reduction of compression potential.

preprint2019arXiv

Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties

We study a generalization of deduplication, which enables lossless deduplication of highly similar data and show that standard deduplication with fixed chunk length is a special case. We provide bounds on the expected length of coded sequences for generalized deduplication and show that the coding has asymptotic near-entropy cost under the proposed source model. More importantly, we show that generalized deduplication allows for multiple orders of magnitude faster convergence than standard deduplication. This means that generalized deduplication can provide compression benefits much earlier than standard deduplication, which is key in practical systems. Numerical examples demonstrate our results, showing that our lower bounds are achievable, and illustrating the potential gain of using the generalization over standard deduplication. In fact, we show that even for a simple case of generalized deduplication, the gain in convergence speed is linear with the size of the data chunks.

preprint2015arXiv

Analysis and Optimization of Sparse Random Linear Network Coding for Reliable Multicast Services

Point-to-multipoint communications are expected to play a pivotal role in next-generation networks. This paper refers to a cellular system transmitting layered multicast services to a multicast group of users. Reliability of communications is ensured via different Random Linear Network Coding (RLNC) techniques. We deal with a fundamental problem: the computational complexity of the RLNC decoder. The higher the number of decoding operations is, the more the user's computational overhead grows and, consequently, the faster the battery of mobile devices drains. By referring to several sparse RLNC techniques, and without any assumption on the implementation of the RLNC decoder in use, we provide an efficient way to characterize the performance of users targeted by ultra-reliable layered multicast services. The proposed modeling allows to efficiently derive the average number of coded packet transmissions needed to recover one or more service layers. We design a convex resource allocation framework that allows to minimize the complexity of the RLNC decoder by jointly optimizing the transmission parameters and the sparsity of the code. The designed optimization framework also ensures service guarantees to predetermined fractions of users. The performance of the proposed optimization framework is then investigated in a LTE-A eMBMS network multicasting H.264/SVC video services.

preprint2015arXiv

Fulcrum Network Codes: A Code for Fluid Allocation of Complexity

This paper proposes Fulcrum network codes, a network coding framework that achieves three seemingly conflicting objectives: (i) to reduce the coding coefficient overhead to almost n bits per packet in a generation of n packets; (ii) to operate the network using only GF(2) operations at intermediate nodes if necessary, dramatically reducing complexity in the network; (iii) to deliver an end-to-end performance that is close to that of a high-field network coding system for high-end receivers while simultaneously catering to low-end receivers that decode in GF(2). As a consequence of (ii) and (iii), Fulcrum codes have a unique trait missing so far in the network coding literature: they provide the network with the flexibility to spread computational complexity over different devices depending on their current load, network conditions, or even energy targets in a decentralized way. At the core of our framework lies the idea of precoding at the sources using an expansion field GF(2h) to increase the number of dimensions seen by the network using a linear mapping. Fulcrum codes can use any high-field linear code for precoding, e.g., Reed-Solomon, with the structure of the precode determining some of the key features of the resulting code. For example, a systematic structure provides the ability to manage heterogeneous receivers while using the same data stream. Our analysis shows that the number of additional dimensions created during precoding controls the trade-off between delay, overhead, and complexity. Our implementation and measurements show that Fulcrum achieves similar decoding probability as high field Random Linear Network Coding (RLNC) approaches but with encoders/decoders that are an order of magnitude faster.

preprint2012arXiv

Systematic Network Coding with the Aid of a Full-Duplex Relay

A characterization of systematic network coding over multi-hop wireless networks is key towards understanding the trade-off between complexity and delay performance of networks that preserve the systematic structure. This paper studies the case of a relay channel, where the source's objective is to deliver a given number of data packets to a receiver with the aid of a relay. The source broadcasts to both the receiver and the relay using one frequency, while the relay uses another frequency for transmissions to the receiver, allowing for a full-duplex operation of the relay. We analyze the decoding complexity and delay performance of two types of relays: one that preserves the systematic structure of the code from the source; another that does not. A systematic relay forwards uncoded packets upon reception, but transmits coded packets to the receiver after receiving the first coded packet from the source. On the other hand, a non-systematic relay always transmits linear combinations of previously received packets. We compare the performance of these two alternatives by analytically characterizing the expected transmission completion time as well as the number of uncoded packets forwarded by the relay. Our numerical results show that, for a poor channel between the source and the receiver, preserving the systematic structure at the relay (i) allows a significant increase in the number of uncoded packets received by the receiver, thus reducing the decoding complexity, and (ii) preserves close to optimal delay performance.

preprint2012arXiv

Whether and Where to Code in the Wireless Relay Channel

The throughput benefits of random linear network codes have been studied extensively for wirelined and wireless erasure networks. It is often assumed that all nodes within a network perform coding operations. In energy-constrained systems, however, coding subgraphs should be chosen to control the number of coding nodes while maintaining throughput. In this paper, we explore the strategic use of network coding in the wireless packet erasure relay channel according to both throughput and energy metrics. In the relay channel, a single source communicates to a single sink through the aid of a half-duplex relay. The fluid flow model is used to describe the case where both the source and the relay are coding, and Markov chain models are proposed to describe packet evolution if only the source or only the relay is coding. In addition to transmission energy, we take into account coding and reception energies. We show that coding at the relay alone while operating in a rateless fashion is neither throughput nor energy efficient. Given a set of system parameters, our analysis determines the optimal amount of time the relay should participate in the transmission, and where coding should be performed.

preprint2011arXiv

Energy-Delay Considerations in Coded Packet Flows

We consider a line of terminals which is connected by packet erasure channels and where random linear network coding is carried out at each node prior to transmission. In particular, we address an online approach in which each terminal has local information to be conveyed to the base station at the end of the line and provide a queueing theoretic analysis of this scenario. First, a genie-aided scenario is considered and the average delay and average transmission energy depending on the link erasure probabilities and the Poisson arrival rates at each node are analyzed. We then assume that all nodes cannot send and receive at the same time. The transmitting nodes in the network send coded data packets before stopping to wait for the receiving nodes to acknowledge the number of degrees of freedom, if any, that are required to decode correctly the information. We analyze this problem for an infinite queue size at the terminals and show that there is an optimal number of coded data packets at each node, in terms of average completion time or transmission energy, to be sent before stopping to listen.

preprint2011arXiv

On the Order Optimality of Large-scale Underwater Networks

Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square, in which both bandwidth and received signal power can be limited significantly. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. Cut-set upper bounds on the throughput scaling are then derived in both extended and dense networks having unit node density and unit area, respectively. It is first analyzed that under extended networks, the upper bound is inversely proportional to the attenuation parameter, thus resulting in a highly power-limited network. Interestingly, it is seen that the upper bound for extended networks is intrinsically related to the attenuation parameter but not the spreading factor. On the other hand, in dense networks, we show that there exists either a bandwidth or power limitation, or both, according to the path-loss attenuation regimes, thus yielding the upper bound that has three fundamentally different operating regimes. Furthermore, we describe an achievable scheme based on the simple nearest-neighbor multi-hop (MH) transmission. We show that under extended networks, the MH scheme is order-optimal for all the operating regimes. An achievability result is also presented in dense networks, where the operating regimes that guarantee the order optimality are identified. It thus turns out that frequency scaling is instrumental towards achieving the order optimality in the regimes. Finally, these scaling results are extended to a random network realization. As a result, vital information for fundamental limits of a variety of underwater network scenarios is provided by showing capacity scaling laws.

preprint2010arXiv

On Capacity Scaling of Underwater Networks: An Information-Theoretic Perspective

Capacity scaling laws are analyzed in an underwater acoustic network with $n$ regularly located nodes on a square. A narrow-band model is assumed where the carrier frequency is allowed to scale as a function of $n$. In the network, we characterize an attenuation parameter that depends on the frequency scaling as well as the transmission distance. A cut-set upper bound on the throughput scaling is then derived in extended networks. Our result indicates that the upper bound is inversely proportional to the attenuation parameter, thus resulting in a highly power-limited network. Interestingly, it is seen that unlike the case of wireless radio networks, our upper bound is intrinsically related to the attenuation parameter but not the spreading factor. Furthermore, we describe an achievable scheme based on the simple nearest neighbor multi-hop (MH) transmission. It is shown under extended networks that the MH scheme is order-optimal as the attenuation parameter scales exponentially with $\sqrt{n}$ (or faster). Finally, these scaling results are extended to a random network realization.

preprint2008arXiv

On the Relationship between Transmission Power and Capacity of an Underwater Acoustic Communication Channel

The underwater acoustic channel is characterized by a path loss that depends not only on the transmission distance, but also on the signal frequency. As a consequence, transmission bandwidth depends on the transmission distance, a feature that distinguishes an underwater acoustic system from a terrestrial radio system. The exact relationship between power, transmission band, distance and capacity for the Gaussian noise scenario is a complicated one. This work provides a closed-form approximate model for 1) power consumption, 2) band-edge frequency and 3) bandwidth as functions of distance and capacity required for a data link. This approximate model is obtained by numerical evaluation of analytical results which takes into account physical models of acoustic propagation loss and ambient noise. The closed-form approximations may become useful tools in the design and analysis of underwater acoustic networks.

Daniel E. Lucani

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Bifrost: Secure, Scalable and Efficient File Sharing System Using Dual Deduplication

Bonsai: A Generalized Look at Dual Deduplication

ZipLine: In-Network Compression at Line Speed

Hermes: Enabling Energy-efficient IoT Networks with Generalized Deduplication

Memory-aware Online Compression of CAN Bus Data for Future Vehicular Systems

Yggdrasil: Privacy-aware Dual Deduplication in Multi Client Settings

Generalized Deduplication: Bounds, Convergence, and Asymptotic Properties

Analysis and Optimization of Sparse Random Linear Network Coding for Reliable Multicast Services

Fulcrum Network Codes: A Code for Fluid Allocation of Complexity

Systematic Network Coding with the Aid of a Full-Duplex Relay

Whether and Where to Code in the Wireless Relay Channel

Energy-Delay Considerations in Coded Packet Flows

On the Order Optimality of Large-scale Underwater Networks

On Capacity Scaling of Underwater Networks: An Information-Theoretic Perspective

On the Relationship between Transmission Power and Capacity of an Underwater Acoustic Communication Channel