Source author record

Aaron Johnson

Aaron Johnson appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Information Theory Machine Learning math.IT Methodology Networking and Internet Architecture

Catalog footprint

What is connected

6works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Accountable Private Set Cardinality for Distributed Measurement

We introduce cryptographic protocols for securely and efficiently computing the cardinality of set union and set intersection. Our private set-cardinality protocols (PSC) are designed for the setting in which a large set of parties in a distributed system makes observations, and a small set of parties with more resources and higher reliability aggregates the observations. PSC allows for secure and useful statistics gathering in privacy-preserving distributed systems. For example, it allows operators of anonymity networks such as Tor to securely answer the questions: "How many unique users are using the network?" and "How many hidden services are being accessed?". We prove the correctness and security of PSC in the Universal Composability framework against an active adversary that compromises all but one of the aggregating parties. Although successful output cannot be guaranteed in this setting, PSC either succeeds or terminates with an abort, and we furthermore make the adversary accountable for causing an abort by blaming at least one malicious party. We also show that PSC prevents adaptive corruption of the data parties from revealing past observations, which prevents them from being victims of targeted compromise, and we ensure safe measurements by making outputs differentially private. We present a proof-of-concept implementation of PSC and use it to demonstrate that PSC operates with low computational overhead and reasonable bandwidth. It can count tens of thousands of unique observations from tens to hundreds of data-collecting parties while completing within hours. PSC is thus suitable for daily measurements in a distributed system.

preprint2022arXiv

Differentially Private Maximal Information Coefficients

The Maximal Information Coefficient (MIC) is a powerful statistic to identify dependencies between variables. However, it may be applied to sensitive data, and publishing it could leak private information. As a solution, we present algorithms to approximate MIC in a way that provides differential privacy. We show that the natural application of the classic Laplace mechanism yields insufficient accuracy. We therefore introduce the MICr statistic, which is a new MIC approximation that is more compatible with differential privacy. We prove MICr is a consistent estimator for MIC, and we provide two differentially private versions of it. We perform experiments on a variety of real and synthetic datasets. The results show that the private MICr statistics significantly outperform direct application of the Laplace mechanism. Moreover, experiments on real-world datasets show accuracy that is usable when the sample size is at least moderately large.

preprint2020arXiv

FlashFlow: A Secure Speed Test for Tor

The Tor network uses a measurement system to estimate its relays' forwarding capacity and to balance traffic among them. This system has been shown to be vulnerable to adversarial manipulation. Moreover, its accuracy and effectiveness in benign circumstances has never been fully quantified. We first obtain such a quantification by analyzing Tor metrics data and performing experiments on the live network. Our results show that Tor currently underestimates its true capacity by about 50% and improperly balances its traffic by 15-25%. Then, to solve the problems with security and accuracy, we present FlashFlow, a system to measure the capacity of Tor relays. Our analysis shows that FlashFlow limits a malicious relay to obtaining a capacity estimate at most 1.33 times its true capacity. Through realistic Internet experiments, we find that FlashFlow measures relay capacity with at least 89% accuracy 95% of the time. Through simulation, we find that FlashFlow can measure the entire Tor network in less than 5 hours using 3 measurers with 1 Gbit/s of bandwidth each. Finally, simulations using FlashFlow for load balancing shows that, compared to TorFlow, network weight error decreases by 86%, while the median of 50 KiB, 1 MiB, and 5 MiB transfer times decreases by 15%, 29%, and 37%, respectively. Moreover, FlashFlow yields more consistent client performance: the median rate of transfer timeouts decreases by 100%, while the standard deviation of 50 KiB, 1 MiB, and 5 MiB transfer times decreases by 55%, 61%, and 41%, respectively. We also find that the performance improvements increase relative to TorFlow as the total client-traffic load increases, demonstrating that FlashFlow is better suited to supporting network growth.

preprint2015arXiv

Defending Tor from Network Adversaries: A Case Study of Network Path Prediction

The Tor anonymity network has been shown vulnerable to traffic analysis attacks by autonomous systems and Internet exchanges, which can observe different overlay hops belonging to the same circuit. We aim to determine whether network path prediction techniques provide an accurate picture of the threat from such adversaries, and whether they can be used to avoid this threat. We perform a measurement study by running traceroutes from Tor relays to destinations around the Internet. We use the data to evaluate the accuracy of the autonomous systems and Internet exchanges that are predicted to appear on the path using state-of-the-art path inference techniques; we also consider the impact that prediction errors have on Tor security, and whether it is possible to produce a useful overestimate that does not miss important threats. Finally, we evaluate the possibility of using these predictions to actively avoid AS and IX adversaries and the challenges this creates for the design of Tor.

preprint2014arXiv

Representing Network Trust and Using It to Improve Anonymous Communication

Motivated by the effectiveness of correlation attacks against Tor, the censorship arms race, and observations of malicious relays in Tor, we propose that Tor users capture their trust in network elements using probability distributions over the sets of elements observed by network adversaries. We present a modular system that allows users to efficiently and conveniently create such distributions and use them to improve their security. The major components of this system are (i) an ontology of network-element types that represents the main threats to and vulnerabilities of anonymous communication over Tor, (ii) a formal language that allows users to naturally express trust beliefs about network elements, and (iii) a conversion procedure that takes the ontology, public information about the network, and user beliefs written in the trust language and produce a Bayesian Belief Network that represents the probability distribution in a way that is concise and easily sampleable. We also present preliminary experimental results that show the distribution produced by our system can improve security when employed by users; further improvement is seen when the system is employed by both users and services.

preprint2011arXiv

Probabilistic Analysis of Onion Routing in a Black-box Model

We perform a probabilistic analysis of onion routing. The analysis is presented in a black-box model of anonymous communication in the Universally Composable framework that abstracts the essential properties of onion routing in the presence of an active adversary that controls a portion of the network and knows all a priori distributions on user choices of destination. Our results quantify how much the adversary can gain in identifying users by exploiting knowledge of their probabilistic behavior. In particular, we show that, in the limit as the network gets large, a user u's anonymity is worst either when the other users always choose the destination u is least likely to visit or when the other users always choose the destination u chooses. This worst-case anonymity with an adversary that controls a fraction b of the routers is shown to be comparable to the best-case anonymity against an adversary that controls a fraction \surdb.