Researcher profile

Jeremiah Blocki

Jeremiah Blocki contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

Cost-Asymmetric Memory Hard Password Hashing

In the past decade, billions of user passwords have been exposed to the dangerous threat of offline password cracking attacks. An offline attacker who has stolen the cryptographic hash of a user's password can check as many password guesses as s/he likes limited only by the resources that s/he is willing to invest to crack the password. Pepper and key-stretching are two techniques that have been proposed to deter an offline attacker by increasing guessing costs. Pepper ensures that the cost of rejecting an incorrect password guess is higher than the (expected) cost of verifying a correct password guess. This is useful because most of the offline attacker's guesses will be incorrect. Unfortunately, as we observe the traditional peppering defense seems to be incompatible with modern memory hard key-stretching algorithms such as Argon2 or Scrypt. We introduce an alternative to pepper which we call Cost-Asymmetric Memory Hard Password Authentication which benefits from the same cost-asymmetry as the classical peppering defense i.e., the cost of rejecting an incorrect password guess is larger than the expected cost to authenticate a correct password guess. When configured properly we prove that our mechanism can only reduce the percentage of user passwords that are cracked by a rational offline attacker whose goal is to maximize (expected) profit i.e., the total value of cracked passwords minus the total guessing costs. We evaluate the effectiveness of our mechanism on empirical password datasets against a rational offline attacker. Our empirical analysis shows that our mechanism can reduce significantly the percentage of user passwords that are cracked by a rational attacker by up to 10%.

preprint2022arXiv

Privately Estimating Graph Parameters in Sublinear time

We initiate a systematic study of algorithms that are both differentially private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $(1+ρ)$-approximation algorithm for the problem of computing the average degree of a graph, for every $ρ>0$. The running time of the algorithm is roughly the same as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of coupled global sensitivity of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.

preprint2021arXiv

DAHash: Distribution Aware Tuning of Password Hashing Costs

An attacker who breaks into an authentication server and steals all of the cryptographic password hashes is able to mount an offline-brute force attack against each user's password. Offline brute-force attacks against passwords are increasingly commonplace and the danger is amplified by the well documented human tendency to select low-entropy password and/or reuse these passwords across multiple accounts. Moderately hard password hashing functions are often deployed to help protect passwords against offline attacks by increasing the attacker's guessing cost. However, there is a limit to how "hard" one can make the password hash function as authentication servers are resource constrained and must avoid introducing substantial authentication delay. Observing that there is a wide gap in the strength of passwords selected by different users we introduce DAHash (Distribution Aware Password Hashing) a novel mechanism which reduces the number of passwords that an attacker will crack. Our key insight ishat a resource-constrained authentication server can dynamically tune the hardness parameters of a password hash function based on the (estimated) strength of the user's password. We introduce a Stackelberg game to model the interaction between a defender (authentication server) and an offline attacker. Our model allows the defender to optimize the parameters of DAHash e.g., specify how much effort is spent to hash weak/moderate/high strength passwords. We use several large scale password frequency datasets to empirically evaluate the effectiveness of our differentiated cost password hashing mechanism. We find that the defender who uses our mechanism can reduce the fraction of passwords that would be cracked by a rational offline attacker by around 15%.

preprint2020arXiv

An Economic Model for Quantum Key-Recovery Attacks against Ideal Ciphers

It has been established that quantum algorithms can solve several key cryptographic problems more efficiently than classical computers. As progress continues in the field of quantum computing it is important to understand the risks they pose to deployed cryptographic systems. Here we focus on one of these risks - quantum key-recovery attacks against ideal ciphers. Specifically, we seek to model the risk posed by an economically motivated quantum attacker who will choose to run a quantum key-recovery attack against an ideal cipher if the cost to recover the secret key is less than the value of the information at the time when the key-recovery attack is complete. In our analysis we introduce the concept of a quantum cipher circuit year to measure the cost of a quantum attack. This concept can be used to model the inherent tradeoff between the total time to run a quantum key recovery attack and the total work required to run said attack. Our model incorporates the time value of the encrypted information to predict whether any time/work tradeoff results in a key-recovery attack with positive utility for the attacker. We make these predictions under various projections of advances in quantum computing. We use these predictions to make recommendations for the future use and deployment of symmetric key ciphers to secure information against these quantum key-recovery attacks. We argue that, even with optimistic predictions for advances in quantum computing, 128 bit keys (as used in common cipher implementations like AES-128) provide adequate security against quantum attacks in almost all use cases.

preprint2020arXiv

Bicycle Attacks Considered Harmful: Quantifying the Damage of Widespread Password Length Leakage

We examine the issue of password length leakage via encrypted traffic i.e., bicycle attacks. We aim to quantify both the prevalence of password length leakage bugs as well as the potential harm to users. In an observational study, we find that {\em most} of the Alexa top 100 rates sites are vulnerable to bicycle attacks meaning that an eavesdropping attacker can infer the exact length of a password based on the length the encrypted packet containing the password. We discuss several ways in which an eavesdropping attacker could link this password length with a particular user account e.g., a targeted campaign against a smaller group of users or via DNS hijacking for larger scale campaigns. We next use a decision-theoretic model to quantify the extent to which password length leakage might help an attacker to crack user passwords. In our analysis, we consider three different levels of password attackers: hacker, criminal and nation-state. In all cases, we find that such an attacker who knows the length of each user password gains a significant advantage over one without knowing the password length. As part of this analysis, we also release a new differentially private password frequency dataset from the 2016 LinkedIn breach using a differentially private algorithm of Blocki et al. (NDSS 2016) to protect user accounts. The LinkedIn frequency corpus is based on over 170 million passwords making it the largest frequency corpus publicly available to password researchers. While the defense against bicycle attacks is straightforward (i.e., ensure that passwords are always padded before encryption), we discuss several practical challenges organizations may face when attempting to patch this vulnerability. We advocate for a new W3C standard on how password fields are handled which would effectively eliminate most instances of password length leakage.

preprint2020arXiv

DALock: Distribution Aware Password Throttling

Large-scale online password guessing attacks are wide-spread and continuously qualified as one of the top cyber-security risks. The common method for mitigating the risk of online cracking is to lock out the user after a fixed number ($K$) of consecutive incorrect login attempts. Selecting the value of $K$ induces a classic security-usability trade-off. When $K$ is too large a hacker can (quickly) break into a significant fraction of user accounts, but when $K$ is too low we will start to annoy honest users by locking them out after a few mistakes. Motivated by the observation that honest user mistakes typically look quite different than the password guesses of an online attacker, we introduce DALock a {\em distribution aware} password lockout mechanism to reduce user annoyance while minimizing user risk. As the name suggests, DALock is designed to be aware of the frequency and popularity of the password used for login attacks while standard throttling mechanisms (e.g., $K$-strikes) are oblivious to the password distribution. In particular, DALock maintains an extra "hit count" in addition to "strike count" for each user which is based on (estimates of) the cumulative probability of {\em all} login attempts for that particular account. We empirically evaluate DALock with an extensive battery of simulations using real world password datasets. In comparison with the traditional $K$-strikes mechanism we find that DALock offers a superior security/usability trade-off. For example, in one of our simulations we are able to reduce the success rate of an attacker to $0.05\%$ (compared to $1\%$ for the $10$-strikes mechanism) whilst simultaneously reducing the unwanted lockout rate for accounts that are not under attack to just $0.08\%$ (compared to $4\%$ for the $3$-strikes mechanism).

preprint2020arXiv

On Locally Decodable Codes in Resource Bounded Channels

Constructions of locally decodable codes (LDCs) have one of two undesirable properties: low rate or high locality (polynomial in the length of the message). In settings where the encoder/decoder have already exchanged cryptographic keys and the channel is a probabilistic polynomial time (PPT) algorithm, it is possible to circumvent these barriers and design LDCs with constant rate and small locality. However, the assumption that the encoder/decoder have exchanged cryptographic keys is often prohibitive. We thus consider the problem of designing explicit and efficient LDCs in settings where the channel is slightly more constrained than the encoder/decoder with respect to some resource e.g., space or (sequential) time. Given an explicit function $f$ that the channel cannot compute, we show how the encoder can transmit a random secret key to the local decoder using $f(\cdot)$ and a random oracle $H(\cdot)$. This allows bootstrap from the private key LDC construction of Ostrovsky, Pandey and Sahai (ICALP, 2007), thereby answering an open question posed by Guruswami and Smith (FOCS 2010) of whether such bootstrapping techniques may apply to LDCs in weaker channel models than just PPT algorithms. Specifically, in the random oracle model we show how to construct explicit constant rate LDCs with locality of polylog in the security parameter against various resource constrained channels.

preprint2020arXiv

On the Economics of Offline Password Cracking

We develop an economic model of an offline password cracker which allows us to make quantitative predictions about the fraction of accounts that a rational password attacker would crack in the event of an authentication server breach. We apply our economic model to analyze recent massive password breaches at Yahoo!, Dropbox, LastPass and AshleyMadison. All four organizations were using key-stretching to protect user passwords. In fact, LastPass' use of PBKDF2-SHA256 with $10^5$ hash iterations exceeds 2017 NIST minimum recommendation by an order of magnitude. Nevertheless, our analysis paints a bleak picture: the adopted key-stretching levels provide insufficient protection for user passwords. In particular, we present strong evidence that most user passwords follow a Zipf's law distribution, and characterize the behavior of a rational attacker when user passwords are selected from a Zipf's law distribution. We show that there is a finite threshold which depends on the Zipf's law parameters that characterizes the behavior of a rational attacker -- if the value of a cracked password (normalized by the cost of computing the password hash function) exceeds this threshold then the adversary's optimal strategy is always to continue attacking until each user password has been cracked. In all cases (Yahoo!, Dropbox, LastPass and AshleyMadison) we find that the value of a cracked password almost certainly exceeds this threshold meaning that a rational attacker would crack all passwords that are selected from the Zipf's law distribution (i.e., most user passwords). This prediction holds even if we incorporate an aggressive model of diminishing returns for the attacker (e.g., the total value of $500$ million cracked passwords is less than $100$ times the total value of $5$ million passwords). See paper for full abstract.

preprint2020arXiv

Spaced Repetition and Mnemonics Enable Recall of Multiple Strong Passwords

We report on a user study that provides evidence that spaced repetition and a specific mnemonic technique enable users to successfully recall multiple strong passwords over time. Remote research participants were asked to memorize 4 Person-Action-Object (PAO) stories where they chose a famous person from a drop-down list and were given machine-generated random action-object pairs. Users were also shown a photo of a scene and asked to imagine the PAO story taking place in the scene (e.g., Bill Gates---swallowing---bike on a beach). Subsequently, they were asked to recall the action-object pairs when prompted with the associated scene-person pairs following a spaced repetition schedule over a period of 127+ days. While we evaluated several spaced repetition schedules, the best results were obtained when users initially returned after 12 hours and then in $1.5\times$ increasing intervals: 77% of the participants successfully recalled all 4 stories in 10 tests over a period of 158 days. Much of the forgetting happened in the first test period (12 hours): 89% of participants who remembered their stories during the first test period successfully remembered them in every subsequent round. These findings, coupled with recent results on naturally rehearsing password schemes, suggest that 4 PAO stories could be used to create usable and strong passwords for 14 sensitive accounts following this spaced repetition schedule, possibly with a few extra upfront rehearsals. In addition, we find that there is an interference effect across multiple PAO stories: the recall rate of 100% (resp. 90%) for participants who were asked to memorize 1 PAO story (resp. 2 PAO stories) is significantly better than the recall rate for participants who were asked to memorize 4 PAO stories. These findings yield concrete advice for improving constructions of password management schemes and future user studies.

preprint2010arXiv

Resolving the Complexity of Some Data Privacy Problems

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving both positive and negative results: 1. 2-anonymity is in P. 2. The problem of partitioning the edges of a triangle-free graph into 4-stars (degree-three vertices) is NP-hard. This yields an alternative proof that 3-anonymity is NP-hard even when the database attributes are all binary. 3. 3-anonymity with only 27 attributes per record is MAX SNP-hard. 4. For databases with n rows, k-anonymity is in O(4^n poly(n)) time for all k > 1. 5. For databases with n rows and l <= log_{2c+2} log n attributes over an alphabet of cardinality c = O(1), k-anonymity is in P. Assuming c, l = O(1), k-anonymity is in O(n). 6. 3-diversity with binary attributes is NP-hard, with one sensitive attribute. 7. 2-diversity with binary attributes is NP-hard, with three sensitive attributes.