Source author record

Jeremiah Blocki

Jeremiah Blocki appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Computer Science and Game Theory Computational Complexity Data Structures and Algorithms Artificial Intelligence cs.CY Databases Human-Computer Interaction Information Theory math.CO math.IT physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

22works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Cost-Asymmetric Memory Hard Password Hashing

In the past decade, billions of user passwords have been exposed to the dangerous threat of offline password cracking attacks. An offline attacker who has stolen the cryptographic hash of a user's password can check as many password guesses as s/he likes limited only by the resources that s/he is willing to invest to crack the password. Pepper and key-stretching are two techniques that have been proposed to deter an offline attacker by increasing guessing costs. Pepper ensures that the cost of rejecting an incorrect password guess is higher than the (expected) cost of verifying a correct password guess. This is useful because most of the offline attacker's guesses will be incorrect. Unfortunately, as we observe the traditional peppering defense seems to be incompatible with modern memory hard key-stretching algorithms such as Argon2 or Scrypt. We introduce an alternative to pepper which we call Cost-Asymmetric Memory Hard Password Authentication which benefits from the same cost-asymmetry as the classical peppering defense i.e., the cost of rejecting an incorrect password guess is larger than the expected cost to authenticate a correct password guess. When configured properly we prove that our mechanism can only reduce the percentage of user passwords that are cracked by a rational offline attacker whose goal is to maximize (expected) profit i.e., the total value of cracked passwords minus the total guessing costs. We evaluate the effectiveness of our mechanism on empirical password datasets against a rational offline attacker. Our empirical analysis shows that our mechanism can reduce significantly the percentage of user passwords that are cracked by a rational attacker by up to 10%.

preprint2022arXiv

Privately Estimating Graph Parameters in Sublinear time

We initiate a systematic study of algorithms that are both differentially private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $(1+ρ)$-approximation algorithm for the problem of computing the average degree of a graph, for every $ρ>0$. The running time of the algorithm is roughly the same as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of coupled global sensitivity of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.

preprint2021arXiv

DAHash: Distribution Aware Tuning of Password Hashing Costs

An attacker who breaks into an authentication server and steals all of the cryptographic password hashes is able to mount an offline-brute force attack against each user's password. Offline brute-force attacks against passwords are increasingly commonplace and the danger is amplified by the well documented human tendency to select low-entropy password and/or reuse these passwords across multiple accounts. Moderately hard password hashing functions are often deployed to help protect passwords against offline attacks by increasing the attacker's guessing cost. However, there is a limit to how "hard" one can make the password hash function as authentication servers are resource constrained and must avoid introducing substantial authentication delay. Observing that there is a wide gap in the strength of passwords selected by different users we introduce DAHash (Distribution Aware Password Hashing) a novel mechanism which reduces the number of passwords that an attacker will crack. Our key insight ishat a resource-constrained authentication server can dynamically tune the hardness parameters of a password hash function based on the (estimated) strength of the user's password. We introduce a Stackelberg game to model the interaction between a defender (authentication server) and an offline attacker. Our model allows the defender to optimize the parameters of DAHash e.g., specify how much effort is spent to hash weak/moderate/high strength passwords. We use several large scale password frequency datasets to empirically evaluate the effectiveness of our differentiated cost password hashing mechanism. We find that the defender who uses our mechanism can reduce the fraction of passwords that would be cracked by a rational offline attacker by around 15%.

preprint2020arXiv

An Economic Model for Quantum Key-Recovery Attacks against Ideal Ciphers

It has been established that quantum algorithms can solve several key cryptographic problems more efficiently than classical computers. As progress continues in the field of quantum computing it is important to understand the risks they pose to deployed cryptographic systems. Here we focus on one of these risks - quantum key-recovery attacks against ideal ciphers. Specifically, we seek to model the risk posed by an economically motivated quantum attacker who will choose to run a quantum key-recovery attack against an ideal cipher if the cost to recover the secret key is less than the value of the information at the time when the key-recovery attack is complete. In our analysis we introduce the concept of a quantum cipher circuit year to measure the cost of a quantum attack. This concept can be used to model the inherent tradeoff between the total time to run a quantum key recovery attack and the total work required to run said attack. Our model incorporates the time value of the encrypted information to predict whether any time/work tradeoff results in a key-recovery attack with positive utility for the attacker. We make these predictions under various projections of advances in quantum computing. We use these predictions to make recommendations for the future use and deployment of symmetric key ciphers to secure information against these quantum key-recovery attacks. We argue that, even with optimistic predictions for advances in quantum computing, 128 bit keys (as used in common cipher implementations like AES-128) provide adequate security against quantum attacks in almost all use cases.

preprint2020arXiv

Bicycle Attacks Considered Harmful: Quantifying the Damage of Widespread Password Length Leakage

We examine the issue of password length leakage via encrypted traffic i.e., bicycle attacks. We aim to quantify both the prevalence of password length leakage bugs as well as the potential harm to users. In an observational study, we find that {\em most} of the Alexa top 100 rates sites are vulnerable to bicycle attacks meaning that an eavesdropping attacker can infer the exact length of a password based on the length the encrypted packet containing the password. We discuss several ways in which an eavesdropping attacker could link this password length with a particular user account e.g., a targeted campaign against a smaller group of users or via DNS hijacking for larger scale campaigns. We next use a decision-theoretic model to quantify the extent to which password length leakage might help an attacker to crack user passwords. In our analysis, we consider three different levels of password attackers: hacker, criminal and nation-state. In all cases, we find that such an attacker who knows the length of each user password gains a significant advantage over one without knowing the password length. As part of this analysis, we also release a new differentially private password frequency dataset from the 2016 LinkedIn breach using a differentially private algorithm of Blocki et al. (NDSS 2016) to protect user accounts. The LinkedIn frequency corpus is based on over 170 million passwords making it the largest frequency corpus publicly available to password researchers. While the defense against bicycle attacks is straightforward (i.e., ensure that passwords are always padded before encryption), we discuss several practical challenges organizations may face when attempting to patch this vulnerability. We advocate for a new W3C standard on how password fields are handled which would effectively eliminate most instances of password length leakage.

preprint2020arXiv

DALock: Distribution Aware Password Throttling

Large-scale online password guessing attacks are wide-spread and continuously qualified as one of the top cyber-security risks. The common method for mitigating the risk of online cracking is to lock out the user after a fixed number ($K$) of consecutive incorrect login attempts. Selecting the value of $K$ induces a classic security-usability trade-off. When $K$ is too large a hacker can (quickly) break into a significant fraction of user accounts, but when $K$ is too low we will start to annoy honest users by locking them out after a few mistakes. Motivated by the observation that honest user mistakes typically look quite different than the password guesses of an online attacker, we introduce DALock a {\em distribution aware} password lockout mechanism to reduce user annoyance while minimizing user risk. As the name suggests, DALock is designed to be aware of the frequency and popularity of the password used for login attacks while standard throttling mechanisms (e.g., $K$-strikes) are oblivious to the password distribution. In particular, DALock maintains an extra "hit count" in addition to "strike count" for each user which is based on (estimates of) the cumulative probability of {\em all} login attempts for that particular account. We empirically evaluate DALock with an extensive battery of simulations using real world password datasets. In comparison with the traditional $K$-strikes mechanism we find that DALock offers a superior security/usability trade-off. For example, in one of our simulations we are able to reduce the success rate of an attacker to $0.05\%$ (compared to $1\%$ for the $10$-strikes mechanism) whilst simultaneously reducing the unwanted lockout rate for accounts that are not under attack to just $0.08\%$ (compared to $4\%$ for the $3$-strikes mechanism).

preprint2020arXiv

On Locally Decodable Codes in Resource Bounded Channels

Constructions of locally decodable codes (LDCs) have one of two undesirable properties: low rate or high locality (polynomial in the length of the message). In settings where the encoder/decoder have already exchanged cryptographic keys and the channel is a probabilistic polynomial time (PPT) algorithm, it is possible to circumvent these barriers and design LDCs with constant rate and small locality. However, the assumption that the encoder/decoder have exchanged cryptographic keys is often prohibitive. We thus consider the problem of designing explicit and efficient LDCs in settings where the channel is slightly more constrained than the encoder/decoder with respect to some resource e.g., space or (sequential) time. Given an explicit function $f$ that the channel cannot compute, we show how the encoder can transmit a random secret key to the local decoder using $f(\cdot)$ and a random oracle $H(\cdot)$. This allows bootstrap from the private key LDC construction of Ostrovsky, Pandey and Sahai (ICALP, 2007), thereby answering an open question posed by Guruswami and Smith (FOCS 2010) of whether such bootstrapping techniques may apply to LDCs in weaker channel models than just PPT algorithms. Specifically, in the random oracle model we show how to construct explicit constant rate LDCs with locality of polylog in the security parameter against various resource constrained channels.

preprint2020arXiv

On the Economics of Offline Password Cracking

We develop an economic model of an offline password cracker which allows us to make quantitative predictions about the fraction of accounts that a rational password attacker would crack in the event of an authentication server breach. We apply our economic model to analyze recent massive password breaches at Yahoo!, Dropbox, LastPass and AshleyMadison. All four organizations were using key-stretching to protect user passwords. In fact, LastPass' use of PBKDF2-SHA256 with $10^5$ hash iterations exceeds 2017 NIST minimum recommendation by an order of magnitude. Nevertheless, our analysis paints a bleak picture: the adopted key-stretching levels provide insufficient protection for user passwords. In particular, we present strong evidence that most user passwords follow a Zipf's law distribution, and characterize the behavior of a rational attacker when user passwords are selected from a Zipf's law distribution. We show that there is a finite threshold which depends on the Zipf's law parameters that characterizes the behavior of a rational attacker -- if the value of a cracked password (normalized by the cost of computing the password hash function) exceeds this threshold then the adversary's optimal strategy is always to continue attacking until each user password has been cracked. In all cases (Yahoo!, Dropbox, LastPass and AshleyMadison) we find that the value of a cracked password almost certainly exceeds this threshold meaning that a rational attacker would crack all passwords that are selected from the Zipf's law distribution (i.e., most user passwords). This prediction holds even if we incorporate an aggressive model of diminishing returns for the attacker (e.g., the total value of $500$ million cracked passwords is less than $100$ times the total value of $5$ million passwords). See paper for full abstract.

preprint2020arXiv

Spaced Repetition and Mnemonics Enable Recall of Multiple Strong Passwords

We report on a user study that provides evidence that spaced repetition and a specific mnemonic technique enable users to successfully recall multiple strong passwords over time. Remote research participants were asked to memorize 4 Person-Action-Object (PAO) stories where they chose a famous person from a drop-down list and were given machine-generated random action-object pairs. Users were also shown a photo of a scene and asked to imagine the PAO story taking place in the scene (e.g., Bill Gates---swallowing---bike on a beach). Subsequently, they were asked to recall the action-object pairs when prompted with the associated scene-person pairs following a spaced repetition schedule over a period of 127+ days. While we evaluated several spaced repetition schedules, the best results were obtained when users initially returned after 12 hours and then in $1.5\times$ increasing intervals: 77% of the participants successfully recalled all 4 stories in 10 tests over a period of 158 days. Much of the forgetting happened in the first test period (12 hours): 89% of participants who remembered their stories during the first test period successfully remembered them in every subsequent round. These findings, coupled with recent results on naturally rehearsing password schemes, suggest that 4 PAO stories could be used to create usable and strong passwords for 14 sensitive accounts following this spaced repetition schedule, possibly with a few extra upfront rehearsals. In addition, we find that there is an interference effect across multiple PAO stories: the recall rate of 100% (resp. 90%) for participants who were asked to memorize 1 PAO story (resp. 2 PAO stories) is significantly better than the recall rate for participants who were asked to memorize 4 PAO stories. These findings yield concrete advice for improving constructions of password management schemes and future user studies.

preprint2016arXiv

CASH: A Cost Asymmetric Secure Hash Algorithm for Optimal Password Protection

An adversary who has obtained the cryptographic hash of a user's password can mount an offline attack to crack the password by comparing this hash value with the cryptographic hashes of likely password guesses. This offline attacker is limited only by the resources he is willing to invest to crack the password. Key-stretching tools can help mitigate the threat of offline attacks by making each password guess more expensive for the adversary to verify. However, key-stretching increases authentication costs for a legitimate authentication server. We introduce a novel Stackelberg game model which captures the essential elements of this interaction between a defender and an offline attacker. We then introduce Cost Asymmetric Secure Hash (CASH), a randomized key-stretching mechanism that minimizes the fraction of passwords that would be cracked by a rational offline attacker without increasing amortized authentication costs for the legitimate authentication server. CASH is motivated by the observation that the legitimate authentication server will typically run the authentication procedure to verify a correct password, while an offline adversary will typically use incorrect password guesses. By using randomization we can ensure that the amortized cost of running CASH to verify a correct password guess is significantly smaller than the cost of rejecting an incorrect password. Using our Stackelberg game framework we can quantify the quality of the underlying CASH running time distribution in terms of the fraction of passwords that a rational offline adversary would crack. We provide an efficient algorithm to compute high quality CASH distributions for the defender. Finally, we analyze CASH using empirical data from two large scale password frequency datasets. Our analysis shows that CASH can significantly reduce (up to $50\%$) the fraction of password cracked by a rational offline adversary.

preprint2016arXiv

Client-CASH: Protecting Master Passwords against Offline Attacks

Offline attacks on passwords are increasingly commonplace and dangerous. An offline adversary is limited only by the amount of computational resources he or she is willing to invest to crack a user's password. The danger is compounded by the existence of authentication servers who fail to adopt proper password storage practices like key-stretching. Password managers can help mitigate these risks by adopting key stretching procedures like hash iteration or memory hard functions to derive site specific passwords from the user's master password on the client-side. While key stretching can reduce the offline adversary's success rate, these procedures also increase computational costs for a legitimate user. Motivated by the observation that most of the password guesses of the offline adversary will be incorrect, we propose a client side cost asymmetric secure hashing scheme (Client-CASH). Client-CASH randomizes the runtime of client-side key stretching procedure in a way that the expected computational cost of our key derivation function is greater when run with an incorrect master password. We make several contributions. First, we show how to introduce randomness into a client-side key stretching algorithms through the use of halting predicates which are selected randomly at the time of account creation. Second, we formalize the problem of finding the optimal running time distribution subject to certain cost constraints for the client and certain security constrains on the halting predicates. Finally, we demonstrate that Client-CASH can reduce the adversary's success rate by up to $21\%$. These results demonstrate the promise of the Client-CASH mechanism.

preprint2016arXiv

Towards Human Computable Passwords

An interesting challenge for the cryptography community is to design authentication protocols that are so simple that a human can execute them without relying on a fully trusted computer. We propose several candidate authentication protocols for a setting in which the human user can only receive assistance from a semi-trusted computer --- a computer that stores information and performs computations correctly but does not provide confidentiality. Our schemes use a semi-trusted computer to store and display public challenges $C_i\in[n]^k$. The human user memorizes a random secret mapping $σ:[n]\rightarrow\mathbb{Z}_d$ and authenticates by computing responses $f(σ(C_i))$ to a sequence of public challenges where $f:\mathbb{Z}_d^k\rightarrow\mathbb{Z}_d$ is a function that is easy for the human to evaluate. We prove that any statistical adversary needs to sample $m=\tildeΩ(n^{s(f)})$ challenge-response pairs to recover $σ$, for a security parameter $s(f)$ that depends on two key properties of $f$. To obtain our results, we apply the general hypercontractivity theorem to lower bound the statistical dimension of the distribution over challenge-response pairs induced by $f$ and $σ$. Our lower bounds apply to arbitrary functions $f $ (not just to functions that are easy for a human to evaluate), and generalize recent results of Feldman et al. As an application, we propose a family of human computable password functions $f_{k_1,k_2}$ in which the user needs to perform $2k_1+2k_2+1$ primitive operations (e.g., adding two digits or remembering $σ(i)$), and we show that $s(f) = \min\{k_1+1, (k_2+1)/2\}$. For these schemes, we prove that forging passwords is equivalent to recovering the secret mapping. Thus, our human computable password schemes can maintain strong security guarantees even after an adversary has observed the user login to many different accounts.

preprint2015arXiv

Audit Games with Multiple Defender Resources

Modern organizations (e.g., hospitals, social networks, government agencies) rely heavily on audit to detect and punish insiders who inappropriately access and disclose confidential information. Recent work on audit games models the strategic interaction between an auditor with a single audit resource and auditees as a Stackelberg game, augmenting associated well-studied security games with a configurable punishment parameter. We significantly generalize this audit game model to account for multiple audit resources where each resource is restricted to audit a subset of all potential violations, thus enabling application to practical auditing scenarios. We provide an FPTAS that computes an approximately optimal solution to the resulting non-convex optimization problem. The main technical novelty is in the design and correctness proof of an optimization transformation that enables the construction of this FPTAS. In addition, we experimentally demonstrate that this transformation significantly speeds up computation of solutions for a class of audit games and security games.

preprint2014arXiv

Set Families with Low Pairwise Intersection

A $\left(n,\ell,γ\right)$-sharing set family of size $m$ is a family of sets $S_1,\ldots,S_m\subseteq [n]$ s.t. each set has size $\ell$ and each pair of sets shares at most $γ$ elements. We let $m\left(n,\ell,γ\right)$ denote the maximum size of any such set family and we consider the following question: How large can $m\left(n,\ell,γ\right)$ be? $\left(n,\ell,γ\right)$-sharing set families have a rich set of applications including the construction of pseudorandom number generators and usable and secure password management schemes. We analyze the explicit construction of Blocki et al using recent bounds on the value of the $t$'th Ramanujan prime. We show that this explicit construction produces a $\left(4\ell^2\ln 4\ell,\ell,γ\right)$-sharing set family of size $\left(2 \ell \ln 2\ell\right)^{γ+1}$ for any $\ell\geq γ$. We also show that the construction of Blocki et al can be used to obtain a weak $\left(n,\ell,γ\right)$-sharing set family of size $m$ for any $m >0$. These results are competitive with the inexplicit construction of Raz et al for weak $\left(n,\ell,γ\right)$-sharing families. We show that our explicit construction of weak $\left(n,\ell,γ\right)$-sharing set families can be used to obtain a parallelizable pseudorandom number generator with a low memory footprint by using the pseudorandom number generator of Nisan and Wigderson. We also prove that $m\left(n,n/c_1,c_2n\right)$ must be a constant whenever $c_2 \leq \frac{2}{c_1^3+c_1^2}$. We show that this bound is nearly tight as $m\left(n,n/c_1,c_2n\right)$ grows exponentially fast whenever $c_2 > c_1^{-2}$.

preprint2013arXiv

Adaptive Regret Minimization in Bounded-Memory Games

Online learning algorithms that minimize regret provide strong guarantees in situations that involve repeatedly making decisions in an uncertain environment, e.g. a driver deciding what route to drive to work every day. While regret minimization has been extensively studied in repeated games, we study regret minimization for a richer class of games called bounded memory games. In each round of a two-player bounded memory-m game, both players simultaneously play an action, observe an outcome and receive a reward. The reward may depend on the last m outcomes as well as the actions of the players in the current round. The standard notion of regret for repeated games is no longer suitable because actions and rewards can depend on the history of play. To account for this generality, we introduce the notion of k-adaptive regret, which compares the reward obtained by playing actions prescribed by the algorithm against a hypothetical k-adaptive adversary with the reward obtained by the best expert in hindsight against the same adversary. Roughly, a hypothetical k-adaptive adversary adapts her strategy to the defender's actions exactly as the real adversary would within each window of k rounds. Our definition is parametrized by a set of experts, which can include both fixed and adaptive defender strategies. We investigate the inherent complexity of and design algorithms for adaptive regret minimization in bounded memory games of perfect and imperfect information. We prove a hardness result showing that, with imperfect information, any k-adaptive regret minimizing algorithm (with fixed strategies as experts) must be inefficient unless NP=RP even when playing against an oblivious adversary. In contrast, for bounded memory games of perfect and imperfect information we present approximate 0-adaptive regret minimization algorithms against an oblivious adversary running in time n^{O(1)}.

preprint2013arXiv

Audit Games

Effective enforcement of laws and policies requires expending resources to prevent and detect offenders, as well as appropriate punishment schemes to deter violators. In particular, enforcement of privacy laws and policies in modern organizations that hold large volumes of personal information (e.g., hospitals, banks, and Web services providers) relies heavily on internal audit mechanisms. We study economic considerations in the design of these mechanisms, focusing in particular on effective resource allocation and appropriate punishment schemes. We present an audit game model that is a natural generalization of a standard security game model for resource allocation with an additional punishment parameter. Computing the Stackelberg equilibrium for this game is challenging because it involves solving an optimization problem with non-convex quadratic constraints. We present an additive FPTAS that efficiently computes a solution that is arbitrarily close to the optimal solution.

preprint2013arXiv

Differentially Private Data Analysis of Social Networks via Restricted Sensitivity

We introduce the notion of restricted sensitivity as an alternative to global and smooth sensitivity to improve accuracy in differentially private data analysis. The definition of restricted sensitivity is similar to that of global sensitivity except that instead of quantifying over all possible datasets, we take advantage of any beliefs about the dataset that a querier may have, to quantify over a restricted class of datasets. Specifically, given a query f and a hypothesis H about the structure of a dataset D, we show generically how to transform f into a new query f_H whose global sensitivity (over all datasets including those that do not satisfy H) matches the restricted sensitivity of the query f. Moreover, if the belief of the querier is correct (i.e., D is in H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be inaccurate. We demonstrate the usefulness of this notion by considering the task of answering queries regarding social-networks, which we model as a combination of a graph and a labeling of its vertices. In particular, while our generic procedure is computationally inefficient, for the specific definition of H as graphs of bounded degree, we exhibit efficient ways of constructing f_H using different projection-based techniques. We then analyze two important query classes: subgraph counting queries (e.g., number of triangles) and local profile queries (e.g., number of people who know a spy and a computer-scientist who know each other). We demonstrate that the restricted sensitivity of such queries can be significantly lower than their smooth sensitivity. Thus, using restricted sensitivity we can maintain privacy whether or not D is in H, while providing more accurate results in the event that H holds true.

preprint2013arXiv

GOTCHA Password Hackers!

We introduce GOTCHAs (Generating panOptic Turing Tests to Tell Computers and Humans Apart) as a way of preventing automated offline dictionary attacks against user selected passwords. A GOTCHA is a randomized puzzle generation protocol, which involves interaction between a computer and a human. Informally, a GOTCHA should satisfy two key properties: (1) The puzzles are easy for the human to solve. (2) The puzzles are hard for a computer to solve even if it has the random bits used by the computer to generate the final puzzle --- unlike a CAPTCHA. Our main theorem demonstrates that GOTCHAs can be used to mitigate the threat of offline dictionary attacks against passwords by ensuring that a password cracker must receive constant feedback from a human being while mounting an attack. Finally, we provide a candidate construction of GOTCHAs based on Inkblot images. Our construction relies on the usability assumption that users can recognize the phrases that they originally used to describe each Inkblot image --- a much weaker usability assumption than previous password systems based on Inkblots which required users to recall their phrase exactly. We conduct a user study to evaluate the usability of our GOTCHA construction. We also generate a GOTCHA challenge where we encourage artificial intelligence and security researchers to try to crack several passwords protected with our scheme.

preprint2013arXiv

Naturally Rehearsing Passwords

We introduce quantitative usability and security models to guide the design of password management schemes --- systematic strategies to help users create and remember multiple passwords. In the same way that security proofs in cryptography are based on complexity-theoretic assumptions (e.g., hardness of factoring and discrete logarithm), we quantify usability by introducing usability assumptions. In particular, password management relies on assumptions about human memory, e.g., that a user who follows a particular rehearsal schedule will successfully maintain the corresponding memory. These assumptions are informed by research in cognitive science and validated through empirical studies. Given rehearsal requirements and a user's visitation schedule for each account, we use the total number of extra rehearsals that the user would have to do to remember all of his passwords as a measure of the usability of the password scheme. Our usability model leads us to a key observation: password reuse benefits users not only by reducing the number of passwords that the user has to memorize, but more importantly by increasing the natural rehearsal rate for each password. We also present a security model which accounts for the complexity of password management with multiple accounts and associated threats, including online, offline, and plaintext password leak attacks. Observing that current password management schemes are either insecure or unusable, we present Shared Cues--- a new scheme in which the underlying secret is strategically shared across accounts to ensure that most rehearsal requirements are satisfied naturally while simultaneously providing strong security. The construction uses the Chinese Remainder Theorem to achieve these competing goals.

preprint2013arXiv

Optimizing Password Composition Policies

A password composition policy restricts the space of allowable passwords to eliminate weak passwords that are vulnerable to statistical guessing attacks. Usability studies have demonstrated that existing password composition policies can sometimes result in weaker password distributions; hence a more principled approach is needed. We introduce the first theoretical model for optimizing password composition policies. We study the computational and sample complexity of this problem under different assumptions on the structure of policies and on users' preferences over passwords. Our main positive result is an algorithm that -- with high probability --- constructs almost optimal policies (which are specified as a union of subsets of allowed passwords), and requires only a small number of samples of users' preferred passwords. We complement our theoretical results with simulations using a real-world dataset of 32 million passwords.

preprint2012arXiv

The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy

This paper proves that an "old dog", namely -- the classical Johnson-Lindenstrauss transform, "performs new tricks" -- it gives a novel way of preserving differential privacy. We show that if we take two databases, $D$ and $D'$, such that (i) $D'-D$ is a rank-1 matrix of bounded norm and (ii) all singular values of $D$ and $D'$ are sufficiently large, then multiplying either $D$ or $D'$ with a vector of iid normal Gaussians yields two statistically close distributions in the sense of differential privacy. Furthermore, a small, deterministic and \emph{public} alteration of the input is enough to assert that all singular values of $D$ are large. We apply the Johnson-Lindenstrauss transform to the task of approximating cut-queries: the number of edges crossing a $(S,\bar S)$-cut in a graph. We show that the JL transform allows us to \emph{publish a sanitized graph} that preserves edge differential privacy (where two graphs are neighbors if they differ on a single edge) while adding only $O(|S|/ε)$ random noise to any given query (w.h.p). Comparing the additive noise of our algorithm to existing algorithms for answering cut-queries in a differentially private manner, we outperform all others on small cuts ($|S| = o(n)$). We also apply our technique to the task of estimating the variance of a given matrix in any given direction. The JL transform allows us to \emph{publish a sanitized covariance matrix} that preserves differential privacy w.r.t bounded changes (each row in the matrix can change by at most a norm-1 vector) while adding random noise of magnitude independent of the size of the matrix (w.h.p). In contrast, existing algorithms introduce an error which depends on the matrix dimensions.

preprint2010arXiv

Resolving the Complexity of Some Data Privacy Problems

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving both positive and negative results: 1. 2-anonymity is in P. 2. The problem of partitioning the edges of a triangle-free graph into 4-stars (degree-three vertices) is NP-hard. This yields an alternative proof that 3-anonymity is NP-hard even when the database attributes are all binary. 3. 3-anonymity with only 27 attributes per record is MAX SNP-hard. 4. For databases with n rows, k-anonymity is in O(4^n poly(n)) time for all k > 1. 5. For databases with n rows and l <= log_{2c+2} log n attributes over an alphabet of cardinality c = O(1), k-anonymity is in P. Assuming c, l = O(1), k-anonymity is in O(n). 6. 3-diversity with binary attributes is NP-hard, with one sensitive attribute. 7. 2-diversity with binary attributes is NP-hard, with three sensitive attributes.

Jeremiah Blocki

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Cost-Asymmetric Memory Hard Password Hashing

Privately Estimating Graph Parameters in Sublinear time

DAHash: Distribution Aware Tuning of Password Hashing Costs

An Economic Model for Quantum Key-Recovery Attacks against Ideal Ciphers

Bicycle Attacks Considered Harmful: Quantifying the Damage of Widespread Password Length Leakage

DALock: Distribution Aware Password Throttling

On Locally Decodable Codes in Resource Bounded Channels

On the Economics of Offline Password Cracking

Spaced Repetition and Mnemonics Enable Recall of Multiple Strong Passwords

CASH: A Cost Asymmetric Secure Hash Algorithm for Optimal Password Protection

Client-CASH: Protecting Master Passwords against Offline Attacks

Towards Human Computable Passwords

Audit Games with Multiple Defender Resources

Set Families with Low Pairwise Intersection

Adaptive Regret Minimization in Bounded-Memory Games

Audit Games

Differentially Private Data Analysis of Social Networks via Restricted Sensitivity

GOTCHA Password Hackers!

Naturally Rehearsing Passwords

Optimizing Password Composition Policies

The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy

Resolving the Complexity of Some Data Privacy Problems