Researcher profile

Dan S. Wallach

Dan S. Wallach contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2013arXiv

A Case of Collusion: A Study of the Interface Between Ad Libraries and their Apps

A growing concern with advertisement libraries on Android is their ability to exfiltrate personal information from their host applications. While previous work has looked at the libraries' abilities to measure private information on their own, advertising libraries also include APIs through which a host application can deliberately leak private information about the user. This study considers a corpus of 114,000 apps. We reconstruct the APIs for 103 ad libraries used in the corpus, and study how the privacy leaking APIs from the top 20 ad libraries are used by the applications. Notably, we have found that app popularity correlates with privacy leakage; the marginal increase in advertising revenue, multiplied over a larger user base, seems to incentivize these app vendors to violate their users' privacy.

preprint2013arXiv

Automated generation of web server fingerprints

In this paper, we demonstrate that it is possible to automatically generate fingerprints for various web server types using multifactor Bayesian inference on randomly selected servers on the Internet, without building an a priori catalog of server features or behaviors. This makes it possible to conclusively study web server distribution without relying on reported (and variable) version strings. We gather data by sending a collection of specialized requests to 110,000 live web servers. Using only the server response codes, we then train an algorithm to successfully predict server types independently of the server version string. In the process, we note several distinguishing features of current web infrastructure.

preprint2013arXiv

Longitudinal Analysis of Android Ad Library Permissions

This paper investigates changes over time in the behavior of Android ad libraries. Taking a sample of 100,000 apps, we extract and classify the ad libraries. By considering the release dates of the applications that use a specific ad library version, we estimate the release date for the library, and thus build a chronological map of the permissions used by various ad libraries over time. We find that the use of most permissions has increased over the last several years, and that more libraries are able to use permissions that pose particular risks to user privacy and security.

preprint2013arXiv

The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

Weibo and other popular Chinese microblogging sites are well known for exercising internal censorship, to comply with Chinese government requirements. This research seeks to quantify the mechanisms of this censorship: how fast and how comprehensively posts are deleted.Our analysis considered 2.38 million posts gathered over roughly two months in 2012, with our attention focused on repeatedly visiting "sensitive" users. This gives us a view of censorship events within minutes of their occurrence, albeit at a cost of our data no longer representing a random sample of the general Weibo population. We also have a larger 470 million post sampling from Weibo's public timeline, taken over a longer time period, that is more representative of a random sample. We found that deletions happen most heavily in the first hour after a post has been submitted. Focusing on original posts, not reposts/retweets, we observed that nearly 30% of the total deletion events occur within 5- 30 minutes. Nearly 90% of the deletions happen within the first 24 hours. Leveraging our data, we also considered a variety of hypotheses about the mechanisms used by Weibo for censorship, such as the extent to which Weibo's censors use retrospective keyword-based censorship, and how repost/retweet popularity interacts with censorship. We also used natural language processing techniques to analyze which topics were more likely to be censored.

preprint2012arXiv

STAR-Vote: A Secure, Transparent, Auditable, and Reliable Voting System

In her 2011 EVT/WOTE keynote, Travis County, Texas County Clerk Dana DeBeauvoir described the qualities she wanted in her ideal election system to replace their existing DREs. In response, in April of 2012, the authors, working with DeBeauvoir and her staff, jointly architected STAR-Vote, a voting system with a DRE-style human interface and a "belt and suspenders" approach to verifiability. It provides both a paper trail and end-to-end cryptography using COTS hardware. It is designed to support both ballot-level risk-limiting audits, and auditing by individual voters and observers. The human interface and process flow is based on modern usability research. This paper describes the STAR-Vote architecture, which could well be the next-generation voting system for Travis County and perhaps elsewhere.

preprint2012arXiv

Tracking and Quantifying Censorship on a Chinese Microblogging Site

We present measurements and analysis of censorship on Weibo, a popular microblogging site in China. Since we were limited in the rate at which we could download posts, we identified users likely to participate in sensitive topics and recursively followed their social contacts. We also leveraged new natural language processing techniques to pick out trending topics despite the use of neologisms, named entities, and informal language usage in Chinese social media. We found that Weibo dynamically adapts to the changing interests of its users through multiple layers of filtering. The filtering includes both retroactively searching posts by keyword or repost links to delete them, and rejecting posts as they are posted. The trend of sensitive topics is short-lived, suggesting that the censorship is effective in stopping the "viral" spread of sensitive issues. We also give evidence that sensitive topics in Weibo only scarcely propagate beyond a core of sensitive posters.

preprint2011arXiv

#h00t: Censorship Resistant Microblogging

Microblogging services such as Twitter are an increasingly important way to communicate, both for individuals and for groups through the use of hashtags that denote topics of conversation. However, groups can be easily blocked from communicating through blocking of posts with the given hashtags. We propose #h00t, a system for censorship resistant microblogging. #h00t presents an interface that is much like Twitter, except that hashtags are replaced with very short hashes (e.g., 24 bits) of the group identifier. Naturally, with such short hashes, hashtags from different groups may collide and #h00t users will actually seek to create collisions. By encrypting all posts with keys derived from the group identifiers, #h00t client software can filter out other groups' posts while making such filtering difficult for the adversary. In essence, by leveraging collisions, groups can tunnel their posts in other groups' posts. A censor could not block a given group without also blocking the other groups with colliding hashtags. We evaluate the feasibility of #h00t through traces collected from Twitter, showing that a single modern computer has enough computational throughput to encrypt every tweet sent through Twitter in real time. We also use these traces to analyze the bandwidth and anonymity tradeoffs that would come with different variations on how group identifiers are encoded and hashtags are selected to purposefully collide with one another.

preprint2011arXiv

An Analysis of Chinese Search Engine Filtering

The imposition of government mandates upon Internet search engine operation is a growing area of interest for both computer science and public policy. Users of these search engines often observe evidence of censorship, but the government policies that impose this censorship are not generally public. To better understand these policies, we conducted a set of experiments on major search engines employed by Internet users in China, issuing queries against a variety of different words: some neutral, some with names of important people, some political, and some pornographic. We conducted these queries, in Chinese, against Baidu, Google (including google.cn, before it was terminated), Yahoo!, and Bing. We found remarkably aggressive filtering of pornographic terms, in some cases causing non-pornographic terms which use common characters to also be filtered. We also found that names of prominent activists and organizers as well as top political and military leaders, were also filtered in whole or in part. In some cases, we found search terms which we believe to be "blacklisted". In these cases, the only results that appeared, for any of them, came from a short "whitelist" of sites owned or controlled directly by the Chinese government. By repeating observations over a long observation period, we also found that the keyword blocking policies of the Great Firewall of China vary over time. While our results don't offer any fundamental insight into how to defeat or work around Chinese internet censorship, they are still helpful to understand the structure of how censorship duties are shared between the Great Firewall and Chinese search engines.

preprint2011arXiv

Attacks on Local Searching Tools

The Google Desktop Search is an indexing tool, currently in beta testing, designed to allow users fast, intuitive, searching for local files. The principle interface is provided through a local web server which supports an interface similar to Google.com's normal web page. Indexing of local files occurs when the system is idle, and understands a number of common file types. A optional feature is that Google Desktop can integrate a short summary of a local search results with Google.com web searches. This summary includes 30-40 character snippets of local files. We have uncovered a vulnerability that would release private local data to an unauthorized remote entity. Using two different attacks, we expose the small snippets of private local data to a remote third party.

preprint2011arXiv

Building Better Incentives for Robustness in BitTorrent

BitTorrent is a widely-deployed, peer-to-peer file transfer protocol engineered with a "tit for tat" mechanism that encourages cooperation. Unfortunately, there is little incentive for nodes to altruistically provide service to their peers after they finish downloading a file, and what altruism there is can be exploited by aggressive clients like Bit- Tyrant. This altruism, called seeding, is always beneficial and sometimes essential to BitTorrent's real-world performance. We propose a new long-term incentives mechanism in BitTorrent to encourage peers to seed and we evaluate its effectiveness via simulation. We show that when nodes running our algorithm reward one another for good behavior in previous swarms, they experience as much as a 50% improvement in download times over unrewarded nodes. Even when aggressive clients, such as BitTyrant, participate in the swarm, our rewarded nodes still outperform them, although by smaller margins.

preprint2011arXiv

The BitTorrent Anonymity Marketplace

The very nature of operations in peer-to-peer systems such as BitTorrent exposes information about participants to their peers. Nodes desiring anonymity, therefore, often chose to route their peer-to-peer traffic through anonymity relays, such as Tor. Unfortunately, these relays have little incentive for contribution and struggle to scale with the high loads that P2P traffic foists upon them. We propose a novel modification for BitTorrent that we call the BitTorrent Anonymity Marketplace. Peers in our system trade in k swarms obscuring the actual intent of the participants. But because peers can cross-trade torrents, the k-1 cover traffic can actually serve a useful purpose. This creates a system wherein a neighbor cannot determine if a node actually wants a given torrent, or if it is only using it as leverage to get the one it really wants. In this paper, we present our design, explore its operation in simulation, and analyze its effectiveness. We demonstrate that the upload and download characteristics of cover traffic and desired torrents are statistically difficult to distinguish.