Source author record

Dan S. Wallach

Dan S. Wallach appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Networking and Internet Architecture Social and Information Networks cs.CY Information Retrieval Operating Systems Computer Science and Game Theory physics.soc-ph

Catalog footprint

What is connected

17works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

An Empirical Study of Mobile Ad Targeting

Advertising, long the financial mainstay of the web ecosystem, has become nearly ubiquitous in the world of mobile apps. While ad targeting on the web is fairly well understood, mobile ad targeting is much less studied. In this paper, we use empirical methods to collect a database of over 225,000 ads on 32 simulated devices hosting one of three distinct user profiles. We then analyze how the ads are targeted by correlating ads to potential targeting profiles using Bayes' rule and Pearson's chi squared test. This enables us to measure the prevalence of different forms of targeting. We find that nearly all ads show the effects of application- and time-based targeting, while we are able to identify location-based targeting in 43% of the ads and user-based targeting in 39%.

preprint2014arXiv

Aggregate Characterization of User Behavior in Twitter and Analysis of the Retweet Graph

Most previous analysis of Twitter user behavior is focused on individual information cascades and the social followers graph. We instead study aggregate user behavior and the retweet graph with a focus on quantitative descriptions. We find that the lifetime tweet distribution is a type-II discrete Weibull stemming from a power law hazard function, the tweet rate distribution, although asymptotically power law, exhibits a lognormal cutoff over finite sample intervals, and the inter-tweet interval distribution is power law with exponential cutoff. The retweet graph is small-world and scale-free, like the social graph, but is less disassortative and has much stronger clustering. These differences are consistent with it better capturing the real-world social relationships of and trust between users. Beyond just understanding and modeling human communication patterns and social networks, applications for alternative, decentralized microblogging systems-both predicting real-word performance and detecting spam-are discussed.

preprint2014arXiv

Glider: A GPU Library Driver for Improved System Security

Legacy device drivers implement both device resource management and isolation. This results in a large code base with a wide high-level interface making the driver vulnerable to security attacks. This is particularly problematic for increasingly popular accelerators like GPUs that have large, complex drivers. We solve this problem with library drivers, a new driver architecture. A library driver implements resource management as an untrusted library in the application process address space, and implements isolation as a kernel module that is smaller and has a narrower lower-level interface (i.e., closer to hardware) than a legacy driver. We articulate a set of device and platform hardware properties that are required to retrofit a legacy driver into a library driver. To demonstrate the feasibility and superiority of library drivers, we present Glider, a library driver implementation for two GPUs of popular brands, Radeon and Intel. Glider reduces the TCB size and attack surface by about 35% and 84% respectively for a Radeon HD 6450 GPU and by about 38% and 90% respectively for an Intel Ivy Bridge GPU. Moreover, it incurs no performance cost. Indeed, Glider outperforms a legacy driver for applications requiring intensive interactions with the device driver, such as applications using the OpenGL immediate mode API.

preprint2014arXiv

Performance Analysis of Location Profile Routing

We propose using the predictability of human motion to eliminate the overhead of distributed location services in human-carried MANETs, dubbing the technique location profile routing. This method outperforms the Geographic Hashing Location Service when nodes change locations 2x more frequently than they initiate connections (e.g., start new TCP streams), as in applications like text- and instant-messaging. Prior characterizations of human mobility are used to show that location profile routing achieves a 93% delivery ratio with a 1.75x first-packet latency increase relative to an oracle location service.

preprint2014arXiv

The Mason Test: A Defense Against Sybil Attacks in Wireless Networks Without Trusted Authorities

Wireless networks are vulnerable to Sybil attacks, in which a malicious node poses as many identities in order to gain disproportionate influence. Many defenses based on spatial variability of wireless channels exist, but depend either on detailed, multi-tap channel estimation - something not exposed on commodity 802.11 devices - or valid RSSI observations from multiple trusted sources, e.g., corporate access points - something not directly available in ad hoc and delay-tolerant networks with potentially malicious neighbors. We extend these techniques to be practical for wireless ad hoc networks of commodity 802.11 devices. Specifically, we propose two efficient methods for separating the valid RSSI observations of behaving nodes from those falsified by malicious participants. Further, we note that prior signalprint methods are easily defeated by mobile attackers and develop an appropriate challenge-response defense. Finally, we present the Mason test, the first implementation of these techniques for ad hoc and delay-tolerant networks of commodity 802.11 devices. We illustrate its performance in several real-world scenarios.

preprint2013arXiv

A Case of Collusion: A Study of the Interface Between Ad Libraries and their Apps

A growing concern with advertisement libraries on Android is their ability to exfiltrate personal information from their host applications. While previous work has looked at the libraries' abilities to measure private information on their own, advertising libraries also include APIs through which a host application can deliberately leak private information about the user. This study considers a corpus of 114,000 apps. We reconstruct the APIs for 103 ad libraries used in the corpus, and study how the privacy leaking APIs from the top 20 ad libraries are used by the applications. Notably, we have found that app popularity correlates with privacy leakage; the marginal increase in advertising revenue, multiplied over a larger user base, seems to incentivize these app vendors to violate their users' privacy.

preprint2013arXiv

Automated generation of web server fingerprints

In this paper, we demonstrate that it is possible to automatically generate fingerprints for various web server types using multifactor Bayesian inference on randomly selected servers on the Internet, without building an a priori catalog of server features or behaviors. This makes it possible to conclusively study web server distribution without relying on reported (and variable) version strings. We gather data by sending a collection of specialized requests to 110,000 live web servers. Using only the server response codes, we then train an algorithm to successfully predict server types independently of the server version string. In the process, we note several distinguishing features of current web infrastructure.

preprint2013arXiv

Longitudinal Analysis of Android Ad Library Permissions

This paper investigates changes over time in the behavior of Android ad libraries. Taking a sample of 100,000 apps, we extract and classify the ad libraries. By considering the release dates of the applications that use a specific ad library version, we estimate the release date for the library, and thus build a chronological map of the permissions used by various ad libraries over time. We find that the use of most permissions has increased over the last several years, and that more libraries are able to use permissions that pose particular risks to user privacy and security.

preprint2013arXiv

The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

Weibo and other popular Chinese microblogging sites are well known for exercising internal censorship, to comply with Chinese government requirements. This research seeks to quantify the mechanisms of this censorship: how fast and how comprehensively posts are deleted.Our analysis considered 2.38 million posts gathered over roughly two months in 2012, with our attention focused on repeatedly visiting "sensitive" users. This gives us a view of censorship events within minutes of their occurrence, albeit at a cost of our data no longer representing a random sample of the general Weibo population. We also have a larger 470 million post sampling from Weibo's public timeline, taken over a longer time period, that is more representative of a random sample. We found that deletions happen most heavily in the first hour after a post has been submitted. Focusing on original posts, not reposts/retweets, we observed that nearly 30% of the total deletion events occur within 5- 30 minutes. Nearly 90% of the deletions happen within the first 24 hours. Leveraging our data, we also considered a variety of hypotheses about the mechanisms used by Weibo for censorship, such as the extent to which Weibo's censors use retrospective keyword-based censorship, and how repost/retweet popularity interacts with censorship. We also used natural language processing techniques to analyze which topics were more likely to be censored.

preprint2012arXiv

AdSplit: Separating smartphone advertising from applications

A wide variety of smartphone applications today rely on third-party advertising services, which provide libraries that are linked into the hosting application. This situation is undesirable for both the application author and the advertiser. Advertising libraries require additional permissions, resulting in additional permission requests to users. Likewise, a malicious application could simulate the behavior of the advertising library, forging the user's interaction and effectively stealing money from the advertiser. This paper describes AdSplit, where we extended Android to allow an application and its advertising to run as separate processes, under separate user-ids, eliminating the need for applications to request permissions on behalf of their advertising libraries. We also leverage mechanisms from Quire to allow the remote server to validate the authenticity of client-side behavior. In this paper, we quantify the degree of permission bloat caused by advertising, with a study of thousands of downloaded apps. AdSplit automatically recompiles apps to extract their ad services, and we measure minimal runtime overhead. We also observe that most ad libraries just embed an HTML widget within and describe how AdSplit can be designed with this in mind to avoid any need for ads to have native code.

preprint2012arXiv

STAR-Vote: A Secure, Transparent, Auditable, and Reliable Voting System

In her 2011 EVT/WOTE keynote, Travis County, Texas County Clerk Dana DeBeauvoir described the qualities she wanted in her ideal election system to replace their existing DREs. In response, in April of 2012, the authors, working with DeBeauvoir and her staff, jointly architected STAR-Vote, a voting system with a DRE-style human interface and a "belt and suspenders" approach to verifiability. It provides both a paper trail and end-to-end cryptography using COTS hardware. It is designed to support both ballot-level risk-limiting audits, and auditing by individual voters and observers. The human interface and process flow is based on modern usability research. This paper describes the STAR-Vote architecture, which could well be the next-generation voting system for Travis County and perhaps elsewhere.

preprint2012arXiv

Tracking and Quantifying Censorship on a Chinese Microblogging Site

We present measurements and analysis of censorship on Weibo, a popular microblogging site in China. Since we were limited in the rate at which we could download posts, we identified users likely to participate in sensitive topics and recursively followed their social contacts. We also leveraged new natural language processing techniques to pick out trending topics despite the use of neologisms, named entities, and informal language usage in Chinese social media. We found that Weibo dynamically adapts to the changing interests of its users through multiple layers of filtering. The filtering includes both retroactively searching posts by keyword or repost links to delete them, and rejecting posts as they are posted. The trend of sensitive topics is short-lived, suggesting that the censorship is effective in stopping the "viral" spread of sensitive issues. We also give evidence that sensitive topics in Weibo only scarcely propagate beyond a core of sensitive posters.

preprint2011arXiv

#h00t: Censorship Resistant Microblogging

Microblogging services such as Twitter are an increasingly important way to communicate, both for individuals and for groups through the use of hashtags that denote topics of conversation. However, groups can be easily blocked from communicating through blocking of posts with the given hashtags. We propose #h00t, a system for censorship resistant microblogging. #h00t presents an interface that is much like Twitter, except that hashtags are replaced with very short hashes (e.g., 24 bits) of the group identifier. Naturally, with such short hashes, hashtags from different groups may collide and #h00t users will actually seek to create collisions. By encrypting all posts with keys derived from the group identifiers, #h00t client software can filter out other groups' posts while making such filtering difficult for the adversary. In essence, by leveraging collisions, groups can tunnel their posts in other groups' posts. A censor could not block a given group without also blocking the other groups with colliding hashtags. We evaluate the feasibility of #h00t through traces collected from Twitter, showing that a single modern computer has enough computational throughput to encrypt every tweet sent through Twitter in real time. We also use these traces to analyze the bandwidth and anonymity tradeoffs that would come with different variations on how group identifiers are encoded and hashtags are selected to purposefully collide with one another.

preprint2011arXiv

An Analysis of Chinese Search Engine Filtering

The imposition of government mandates upon Internet search engine operation is a growing area of interest for both computer science and public policy. Users of these search engines often observe evidence of censorship, but the government policies that impose this censorship are not generally public. To better understand these policies, we conducted a set of experiments on major search engines employed by Internet users in China, issuing queries against a variety of different words: some neutral, some with names of important people, some political, and some pornographic. We conducted these queries, in Chinese, against Baidu, Google (including google.cn, before it was terminated), Yahoo!, and Bing. We found remarkably aggressive filtering of pornographic terms, in some cases causing non-pornographic terms which use common characters to also be filtered. We also found that names of prominent activists and organizers as well as top political and military leaders, were also filtered in whole or in part. In some cases, we found search terms which we believe to be "blacklisted". In these cases, the only results that appeared, for any of them, came from a short "whitelist" of sites owned or controlled directly by the Chinese government. By repeating observations over a long observation period, we also found that the keyword blocking policies of the Great Firewall of China vary over time. While our results don't offer any fundamental insight into how to defeat or work around Chinese internet censorship, they are still helpful to understand the structure of how censorship duties are shared between the Great Firewall and Chinese search engines.

preprint2011arXiv

Attacks on Local Searching Tools

The Google Desktop Search is an indexing tool, currently in beta testing, designed to allow users fast, intuitive, searching for local files. The principle interface is provided through a local web server which supports an interface similar to Google.com's normal web page. Indexing of local files occurs when the system is idle, and understands a number of common file types. A optional feature is that Google Desktop can integrate a short summary of a local search results with Google.com web searches. This summary includes 30-40 character snippets of local files. We have uncovered a vulnerability that would release private local data to an unauthorized remote entity. Using two different attacks, we expose the small snippets of private local data to a remote third party.

preprint2011arXiv

Building Better Incentives for Robustness in BitTorrent

BitTorrent is a widely-deployed, peer-to-peer file transfer protocol engineered with a "tit for tat" mechanism that encourages cooperation. Unfortunately, there is little incentive for nodes to altruistically provide service to their peers after they finish downloading a file, and what altruism there is can be exploited by aggressive clients like Bit- Tyrant. This altruism, called seeding, is always beneficial and sometimes essential to BitTorrent's real-world performance. We propose a new long-term incentives mechanism in BitTorrent to encourage peers to seed and we evaluate its effectiveness via simulation. We show that when nodes running our algorithm reward one another for good behavior in previous swarms, they experience as much as a 50% improvement in download times over unrewarded nodes. Even when aggressive clients, such as BitTyrant, participate in the swarm, our rewarded nodes still outperform them, although by smaller margins.

preprint2011arXiv

The BitTorrent Anonymity Marketplace

The very nature of operations in peer-to-peer systems such as BitTorrent exposes information about participants to their peers. Nodes desiring anonymity, therefore, often chose to route their peer-to-peer traffic through anonymity relays, such as Tor. Unfortunately, these relays have little incentive for contribution and struggle to scale with the high loads that P2P traffic foists upon them. We propose a novel modification for BitTorrent that we call the BitTorrent Anonymity Marketplace. Peers in our system trade in k swarms obscuring the actual intent of the participants. But because peers can cross-trade torrents, the k-1 cover traffic can actually serve a useful purpose. This creates a system wherein a neighbor cannot determine if a node actually wants a given torrent, or if it is only using it as leverage to get the one it really wants. In this paper, we present our design, explore its operation in simulation, and analyze its effectiveness. We demonstrate that the upload and download characteristics of cover traffic and desired torrents are statistically difficult to distinguish.

Dan S. Wallach

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

An Empirical Study of Mobile Ad Targeting

Aggregate Characterization of User Behavior in Twitter and Analysis of the Retweet Graph

Glider: A GPU Library Driver for Improved System Security

Performance Analysis of Location Profile Routing

The Mason Test: A Defense Against Sybil Attacks in Wireless Networks Without Trusted Authorities

A Case of Collusion: A Study of the Interface Between Ad Libraries and their Apps

Automated generation of web server fingerprints

Longitudinal Analysis of Android Ad Library Permissions

The Velocity of Censorship: High-Fidelity Detection of Microblog Post Deletions

AdSplit: Separating smartphone advertising from applications

STAR-Vote: A Secure, Transparent, Auditable, and Reliable Voting System

Tracking and Quantifying Censorship on a Chinese Microblogging Site

#h00t: Censorship Resistant Microblogging

An Analysis of Chinese Search Engine Filtering

Attacks on Local Searching Tools

Building Better Incentives for Robustness in BitTorrent

The BitTorrent Anonymity Marketplace