Source author record

Umar Syed

Umar Syed appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Databases Computation Computer Science and Game Theory Cryptography and Security cs.CY Data Structures and Algorithms

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Statistical anonymity: Quantifying reidentification risks without reidentifying users

Data anonymization is an approach to privacy-preserving data release aimed at preventing participants reidentification, and it is an important alternative to differential privacy in applications that cannot tolerate noisy data. Existing algorithms for enforcing $k$-anonymity in the released data assume that the curator performing the anonymization has complete access to the original data. Reasons for limiting this access range from undesirability to complete infeasibility. This paper explores ideas -- objectives, metrics, protocols, and extensions -- for reducing the trust that must be placed in the curator, while still maintaining a statistical notion of $k$-anonymity. We suggest trust (amount of information provided to the curator) and privacy (anonymity of the participants) as the primary objectives of such a framework. We describe a class of protocols aimed at achieving these goals, proposing new metrics of privacy in the process, and proving related bounds. We conclude by discussing a natural extension of this work that completely removes the need for a central curator.

preprint2015arXiv

An $\tilde{O}(\frac{1}{\sqrt{T}})$-error online algorithm for retrieving heavily perturbated statistical databases in the low-dimensional querying mode

We give the first $\tilde{O}(\frac{1}{\sqrt{T}})$-error online algorithm for reconstructing noisy statistical databases, where $T$ is the number of (online) sample queries received. The algorithm, which requires only $O(\log T)$ memory, aims to learn a hidden database-vector $w^{*} \in \mathbb{R}^{D}$ in order to accurately answer a stream of queries regarding the hidden database, which arrive in an online fashion from some unknown distribution $\mathcal{D}$. We assume the distribution $\mathcal{D}$ is defined on the neighborhood of a low-dimensional manifold. The presented algorithm runs in $O(dD)$-time per query, where $d$ is the dimensionality of the query-space. Contrary to the classical setting, there is no separate training set that is used by the algorithm to learn the database --- the stream on which the algorithm will be evaluated must also be used to learn the database-vector. The algorithm only has access to a binary oracle $\mathcal{O}$ that answers whether a particular linear function of the database-vector plus random noise is larger than a threshold, which is specified by the algorithm. We note that we allow for a significant $O(D)$ amount of noise to be added while other works focused on the low noise $o(\sqrt{D})$-setting. For a stream of $T$ queries our algorithm achieves an average error $\tilde{O}(\frac{1}{\sqrt{T}})$ by filtering out random noise, adapting threshold values given to the oracle based on its previous answers and, as a consequence, recovering with high precision a projection of a database-vector $w^{*}$ onto the manifold defining the query-space.

preprint2013arXiv

Learning Prices for Repeated Auctions with Strategic Buyers

Inspired by real-time ad exchanges for online display advertising, we consider the problem of inferring a buyer's value distribution for a good when the buyer is repeatedly interacting with a seller through a posted-price mechanism. We model the buyer as a strategic agent, whose goal is to maximize her long-term surplus, and we are interested in mechanisms that maximize the seller's long-term revenue. We define the natural notion of strategic regret --- the lost revenue as measured against a truthful (non-strategic) buyer. We present seller algorithms that are no-(strategic)-regret when the buyer discounts her future surplus --- i.e. the buyer prefers showing advertisements to users sooner rather than later. We also give a lower bound on strategic regret that increases as the buyer's discounting weakens and shows, in particular, that any seller algorithm will suffer linear strategic regret if there is no discounting.

preprint2012arXiv

Graphical Models for Bandit Problems

We introduce a rich class of graphical models for multi-armed bandit problems that permit both the state or context space and the action space to be very large, yet succinctly specify the payoffs for any context-action pair. Our main result is an algorithm for such models whose regret is bounded by the number of parameters and whose running time depends only on the treewidth of the graph substructure induced by the action space.

preprint2012arXiv

Imitation Learning with a Value-Based Prior

The goal of imitation learning is for an apprentice to learn how to behave in a stochastic environment by observing a mentor demonstrating the correct behavior. Accurate prior knowledge about the correct behavior can reduce the need for demonstrations from the mentor. We present a novel approach to encoding prior knowledge about the correct behavior, where we assume that this prior knowledge takes the form of a Markov Decision Process (MDP) that is used by the apprentice as a rough and imperfect model of the mentor's behavior. Specifically, taking a Bayesian approach, we treat the value of a policy in this modeling MDP as the log prior probability of the policy. In other words, we assume a priori that the mentor's behavior is likely to be a high value policy in the modeling MDP, though quite possibly different from the optimal policy. We describe an efficient algorithm that, given a modeling MDP and a set of demonstrations by a mentor, provably converges to a stationary point of the log posterior of the mentor's policy, where the posterior is computed with respect to the "value based" prior. We also present empirical evidence that this prior does in fact speed learning of the mentor's policy, and is an improvement in our experiments over similar previous methods.

preprint2010arXiv

Adapting to the Shifting Intent of Search Queries

Search engines today present results that are often oblivious to abrupt shifts in intent. For example, the query `independence day' usually refers to a US holiday, but the intent of this query abruptly changed during the release of a major film by that name. While no studies exactly quantify the magnitude of intent-shifting traffic, studies suggest that news events, seasonal topics, pop culture, etc account for 50% of all search queries. This paper shows that the signals a search engine receives can be used to both determine that a shift in intent has happened, as well as find a result that is now more relevant. We present a meta-algorithm that marries a classifier with a bandit algorithm to achieve regret that depends logarithmically on the number of query impressions, under certain assumptions. We provide strong evidence that this regret is close to the best achievable. Finally, via a series of experiments, we demonstrate that our algorithm outperforms prior approaches, particularly as the amount of intent-shifting traffic increases.