Source author record

Amir Houmansadr

Amir Houmansadr appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Cryptography and Security Machine Learning Artificial Intelligence Computer Vision Networking and Internet Architecture physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

19works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Identifying Models Behind Text-to-Image Leaderboards

Text-to-image (T2I) models are increasingly popular, producing a large share of AI-generated images online. To compare model quality, voting-based leaderboards have become the standard, relying on anonymized model outputs for fairness. In this work, we show that such anonymity can be easily broken. We find that generations from each T2I model form distinctive clusters in the image embedding space, enabling accurate deanonymization without prompt control or training data. Using 22 models and 280 prompts (150K images), our centroid-based method achieves high accuracy and reveals systematic model-specific signatures. We further introduce a prompt-level distinguishability metric and conduct large-scale analyses showing how certain prompts can lead to near-perfect distinguishability. Our findings expose fundamental security flaws in T2I leaderboards and motivate stronger anonymization defenses.

preprint2026arXiv

Network-Level Prompt and Trait Leakage in Local Research Agents

We show that Web and Research Agents (WRAs) -- language-model-based systems that investigate complex topics on the Internet -- are vulnerable to inference attacks by passive network observers. Deployment of WRAs \emph{locally} by organizations and individuals for privacy, legal, or financial purposes exposes them to DNS resolvers, malicious ISPs, VPNs, web proxies, and corporate or government firewalls. However, unlike sporadic and scarce web browsing by humans, WRAs visit $70{-}140$ domains per each request with a distinct timing pattern creating unique privacy risks. Specifically, we demonstrate a novel prompt and user trait leakage attack against WRAs that only leverages their network-level metadata (i.e., visited IP addresses and their timings). We start by building a new dataset of WRA traces based on real user search queries and queries generated by synthetic personas. We define a behavioral metric (called OBELS) to comprehensively assess similarity between original and inferred prompts, showing that our attack recovers over 73\% of the functional and domain knowledge of user prompts. Extending to a multi-session setting, we recover up to 19 of 32 latent traits with high accuracy. Our attack remains effective under partial observability and noisy conditions. Finally, we discuss mitigation strategies that constrain domain diversity or obfuscate traces, showing negligible utility impact while reducing attack effectiveness by an average of 29\%.

preprint2026arXiv

ULTra: Unveiling Latent Token Interpretability in Transformer-Based Understanding and Segmentation

Transformers have revolutionized Computer Vision (CV) through self-attention mechanisms. However, their complexity makes latent token representations difficult to interpret. We introduce ULTra, a framework for interpreting Transformer embeddings and uncovering meaningful semantic patterns within them. ULTra enables unsupervised semantic segmentation using pre-trained models without requiring fine-tuning. Additionally, we propose a self-supervised training approach that refines segmentation performance by learning an external transformation matrix without modifying the underlying model. Our method achieves state-of-the-art performance in unsupervised semantic segmentation, outperforming existing segmentation methods. Furthermore, we validate ULTra for model interpretation on both synthetic and real-world scenarios, including Object Selection and interpretable text summarization using LLMs, demonstrating its broad applicability in explaining the semantic structure of latent token representations.

preprint2022arXiv

E2FL: Equal and Equitable Federated Learning

Federated Learning (FL) enables data owners to train a shared global model without sharing their private data. Unfortunately, FL is susceptible to an intrinsic fairness issue: due to heterogeneity in clients' data distributions, the final trained model can give disproportionate advantages across the participating clients. In this work, we present Equal and Equitable Federated Learning (E2FL) to produce fair federated learning models by preserving two main fairness properties, equity and equality, concurrently. We validate the efficiency and fairness of E2FL in different real-world FL applications, and show that E2FL outperforms existing baselines in terms of the resulting efficiency, fairness of different groups, and fairness among all individual clients.

preprint2022arXiv

FRL: Federated Rank Learning

Federated learning (FL) allows mutually untrusted clients to collaboratively train a common machine learning model without sharing their private/proprietary training data among each other. FL is unfortunately susceptible to poisoning by malicious clients who aim to hamper the accuracy of the commonly trained model through sending malicious model updates during FL's training process. We argue that the key factor to the success of poisoning attacks against existing FL systems is the large space of model updates available to the clients, allowing malicious clients to search for the most poisonous model updates, e.g., by solving an optimization problem. To address this, we propose Federated Rank Learning (FRL). FRL reduces the space of client updates from model parameter updates (a continuous space of float numbers) in standard FL to the space of parameter rankings (a discrete space of integer values). To be able to train the global model using parameter ranks (instead of parameter weights), FRL leverage ideas from recent supermasks training mechanisms. Specifically, FRL clients rank the parameters of a randomly initialized neural network (provided by the server) based on their local training data. The FRL server uses a voting mechanism to aggregate the parameter rankings submitted by clients in each training epoch to generate the global ranking of the next training epoch. Intuitively, our voting-based aggregation mechanism prevents poisoning clients from making significant adversarial modifications to the global model, as each client will have a single vote! We demonstrate the robustness of FRL to poisoning through analytical proofs and experimentation. We also show FRL's high communication efficiency. Our experiments demonstrate the superiority of FRL in real-world FL settings.

preprint2021arXiv

Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems

Deep Neural Networks (DNNs) have become prevalent in wireless communication systems due to their promising performance. However, similar to other DNN-based applications, they are vulnerable to adversarial examples. In this work, we propose an input-agnostic, undetectable, and robust adversarial attack against DNN-based wireless communication systems in both white-box and black-box scenarios. We design tailored Universal Adversarial Perturbations (UAPs) to perform the attack. We also use a Generative Adversarial Network (GAN) to enforce an undetectability constraint for our attack. Furthermore, we investigate the robustness of our attack against countermeasures. We show that in the presence of defense mechanisms deployed by the communicating parties, our attack performs significantly better compared to existing attacks against DNN-based wireless systems. In particular, the results demonstrate that even when employing well-considered defenses, DNN-based wireless communications are vulnerable to adversarial attacks.

preprint2020arXiv

Asymptotic Privacy Loss due to Time Series Matching of Dependent Users

The Internet of Things (IoT) promises to improve user utility by tuning applications to user behavior, but revealing the characteristics of a user's behavior presents a significant privacy risk. Our previous work has established the challenging requirements for anonymization to protect users' privacy in a Bayesian setting in which we assume a powerful adversary who has perfect knowledge of the prior distribution for each user's behavior. However, even sophisticated adversaries do not often have such perfect knowledge; hence, in this paper, we turn our attention to an adversary who must learn user behavior from past data traces of limited length. We also assume there exists dependency between data traces of different users, and the data points of each user are drawn from a normal distribution. Results on the lengths of training sequences and data sequences that result in a loss of user privacy are presented.

preprint2020arXiv

Blind Adversarial Network Perturbations

Deep Neural Networks (DNNs) are commonly used for various traffic analysis problems, such as website fingerprinting and flow correlation, as they outperform traditional (e.g., statistical) techniques by large margins. However, deep neural networks are known to be vulnerable to adversarial examples: adversarial inputs to the model that get labeled incorrectly by the model due to small adversarial perturbations. In this paper, for the first time, we show that an adversary can defeat DNN-based traffic analysis techniques by applying \emph{adversarial perturbations} on the patterns of \emph{live} network traffic.

preprint2020arXiv

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

preprint2020arXiv

Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising

Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP guarantees while preserving model accuracy much better than previous works. Our key technique is to encode gradients to map them to a smaller vector space, therefore enabling us to obtain DP guarantees for different noise distributions. This allows us to investigate and choose noise distributions that best preserve model accuracy for a target privacy budget. We also take advantage of the post-processing property of differential privacy by introducing the idea of denoising, which further improves the utility of the trained models without degrading their DP guarantees. We show that our mechanism outperforms the state-of-the-art DPSGD; for instance, for the same model accuracy of $96.1\%$ on MNIST, our technique results in a privacy bound of $ε=3.2$ compared to $ε=6$ of DPSGD, which is a significant improvement.

preprint2020arXiv

Membership Privacy for Machine Learning Models Through Knowledge Transfer

Large capacity machine learning (ML) models are prone to membership inference attacks (MIAs), which aim to infer whether the target sample is a member of the target model's training dataset. The serious privacy concerns due to the membership inference have motivated multiple defenses against MIAs, e.g., differential privacy and adversarial regularization. Unfortunately, these defenses produce ML models with unacceptably low classification performances. Our work proposes a new defense, called distillation for membership privacy (DMP), against MIAs that preserves the utility of the resulting models significantly better than prior defenses. DMP leverages knowledge distillation to train ML models with membership privacy. We provide a novel criterion to tune the data used for knowledge transfer in order to amplify the membership privacy of DMP. Our extensive evaluation shows that DMP provides significantly better tradeoffs between membership privacy and classification accuracies compared to state-of-the-art MIA defenses. For instance, DMP achieves ~100% accuracy improvement over adversarial regularization for DenseNet trained on CIFAR100, for similar membership privacy (measured using MIA risk): when the MIA risk is 53.7%, adversarially regularized DenseNet is 33.6% accurate, while DMP-trained DenseNet is 65.3% accurate.

preprint2020arXiv

Practical Traffic Analysis Attacks on Secure Messaging Applications

Instant Messaging (IM) applications like Telegram, Signal, and WhatsApp have become extremely popular in recent years. Unfortunately, such IM services have been targets of continuous governmental surveillance and censorship, as these services are home to public and private communication channels on socially and politically sensitive topics. To protect their clients, popular IM services deploy state-of-the-art encryption mechanisms. In this paper, we show that despite the use of advanced encryption, popular IM applications leak sensitive information about their clients to adversaries who merely monitor their encrypted IM traffic, with no need for leveraging any software vulnerabilities of IM applications. Specifically, we devise traffic analysis attacks that enable an adversary to identify administrators as well as members of target IM channels (e.g., forums) with high accuracies. We believe that our study demonstrates a significant, real-world threat to the users of such services given the increasing attempts by oppressive governments at cracking down controversial IM channels. We demonstrate the practicality of our traffic analysis attacks through extensive experiments on real-world IM communications. We show that standard countermeasure techniques such as adding cover traffic can degrade the effectiveness of the attacks we introduce in this paper. We hope that our study will encourage IM providers to integrate effective traffic obfuscation countermeasures into their software. In the meantime, we have designed and deployed an open-source, publicly available countermeasure system, called IMProxy, that can be used by IM clients with no need for any support from IM providers. We have demonstrated the effectiveness of IMProxy through experiments.

preprint2012arXiv

BotMosaic: Collaborative Network Watermark for Botnet Detection

Recent research has made great strides in the field of detecting botnets. However, botnets of all kinds continue to plague the Internet, as many ISPs and organizations do not deploy these techniques. We aim to mitigate this state by creating a very low-cost method of detecting infected bot host. Our approach is to leverage the botnet detection work carried out by some organizations to easily locate collaborating bots elsewhere. We created BotMosaic as a countermeasure to IRC-based botnets. BotMosaic relies on captured bot instances controlled by a watermarker, who inserts a particular pattern into their network traffic. This pattern can then be detected at a very low cost by client organizations and the watermark can be tuned to provide acceptable false-positive rates. A novel feature of the watermark is that it is inserted collaboratively into the flows of multiple captured bots at once, in order to ensure the signal is strong enough to be detected. BotMosaic can also be used to detect stepping stones and to help trace back to the botmaster. It is content agnostic and can operate on encrypted traffic. We evaluate BotMosaic using simulations and a testbed deployment.

preprint2012arXiv

CensorSpoofer: Asymmetric Communication with IP Spoofing for Censorship-Resistant Web Browsing

A key challenge in censorship-resistant web browsing is being able to direct legitimate users to redirection proxies while preventing censors, posing as insiders, from discovering their addresses and blocking them. We propose a new framework for censorship-resistant web browsing called {\it CensorSpoofer} that addresses this challenge by exploiting the asymmetric nature of web browsing traffic and making use of IP spoofing. CensorSpoofer de-couples the upstream and downstream channels, using a low-bandwidth indirect channel for delivering outbound requests (URLs) and a high-bandwidth direct channel for downloading web content. The upstream channel hides the request contents using steganographic encoding within email or instant messages, whereas the downstream channel uses IP address spoofing so that the real address of the proxies is not revealed either to legitimate users or censors. We built a proof-of-concept prototype that uses encrypted VoIP for this downstream channel and demonstrated the feasibility of using the CensorSpoofer framework in a realistic environment.

preprint2012arXiv

IP over Voice-over-IP for censorship circumvention

Open communication over the Internet poses a serious threat to countries with repressive regimes, leading them to develop and deploy network-based censorship mechanisms within their networks. Existing censorship circumvention systems face different difficulties in providing unobservable communication with their clients; this limits their availability and poses threats to their users. To provide the required unobservability, several recent circumvention systems suggest modifying Internet routers running outside the censored region to intercept and redirect packets to censored destinations. However, these approaches require modifications to ISP networks, and hence requires cooperation from ISP operators and/or network equipment vendors, presenting a substantial deployment challenge. In this report we propose a deployable and unobservable censorship-resistant infrastructure, called FreeWave. FreeWave works by modulating a client's Internet connections into acoustic signals that are carried over VoIP connections. Such VoIP connections are targeted to a server, FreeWave server, that extracts the tunneled traffic of clients and proxies them to the uncensored Internet. The use of actual VoIP connections, as opposed to traffic morphing, allows FreeWave to relay its VoIP connections through oblivious VoIP nodes, hence keeping itself unblockable from censors that perform IP address blocking. Also, the use of end-to-end encryption prevents censors from identifying FreeWave's VoIP connections using packet content filtering technologies, like deep-packet inspection. We prototype the designed FreeWave system over the popular VoIP system of Skype. We show that FreeWave is able to reliably achieve communication bandwidths that are sufficient for web browsing, even when clients are far distanced from the FreeWave server.

preprint2012arXiv

Multi-Flow Attacks Against Network Flow Watermarks: Analysis and Countermeasures

In this paper, we analyze several recent schemes for watermarking network flows that are based on splitting the flow into timing intervals. We show that this approach creates time-dependent correlations that enable an attack that combines multiple watermarked flows. Such an attack can easily be mounted in nearly all applications of network flow watermarking, both in anonymous communication and stepping stone detection. The attack can be used to detect the presence of a watermark, recover the secret parameters, and remove the watermark from a flow. The attack can be effective even if different flows are marked with different values of a watermark. We analyze the efficacy of our attack using a probabilistic model and a Markov-Modulated Poisson Process (MMPP) model of interactive traffic. We also implement our attack and test it using both synthetic and real-world traces, showing that our attack is effective with as few as 10 watermarked flows. Finally, we propose possible countermeasures to defeat the multi-flow attack.

preprint2012arXiv

Non-blind watermarking of network flows

Linking network flows is an important problem in intrusion detection as well as anonymity. Passive traffic analysis can link flows but requires long periods of observation to reduce errors. Active traffic analysis, also known as flow watermarking, allows for better precision and is more scalable. Previous flow watermarks introduce significant delays to the traffic flow as a side effect of using a blind detection scheme; this enables attacks that detect and remove the watermark, while at the same time slowing down legitimate traffic. We propose the first non-blind approach for flow watermarking, called RAINBOW, that improves watermark invisibility by inserting delays hundreds of times smaller than previous blind watermarks, hence reduces the watermark interference on network flows. We derive and analyze the optimum detectors for RAINBOW as well as the passive traffic analysis under different traffic models by using hypothesis testing. Comparing the detection performance of RAINBOW and the passive approach we observe that both RAINBOW and passive traffic analysis perform similarly good in the case of uncorrelated traffic, however, the RAINBOW detector drastically outperforms the optimum passive detector in the case of correlated network flows. This justifies the use of non-blind watermarks over passive traffic analysis even though both approaches have similar scalability constraints. We confirm our analysis by simulating the detectors and testing them against large traces of real network flows.

preprint2012arXiv

SWEET: Serving the Web by Exploiting Email Tunnels

Open communication over the Internet poses a serious threat to countries with repressive regimes, leading them to develop and deploy censorship mechanisms within their networks. Unfortunately, existing censorship circumvention systems do not provide high availability guarantees to their users, as censors can identify, hence disrupt, the traffic belonging to these systems using today's advanced censorship technologies. In this paper we propose SWEET, a highly available censorship-resistant infrastructure. SWEET works by encapsulating a censored user's traffic to a proxy server inside email messages that are carried over by public email service providers, like Gmail and Yahoo Mail. As the operation of SWEET is not bound to specific email providers we argue that a censor will need to block all email communications in order to disrupt SWEET, which is infeasible as email constitutes an important part of today's Internet. Through experiments with a prototype of our system we find that SWEET's performance is sufficient for web traffic. In particular, regular websites are downloaded within couple of seconds.

preprint2011arXiv

Stegobot: construction of an unobservable communication network leveraging social behavior

We propose the construction of an unobservable communications network using social networks. The communication endpoints are vertices on a social network. Probabilistically unobservable communication channels are built by leveraging image steganography and the social image sharing behavior of users. All communication takes place along the edges of a social network overlay connecting friends. We show that such a network can provide decent bandwidth even with a far from optimal routing mechanism such as restricted flooding. We show that such a network is indeed usable by constructing a botnet on top of it, called Stegobot. It is designed to spread via social malware attacks and steal information from its victims. Unlike conventional botnets, Stegobot traffic does not introduce new communication endpoints between bots. We analyzed a real-world dataset of image sharing between members of an online social network. Analysis of Stegobot's network throughput indicates that stealthy as it is, it is also functionally powerful -- capable of channeling fair quantities of sensitive data from its victims to the botmaster at tens of megabytes every month.

Amir Houmansadr

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Identifying Models Behind Text-to-Image Leaderboards

Network-Level Prompt and Trait Leakage in Local Research Agents

ULTra: Unveiling Latent Token Interpretability in Transformer-Based Understanding and Segmentation

E2FL: Equal and Equitable Federated Learning

FRL: Federated Rank Learning

Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems

Asymptotic Privacy Loss due to Time Series Matching of Dependent Users

Blind Adversarial Network Perturbations

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising

Membership Privacy for Machine Learning Models Through Knowledge Transfer

Practical Traffic Analysis Attacks on Secure Messaging Applications

BotMosaic: Collaborative Network Watermark for Botnet Detection

CensorSpoofer: Asymmetric Communication with IP Spoofing for Censorship-Resistant Web Browsing

IP over Voice-over-IP for censorship circumvention

Multi-Flow Attacks Against Network Flow Watermarks: Analysis and Countermeasures

Non-blind watermarking of network flows

SWEET: Serving the Web by Exploiting Email Tunnels

Stegobot: construction of an unobservable communication network leveraging social behavior