Source author record

Lalitha Sankar

Lalitha Sankar appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning Systems and Control Cryptography and Security eess.SY Computer Science and Game Theory math.OC Applications Artificial Intelligence Human-Computer Interaction math.ST Methodology Social and Information Networks Statistics Theory

Catalog footprint

What is connected

42works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Parameter Optimization with Conscious Allocation (POCA)

The performance of modern machine learning algorithms depends upon the selection of a set of hyperparameters. Common examples of hyperparameters are learning rate and the number of layers in a dense neural network. Auto-ML is a branch of optimization that has produced important contributions in this area. Within Auto-ML, hyperband-based approaches, which eliminate poorly-performing configurations after evaluating them at low budgets, are among the most effective. However, the performance of these algorithms strongly depends on how effectively they allocate the computational budget to various hyperparameter configurations. We present the new Parameter Optimization with Conscious Allocation (POCA), a hyperband-based algorithm that adaptively allocates the inputted budget to the hyperparameter configurations it generates following a Bayesian sampling scheme. We compare POCA to its nearest competitor at optimizing the hyperparameters of an artificial toy function and a deep neural network and find that POCA finds strong configurations faster in both settings.

preprint2022arXiv

$α$-GAN: Convergence and Estimation Guarantees

We prove a two-way correspondence between the min-max optimization of general CPE loss function GANs and the minimization of associated $f$-divergences. We then focus on $α$-GAN, defined via the $α$-loss, which interpolates several GANs (Hellinger, vanilla, Total Variation) and corresponds to the minimization of the Arimoto divergence. We show that the Arimoto divergences induced by $α$-GAN equivalently converge, for all $α\in \mathbb{R}_{>0}\cup\{\infty\}$. However, under restricted learning models and finite samples, we provide estimation bounds which indicate diverse GAN behavior as a function of $α$. Finally, we present empirical results on a toy dataset that highlight the practical utility of tuning the $α$ hyperparameter.

preprint2022arXiv

A Complex-LASSO Approach for Localizing Forced Oscillations in Power Systems

We study the problem of localizing multiple sources of forced oscillations (FOs) and estimating their characteristics, such as frequency, phase, and amplitude, using noisy PMU measurements. For each source location, we model the input oscillation as a sum of unknown sinusoidal terms. This allows us to obtain a linear relationship between measurements and the inputs at the unknown sinusoids' frequencies in the frequency domain. We determine these frequencies by thresholding the empirical spectrum of the noisy measurements. Assuming sparsity in the number of FOs' locations and the number of sinusoids at each location, we cast the location recovery problem as an $\ell_1$-regularized least squares problem in the complex domain -- i.e., complex-LASSO (linear shrinkage and selection operator). We numerically solve this optimization problem using the complex-valued coordinate descent method, and show its efficiency on the IEEE 68-bus, 16 machine and WECC 179-bus, 29-machine systems.

preprint2022arXiv

A Machine Learning Framework for Event Identification via Modal Analysis of PMU Data

Power systems are prone to a variety of events (e.g. line trips and generation loss) and real-time identification of such events is crucial in terms of situational awareness, reliability, and security. Using measurements from multiple synchrophasors, i.e., phasor measurement units (PMUs), we propose to identify events by extracting features based on modal dynamics. We combine such traditional physics-based feature extraction methods with machine learning to distinguish different event types. Including all measurement channels at each PMU allows exploiting diverse features but also requires learning classification models over a high-dimensional space. To address this issue, various feature selection methods are implemented to choose the best subset of features. Using the obtained subset of features, we investigate the performance of two well-known classification models, namely, logistic regression (LR) and support vector machines (SVM) to identify generation loss and line trip events in two datasets. The first dataset is obtained from simulated generation loss and line trip events in the Texas 2000-bus synthetic grid. The second is a proprietary dataset with labeled events obtained from a large utility in the USA involving measurements from nearly 500 PMUs. Our results indicate that the proposed framework is promising for identifying the two types of events.

preprint2022arXiv

A Variational Formula for Infinity-Rényi Divergence with Applications to Information Leakage

We present a variational characterization for the Rényi divergence of order infinity. Our characterization is related to guessing: the objective functional is a ratio of maximal expected values of a gain function applied to the probability of correctly guessing an unknown random variable. An important aspect of our variational characterization is that it remains agnostic to the particular gain function considered, as long as it satisfies some regularity conditions. Also, we define two variants of a tunable measure of information leakage, the maximal $α$-leakage, and obtain closed-form expressions for these information measures by leveraging our variational characterization.

preprint2022arXiv

Being Properly Improper

Properness for supervised losses stipulates that the loss function shapes the learning algorithm towards the true posterior of the data generating distribution. Unfortunately, data in modern machine learning can be corrupted or twisted in many ways. Hence, optimizing a proper loss function on twisted data could perilously lead the learning algorithm towards the twisted posterior, rather than to the desired clean posterior. Many papers cope with specific twists (e.g., label/feature/adversarial noise), but there is a growing need for a unified and actionable understanding atop properness. Our chief theoretical contribution is a generalization of the properness framework with a notion called twist-properness, which delineates loss functions with the ability to "untwist" the twisted posterior into the clean posterior. Notably, we show that a nontrivial extension of a loss function called $α$-loss, which was first introduced in information theory, is twist-proper. We study the twist-proper $α$-loss under a novel boosting algorithm, called PILBoost, and provide formal and experimental results for this algorithm. Our overarching practical conclusion is that the twist-proper $α$-loss outperforms the proper $\log$-loss on several variants of twisted data.

preprint2022arXiv

Cactus Mechanisms: Optimal Differential Privacy Mechanisms in the Large-Composition Regime

Most differential privacy mechanisms are applied (i.e., composed) numerous times on sensitive data. We study the design of optimal differential privacy mechanisms in the limit of a large number of compositions. As a consequence of the law of large numbers, in this regime the best privacy mechanism is the one that minimizes the Kullback-Leibler divergence between the conditional output distributions of the mechanism given two different inputs. We formulate an optimization problem to minimize this divergence subject to a cost constraint on the noise. We first prove that additive mechanisms are optimal. Since the optimization problem is infinite dimensional, it cannot be solved directly; nevertheless, we quantize the problem to derive near-optimal additive mechanisms that we call "cactus mechanisms" due to their shape. We show that our quantization approach can be arbitrarily close to an optimal mechanism. Surprisingly, for quadratic cost, the Gaussian mechanism is strictly sub-optimal compared to this cactus mechanism. Finally, we provide numerical results which indicate that cactus mechanism outperforms the Gaussian mechanism for a finite number of compositions.

preprint2022arXiv

Generating Fair Universal Representations using Adversarial Models

We present a data-driven framework for learning fair universal representations (FUR) that guarantee statistical fairness for any learning task that may not be known a priori. Our framework leverages recent advances in adversarial learning to allow a data holder to learn representations in which a set of sensitive attributes are decoupled from the rest of the dataset. We formulate this as a constrained minimax game between an encoder and an adversary where the constraint ensures a measure of usefulness (utility) of the representation. The resulting problem is that of censoring, i.e., finding a representation that is least informative about the sensitive attributes given a utility constraint. For appropriately chosen adversarial loss functions, our censoring framework precisely clarifies the optimal adversarial strategy against strong information-theoretic adversaries; it also achieves the fairness measure of demographic parity for the resulting constrained representations. We evaluate the performance of our proposed framework on both synthetic and publicly available datasets. For these datasets, we use two tradeoff measures: censoring vs. representation fidelity and fairness vs. utility for downstream tasks, to amply demonstrate that multiple sensitive features can be effectively censored even as the resulting fair representations ensure accuracy for multiple downstream tasks.

preprint2022arXiv

Localization and Estimation of Unknown Forced Inputs: A Group LASSO Approach

We model and study the problem of localizing a set of sparse forcing inputs for linear dynamical systems from noisy measurements when the initial state is unknown. This problem is of particular relevance to detecting forced oscillations in electric power networks. We express measurements as an additive model comprising the initial state and inputs grouped over time, both expanded in terms of the basis functions (i.e., impulse response coefficients). Using this model, with probabilistic guarantees, we recover the locations and simultaneously estimate the initial state and forcing inputs using a variant of the group LASSO (linear absolute shrinkage and selection operator) method. Specifically, we provide a tight upper bound on: (i) the probability that the group LASSO estimator wrongly identifies the source locations, and (ii) the $\ell_2$-norm of the estimation error. Our bounds explicitly depend upon the length of the measurement horizon, the noise statistics, the number of inputs and sensors, and the singular values of impulse response matrices. Our theoretical analysis is one of the first to provide a complete treatment for the group LASSO estimator for linear dynamical systems under input-to-output delay assumptions. Finally, we validate our results on synthetic models and the IEEE 68-bus, 16-machine system.

preprint2022arXiv

Lower Bounds for the MMSE via Neural Network Estimation and Their Applications to Privacy

The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable $Y\in\mathbb{R}$ given another random variable $X\in\mathbb{R}^{d}$ is of much interest in a variety of statistical settings. In the context of estimation-theoretic privacy, the MMSE has been proposed as an information leakage measure that captures the ability of an adversary in estimating $Y$ upon observing $X$. In this paper we establish provable lower bounds for the MMSE based on a two-layer neural network estimator of the MMSE and the Barron constant of an appropriate function of the conditional expectation of $Y$ given $X$. Furthermore, we derive a general upper bound for the Barron constant that, when $X\in\mathbb{R}$ is post-processed by the additive Gaussian mechanism and $Y$ is binary, produces order optimal estimates in the large noise regime. In order to obtain numerical lower bounds for the MMSE in some concrete applications, we introduce an efficient optimization process that approximates the value of the proposed neural network estimator. Overall, we provide an effective machinery to obtain provable lower bounds for the MMSE.

preprint2022arXiv

Parameter Estimation in Ill-conditioned Low-inertia Power Systems

This paper examines model parameter estimation in dynamic power systems whose governing electro-mechanical equations are ill-conditioned or singular. This ill-conditioning is because of converter-interfaced power systems generators' zero or small inertia contribution. Consequently, the overall system inertia decreases, resulting in low-inertia power systems. We show that the standard state-space model based on least squares or subspace estimators fails to exist for these models. We overcome this challenge by considering a least-squares estimator directly on the coupled swing-equation model but not on its transformed first-order state-space form. We specifically focus on estimating inertia (mechanical and virtual) and damping constants, although our method is general enough for estimating other parameters. Our theoretical analysis highlights the role of network topology on the parameter estimates of an individual generator. For generators with greater connectivity, estimation of the associated parameters is more susceptible to variations in other generator states. Furthermore, we numerically show that estimating the parameters by ignoring their ill-conditioning aspects yields highly unreliable results.

preprint2022arXiv

PMU Tracker: A Visualization Platform for Epicentric Event Propagation Analysis in the Power Grid

The electrical power grid is a critical infrastructure, with disruptions in transmission having severe repercussions on daily activities, across multiple sectors. To identify, prevent, and mitigate such events, power grids are being refurbished as 'smart' systems that include the widespread deployment of GPS-enabled phasor measurement units (PMUs). PMUs provide fast, precise, and time-synchronized measurements of voltage and current, enabling real-time wide-area monitoring and control. However, the potential benefits of PMUs, for analyzing grid events like abnormal power oscillations and load fluctuations, are hindered by the fact that these sensors produce large, concurrent volumes of noisy data. In this paper, we describe working with power grid engineers to investigate how this problem can be addressed from a visual analytics perspective. As a result, we have developed PMU Tracker, an event localization tool that supports power grid operators in visually analyzing and identifying power grid events and tracking their propagation through the power grid's network. As a part of the PMU Tracker interface, we develop a novel visualization technique which we term an epicentric cluster dendrogram, which allows operators to analyze the effects of an event as it propagates outwards from a source location. We robustly validate PMU Tracker with: (1) a usage scenario demonstrating how PMU Tracker can be used to analyze anomalous grid events, and (2) case studies with power grid operators using a real-world interconnection dataset. Our results indicate that PMU Tracker effectively supports the analysis of power grid events; we also demonstrate and discuss how PMU Tracker's visual analytics approach can be generalized to other domains composed of time-varying networks with epicentric event characteristics.

preprint2022arXiv

The Saddle-Point Accountant for Differential Privacy

We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximation error offered by SPA. The crux of SPA is a combination of large-deviation methods with central limit theorems, which we derive via exponentially tilting the privacy loss random variables corresponding to the DP mechanisms. One key advantage of SPA is that it runs in constant time for the $n$-fold composition of a privacy mechanism. Numerical experiments demonstrate that SPA achieves comparable accuracy to state-of-the-art accounting methods with a faster runtime.

preprint2021arXiv

Three Variants of Differential Privacy: Lossless Conversion and Applications

We consider three different variants of differential privacy (DP), namely approximate DP, Rényi DP (RDP), and hypothesis test DP. In the first part, we develop a machinery for optimally relating approximate DP to RDP based on the joint range of two $f$-divergences that underlie the approximate DP and RDP. In particular, this enables us to derive the optimal approximate DP parameters of a mechanism that satisfies a given level of RDP. As an application, we apply our result to the moments accountant framework for characterizing privacy guarantees of noisy stochastic gradient descent (SGD). When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget. In the second part, we establish a relationship between RDP and hypothesis test DP which allows us to translate the RDP constraint into a tradeoff between type I and type II error probabilities of a certain binary hypothesis test. We then demonstrate that for noisy SGD our result leads to tighter privacy guarantees compared to the recently proposed $f$-DP framework for some range of parameters.

preprint2020arXiv

$N-1$ Reliability Makes It Difficult for False Data Injection Attacks to Cause Physical Consequences

This paper demonstrates that false data injection (FDI) attacks are extremely limited in their ability to cause physical consequences on $N-1$ reliable power systems operating with real-time contingency analysis (RTCA) and security constrained economic dispatch (SCED). Prior work has shown that FDI attacks can be designed via an attacker-defender bi-level linear program (ADBLP) to cause physical overflows after re-dispatch using DCOPF. In this paper, it is shown that attacks designed using DCOPF fail to cause overflows on $N-1$ reliable systems because the system response modeled is inaccurate. An ADBLP that accurately models the system response is proposed to find the worst-case physical consequences, thereby modeling a strong attacker with system level knowledge. Simulation results on the synthetic Texas system with 2000 buses show that even with the new enhanced attacks, for systems operated conservatively due to $N-1$ constraints, the designed attacks only lead to post-contingency overflows. Moreover, the attacker must control a large portion of measurements and physically create a contingency in the system to cause consequences. Therefore, it is conceivable but requires an extremely sophisticated attacker to cause physical consequences on $N-1$ reliable power systems operated with RTCA and SCED.

preprint2020arXiv

A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$-Divergences

We derive the optimal differential privacy (DP) parameters of a mechanism that satisfies a given level of Rényi differential privacy (RDP). Our result is based on the joint range of two $f$-divergences that underlie the approximate and the Rényi variations of differential privacy. We apply our result to the moments accountant framework for characterizing privacy guarantees of stochastic gradient descent. When compared to the state-of-the-art, our bounds may lead to about 100 more stochastic gradient descent iterations for training deep learning models for the same privacy budget.

preprint2020arXiv

Detecting Load Redistribution Attacks via Support Vector Models

A machine learning-based detection framework is proposed to detect a class of cyber-attacks that redistribute loads by modifying measurements. The detection framework consists of a multi-output support vector regression (SVR) load predictor that predicts loads by exploiting both spatial and temporal correlations, and a subsequent support vector machine (SVM) attack detector to determine the existence of load redistribution (LR) attacks utilizing loads predicted by the SVR predictor. Historical load data for training the SVR are obtained from the publicly available PJM zonal loads and are mapped to the IEEE 30-bus system. The SVM is trained using normal data and randomly created LR attacks, and is tested against both random and intelligently designed LR attacks. The results show that the proposed detection framework can effectively detect LR attacks. Moreover, attack mitigation can be achieved by using the SVR predicted loads to re-dispatch generations.

preprint2020arXiv

Detection and Localization of Load Redistribution Attacks on Large Scale Systems

A nearest neighbor-based detection scheme against load redistribution attacks is presented. The detector is designed to scale from small to very large systems while guaranteeing consistent detection performance. Extensive testing is performed on a realistic, large scale system to evaluate the performance of the proposed detector against a wide range of attacks, from simple random noise attacks to sophisticated load redistribution attacks. The detection capability is analyzed against different attack parameters to evaluate its sensitivity. Finally, a statistical test that leverages the proposed detection algorithm is introduced to identify which loads are likely to have been maliciously modified, thus, localizing the attack subgraph. This test is based on ascribing to each load a risk measure (probability of being attacked) and then computing the best posterior likelihood that minimizes log-loss.

preprint2020arXiv

On the Robustness of Information-Theoretic Privacy Measures and Mechanisms

Consider a data publishing setting for a dataset composed by both private and non-private features. The publisher uses an empirical distribution, estimated from $n$ i.i.d. samples, to design a privacy mechanism which is applied to new fresh samples afterward. In this paper, we study the discrepancy between the privacy-utility guarantees for the empirical distribution, used to design the privacy mechanism, and those for the true distribution, experienced by the privacy mechanism in practice. We first show that, for any privacy mechanism, these discrepancies vanish at speed $O(1/\sqrt{n})$ with high probability. These bounds follow from our main technical results regarding the Lipschitz continuity of the considered information leakage measures. Then we prove that the optimal privacy mechanisms for the empirical distribution approach the corresponding mechanisms for the true distribution as the sample size $n$ increases, thereby establishing the statistical consistency of the optimal privacy mechanisms. Finally, we introduce and study uniform privacy mechanisms which, by construction, provide privacy to all the distributions within a neighborhood of the estimated distribution and, thereby, guarantee privacy for the true distribution with high probability.

preprint2016arXiv

Evaluating Power System Vulnerability to False Data Injection Attacks via Scalable Optimization

Physical consequences to power systems of false data injection cyber-attacks are considered. Prior work has shown that the worst-case consequences of such an attack can be determined using a bi-level optimization problem, wherein an attack is chosen to maximize the physical power flow on a target line subsequent to re-dispatch. This problem can be solved as a mixed-integer linear program, but it is difficult to scale to large systems due to numerical challenges. Three new computationally efficient algorithms to solve this problem are presented. These algorithms provide lower and upper bounds on the system vulnerability measured as the maximum power flow subsequent to an attack. Using these techniques, vulnerability assessments are conducted for IEEE 118-bus system and Polish system with 2383 buses.

preprint2016arXiv

Hypothesis Testing in the High Privacy Limit

Binary hypothesis testing under the Neyman-Pearson formalism is a statistical inference framework for distinguishing data generated by two different source distributions. Privacy restrictions may require the curator of the data or the data respondents themselves to share data with the test only after applying a randomizing privacy mechanism. Using mutual information as the privacy metric and the relative entropy between the two distributions of the output (postrandomization) source classes as the utility metric (motivated by the Chernoff-Stein Lemma), this work focuses on finding an optimal mechanism that maximizes the chosen utility function while ensuring that the mutual information based leakage for both source distributions is bounded. Focusing on the high privacy regime, an Euclidean information-theoretic (E-IT) approximation to the tradeoff problem is presented. It is shown that the solution to the E-IT approximation is independent of the alphabet size and clarifies that a mutual information based privacy metric preserves the privacy of the source symbols in inverse proportion to their likelihood.

preprint2016arXiv

Market Segmentation for Privacy Differentiated "Free" Services

The emerging marketplace for online free services in which service providers earn revenue from using consumer data in direct and indirect ways has lead to significant privacy concerns. This leads to the following question: can the online marketplace sustain multiple service providers (SPs) that offer privacy-differentiated free services? This paper studies the problem of market segmentation for the free online services market by augmenting the classical Hotelling model for market segmentation analysis to include the fact that for the free services market, a consumer values service not in monetized terms but by its quality of service (QoS) and that the differentiator of services is not product price but the privacy risk advertised by a SP. Building upon the Hotelling model, this paper presents a parametrized model for SP profit and consumer valuation of service for both the two- and multi-SP problems to show that: (i) when consumers place a high value on privacy, it leads to a lower use of private data by SPs (i.e., their advertised privacy risk reduces), and thus, SPs compete on the QoS; (ii) SPs that are capable of differentiating on services that do not directly target consumers gain larger market share; and (iii) a higher valuation of privacy by consumers forces SPs with smaller untargeted revenue to offer lower privacy risk to attract more consumers. The work also illustrates the market segmentation problem for more than two SPs and highlights the instability of such markets.

preprint2016arXiv

Privacy-guaranteed Two-Agent Interactions Using Information-Theoretic Mechanisms

This paper introduces a multi-round interaction problem with privacy constraints between two agents that observe correlated data. The agents alternately share data with one another for a total of K rounds such that each agent initiates sharing over K/2 rounds. The interactions are modeled as a collection of K random mechanisms (mappings), one for each round. The goal is to jointly design the K private mechanisms to determine the set of all achievable distortion-leakage pairs at each agent. Arguing that a mutual information-based leakage metric can be appropriate for streaming data settings, this paper: (i) determines the set of all achievable distortion- leakage tuples ; (ii) shows that the K mechanisms allow for precisely composing the total privacy budget over K rounds without loss; and (ii) develops conditions under which interaction reduces the net leakage at both agents and illustrates it for a specific class of sources. The paper then focuses on log-loss distortion to better understand the effect on leakage of using a commonly used utility metric in learning theory. The resulting interaction problem leads to a non-convex sum-leakage-distortion optimization problem that can be viewed as an interactive version of the information bottleneck problem. A new merge-and-search algorithm that extends the classical agglomerative information bottleneck algorithm to the interactive setting is introduced to determine a provable locally optimal solution. Finally, the benefit of interaction under log-loss is illustrated for specific source classes and the optimality of one-shot is proved for Gaussian sources under both mean-square and log-loss distortions constraints.

preprint2015arXiv

Designing Incentive Schemes For Privacy-Sensitive Users

Businesses (retailers) often wish to offer personalized advertisements (coupons) to individuals (consumers), but run the risk of strong reactions from consumers who want a customized shopping experience but feel their privacy has been violated. Existing models for privacy such as differential privacy or information theory try to quantify privacy risk but do not capture the subjective experience and heterogeneous expression of privacy-sensitivity. We propose a Markov decision process (MDP) model to capture (i) different consumer privacy sensitivities via a time-varying state; (ii) different coupon types (action set) for the retailer; and (iii) the action-and-state-dependent cost for perceived privacy violations. For the simple case with two states ("Normal" and "Alerted"), two coupons (targeted and untargeted) model, and consumer behavior statistics known to the retailer, we show that a stationary threshold-based policy is the optimal coupon-offering strategy for a retailer that wishes to minimize its expected discounted cost. The threshold is a function of all model parameters; the retailer offers a targeted coupon if their belief that the consumer is in the "Alerted" state is below the threshold. We extend this two-state model to consumers with multiple privacy-sensitivity states as well as coupon-dependent state transition probabilities. Furthermore, we study the case with imperfect (noisy) cost feedback from consumers and uncertain initial belief state.

preprint2015arXiv

Implication of Unobservable State-and-topology Cyber-physical Attacks

This paper studies the physical consequences of a class of unobservable state-and-topology cyber-physical attacks in which both state and topology data for a sub-network of the network are changed by an attacker to mask a physical attack. The problem is formulated as a two-stage optimization problem which aims to cause overload in a line of the network with limited attack resources. It is shown that unobservable state-and-topology cyber-physical attacks as studied in this paper can make the system operation more vulnerable to line outages and failures.

preprint2015arXiv

Vulnerability Analysis and Consequences of False Data Injection Attack on Power System State Estimation

An unobservable false data injection (FDI) attack on AC state estimation (SE) is introduced and its consequences on the physical system are studied. With a focus on understanding the physical consequences of FDI attacks, a bi-level optimization problem is introduced whose objective is to maximize the physical line flows subsequent to an FDI attack on DC SE. The maximization is subject to constraints on both attacker resources (size of attack) and attack detection (limiting load shifts) as well as those required by DC optimal power flow (OPF) following SE. The resulting attacks are tested on a more realistic non-linear system model using AC state estimation and ACOPF, and it is shown that, with an appropriately chosen sub-network, the attacker can overload transmission lines with moderate shifts of load.

preprint2014arXiv

Asymptotics and Non-asymptotics for Universal Fixed-to-Variable Source Coding

Universal fixed-to-variable lossless source coding for memoryless sources is studied in the finite blocklength and higher-order asymptotics regimes. Optimal third-order coding rates are derived for general fixed-to-variable codes and for prefix codes. It is shown that the non-prefix Type Size code, in which codeword lengths are chosen in ascending order of type class size, achieves the optimal third-order rate and outperforms classical Two-Stage codes. Converse results are proved making use of a result on the distribution of the empirical entropy and Laplace's approximation. Finally, the fixed-to-variable coding problem without a prefix constraint is shown to be essentially the same as the universal guessing problem.

preprint2014arXiv

Enabling Data Exchange in Interactive State Estimation under Privacy Constraints

Data collecting agents in large networks, such as the electric power system, need to share information (measurements) for estimating the system state in a distributed manner. However, privacy concerns may limit or prevent this exchange leading to a tradeoff between state estimation fidelity and privacy (referred to as competitive privacy). This paper builds upon a recent information-theoretic result (using mutual information to measure privacy and mean-squared error to measure fidelity) that quantifies the region of achievable distortion-leakage tuples in a two-agent network. The objective of this paper is to study centralized and decentralized mechanisms that can enable and sustain non-trivial data exchanges among the agents. A centralized mechanism determines the data sharing policies that optimize a network-wide objective function combining the fidelities and leakages at both agents. Using common-goal games and best-response analysis, the optimal policies allow for distributed implementation. In contrast, in the decentralized setting, repeated discounted games are shown to naturally enable data exchange without any central control nor economic incentives. The effect of repetition is modeled by a time-averaged payoff function at each agent which combines its fidelity and leakage at each interaction stage. For both approaches, it is shown that non-trivial data exchange can be sustained for specific fidelity ranges even when privacy is a limiting factor.

preprint2013arXiv

Utility-Privacy Tradeoff in Databases: An Information-theoretic Approach

Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an analytical framework that can quantify the safety of personally identifiable information (privacy) while still providing a quantifable benefit (utility) to multiple legitimate information consumers. This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa. Specific contributions include: i) stochastic data models for both categorical and numerical data; ii) utility-privacy tradeoff regions and the encoding (sanization) schemes achieving them for both classes and their practical relevance; and iii) modeling of prior knowledge at the user and/or data source and optimal encoding schemes for both cases.

preprint2012arXiv

Distributed Estimation in Multi-Agent Networks

A problem of distributed state estimation at multiple agents that are physically connected and have competitive interests is mapped to a distributed source coding problem with additional privacy constraints. The agents interact to estimate their own states to a desired fidelity from their (sensor) measurements which are functions of both the local state and the states at the other agents. For a Gaussian state and measurement model, it is shown that the sum-rate achieved by a distributed protocol in which the agents broadcast to one another is a lower bound on that of a centralized protocol in which the agents broadcast as if to a virtual CEO converging only in the limit of a large number of agents. The sufficiency of encoding using local measurements is also proved for both protocols.

preprint2011arXiv

Competitive Privacy in the Smart Grid: An Information-theoretic Approach

Advances in sensing and communication capabilities as well as power industry deregulation are driving the need for distributed state estimation in the smart grid at the level of the regional transmission organizations (RTOs). This leads to a new competitive privacy problem amongst the RTOs since there is a tension between sharing data to ensure network reliability (utility/benefit to all RTOs) and withholding data for profitability and privacy reasons. The resulting tradeoff between utility, quantified via fidelity of its state estimate at each RTO, and privacy, quantified via the leakage of the state of one RTO at other RTOs, is captured precisely using a lossy source coding problem formulation for a two RTO network. For a two-RTO model, it is shown that the set of all feasible utility-privacy pairs can be achieved via a single round of communication when each RTO communicates taking into account the correlation between the measured data at both RTOs. The lossy source coding problem and solution developed here is also of independent interest.

preprint2011arXiv

Discriminatory Lossy Source Coding: Side Information Privacy

A lossy source coding problem is studied in which a source encoder communicates with two decoders, one with and one without correlated side information with an additional constraint on the privacy of the side information at the uninformed decoder. Two cases of this problem arise depending on the availability of the side information at the encoder. The set of all feasible rate-distortion-equivocation tuples are characterized for both cases. The difference between the informed and uninformed cases and the advantages of encoder side information for enhancing privacy are highlighted for a binary symmetric source with erasure side information and Hamming distortion.

preprint2011arXiv

Ergodic Fading Interference Channels: Sum-Capacity and Separability

The sum-capacity for specific sub-classes of ergodic fading Gaussian two-user interference channels (IFCs) is developed under the assumption of perfect channel state information at all transmitters and receivers. For the sub-classes of uniformly strong (every fading state is strong) and ergodic very strong two-sided IFCs (a mix of strong and weak fading states satisfying specific fading averaged conditions) the optimality of completely decoding the interference, i.e., converting the IFC to a compound multiple access channel (C-MAC), is proved. It is also shown that this capacity-achieving scheme requires encoding and decoding jointly across all fading states. As an achievable scheme and also as a topic of independent interest, the capacity region and the corresponding optimal power policies for an ergodic fading C-MAC are developed. For the sub-class of uniformly weak IFCs (every fading state is weak), genie-aided outer bounds are developed. The bounds are shown to be achieved by treating interference as noise and by separable coding for one-sided fading IFCs. Finally, for the sub-class of one-sided hybrid IFCs (a mix of weak and strong states that do not satisfy ergodic very strong conditions), an achievable scheme involving rate splitting and joint coding across all fading states is developed and is shown to perform at least as well as a separable coding scheme.

preprint2011arXiv

Multi-User Privacy: The Gray-Wyner System and Generalized Common Information

The problem of preserving privacy when a multivariate source is required to be revealed partially to multiple users is modeled as a Gray-Wyner source coding problem with K correlated sources at the encoder and K decoders in which the kth decoder, k = 1, 2, ...,K, losslessly reconstructs the kth source via a common link and a private link. The privacy requirement of keeping each decoder oblivious of all sources other than the one intended for it is introduced via an equivocation constraint at each decoder such that the total equivocation summed over all decoders is E. The set of achievable rates-equivocation tuples is completely characterized. Using this characterization, two different definitions of common information are presented and are shown to be equivalent.

preprint2011arXiv

Smart Meter Privacy: A Utility-Privacy Framework

End-user privacy in smart meter measurements is a well-known challenge in the smart grid. The solutions offered thus far have been tied to specific technologies such as batteries or assumptions on data usage. Existing solutions have also not quantified the loss of benefit (utility) that results from any such privacy-preserving approach. Using tools from information theory, a new framework is presented that abstracts both the privacy and the utility requirements of smart meter data. This leads to a novel privacy-utility tradeoff problem with minimal assumptions that is tractable. Specifically for a stationary Gaussian Markov model of the electricity load, it is shown that the optimal utility-and-privacy preserving solution requires filtering out frequency components that are low in power, and this approach appears to encompass most of the proposed privacy approaches.

preprint2010arXiv

A General Coding Scheme for Two-User Fading Interference Channels

A Han-Kobayashi based achievable scheme is presented for ergodic fading two-user Gaussian interference channels (IFCs) with perfect channel state information at all nodes and Gaussian codebooks with no time-sharing. Using max-min optimization techniques, it is shown that jointly coding across all states performs at least as well as separable coding for the sub-classes of uniformly weak (every sub-channel is weak) and hybrid (mix of strong and weak sub-channels that do not achieve the interference-free sum-capacity) IFCs. For the uniformly weak IFCs, sufficient conditions are obtained for which the sum-rate is maximized when interference is ignored at both receivers.

preprint2010arXiv

An Information-theoretic Approach to Privacy

Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an overarching analytical framework that can quantify the safety of personally identifiable information (privacy) while still providing a quantifable benefit (utility) to multiple legitimate information consumers. State of the art approaches have predominantly focused on privacy. This paper presents the first information-theoretic approach that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa.

preprint2010arXiv

Utility and Privacy of Data Sources: Can Shannon Help Conceal and Reveal Information?

The problem of private information "leakage" (inadvertently or by malicious design) from the myriad large centralized searchable data repositories drives the need for an analytical framework that quantifies unequivocally how safe private data can be (privacy) while still providing useful benefit (utility) to multiple legitimate information consumers. Rate distortion theory is shown to be a natural choice to develop such a framework which includes the following: modeling of data sources, developing application independent utility and privacy metrics, quantifying utility-privacy tradeoffs irrespective of the type of data sources or the methods of providing privacy, developing a side-information model for dealing with questions of external knowledge, and studying a successive disclosure problem for multiple query data sources.

preprint2008arXiv

Coalitions in Cooperative Wireless Networks

Cooperation between rational users in wireless networks is studied using coalitional game theory. Using the rate achieved by a user as its utility, it is shown that the stable coalition structure, i.e., set of coalitions from which users have no incentives to defect, depends on the manner in which the rate gains are apportioned among the cooperating users. Specifically, the stability of the grand coalition (GC), i.e., the coalition of all users, is studied. Transmitter and receiver cooperation in an interference channel (IC) are studied as illustrative cooperative models to determine the stable coalitions for both flexible (transferable) and fixed (non-transferable) apportioning schemes. It is shown that the stable sum-rate optimal coalition when only receivers cooperate by jointly decoding (transferable) is the GC. The stability of the GC depends on the detector when receivers cooperate using linear multiuser detectors (non-transferable). Transmitter cooperation is studied assuming that all receivers cooperate perfectly and that users outside a coalition act as jammers. The stability of the GC is studied for both the case of perfectly cooperating transmitters (transferrable) and under a partial decode-and-forward strategy (non-transferable). In both cases, the stability is shown to depend on the channel gains and the transmitter jamming strengths.

preprint2008arXiv

Relay vs. User Cooperation in Time-Duplexed Multiaccess Networks

The performance of user-cooperation in a multi-access network is compared to that of using a wireless relay. Using the total transmit and processing power consumed at all nodes as a cost metric, the outage probabilities achieved by dynamic decode-and-forward (DDF) and amplify-and-forward (AF) are compared for the two networks. A geometry-inclusive high signal-to-noise ratio (SNR) outage analysis in conjunction with area-averaged numerical simulations shows that user and relay cooperation achieve a maximum diversity of K and 2 respectively for a K-user multiaccess network under both DDF and AF. However, when accounting for energy costs of processing and communication, relay cooperation can be more energy efficient than user cooperation, i.e., relay cooperation achieves coding (SNR) gains, particularly in the low SNR regime, that override the diversity advantage of user cooperation.

preprint2008arXiv

Sum-Capacity of Ergodic Fading Interference and Compound Multiaccess Channels

The problem of resource allocation is studied for two-sender two-receiver fading Gaussian interference channels (IFCs) and compound multiaccess channels (C-MACs). The senders in an IFC communicate with their own receiver (unicast) while those in a C-MAC communicate with both receivers (multicast). The instantaneous fading state between every transmit-receive pair in this network is assumed to be known at all transmitters and receivers. Under an average power constraint at each source, the sum-capacity of the C-MAC and the power policy that achieves this capacity is developed. The conditions defining the classes of strong and very strong ergodic IFCs are presented and the multicast sum-capacity is shown to be tight for both classes.

preprint2007arXiv

Opportunistic Communications in an Orthogonal Multiaccess Relay Channel

The problem of resource allocation is studied for a two-user fading orthogonal multiaccess relay channel (MARC) where both users (sources) communicate with a destination in the presence of a relay. A half-duplex relay is considered that transmits on a channel orthogonal to that used by the sources. The instantaneous fading state between every transmit-receive pair in this network is assumed to be known at both the transmitter and receiver. Under an average power constraint at each source and the relay, the sum-rate for the achievable strategy of decode-and-forward (DF) is maximized over all power allocations (policies) at the sources and relay. It is shown that the sum-rate maximizing policy exploits the multiuser fading diversity to reveal the optimality of opportunistic channel use by each user. A geometric interpretation of the optimal power policy is also presented.

Lalitha Sankar

What is connected

Connect this record

See the researcher in context

Building this map preview

42 published item(s)

Parameter Optimization with Conscious Allocation (POCA)

$α$-GAN: Convergence and Estimation Guarantees

A Complex-LASSO Approach for Localizing Forced Oscillations in Power Systems

A Machine Learning Framework for Event Identification via Modal Analysis of PMU Data

A Variational Formula for Infinity-Rényi Divergence with Applications to Information Leakage

Being Properly Improper

Cactus Mechanisms: Optimal Differential Privacy Mechanisms in the Large-Composition Regime

Generating Fair Universal Representations using Adversarial Models

Localization and Estimation of Unknown Forced Inputs: A Group LASSO Approach

Lower Bounds for the MMSE via Neural Network Estimation and Their Applications to Privacy

Parameter Estimation in Ill-conditioned Low-inertia Power Systems

PMU Tracker: A Visualization Platform for Epicentric Event Propagation Analysis in the Power Grid

The Saddle-Point Accountant for Differential Privacy

Three Variants of Differential Privacy: Lossless Conversion and Applications

$N-1$ Reliability Makes It Difficult for False Data Injection Attacks to Cause Physical Consequences

A Better Bound Gives a Hundred Rounds: Enhanced Privacy Guarantees via $f$-Divergences

Detecting Load Redistribution Attacks via Support Vector Models

Detection and Localization of Load Redistribution Attacks on Large Scale Systems

On the Robustness of Information-Theoretic Privacy Measures and Mechanisms

Evaluating Power System Vulnerability to False Data Injection Attacks via Scalable Optimization

Hypothesis Testing in the High Privacy Limit

Market Segmentation for Privacy Differentiated "Free" Services

Privacy-guaranteed Two-Agent Interactions Using Information-Theoretic Mechanisms

Designing Incentive Schemes For Privacy-Sensitive Users

Implication of Unobservable State-and-topology Cyber-physical Attacks

Vulnerability Analysis and Consequences of False Data Injection Attack on Power System State Estimation

Asymptotics and Non-asymptotics for Universal Fixed-to-Variable Source Coding

Enabling Data Exchange in Interactive State Estimation under Privacy Constraints

Utility-Privacy Tradeoff in Databases: An Information-theoretic Approach

Distributed Estimation in Multi-Agent Networks

Competitive Privacy in the Smart Grid: An Information-theoretic Approach

Discriminatory Lossy Source Coding: Side Information Privacy

Ergodic Fading Interference Channels: Sum-Capacity and Separability

Multi-User Privacy: The Gray-Wyner System and Generalized Common Information

Smart Meter Privacy: A Utility-Privacy Framework

A General Coding Scheme for Two-User Fading Interference Channels

An Information-theoretic Approach to Privacy

Utility and Privacy of Data Sources: Can Shannon Help Conceal and Reveal Information?

Coalitions in Cooperative Wireless Networks

Relay vs. User Cooperation in Time-Duplexed Multiaccess Networks

Sum-Capacity of Ergodic Fading Interference and Compound Multiaccess Channels

Opportunistic Communications in an Orthogonal Multiaccess Relay Channel