Source author record

Ketan Rajawat

Ketan Rajawat appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Networking and Internet Architecture eess.SP Information Theory math.IT Computer Vision Distributed, Parallel, and Cluster Computing Cryptography and Security eess.SY Multiagent Systems Multimedia Robotics Systems and Control

Catalog footprint

What is connected

18works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning

Newton-type methods are popular in federated learning due to their fast convergence. Still, they suffer from two main issues, namely: low communication efficiency and low privacy due to the requirement of sending Hessian information from clients to parameter server (PS). In this work, we introduced a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS, hence resolving the bottleneck to improve communication efficiency. In addition, FedNew hides the gradient information and results in a privacy-preserving approach compared to the existing state-of-the-art. The core novel idea in FedNew is to introduce a two level framework, and alternate between updating the inverse Hessian-gradient product using only one alternating direction method of multipliers (ADMM) step and then performing the global model update using Newton's method. Though only one ADMM pass is used to approximate the inverse Hessian-gradient product at each iteration, we develop a novel theoretical approach to show the converging behavior of FedNew for convex problems. Additionally, a significant reduction in communication overhead is achieved by utilizing stochastic quantization. Numerical results using real datasets show the superiority of FedNew compared to existing methods in terms of communication costs.

preprint2022arXiv

Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood

Non-asymptotic analysis of quasi-Newton methods have gained traction recently. In particular, several works have established a non-asymptotic superlinear rate of $\mathcal{O}((1/\sqrt{t})^t)$ for the (classic) BFGS method by exploiting the fact that its error of Newton direction approximation approaches zero. Moreover, a greedy variant of BFGS was recently proposed which accelerates its convergence by directly approximating the Hessian, instead of the Newton direction, and achieves a fast local quadratic convergence rate. Alas, the local quadratic convergence of Greedy-BFGS requires way more updates compared to the number of iterations that BFGS requires for a local superlinear rate. This is due to the fact that in Greedy-BFGS the Hessian is directly approximated and the Newton direction approximation may not be as accurate as the one for BFGS. In this paper, we close this gap and present a novel BFGS method that has the best of both worlds in that it leverages the approximation ideas of both BFGS and Greedy-BFGS to properly approximate the Newton direction and the Hessian matrix simultaneously. Our theoretical results show that our method out-performs both BFGS and Greedy-BFGS in terms of convergence rate, while it reaches its quadratic convergence rate with fewer steps compared to Greedy-BFGS. Numerical experiments on various datasets also confirm our theoretical findings.

preprint2022arXiv

Sparse Representations of Positive Functions via First and Second-Order Pseudo-Mirror Descent

We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent employing (i) \emph{pseudo-gradients} and (ii) complexity-reducing projections. Compressive projection in the first-order scheme is executed via kernel orthogonal matching pursuit (KOMP), which overcomes the fact that the vanilla RKHS parameterization grows unbounded with the iteration index in the stochastic setting. Moreover, pseudo-gradients are needed when gradient estimates for cost are only computable up to some numerical error, which arise in, e.g., integral approximations. Under constant step-size and compression budget, we establish tradeoffs between the radius of convergence of the expected sub-optimality and the projection budget parameter, as well as non-asymptotic bounds on the model complexity. To refine the solution's precision, we develop a second-order extension which employs recursively averaged pseudo-gradient outer-products to approximate the Hessian inverse, whose convergence in mean is established under an additional eigenvalue decay condition on the Hessian of the optimal RKHS element, which is unique to this work. Experiments demonstrate favorable performance on inhomogeneous Poisson Process intensity estimation in practice.

preprint2022arXiv

Stochastic Compositional Gradient Descent under Compositional Constraints

This work studies constrained stochastic optimization problems where the objective and constraint functions are convex and expressed as compositions of stochastic functions. The problem arises in the context of fair classification, fair regression, and the design of queuing systems. Of particular interest is the large-scale setting where an oracle provides the stochastic gradients of the constituent functions, and the goal is to solve the problem with a minimal number of calls to the oracle. Owing to the compositional form, the stochastic gradients provided by the oracle do not yield unbiased estimates of the objective or constraint gradients. Instead, we construct approximate gradients by tracking the inner function evaluations, resulting in a quasi-gradient saddle point algorithm. We prove that the proposed algorithm is guaranteed to find the optimal and feasible solution almost surely. We further establish that the proposed algorithm requires $\mathcal{O}(1/ε^4)$ data samples in order to obtain an $ε$-approximate optimal point while also ensuring zero constraint violation. The result matches the sample complexity of the stochastic compositional gradient descent method for unconstrained problems and improves upon the best-known sample complexity results for the constrained settings. The efficacy of the proposed algorithm is tested on both fair classification and fair regression problems. The numerical results show that the proposed algorithm outperforms the state-of-the-art algorithms in terms of the convergence rate.

preprint2022arXiv

Variational Bayesian Filtering with Subspace Information for Extreme Spatio-Temporal Matrix Completion

Missing data is a common problem in real-world sensor data collection. The performance of various approaches to impute data degrade rapidly in the extreme scenarios of low data sampling and noisy sampling, a case present in many real-world problems in the field of traffic sensing and environment monitoring, etc. However, jointly exploiting the spatiotemporal and periodic structure, which is generally not captured by classical matrix completion approaches, can improve the imputation performance of sensor data in such real-world conditions. We present a Bayesian approach towards spatiotemporal matrix completion wherein we estimate the underlying temporarily varying subspace using a Variational Bayesian technique. We jointly couple the low-rank matrix completion with the state space autoregressive framework along with a penalty function on the slowly varying subspace to model the temporal and periodic evolution in the data. A major advantage of our method is that a critical parameter like the rank of the model is automatically tuned using the automatic relevance determination (ARD) approach, unlike most matrix/tensor completion techniques. We also propose a robust version of the above formulation, which improves the performance of imputation in the presence of outliers. We evaluate the proposed Variational Bayesian Filtering with Subspace Information (VBFSI) method to impute matrices in real-world traffic and air pollution data. Simulation results demonstrate that the proposed method outperforms the recent state-of-the-art methods and provides a sufficiently accurate imputation for different sampling rates. In particular, we demonstrate that fusing the subspace evolution over days can improve the imputation performance with even 15% of the data sampling.

preprint2022arXiv

Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

This paper considers stochastic convex optimization problems with two sets of constraints: (a) deterministic constraints on the domain of the optimization variable, which are difficult to project onto; and (b) deterministic or stochastic constraints that admit efficient projection. Problems of this form arise frequently in the context of semidefinite programming as well as when various NP-hard problems are solved approximately via semidefinite relaxation. Since projection onto the first set of constraints is difficult, it becomes necessary to explore projection-free algorithms, such as the stochastic Frank-Wolfe (FW) algorithm. On the other hand, the second set of constraints cannot be handled in the same way, and must be incorporated as an indicator function within the objective function, thereby complicating the application of FW methods. Similar problems have been studied before; however, they suffer from slow convergence rates. This work, equipped with momentum based gradient tracking technique, guarantees fast convergence rates on par with the best-known rates for problems without the second set of constraints. Zeroth-order variants of the proposed algorithms are also developed and again improve upon the state-of-the-art rate results. We further propose the novel trimmed FW variants that enjoy the same convergence rates as their classical counterparts, but are empirically shown to require significantly fewer calls to the linear minimization oracle speeding up the overall algorithm. The efficacy of the proposed algorithms is tested on relevant applications of sparse matrix estimation, clustering via semidefinite relaxation, and uniform sparsest cut problem.

preprint2021arXiv

Distributed Optimisation With Communication Delays

This paper discusses distributed optimization over a directed graph. We begin with some well known algorithms which achieve consensus among agents including FROST [1], which possesses the quickest convergence to the optimum. It is a well known fact FROST has a linear convergence. However FROST works only over fixed topology of underlying network. Moreover the updates proposed therein require perfectly synchronized communication among nodes. Hence communication delays among nodes, which are inevitable in a realistic scenario, preclude the possibility of implementing FROST in real time. In this paper we introduce a co-operative control strategy which makes convergence to optimum robust to communication delays.

preprint2021arXiv

Optimizing QoS for Erasure-Coded Wireless Data Centers

Cloud computing facilitates the access of applications and data from any location by a distributed storage system. Erasure codes offer better data replication technique with reduced storage costs for more reliability. This paper considers the erasure-coded data center with multiple servers in a wireless network where each is equipped with a base-station. The cause of latency in the file retrieval process is mainly due to queuing delays at each server. This work puts forth a stochastic optimization framework for obtaining the optimal scheduling policy that maximizes users' quality of service (QoS) while adhering to the latency requirements. We further show that the problem has non-linear functions of expectations in objective and constraints and is impossible to solve with traditional SGD like algorithms. We propose a new algorithm that addresses compositional structure in the problem. Further, we show that the proposed algorithm achieves a faster convergence rate than the best-known results. Finally, we test the efficacy of the proposed method in a simulated environment.

preprint2020arXiv

A Generalized Framework for Autonomous Calibration of Wheeled Mobile Robots

Robotic calibration allows for the fusion of data from multiple sensors such as odometers, cameras, etc., by providing appropriate transformational relationships between the corresponding reference frames. For wheeled robots equipped with exteroceptive sensors, calibration entails learning the motion model of the sensor or the robot in terms of the odometric data, and must generally be performed prior to performing tasks such as simultaneous localization and mapping (SLAM). Within this context, the current trend is to carry out simultaneous calibration of odometry and sensor without the use of any additional hardware. Building upon the existing simultaneous calibration algorithms, we put forth a generalized calibration framework that can not only handle robots operating in 2D with arbitrary or unknown motion models but also handle outliers in an automated manner. We first propose an algorithm based on the alternating minimization framework applicable to two-wheel differential drive. Subsequently, for arbitrary but known drive configurations we put forth an iteratively re-weighted least squares methodology leveraging an intelligent weighing scheme. Different from the existing works, these proposed algorithms require no manual intervention and seamlessly handle outliers that arise due to both systematic and non-systematic errors. Finally, we put forward a novel Gaussian Process-based non-parametric approach for calibrating wheeled robots with arbitrary or unknown drive configurations. Detailed experiments are performed to demonstrate the accuracy, usefulness, and flexibility of the proposed algorithms.

preprint2020arXiv

Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of $K$ worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node having access to only the stochastic samples of its local objective function. In contrast to the existing approaches, we employ a momentum based "single loop" distributed algorithm which eliminates the need of computing large batch size gradients to achieve variance reduction. We propose two algorithms one with "adaptive" and the other with "non-adaptive" learning rates. We show that the proposed algorithms achieve the optimal computational complexity while attaining linear speedup with the number of WNs. Specifically, the algorithms reach an $ε$-stationary point $x_a$ with $\mathbb{E}\| \nabla f(x_a) \| \leq \tilde{O}(K^{-1/3}T^{-1/2} + K^{-1/3}T^{-1/3})$ in $T$ iterations, thereby requiring $\tilde{O}(K^{-1} ε^{-3})$ gradient computations at each WN. Moreover, our approach does not assume identical data distributions across WNs making the approach general enough for federated learning applications.

preprint2020arXiv

Dynamic Cache Management In Content Delivery Networks

The importance of content delivery networks (CDN) continues to rise with the exponential increase in the generation and consumption of electronic media. In order to ensure a high quality of experience, CDNs often deploy cache servers that are capable of storing some of the popular files close to the user. Such edge caching solutions not only increase the content availability, but also result in higher download rates and lower latency at the user. We consider the problem of content placement from an optimization perspective. Different from the classical eviction-based algorithms, the present work formulates the content placement problem from an optimization perspective and puts forth an online algorithm for the same. In contrast to the existing optimization-based solutions, the proposed algorithm is incremental and incurs very low computation cost, while yielding storage allocations that are provably near-optimal. The proposed algorithm can handle time varying content popularity, thereby obviating the need for periodically estimating demand distribution. Using synthetic and real IPTV data, we show that the proposed policies outperform all the state of art caching techniques in terms of various metrics.

preprint2020arXiv

Online Trajectory Optimization Using Inexact Gradient Feedback for Time-Varying Environments

This paper considers the problem of online trajectory design under time-varying environments. We formulate the general trajectory optimization problem within the framework of time-varying constrained convex optimization and proposed a novel version of the online gradient ascent algorithm for such problems. Moreover, the gradient feedback is noisy and allows us to use the proposed algorithm for a range of practical applications where it is difficult to acquire the true gradient. In contrast to the most available literature, we present the offline sublinear regret of the proposed algorithm up to the path length variations of the optimal offline solution, the cumulative gradient, and the error in the gradient variations. Furthermore, we establish a lower bound on the offline dynamic regret, which defines the optimality of any trajectory. To show the efficacy of the proposed algorithm, we consider two practical problems of interest. First, we consider a device to device (D2D) communications setting, and the goal is to design a user trajectory while maximizing its connectivity to the internet. The second problem is associated with the online planning of energy-efficient trajectories for unmanned surface vehicles (USV) under strong disturbances in ocean environments with both static and dynamic goal locations. The detailed simulation results demonstrate the significance of the proposed algorithm on synthetic and real data sets. Video on the real-world datasets can be found at {https://www.youtube.com/watch?v=FcRqqWtpf\_0}

preprint2020arXiv

Optimally Compressed Nonparametric Online Learning

Batch training of machine learning models based on neural networks is now well established, whereas to date streaming methods are largely based on linear models. To go beyond linear in the online setting, nonparametric methods are of interest due to their universality and ability to stably incorporate new information via convexity or Bayes' Rule. Unfortunately, when used online, nonparametric methods suffer a "curse of dimensionality" which precludes their use: their complexity scales at least with the time index. We survey online compression tools which bring their memory under control and attain approximate convergence. The asymptotic bias depends on a compression parameter that trades off memory and accuracy. Further, the applications to robotics, communications, economics, and power are discussed, as well as extensions to multi-agent systems.

preprint2016arXiv

Asynchronous Optimization Over Heterogeneous Networks via Consensus ADMM

This paper considers the distributed optimization of a sum of locally observable, non-convex functions. The optimization is performed over a multi-agent networked system, and each local function depends only on a subset of the variables. An asynchronous and distributed alternating directions method of multipliers (ADMM) method that allows the nodes to defer or skip the computation and transmission of updates is proposed in the paper. The proposed algorithm utilizes different approximations in the update step, resulting in proximal and majorized ADMM variants. Both variants are shown to converge to a local minimum, under certain regularity conditions. The proposed asynchronous algorithms are also applied to the problem of cooperative localization in wireless ad hoc networks, where it is shown to outperform the other state-of-the-art localization algorithms.

preprint2016arXiv

Stochastic Multidimensional Scaling

Multidimensional scaling (MDS) is a popular dimensionality reduction techniques that has been widely used for network visualization and cooperative localization. However, the traditional stress minimization formulation of MDS necessitates the use of batch optimization algorithms that are not scalable to large-sized problems. This paper considers an alternative stochastic stress minimization framework that is amenable to incremental and distributed solutions. A novel linear-complexity stochastic optimization algorithm is proposed that is provably convergent and simple to implement. The applicability of the proposed algorithm to localization and visualization tasks is also expounded. Extensive tests on synthetic and real datasets demonstrate the efficacy of the proposed algorithm.

preprint2012arXiv

Dynamic Network Cartography

Communication networks have evolved from specialized, research and tactical transmission systems to large-scale and highly complex interconnections of intelligent devices, increasingly becoming more commercial, consumer-oriented, and heterogeneous. Propelled by emergent social networking services and high-definition streaming platforms, network traffic has grown explosively thanks to the advances in processing speed and storage capacity of state-of-the-art communication technologies. As "netizens" demand a seamless networking experience that entails not only higher speeds, but also resilience and robustness to failures and malicious cyber-attacks, ample opportunities for signal processing (SP) research arise. The vision is for ubiquitous smart network devices to enable data-driven statistical learning algorithms for distributed, robust, and online network operation and management, adaptable to the dynamically-evolving network landscape with minimal need for human intervention. The present paper aims at delineating the analytical background and the relevance of SP tools to dynamic network monitoring, introducing the SP readership to the concept of dynamic network cartography -- a framework to construct maps of the dynamic network state in an efficient and scalable manner tailored to large-scale heterogeneous networks.

preprint2012arXiv

Dynamic Network Delay Cartography

Path delays in IP networks are important metrics, required by network operators for assessment, planning, and fault diagnosis. Monitoring delays of all source-destination pairs in a large network is however challenging and wasteful of resources. The present paper advocates a spatio-temporal Kalman filtering approach to construct network-wide delay maps using measurements on only a few paths. The proposed network cartography framework allows efficient tracking and prediction of delays by relying on both topological as well as historical data. Optimal paths for delay measurement are selected in an online fashion by leveraging the notion of submodularity. The resulting predictor is optimal in the class of linear predictors, and outperforms competing alternatives on real-world datasets.

preprint2010arXiv

Cross-Layer Designs in Coded Wireless Fading Networks with Multicast

A cross-layer design along with an optimal resource allocation framework is formulated for wireless fading networks, where the nodes are allowed to perform network coding. The aim is to jointly optimize end-to-end transport layer rates, network code design variables, broadcast link flows, link capacities, average power consumption, and short-term power allocation policies. As in the routing paradigm where nodes simply forward packets, the cross-layer optimization problem with network coding is non-convex in general. It is proved however, that with network coding, dual decomposition for multicast is optimal so long as the fading at each wireless link is a continuous random variable. This lends itself to provably convergent subgradient algorithms, which not only admit a layered-architecture interpretation but also optimally integrate network coding in the protocol stack. The dual algorithm is also paired with a scheme that yields near-optimal network design variables, namely multicast end-to-end rates, network code design quantities, flows over the broadcast links, link capacities, and average power consumption. Finally, an asynchronous subgradient method is developed, whereby the dual updates at the physical layer can be affordably performed with a certain delay with respect to the resource allocation tasks in upper layers. This attractive feature is motivated by the complexity of the physical layer subproblem, and is an adaptation of the subgradient method suitable for network control.

Ketan Rajawat

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type Method for Federated Learning

Sharpened Quasi-Newton Methods: Faster Superlinear Rate and Larger Local Convergence Neighborhood

Sparse Representations of Positive Functions via First and Second-Order Pseudo-Mirror Descent

Stochastic Compositional Gradient Descent under Compositional Constraints

Variational Bayesian Filtering with Subspace Information for Extreme Spatio-Temporal Matrix Completion

Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization

Distributed Optimisation With Communication Delays

Optimizing QoS for Erasure-Coded Wireless Data Centers

A Generalized Framework for Autonomous Calibration of Wheeled Mobile Robots

Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

Dynamic Cache Management In Content Delivery Networks

Online Trajectory Optimization Using Inexact Gradient Feedback for Time-Varying Environments

Optimally Compressed Nonparametric Online Learning

Asynchronous Optimization Over Heterogeneous Networks via Consensus ADMM

Stochastic Multidimensional Scaling

Dynamic Network Cartography

Dynamic Network Delay Cartography

Cross-Layer Designs in Coded Wireless Fading Networks with Multicast