Researcher profile

Usman A. Khan

Usman A. Khan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Optimal and Scalable MAPF via Multi-Marginal Optimal Transport and Schrödinger Bridges

We consider anonymous multi-agent path finding (MAPF) where a set of robots is tasked to travel to a set of targets on a finite, connected graph. We show that MAPF can be cast as a special class of multi-marginal optimal transport (MMOT) problems with an underlying Markovian structure, under which the exponentially large MMOT collapses to a linear program (LP) polynomial in size. Focusing on the anonymous setting, we establish conditions under which the corresponding LP is feasible, totally unimodular, and consequently, yields min-cost, integral $(\{0,1\})$ transports that do not overlap in both space and time. To adapt the approach to large-scale problems, we cast the MAPF-MMOT in a probabilistic framework via Schrödinger bridges. Under standard assumptions, we show that the Schrödinger bridge formulation reduces to an entropic regularization of the corresponding MMOT that admits an iterative Sinkhorn-type solution. The Schrödinger bridge, being a probabilistic framework, provides a shadow (fractional) transport that we use as a template to solve a reduced LP and demonstrate that it results in near-optimal, integral transports at a significant reduction in complexity. Extensive experiments highlight the optimality and scalability of the proposed approaches.

preprint2022arXiv

Distributed Constraint-Coupled Optimization over Lossy Networks

This paper considers distributed resource allocation and sum-preserving constrained optimization over lossy networks, where the links are unreliable and subject to packet drops. We define the conditions to ensure convergence under packet drops and link removal by focusing on two main properties of our allocation algorithm: (i) The weight-stochastic condition in typical consensus schemes is reduced to balanced weights, with no need for readjusting the weights to satisfy stochasticity. (ii) The algorithm does not require all-time connectivity but instead uniform connectivity over some non-overlapping finite time intervals. First, we prove that our algorithm provides primal-feasible allocation at every iteration step and converges under the conditions (i)-(ii) and some other mild conditions on the nonlinear iterative dynamics. These nonlinearities address possible practical constraints in real applications due to, for example, saturation or quantization among others. Then, using (i)-(ii) and the notion of bond-percolation theory, we relate the packet drop rate and the network percolation threshold to the (finite) number of iterations ensuring uniform connectivity and, thus, convergence towards the optimum value.

preprint2022arXiv

Distributed saddle point problems for strongly concave-convex functions

In this paper, we propose GT-GDA, a distributed optimization method to solve saddle point problems of the form: $\min_{\mathbf{x}} \max_{\mathbf{y}} \{F(\mathbf{x},\mathbf{y}) :=G(\mathbf{x}) + \langle \mathbf{y}, \overline{P} \mathbf{x} \rangle - H(\mathbf{y})\}$, where the functions $G(\cdot)$, $H(\cdot)$, and the the coupling matrix $\overline{P}$ are distributed over a strongly connected network of nodes. GT-GDA is a first-order method that uses gradient tracking to eliminate the dissimilarity caused by heterogeneous data distribution among the nodes. In the most general form, GT-GDA includes a consensus over the local coupling matrices to achieve the optimal (unique) saddle point, however, at the expense of increased communication. To avoid this, we propose a more efficient variant GT-GDA-Lite that does not incur the additional communication and analyze its convergence in various scenarios. We show that GT-GDA converges linearly to the unique saddle point solution when $G(\cdot)$ is smooth and convex, $H(\cdot)$ is smooth and strongly convex, and the global coupling matrix $\overline{P}$ has full column rank. We further characterize the regime under which GT-GDA exhibits a network topology-independent convergence behavior. We next show the linear convergence of GT-GDA to an error around the unique saddle point, which goes to zero when the coupling cost ${\langle \mathbf y, \overline{P} \mathbf x \rangle}$ is common to all nodes, or when $G(\cdot)$ and $H(\cdot)$ are quadratic. Numerical experiments illustrate the convergence properties and importance of GT-GDA and GT-GDA-Lite for several applications.

preprint2022arXiv

Variance reduced stochastic optimization over directed graphs with row and column stochastic weights

This paper proposes AB-SAGA, a first-order distributed stochastic optimization method to minimize a finite-sum of smooth and strongly convex functions distributed over an arbitrary directed graph. AB-SAGA removes the uncertainty caused by the stochastic gradients using a node-level variance reduction and subsequently employs network-level gradient tracking to address the data dissimilarity across the nodes. Unlike existing methods that use the nonlinear push-sum correction to cancel the imbalance caused by the directed communication, the consensus updates in AB-SAGA are linear and uses both row and column stochastic weights. We show that for a constant step-size, AB-SAGA converges linearly to the global optimal. We quantify the directed nature of the underlying graph using an explicit directivity constant and characterize the regimes in which AB-SAGA achieves a linear speed-up over its centralized counterpart. Numerical experiments illustrate the convergence of AB-SAGA for strongly convex and nonconvex problems.

preprint2020arXiv

A general framework for decentralized optimization with first-order methods

Decentralized optimization to minimize a finite sum of functions over a network of nodes has been a significant focus within control and signal processing research due to its natural relevance to optimal control and signal estimation problems. More recently, the emergence of sophisticated computing and large-scale data science needs have led to a resurgence of activity in this area. In this article, we discuss decentralized first-order gradient methods, which have found tremendous success in control, signal processing, and machine learning problems, where such methods, due to their simplicity, serve as the first method of choice for many complex inference and training tasks. In particular, we provide a general framework of decentralized first-order methods that is applicable to undirected and directed communication networks alike, and show that much of the existing work on optimization and consensus can be related explicitly to this framework. We further extend the discussion to decentralized stochastic first-order methods that rely on stochastic gradients at each node and describe how local variance reduction schemes, previously shown to have promise in the centralized settings, are able to improve the performance of decentralized methods when combined with what is known as gradient tracking. We motivate and demonstrate the effectiveness of the corresponding methods in the context of machine learning and signal processing problems that arise in decentralized environments.

preprint2020arXiv

Gradient tracking and variance reduction for decentralized optimization and machine learning

Decentralized methods to solve finite-sum minimization problems are important in many signal processing and machine learning tasks where the data is distributed over a network of nodes and raw data sharing is not permitted due to privacy and/or resource constraints. In this article, we review decentralized stochastic first-order methods and provide a unified algorithmic framework that combines variance-reduction with gradient tracking to achieve both robust performance and fast convergence. We provide explicit theoretical guarantees of the corresponding methods when the objective functions are smooth and strongly-convex, and show their applicability to non-convex problems via numerical experiments. Throughout the article, we provide intuitive illustrations of the main technical ideas by casting appropriate tradeoffs and comparisons among the methods of interest and by highlighting applications to decentralized training of machine learning models.

preprint2020arXiv

S-ADDOPT: Decentralized stochastic first-order optimization over directed graphs

In this report, we study decentralized stochastic optimization to minimize a sum of smooth and strongly convex cost functions when the functions are distributed over a directed network of nodes. In contrast to the existing work, we use gradient tracking to improve certain aspects of the resulting algorithm. In particular, we propose the~\textbf{\texttt{S-ADDOPT}} algorithm that assumes a stochastic first-order oracle at each node and show that for a constant step-size~$α$, each node converges linearly inside an error ball around the optimal solution, the size of which is controlled by~$α$. For decaying step-sizes~$\mathcal{O}(1/k)$, we show that~\textbf{\texttt{S-ADDOPT}} reaches the exact solution sublinearly at~$\mathcal{O}(1/k)$ and its convergence is asymptotically network-independent. Thus the asymptotic behavior of~\textbf{\texttt{S-ADDOPT}} is comparable to the centralized stochastic gradient descent. Numerical experiments over both strongly convex and non-convex problems illustrate the convergence behavior and the performance comparison of the proposed algorithm.

preprint2019arXiv

Cyber-Social Systems: Modeling, Inference, and Optimal Design

This paper models the cyber-social system as a cyber-network of agents monitoring states of individuals in a social network. The state of each individual is represented by a social node and the interactions among individuals are represented by a social link. In the cyber-network each node represents an agent and the links represent information sharing among agents. Agents make an observation of social states and perform distributed inference. In this direction, the contribution of this work is threefold: (i) A novel distributed inference protocol is proposed that makes no assumption on the rank of the underlying social system. This is significant as most protocols in the literature only work on full-rank systems. (ii) A novel agent classification is developed, where it is shown that connectivity requirement on the cyber-network differs for each type. This is particularly important in finding the minimal number of observations and minimal connectivity of the cyber-network as the next contribution. (iii) The cost-optimal design of cyber-network constraint with distributed observability is addressed. This problem is subdivided into sensing cost optimization and networking cost optimization where both are claimed to be NP-hard. We solve both problems for certain types of social networks and find polynomial-order solutions.