Researcher profile

Qifa Yan

Qifa Yan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2023arXiv

A Fundamental Tradeoff Among Storage, Computation, and Communication for Distributed Computing over Star Network

Coded distributed computing can alleviate the communication load by leveraging the redundant storage and computation resources with coding techniques in distributed computing. In this paper, we study a MapReduce-type distributed computing framework over star topological network, where all the workers exchange information through a common access point. The optimal tradeoff among the normalized number of the stored files (storage load), computed intermediate values (computation load) and transmitted bits in the uplink and downlink (communication loads) are characterized. A coded computing scheme is proposed to achieve the Pareto-optimal tradeoff surface, in which the access point only needs to perform simple chain coding between the signals it receives, and information-theretical bound matching the surface is also provided.

preprint2021arXiv

Capacity-Achieving Private Information Retrieval Schemes from Uncoded Storage Constrained Servers with Low Sub-packetization

This paper investigates reducing sub-packetization of capacity-achieving schemes for uncoded Storage Constrained Private Information Retrieval (SC-PIR) systems. In the SC-PIR system, a user aims to retrieve one out of $K$ files from $N$ servers while revealing nothing about its identity to any individual server, in which the $K$ files are stored at the $N$ servers in an uncoded form and each server can store up to $μK$ equivalent files, where $μ$ is the normalized storage capacity of each server. We first prove that there exists a capacity-achieving SC-PIR scheme for a given storage design if and only if all the packets are stored exactly at $M\triangleq μN$ servers for $μ$ such that $M=μN\in\{2,3,\ldots,N\}$. Then, the optimal sub-packetization for capacity-achieving linear SC-PIR schemes is characterized as the solution to an optimization problem, which is typically hard to solve because of involving indicator functions. Moreover, a new notion of array called Storage Design Array (SDA) is introduced for the SC-PIR system. With any given SDA, an associated capacity-achieving SC-PIR scheme is constructed. Next, the SC-PIR schemes that have equal-size packets are investigated. Furthermore, the optimal equal-size sub-packetization among all capacity-achieving linear SC-PIR schemes characterized by Woolsey et al. is proved to be $\frac{N(M-1)}{\gcd(N,M)}$. Finally, by allowing unequal size of packets, a greedy SDA construction is proposed, where the sub-packetization of the associated SC-PIR scheme is upper bounded by $\frac{N(M-1)}{\gcd(N,M)}$. Among all capacity-achieving linear SC-PIR schemes, the sub-packetization is optimal when $\min\{M,N-M\}|N$ or $M=N$, and within a multiplicative gap $\frac{\min\{M,N-M\}}{\gcd(N,M)}$ of the optimal one otherwise. In particular, for the case $N=d\cdot M\pm1$ where $d\geq 2$, another SDA is constructed to obtain lower sub-packetization.

preprint2020arXiv

A Fundamental Storage-Communication Tradeoff for Distributed Computing with Straggling Nodes

Placement delivery arrays for distributed computing (Comp-PDAs) have recently been proposed as a framework to construct universal computing schemes for MapReduce-like systems. In this work, we extend this concept to systems with straggling nodes, i.e., to systems where a subset of the nodes cannot accomplish the assigned map computations in due time. Unlike most previous works that focused on computing linear functions, our results are universal and apply for arbitrary map and reduce functions. Our contributions are as follows. Firstly, we show how to construct a universal coded computing scheme for MapReduce-like systems with straggling nodes from any given Comp-PDA. We also characterize the storage and communication loads of the resulting scheme in terms of the Comp-PDA parameters. Then, we prove an information-theoretic converse bound on the storage-communication (SC) tradeoff achieved by universal computing schemes with straggling nodes. We show that the information-theoretic bound matches the performance achieved by the coded computing schemes with straggling nodes corresponding to the Maddah-Ali and Niesen (MAN) PDAs, i.e., to the Comp-PDAs describing Maddah-Ali and Niesen's coded caching scheme. Interestingly, the same Comp-PDAs (the MAN-PDAs) are optimal for any number of straggling nodes, which implies that the map phase of optimal coded computing schemes does not need to be adapted to the number of stragglers in the system. We finally prove that while the points that lie exactly on the fundamental SC tradeoff cannot be achieved with Comp-PDAs that require smaller number of files than the MAN-PDAs, this is possible for some of the points that lie close to the SC tradeoff. For these latter points, the decrease in the requested number of files can be exponential in the number of nodes of the system.

preprint2020arXiv

A Storage-Computation-Communication Tradeoff for Distributed Computing

This paper investigates distributed computing systems where computations are split into "Map" and "Reduce" functions. A new coded scheme, called distributed computing and coded communication (D3C), is proposed, and its communication load is analyzed as a function of the available storage space and the number of intermediate values (IVA) to be computed. D3C achieves the smallest possible communication load for a given storage space, while a smaller number of IVAs need to be computed compared to Li et al.'s coded distributed computing (CDC) scheme. More generally, our scheme can flexibly trade between storage space and the number of IVAs to be computed. Communication load is then analyzed for any given tradeoff.

preprint2020arXiv

Placement Delivery Array Design for Combination Networks with Edge Caching

A major practical limitation of the Maddah-Ali-Niesen coded caching techniques is their high subpacketization level. For the simple network with a single server and multiple users, Yan \emph{et al.} proposed an alternative scheme with the so-called placement delivery arrays (PDA). Such a scheme requires slightly higher transmission rates but significantly reduces the subpacketization level. In this paper, we extend the PDA framework and propose three low-subpacketization schemes for combination networks, i.e., networks with a single server, multiple relays, and multiple cache-aided users that are connected to subsets of relays. One of the schemes achieves the cutset lower bound on the link rate when the cache memories are sufficiently large. Our other two schemes apply only to \emph{resolvable} combination networks. For these networks and for a wide range of cache sizes, the new schemes perform closely to the coded caching schemes that directly apply Maddah-Ali-Niesen scheme while having significantly reduced subpacketization levels.

preprint2020arXiv

Storage, Computation, and Communication: A Fundamental Tradeoff in Distributed Computing

We consider a MapReduce-like distributed computing system. We derive a lower bound on the communication cost for any given storage and computation costs. This lower bound matches the achievable bound we proposed recently. As a result, we completely characterize the optimal tradeoff between the storage, the computation, and the communication. Our result generalizes the previous one by Li et al. to also account for the number of computed intermediate values.