Source author record

Binghui Peng

Binghui Peng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Data Structures and Algorithms Artificial Intelligence Cryptography and Security math.OC Social and Information Networks

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Adaptive Greedy versus Non-adaptive Greedy for Influence Maximization

We consider the *adaptive influence maximization problem*: given a network and a budget $k$, iteratively select $k$ seeds in the network to maximize the expected number of adopters. In the *full-adoption feedback model*, after selecting each seed, the seed-picker observes all the resulting adoptions. In the *myopic feedback model*, the seed-picker only observes whether each neighbor of the chosen seed adopts. Motivated by the extreme success of greedy-based algorithms/heuristics for influence maximization, we propose the concept of *greedy adaptivity gap*, which compares the performance of the adaptive greedy algorithm to its non-adaptive counterpart. Our first result shows that, for submodular influence maximization, the adaptive greedy algorithm can perform up to a $(1-1/e)$-fraction worse than the non-adaptive greedy algorithm, and that this ratio is tight. More specifically, on one side we provide examples where the performance of the adaptive greedy algorithm is only a $(1-1/e)$ fraction of the performance of the non-adaptive greedy algorithm in four settings: for both feedback models and both the *independent cascade model* and the *linear threshold model*. On the other side, we prove that in any submodular cascade, the adaptive greedy algorithm always outputs a $(1-1/e)$-approximation to the expected number of adoptions in the optimal non-adaptive seed choice. Our second result shows that, for the general submodular diffusion model with full-adoption feedback, the adaptive greedy algorithm can outperform the non-adaptive greedy algorithm by an unbounded factor. Finally, we propose a risk-free variant of the adaptive greedy algorithm that always performs no worse than the non-adaptive greedy algorithm.

preprint2022arXiv

Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions

Continual learning is an emerging paradigm in machine learning, wherein a model is exposed in an online fashion to data from multiple different distributions (i.e. environments), and is expected to adapt to the distribution change. Precisely, the goal is to perform well in the new environment, while simultaneously retaining the performance on the previous environments (i.e. avoid "catastrophic forgetting") -- without increasing the size of the model. While this setup has enjoyed a lot of attention in the applied community, there hasn't be theoretical work that even formalizes the desired guarantees. In this paper, we propose a framework for continual learning through the framework of feature extraction -- namely, one in which features, as well as a classifier, are being trained with each environment. When the features are linear, we design an efficient gradient-based algorithm $\mathsf{DPGD}$, that is guaranteed to perform well on the current environment, as well as avoid catastrophic forgetting. In the general case, when the features are non-linear, we show such an algorithm cannot exist, whether efficient or not.

preprint2022arXiv

Memory Bounds for Continual Learning

Continual learning, or lifelong learning, is a formidable current challenge to machine learning. It requires the learner to solve a sequence of $k$ different learning tasks, one after the other, while retaining its aptitude for earlier tasks; the continual learner should scale better than the obvious solution of developing and maintaining a separate learner for each of the $k$ tasks. We embark on a complexity-theoretic study of continual learning in the PAC framework. We make novel uses of communication complexity to establish that any continual learner, even an improper one, needs memory that grows linearly with $k$, strongly suggesting that the problem is intractable. When logarithmically many passes over the learning tasks are allowed, we provide an algorithm based on multiplicative weights update whose memory requirement scales well; we also establish that improper learning is necessary for such performance. We conjecture that these results may lead to new promising approaches to continual learning.

preprint2022arXiv

On the Complexity of Dynamic Submodular Maximization

We study dynamic algorithms for the problem of maximizing a monotone submodular function over a stream of $n$ insertions and deletions. We show that any algorithm that maintains a $(0.5+ε)$-approximate solution under a cardinality constraint, for any constant $ε>0$, must have an amortized query complexity that is $\mathit{polynomial}$ in $n$. Moreover, a linear amortized query complexity is needed in order to maintain a $0.584$-approximate solution. This is in sharp contrast with recent dynamic algorithms of [LMNF+20, Mon20] that achieve $(0.5-ε)$-approximation with a $\mathsf{poly}\log(n)$ amortized query complexity. On the positive side, when the stream is insertion-only, we present efficient algorithms for the problem under a cardinality constraint and under a matroid constraint with approximation guarantee $1-1/e-ε$ and amortized query complexities $\smash{O(\log (k/ε)/ε^2)}$ and $\smash{k^{\tilde{O}(1/ε^2)}\log n}$, respectively, where $k$ denotes the cardinality parameter or the rank of the matroid.

preprint2022arXiv

Shuffle Private Stochastic Convex Optimization

In shuffle privacy, each user sends a collection of randomized messages to a trusted shuffler, the shuffler randomly permutes these messages, and the resulting shuffled collection of messages must satisfy differential privacy. Prior work in this model has largely focused on protocols that use a single round of communication to compute algorithmic primitives like means, histograms, and counts. We present interactive shuffle protocols for stochastic convex optimization. Our protocols rely on a new noninteractive protocol for summing vectors of bounded $\ell_2$ norm. By combining this sum subroutine with mini-batch stochastic gradient descent, accelerated gradient descent, and Nesterov's smoothing method, we obtain loss guarantees for a variety of convex loss functions that significantly improve on those of the local model and sometimes match those of the central model.

Binghui Peng

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Adaptive Greedy versus Non-adaptive Greedy for Influence Maximization

Continual learning: a feature extraction formalization, an efficient algorithm, and fundamental obstructions

Memory Bounds for Continual Learning

On the Complexity of Dynamic Submodular Maximization

Shuffle Private Stochastic Convex Optimization