Researcher profile

Kai Fong Ernest Chong

Kai Fong Ernest Chong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

FedCorr: Multi-Stage Federated Learning for Label Noise Correction

Federated learning (FL) is a privacy-preserving distributed learning paradigm that enables clients to jointly train a global model. In real-world FL implementations, client data could have label noise, and different clients could have vastly different label noise levels. Although there exist methods in centralized learning for tackling label noise, such methods do not perform well on heterogeneous label noise in FL settings, due to the typically smaller sizes of client datasets and data privacy requirements in FL. In this paper, we propose $\texttt{FedCorr}$, a general multi-stage framework to tackle heterogeneous label noise in FL, without making any assumptions on the noise models of local clients, while still maintaining client data privacy. In particular, (1) $\texttt{FedCorr}$ dynamically identifies noisy clients by exploiting the dimensionalities of the model prediction subspaces independently measured on all clients, and then identifies incorrect labels on noisy clients based on per-sample losses. To deal with data heterogeneity and to increase training stability, we propose an adaptive local proximal regularization term that is based on estimated local noise levels. (2) We further finetune the global model on identified clean clients and correct the noisy labels for the remaining noisy clients after finetuning. (3) Finally, we apply the usual training on all clients to make full use of all local data. Experiments conducted on CIFAR-10/100 with federated synthetic label noise, and on a real-world noisy dataset, Clothing1M, demonstrate that $\texttt{FedCorr}$ is robust to label noise and substantially outperforms the state-of-the-art methods at multiple noise levels.

preprint2020arXiv

A closer look at the approximation capabilities of neural networks

The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions $σ$, then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function $f$ to any given approximation threshold $\varepsilon$, if and only if $σ$ is non-polynomial. In this paper, we give a direct algebraic proof of the theorem. Furthermore we shall explicitly quantify the number of hidden units required for approximation. Specifically, if $X\subseteq \mathbb{R}^n$ is compact, then a neural network with $n$ input units, $m$ output units, and a single hidden layer with $\binom{n+d}{d}$ hidden units (independent of $m$ and $\varepsilon$), can uniformly approximate any polynomial function $f:X \to \mathbb{R}^m$ whose total degree is at most $d$ for each of its $m$ coordinate functions. In the general case that $f$ is any continuous function, we show there exists some $N\in \mathcal{O}(\varepsilon^{-n})$ (independent of $m$), such that $N$ hidden units would suffice to approximate $f$. We also show that this uniform approximation property (UAP) still holds even under seemingly strong conditions imposed on the weights. We highlight several consequences: (i) For any $δ> 0$, the UAP still holds if we restrict all non-bias weights $w$ in the last layer to satisfy $|w| < δ$. (ii) There exists some $λ>0$ (depending only on $f$ and $σ$), such that the UAP still holds if we restrict all non-bias weights $w$ in the first layer to satisfy $|w|>λ$. (iii) If the non-bias weights in the first layer are \emph{fixed} and randomly chosen from a suitable range, then the UAP holds with probability $1$.

preprint2010arXiv

Fountain Codes with Varying Probability Distributions

Fountain codes are rateless erasure-correcting codes, i.e., an essentially infinite stream of encoded packets can be generated from a finite set of data packets. Several fountain codes have been proposed recently to minimize overhead, many of which involve modifications of the Luby transform (LT) code. These fountain codes, like the LT code, have the implicit assumption that the probability distribution is fixed throughout the encoding process. In this paper, we will use the theory of posets to show that this assumption is unnecessary, and by dropping it, we can achieve overhead reduction by as much as 64% lower than LT codes. We also present the fundamental theory of probability distribution designs for fountain codes with non-constant probability distributions that minimize overhead.