Source author record

Pong C. Yuen

Pong C. Yuen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Graphics Machine Learning math.OC

Catalog footprint

What is connected

6works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Federated Generalized Face Presentation Attack Detection

Face presentation attack detection plays a critical role in the modern face recognition pipeline. A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks. In reality, training data (both real face images and spoof images) are not directly shared between data owners due to legal and privacy issues. In this paper, with the motivation of circumventing this challenge, we propose a Federated Face Presentation Attack Detection (FedPAD) framework that simultaneously takes advantage of rich fPAD information available at different data owners while preserving data privacy. In the proposed framework, each data center locally trains its own fPAD model. A server learns a global fPAD model by iteratively aggregating model updates from all data centers without accessing private data in each of them. To equip the aggregated fPAD model in the server with better generalization ability to unseen attacks from users, following the basic idea of FedPAD, we further propose a Federated Generalized Face Presentation Attack Detection (FedGPAD) framework. A federated domain disentanglement strategy is introduced in FedGPAD, which treats each data center as one domain and decomposes the fPAD model into domain-invariant and domain-specific parts in each data center. Two parts disentangle the domain-invariant and domain-specific features from images in each local data center, respectively. A server learns a global fPAD model by only aggregating domain-invariant parts of the fPAD models from data centers and thus a more generalized fPAD model can be aggregated in server. We introduce the experimental setting to evaluate the proposed FedPAD and FedGPAD frameworks and carry out extensive experiments to provide various insights about federated learning for fPAD.

preprint2022arXiv

Open-set Adversarial Defense with Clean-Adversarial Mutual Learning

Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to robustify the network against images perturbed by imperceptible adversarial noise. This paper demonstrates that open-set recognition systems are vulnerable to adversarial samples. Furthermore, this paper shows that adversarial defense mechanisms trained on known classes are unable to generalize well to open-set samples. Motivated by these observations, we emphasize the necessity of an Open-Set Adversarial Defense (OSAD) mechanism. This paper proposes an Open-Set Defense Network with Clean-Adversarial Mutual Learning (OSDN-CAML) as a solution to the OSAD problem. The proposed network designs an encoder with dual-attentive feature-denoising layers coupled with a classifier to learn a noise-free latent feature representation, which adaptively removes adversarial noise guided by channel and spatial-wise attentive filters. Several techniques are exploited to learn a noise-free and informative latent feature space with the aim of improving the performance of adversarial defense and open-set recognition. First, we incorporate a decoder to ensure that clean images can be well reconstructed from the obtained latent features. Then, self-supervision is used to ensure that the latent features are informative enough to carry out an auxiliary task. Finally, to exploit more complementary knowledge from clean image classification to facilitate feature denoising and search for a more generalized local minimum for open-set recognition, we further propose clean-adversarial mutual learning, where a peer network (classifying clean images) is further introduced to mutually learn with the classifier (classifying adversarial images).

preprint2020arXiv

Open-set Adversarial Defense

Open-set recognition and adversarial defense study two key aspects of deep learning that are vital for real-world deployment. The objective of open-set recognition is to identify samples from open-set classes during testing, while adversarial defense aims to defend the network against images with imperceptible adversarial perturbations. In this paper, we show that open-set recognition systems are vulnerable to adversarial attacks. Furthermore, we show that adversarial defense mechanisms trained on known classes do not generalize well to open-set samples. Motivated by this observation, we emphasize the need of an Open-Set Adversarial Defense (OSAD) mechanism. This paper proposes an Open-Set Defense Network (OSDN) as a solution to the OSAD problem. The proposed network uses an encoder with feature-denoising layers coupled with a classifier to learn a noise-free latent feature representation. Two techniques are employed to obtain an informative latent feature space with the objective of improving open-set performance. First, a decoder is used to ensure that clean images can be reconstructed from the obtained latent features. Then, self-supervision is used to ensure that the latent features are informative enough to carry out an auxiliary task. We introduce a testing protocol to evaluate OSAD performance and show the effectiveness of the proposed method in multiple object classification datasets. The implementation code of the proposed method is available at: https://github.com/rshaojimmy/ECCV2020-OSAD.

preprint2020arXiv

Self-supervised Temporal Discriminative Learning for Video Representation Learning

Temporal cues in videos provide important information for recognizing actions accurately. However, temporal-discriminative features can hardly be extracted without using an annotated large-scale video action dataset for training. This paper proposes a novel Video-based Temporal-Discriminative Learning (VTDL) framework in self-supervised manner. Without labelled data for network pretraining, temporal triplet is generated for each anchor video by using segment of the same or different time interval so as to enhance the capacity for temporal feature representation. Measuring temporal information by time derivative, Temporal Consistent Augmentation (TCA) is designed to ensure that the time derivative (in any order) of the augmented positive is invariant except for a scaling constant. Finally, temporal-discriminative features are learnt by minimizing the distance between each anchor and its augmented positive, while the distance between each anchor and its augmented negative as well as other videos saved in the memory bank is maximized to enrich the representation diversity. In the downstream action recognition task, the proposed method significantly outperforms existing related works. Surprisingly, the proposed self-supervised approach is better than fully-supervised methods on UCF101 and HMDB51 when a small-scale video dataset (with only thousands of videos) is used for pre-training. The code has been made publicly available on https://github.com/FingerRec/Self-Supervised-Temporal-Discriminative-Representation-Learning-for-Video-Action-Recognition.

preprint2012arXiv

Interactive Character Posing by Sparse Coding

Character posing is of interest in computer animation. It is difficult due to its dependence on inverse kinematics (IK) techniques and articulate property of human characters . To solve the IK problem, classical methods that rely on numerical solutions often suffer from the under-determination problem and can not guarantee naturalness. Existing data-driven methods address this problem by learning from motion capture data. When facing a large variety of poses however, these methods may not be able to capture the pose styles or be applicable in real-time environment. Inspired from the low-rank motion de-noising and completion model in \cite{lai2011motion}, we propose a novel model for character posing based on sparse coding. Unlike conventional approaches, our model directly captures the pose styles in Euclidean space to provide intuitive training error measurements and facilitate pose synthesis. A pose dictionary is learned in training stage and based on it natural poses are synthesized to satisfy users' constraints . We compare our model with existing models for tasks of pose de-noising and completion. Experiments show our model obtains lower de-noising and completion error. We also provide User Interface(UI) examples illustrating that our model is effective for interactive character posing.

preprint2012arXiv

ProPPA: A Fast Algorithm for $\ell_1$ Minimization and Low-Rank Matrix Completion

We propose a Projected Proximal Point Algorithm (ProPPA) for solving a class of optimization problems. The algorithm iteratively computes the proximal point of the last estimated solution projected into an affine space which itself is parallel and approaching to the feasible set. We provide convergence analysis theoretically supporting the general algorithm, and then apply it for solving $\ell_1$-minimization problems and the matrix completion problem. These problems arise in many applications including machine learning, image and signal processing. We compare our algorithm with the existing state-of-the-art algorithms. Experimental results on solving these problems show that our algorithm is very efficient and competitive.

Pong C. Yuen

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Federated Generalized Face Presentation Attack Detection

Open-set Adversarial Defense with Clean-Adversarial Mutual Learning

Open-set Adversarial Defense

Self-supervised Temporal Discriminative Learning for Video Representation Learning

Interactive Character Posing by Sparse Coding

ProPPA: A Fast Algorithm for $\ell_1$ Minimization and Low-Rank Matrix Completion