Researcher profile

Zhengming Ding

Zhengming Ding contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
16works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

16 published item(s)

preprint2022arXiv

Learnable Visual Words for Interpretable Image Recognition

To interpret deep models' predictions, attention-based visual cues are widely used in addressing \textit{why} deep models make such predictions. Beyond that, the current research community becomes more interested in reasoning \textit{how} deep models make predictions, where some prototype-based methods employ interpretable representations with their corresponding visual cues to reveal the black-box mechanism of deep model behaviors. However, these pioneering attempts only either learn the category-specific prototypes and deteriorate their generalizing capacities, or demonstrate several illustrative examples without a quantitative evaluation of visual-based interpretability with further limitations on their practical usages. In this paper, we revisit the concept of visual words and propose the Learnable Visual Words (LVW) to interpret the model prediction behaviors with two novel modules: semantic visual words learning and dual fidelity preservation. The semantic visual words learning relaxes the category-specific constraint, enabling the general visual words shared across different categories. Beyond employing the visual words for prediction to align visual words with the base model, our dual fidelity preservation also includes the attention guided semantic alignment that encourages the learned visual words to focus on the same conceptual regions for prediction. Experiments on six visual benchmarks demonstrate the superior effectiveness of our proposed LVW in both accuracy and model interpretation over the state-of-the-art methods. Moreover, we elaborate on various in-depth analyses to further explore the learned visual words and the generalizability of our method for unseen categories.

preprint2022arXiv

Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning

Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices). However, the data distribution among clients is often non-IID in nature, making efficient optimization difficult. To alleviate this issue, many FL algorithms focus on mitigating the effects of data heterogeneity across clients by introducing a variety of proximal terms, some incurring considerable compute and/or memory overheads, to restrain local updates with respect to the global model. Instead, we consider rethinking solutions to data heterogeneity in FL with a focus on local learning generality rather than proximal restriction. To this end, we first present a systematic study informed by second-order indicators to better understand algorithm effectiveness in FL. Interestingly, we find that standard regularization methods are surprisingly strong performers in mitigating data heterogeneity effects. Based on our findings, we further propose a simple and effective method, FedAlign, to overcome data heterogeneity and the pitfalls of previous methods. FedAlign achieves competitive accuracy with state-of-the-art FL methods across a variety of settings while minimizing computation and memory overhead. Code is available at https://github.com/mmendiet/FedAlign

preprint2022arXiv

On the Equity of Nuclear Norm Maximization in Unsupervised Domain Adaptation

Nuclear norm maximization has shown the power to enhance the transferability of unsupervised domain adaptation model (UDA) in an empirical scheme. In this paper, we identify a new property termed equity, which indicates the balance degree of predicted classes, to demystify the efficacy of nuclear norm maximization for UDA theoretically. With this in mind, we offer a new discriminability-and-equity maximization paradigm built on squares loss, such that predictions are equalized explicitly. To verify its feasibility and flexibility, two new losses termed Class Weighted Squares Maximization (CWSM) and Normalized Squares Maximization (NSM), are proposed to maximize both predictive discriminability and equity, from the class level and the sample level, respectively. Importantly, we theoretically relate these two novel losses (i.e., CWSM and NSM) to the equity maximization under mild conditions, and empirically suggest the importance of the predictive equity in UDA. Moreover, it is very efficient to realize the equity constraints in both losses. Experiments of cross-domain image classification on three popular benchmark datasets show that both CWSM and NSM contribute to outperforming the corresponding counterparts.

preprint2021arXiv

A Unified Joint Maximum Mean Discrepancy for Domain Adaptation

Domain adaptation has received a lot of attention in recent years, and many algorithms have been proposed with impressive progress. However, it is still not fully explored concerning the joint probability distribution (P(X, Y)) distance for this problem, since its empirical estimation derived from the maximum mean discrepancy (joint maximum mean discrepancy, JMMD) will involve complex tensor-product operator that is hard to manipulate. To solve this issue, this paper theoretically derives a unified form of JMMD that is easy to optimize, and proves that the marginal, class conditional and weighted class conditional probability distribution distances are our special cases with different label kernels, among which the weighted class conditional one not only can realize feature alignment across domains in the category level, but also deal with imbalance dataset using the class prior probabilities. From the revealed unified JMMD, we illustrate that JMMD degrades the feature-label dependence (discriminability) that benefits to classification, and it is sensitive to the label distribution shift when the label kernel is the weighted class conditional one. Therefore, we leverage Hilbert Schmidt independence criterion and propose a novel MMD matrix to promote the dependence, and devise a novel label kernel that is robust to label distribution shift. Finally, we conduct extensive experiments on several cross-domain datasets to demonstrate the validity and effectiveness of the revealed theoretical results.

preprint2021arXiv

Generative Partial Visual-Tactile Fused Object Clustering

Visual-tactile fused sensing for object clustering has achieved significant progresses recently, since the involvement of tactile modality can effectively improve clustering performance. However, the missing data (i.e., partial data) issues always happen due to occlusion and noises during the data collecting process. This issue is not well solved by most existing partial multi-view clustering methods for the heterogeneous modality challenge. Naively employing these methods would inevitably induce a negative effect and further hurt the performance. To solve the mentioned challenges, we propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering. More specifically, we first do partial visual and tactile features extraction from the partial visual and tactile data, respectively, and encode the extracted features in modality-specific feature subspaces. A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality, which can compensate missing samples and align the visual and tactile modalities naturally by adversarial learning. To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders. Extensive comparative experiments on three public visual-tactile datasets prove the effectiveness of our method.

preprint2020arXiv

Adaptively-Accumulated Knowledge Transfer for Partial Domain Adaptation

Partial domain adaptation (PDA) attracts appealing attention as it deals with a realistic and challenging problem when the source domain label space substitutes the target domain. Most conventional domain adaptation (DA) efforts concentrate on learning domain-invariant features to mitigate the distribution disparity across domains. However, it is crucial to alleviate the negative influence caused by the irrelevant source domain categories explicitly for PDA. In this work, we propose an Adaptively-Accumulated Knowledge Transfer framework (A$^2$KT) to align the relevant categories across two domains for effective domain adaptation. Specifically, an adaptively-accumulated mechanism is explored to gradually filter out the most confident target samples and their corresponding source categories, promoting positive transfer with more knowledge across two domains. Moreover, a dual distinct classifier architecture consisting of a prototype classifier and a multilayer perceptron classifier is built to capture intrinsic data distribution knowledge across domains from various perspectives. By maximizing the inter-class center-wise discrepancy and minimizing the intra-class sample-wise compactness, the proposed model is able to obtain more domain-invariant and task-specific discriminative representations of the shared categories data. Comprehensive experiments on several partial domain adaptation benchmarks demonstrate the effectiveness of our proposed model, compared with the state-of-the-art PDA methods.

preprint2020arXiv

Adversarial Dual Distinct Classifiers for Unsupervised Domain Adaptation

Unsupervised Domain adaptation (UDA) attempts to recognize the unlabeled target samples by building a learning model from a differently-distributed labeled source domain. Conventional UDA concentrates on extracting domain-invariant features through deep adversarial networks. However, most of them seek to match the different domain feature distributions, without considering the task-specific decision boundaries across various classes. In this paper, we propose a novel Adversarial Dual Distinct Classifiers Network (AD$^2$CN) to align the source and target domain data distribution simultaneously with matching task-specific category boundaries. To be specific, a domain-invariant feature generator is exploited to embed the source and target data into a latent common space with the guidance of discriminative cross-domain alignment. Moreover, we naturally design two different structure classifiers to identify the unlabeled target samples over the supervision of the labeled source domain data. Such dual distinct classifiers with various architectures can capture diverse knowledge of the target data structure from different perspectives. Extensive experimental results on several cross-domain visual benchmarks prove the model's effectiveness by comparing it with other state-of-the-art UDA.

preprint2020arXiv

Bi-Directional Generation for Unsupervised Domain Adaptation

Unsupervised domain adaptation facilitates the unlabeled target domain relying on well-established source domain information. The conventional methods forcefully reducing the domain discrepancy in the latent space will result in the destruction of intrinsic data structure. To balance the mitigation of domain gap and the preservation of the inherent structure, we propose a Bi-Directional Generation domain adaptation model with consistent classifiers interpolating two intermediate domains to bridge source and target domains. Specifically, two cross-domain generators are employed to synthesize one domain conditioned on the other. The performance of our proposed method can be further enhanced by the consistent classifiers and the cross-domain alignment constraints. We also design two classifiers which are jointly optimized to maximize the consistency on target sample prediction. Extensive experiments verify that our proposed model outperforms the state-of-the-art on standard cross domain visual benchmarks.

preprint2020arXiv

Deep Residual Correction Network for Partial Domain Adaptation

Deep domain adaptation methods have achieved appealing performance by learning transferable representations from a well-labeled source domain to a different but related unlabeled target domain. Most existing works assume source and target data share the identical label space, which is often difficult to be satisfied in many real-world applications. With the emergence of big data, there is a more practical scenario called partial domain adaptation, where we are always accessible to a more large-scale source domain while working on a relative small-scale target domain. In this case, the conventional domain adaptation assumption should be relaxed, and the target label space tends to be a subset of the source label space. Intuitively, reinforcing the positive effects of the most relevant source subclasses and reducing the negative impacts of irrelevant source subclasses are of vital importance to address partial domain adaptation challenge. This paper proposes an efficiently-implemented Deep Residual Correction Network (DRCN) by plugging one residual block into the source network along with the task-specific feature layer, which effectively enhances the adaptation from source to target and explicitly weakens the influence from the irrelevant source classes. Specifically, the plugged residual block, which consists of several fully-connected layers, could deepen basic network and boost its feature representation capability correspondingly. Moreover, we design a weighted class-wise domain alignment loss to couple two domains by matching the feature distributions of shared classes between source and target. Comprehensive experiments on partial, traditional and fine-grained cross-domain visual recognition demonstrate that DRCN is superior to the competitive deep domain adaptation approaches.

preprint2020arXiv

Discriminative Cross-Domain Feature Learning for Partial Domain Adaptation

Partial domain adaptation aims to adapt knowledge from a larger and more diverse source domain to a smaller target domain with less number of classes, which has attracted appealing attention. Recent practice on domain adaptation manages to extract effective features by incorporating the pseudo labels for the target domain to better fight off the cross-domain distribution divergences. However, it is essential to align target data with only a small set of source data. In this paper, we develop a novel Discriminative Cross-Domain Feature Learning (DCDF) framework to iteratively optimize target labels with a cross-domain graph in a weighted scheme. Specifically, a weighted cross-domain center loss and weighted cross-domain graph propagation are proposed to couple unlabeled target data to related source samples for discriminative cross-domain feature learning, where irrelevant source centers will be ignored, to alleviate the marginal and conditional disparities simultaneously. Experimental evaluations on several popular benchmarks demonstrate the effectiveness of our proposed approach on facilitating the recognition for the unlabeled target domain, through comparing it to the state-of-the-art partial domain adaptation approaches.

preprint2020arXiv

Domain Conditioned Adaptation Network

Tremendous research efforts have been made to thrive deep domain adaptation (DA) by seeking domain-invariant features. Most existing deep DA models only focus on aligning feature representations of task-specific layers across domains while integrating a totally shared convolutional architecture for source and target. However, we argue that such strongly-shared convolutional layers might be harmful for domain-specific feature learning when source and target data distribution differs to a large extent. In this paper, we relax a shared-convnets assumption made by previous DA methods and propose a Domain Conditioned Adaptation Network (DCAN), which aims to excite distinct convolutional channels with a domain conditioned channel attention mechanism. As a result, the critical low-level domain-dependent knowledge could be explored appropriately. As far as we know, this is the first work to explore the domain-wise convolutional channel activation for deep DA networks. Moreover, to effectively align high-level feature distributions across two domains, we further deploy domain conditioned feature correction blocks after task-specific layers, which will explicitly correct the domain discrepancy. Extensive experiments on three cross-domain benchmarks demonstrate the proposed approach outperforms existing methods by a large margin, especially on very tough cross-domain learning tasks.

preprint2020arXiv

Generative Partial Multi-View Clustering

Nowadays, with the rapid development of data collection sources and feature extraction methods, multi-view data are getting easy to obtain and have received increasing research attention in recent years, among which, multi-view clustering (MVC) forms a mainstream research direction and is widely used in data analysis. However, existing MVC methods mainly assume that each sample appears in all the views, without considering the incomplete view case due to data corruption, sensor failure, equipment malfunction, etc. In this study, we design and build a generative partial multi-view clustering model, named as GP-MVC, to address the incomplete multi-view problem by explicitly generating the data of missing views. The main idea of GP-MVC lies at two-fold. First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the consistent cluster structure across multiple views. Second, view-specific generative adversarial networks are developed to generate the missing data of one view conditioning on the shared representation given by other views. These two steps could be promoted mutually, where learning common representations facilitates data imputation and the generated data could further explores the view consistency. Moreover, an weighted adaptive fusion scheme is implemented to exploit the complementary information among different views. Experimental results on four benchmark datasets are provided to show the effectiveness of the proposed GP-MVC over the state-of-the-art methods.

preprint2020arXiv

Importance Filtered Cross-Domain Adaptation

In Domain Adaptation (DA), the category-relevant losses usually occupy a dominant position, while they are usually built with hard or soft labels in existing models. We observed that hard labels are overconfident due to hard samples existed, and soft labels are ambiguous as too many small noisy probabilities involved, and both of them are easily to cause negative transfer. Besides, the category-irrelevant losses in Closed-Set DA (CSDA) paradigm fail to work in Open-Set DA (OSDA), and they also have to be in a category-relevant form, since target data samples are split into shared and private classes. To this end, we propose a newly-unified DA framework (i.e., Importance Filtered Cross-Domain Adaptation, IFCDA). Firstly, an importance filtered mechanism is devised to generate filtered soft labels to mitigate negative transfer desirably. Specifically, the soft labels are divided into confident and ambiguous ones. Then, only the maximum probability in each confident label is retained, and a threshold value is set to truncate each ambiguous label so that only prominent probabilities are reserved. Moreover, a general graph-based label propagation is contrived to attain soft labels in both CSDA and OSDA, where an extra component is embedded into label vector, so that it could detect target novel classes. Finally, the category-relevant losses in both scenarios are reformulated using filtered soft labels, while the category-irrelevant MMD loss in CSDA is reformulated as a form like class-wise MMD using newly-designed importance filtered soft labels. Notably, CSDA paradigm is a special case when all extra components are set to 0, thus the proposed approach is geared to both CSDA and OSDA. Comprehensive experiments on benchmark cross-domain object recognition datasets verify that the proposed approach outperforms several state-of-the-art methods in both scenarios.

preprint2020arXiv

Learning Robust Data Representation: A Knowledge Flow Perspective

It is always demanding to learn robust visual representation for various learning problems; however, this learning and maintenance process usually suffers from noise, incompleteness or knowledge domain mismatch. Thus, robust representation learning by removing noisy features or samples, complementing incomplete data, and mitigating the distribution difference becomes the key. Along this line of research, low-rank modeling has been widely-applied to solving representation learning challenges. This survey covers the topic from a knowledge flow perspective in terms of: (1) robust knowledge recovery, (2) robust knowledge transfer, and (3) robust knowledge fusion, centered around several major applications. First of all, we deliver a unified formulation for robust knowledge discovery given single dataset. Second, we discuss robust knowledge transfer and fusion given multiple datasets with different knowledge flows, followed by practical challenges, model variations, and remarks. Finally, we highlight future research of robust knowledge discovery for incomplete, unbalance, large-scale data analysis. This would benefit AI community from literature review to future direction.

preprint2020arXiv

Rethink Maximum Mean Discrepancy for Domain Adaptation

Existing domain adaptation methods aim to reduce the distributional difference between the source and target domains and respect their specific discriminative information, by establishing the Maximum Mean Discrepancy (MMD) and the discriminative distances. However, they usually accumulate to consider those statistics and deal with their relationships by estimating parameters blindly. This paper theoretically proves two essential facts: 1) minimizing the MMD equals to maximize the source and target intra-class distances respectively but jointly minimize their variance with some implicit weights, so that the feature discriminability degrades; 2) the relationship between the intra-class and inter-class distances is as one falls, another rises. Based on this, we propose a novel discriminative MMD. On one hand, we consider the intra-class and inter-class distances alone to remove a redundant parameter, and the revealed weights provide their approximate optimal ranges. On the other hand, we design two different strategies to boost the feature discriminability: 1) we directly impose a trade-off parameter on the implicit intra-class distance in MMD to regulate its change; 2) we impose the similar weights revealed in MMD on inter-class distance and maximize it, then a balanced factor could be introduced to quantitatively leverage the relative importance between the feature transferability and its discriminability. The experiments on several benchmark datasets not only prove the validity of theoretical results but also demonstrate that our approach could perform better than the comparative state-of-art methods substantially.

preprint2020arXiv

Sparsely-Labeled Source Assisted Domain Adaptation

Domain Adaptation (DA) aims to generalize the classifier learned from the source domain to the target domain. Existing DA methods usually assume that rich labels could be available in the source domain. However, there are usually a large number of unlabeled data but only a few labeled data in the source domain, and how to transfer knowledge from this sparsely-labeled source domain to the target domain is still a challenge, which greatly limits their application in the wild. This paper proposes a novel Sparsely-Labeled Source Assisted Domain Adaptation (SLSA-DA) algorithm to address the challenge with limited labeled source domain samples. Specifically, due to the label scarcity problem, the projected clustering is conducted on both the source and target domains, so that the discriminative structures of data could be leveraged elegantly. Then the label propagation is adopted to propagate the labels from those limited labeled source samples to the whole unlabeled data progressively, so that the cluster labels are revealed correctly. Finally, we jointly align the marginal and conditional distributions to mitigate the cross-domain mismatch problem, and optimize those three procedures iteratively. However, it is nontrivial to incorporate those three procedures into a unified optimization framework seamlessly since some variables to be optimized are implicitly involved in their formulas, thus they could not promote to each other. Remarkably, we prove that the projected clustering and conditional distribution alignment could be reformulated as different expressions, thus the implicit variables are revealed in different optimization steps. As such, the variables related to those three quantities could be optimized in a unified optimization framework and facilitate to each other, to improve the recognition performance obviously.