Source author record

Yuhong Guo

Yuhong Guo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Computation and Language eess.IV Information Retrieval math.CO Social and Information Networks

Catalog footprint

What is connected

14works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Object Detection in 20 Years: A Survey

Object detection, as of one the most fundamental and challenging problems in computer vision, has received great attention in recent years. Over the past two decades, we have seen a rapid technological evolution of object detection and its profound impact on the entire computer vision field. If we consider today's object detection technique as a revolution driven by deep learning, then back in the 1990s, we would see the ingenious thinking and long-term perspective design of early computer vision. This paper extensively reviews this fast-moving research field in the light of technical evolution, spanning over a quarter-century's time (from the 1990s to 2022). A number of topics have been covered in this paper, including the milestone detectors in history, detection datasets, metrics, fundamental building blocks of the detection system, speed-up techniques, and the recent state-of-the-art detection methods.

preprint2021arXiv

A novel method based on node correlation to evaluate the important nodes in complex networks

Finding the important nodes in complex networks by topological structure is of great significance to network invulnerability. Several centrality measures have been proposed recently to evaluate the performance of nodes based on their correlation, showing that the interaction between nodes has an influence on the importance of nodes. In this paper, a novel method based on node distribution and global influence in complex networks is proposed. Our main idea is that the importance of nodes being linked not only to the relative position in the network but also to the correlations with each other. The nodes in the complex networks are classified according to the distance matrix, then the correlation coefficient between pairs of nodes is calculated. From the whole perspective in the network, the global similarity centrality (GSC) is proposed based on the relevance and shortest distance between any two nodes. The efficiency, accuracy and monotonicity of the proposed method are analyzed in two artificial datasets and eight real datasets of different sizes. Experimental results show that the performance of GSC method outperforms those current state-of-the-art algorithms.

preprint2020arXiv

A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning

In this paper, we present a new method, Transductive Multi-Head Few-Shot learning (TMHFS), to address the Cross-Domain Few-Shot Learning (CD-FSL) challenge. The TMHFS method extends the Meta-Confidence Transduction (MCT) and Dense Feature-Matching Networks (DFMN) method [2] by introducing a new prediction head, i.e, an instance-wise global classification network based on semantic information, after the common feature embedding network. We train the embedding network with the multiple heads, i.e,, the MCT loss, the DFMN loss and the semantic classifier loss, simultaneously in the source domain. For the few-shot learning in the target domain, we first perform fine-tuning on the embedding network with only the semantic global classifier and the support instances, and then use the MCT part to predict labels of the query set with the fine-tuned embedding network. Moreover, we further exploit data augmentation techniques during the fine-tuning and test stages to improve the prediction performance. The experimental results demonstrate that the proposed methods greatly outperform the strong baseline, fine-tuning, on four different target domains.

preprint2020arXiv

Adaptive Object Detection with Dual Multi-Label Prediction

In this paper, we propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection by exploiting multi-label object recognition as a dual auxiliary task. The model exploits multi-label prediction to reveal the object category information in each image and then uses the prediction results to perform conditional adversarial global feature alignment, such that the multi-modal structure of image features can be tackled to bridge the domain divergence at the global feature level while preserving the discriminability of the features. Moreover, we introduce a prediction consistency regularization mechanism to assist object detection, which uses the multi-label prediction results as an auxiliary regularization information to ensure consistent object category discoveries between the object recognition task and the object detection task. Experiments are conducted on a few benchmark datasets and the results show the proposed model outperforms the state-of-the-art comparison methods.

preprint2020arXiv

Adversarial Partial Multi-Label Learning

Partial multi-label learning (PML), which tackles the problem of learning multi-label prediction models from instances with overcomplete noisy annotations, has recently started gaining attention from the research community. In this paper, we propose a novel adversarial learning model, PML-GAN, under a generalized encoder-decoder framework for partial multi-label learning. The PML-GAN model uses a disambiguation network to identify noisy labels and uses a multi-label prediction network to map the training instances to the disambiguated label vectors, while deploying a generative adversarial network as an inverse mapping from label vectors to data samples in the input feature space. The learning of the overall model corresponds to a minimax adversarial game, which enhances the correspondence of input features with the output labels in a bi-directional mapping. Extensive experiments are conducted on multiple datasets, while the proposed model demonstrates the state-of-the-art performance for partial multi-label learning.

preprint2020arXiv

Ensemble Model with Batch Spectral Regularization and Data Blending for Cross-Domain Few-Shot Learning with Unlabeled Data

In this paper, we present our proposed ensemble model with batch spectral regularization and data blending mechanisms for the Track 2 problem of the cross-domain few-shot learning (CD-FSL) challenge. We build a multi-branch ensemble framework by using diverse feature transformation matrices, while deploying batch spectral feature regularization on each branch to improve the model's transferability. Moreover, we propose a data blending method to exploit the unlabeled data and augment the sparse support set in the target domain. Our proposed model demonstrates effective performance on the CD-FSL benchmark tasks.

preprint2020arXiv

Feature Transformation Ensemble Model with Batch Spectral Regularization for Cross-Domain Few-Shot Classification

In this paper, we propose a feature transformation ensemble model with batch spectral regularization for the Cross-domain few-shot learning (CD-FSL) challenge. Specifically, we proposes to construct an ensemble prediction model by performing diverse feature transformations after a feature extraction network. On each branch prediction network of the model we use a batch spectral regularization term to suppress the singular values of the feature matrix during pre-training to improve the generalization ability of the model. The proposed model can then be fine tuned in the target domain to address few-shot classification. We also further apply label propagation, entropy minimization and data augmentation to mitigate the shortage of labeled data in target domains. Experiments are conducted on a number of CD-FSL benchmark tasks with four target domains and the results demonstrate the superiority of our proposed model.

preprint2020arXiv

Multi-Level Generative Models for Partial Label Learning with Non-random Label Noise

Partial label (PL) learning tackles the problem where each training instance is associated with a set of candidate labels that include both the true label and irrelevant noise labels. In this paper, we propose a novel multi-level generative model for partial label learning (MGPLL), which tackles the problem by learning both a label level adversarial generator and a feature level adversarial generator under a bi-directional mapping framework between the label vectors and the data samples. Specifically, MGPLL uses a conditional noise label generation network to model the non-random noise labels and perform label denoising, and uses a multi-class predictor to map the training instances to the denoised label vectors, while a conditional data feature generator is used to form an inverse mapping from the denoised label vectors to data samples. Both the noise label generator and the data feature generator are learned in an adversarial manner to match the observed candidate labels and data features respectively. Extensive experiments are conducted on synthesized and real-world partial label datasets. The proposed approach demonstrates the state-of-the-art performance for partial label learning.

preprint2020arXiv

Mutual Learning Network for Multi-Source Domain Adaptation

Early Unsupervised Domain Adaptation (UDA) methods have mostly assumed the setting of a single source domain, where all the labeled source data come from the same distribution. However, in practice the labeled data can come from multiple source domains with different distributions. In such scenarios, the single source domain adaptation methods can fail due to the existence of domain shifts across different source domains and multi-source domain adaptation methods need to be designed. In this paper, we propose a novel multi-source domain adaptation method, Mutual Learning Network for Multiple Source Domain Adaptation (ML-MSDA). Under the framework of mutual learning, the proposed method pairs the target domain with each single source domain to train a conditional adversarial domain adaptation network as a branch network, while taking the pair of the combined multi-source domain and target domain to train a conditional adversarial adaptive network as the guidance network. The multiple branch networks are aligned with the guidance network to achieve mutual learning by enforcing JS-divergence regularization over their prediction probability distributions on the corresponding target data. We conduct extensive experiments on multiple multi-source domain adaptation benchmark datasets. The results show the proposed ML-MSDA method outperforms the comparison methods and achieves the state-of-the-art performance.

preprint2020arXiv

Time-aware Large Kernel Convolutions

To date, most state-of-the-art sequence modeling architectures use attention to build generative models for language based tasks. Some of these models use all the available sequence tokens to generate an attention distribution which results in time complexity of $O(n^2)$. Alternatively, they utilize depthwise convolutions with softmax normalized kernels of size $k$ acting as a limited-window self-attention, resulting in time complexity of $O(k{\cdot}n)$. In this paper, we introduce Time-aware Large Kernel (TaLK) Convolutions, a novel adaptive convolution operation that learns to predict the size of a summation kernel instead of using a fixed-sized kernel matrix. This method yields a time complexity of $O(n)$, effectively making the sequence encoding process linear to the number of tokens. We evaluate the proposed method on large-scale standard machine translation, abstractive summarization and language modeling datasets and show that TaLK Convolutions constitute an efficient improvement over other attention/convolution based approaches.

preprint2020arXiv

Unsupervised Domain Adaptation with Progressive Domain Augmentation

Domain adaptation aims to exploit a label-rich source domain for learning classifiers in a different label-scarce target domain. It is particularly challenging when there are significant divergences between the two domains. In the paper, we propose a novel unsupervised domain adaptation method based on progressive domain augmentation. The proposed method generates virtual intermediate domains via domain interpolation, progressively augments the source domain and bridges the source-target domain divergence by conducting multiple subspace alignment on the Grassmann manifold. We conduct experiments on multiple domain adaptation tasks and the results shows the proposed method achieves the state-of-the-art performance.

preprint2012arXiv

Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering

We present a new approach to learning the structure and parameters of a Bayesian network based on regularized estimation in an exponential family representation. Here we show that, given a fixed variable order, the optimal structure and parameters can be learned efficiently, even without restricting the size of the parent sets. We then consider the problem of optimizing the variable order for a given set of features. This is still a computationally hard problem, but we present a convex relaxation that yields an optimal 'soft' ordering in polynomial time. One novel aspect of the approach is that we do not perform a discrete search over DAG structures, nor over variable orders, but instead solve a continuous relaxation that can then be rounded to obtain a valid network structure. We conduct an experimental comparison against standard structure search procedures over standard objectives, which cope with local minima, and evaluate the advantages of using convex relaxations that reduce the effects of local minima.

preprint2012arXiv

Cross Language Text Classification via Subspace Co-Regularized Multi-View Learning

In many multilingual text classification problems, the documents in different languages often share the same set of categories. To reduce the labeling cost of training a classification model for each individual language, it is important to transfer the label knowledge gained from one language to another language by conducting cross language classification. In this paper we develop a novel subspace co-regularized multi-view learning method for cross language text classification. This method is built on parallel corpora produced by machine translation. It jointly minimizes the training error of each classifier in each language while penalizing the distance between the subspace representations of parallel documents. Our empirical study on a large set of cross language text classification tasks shows the proposed method consistently outperforms a number of inductive methods, domain adaptation methods, and multi-view learning methods.

preprint2012arXiv

Maximum Margin Bayesian Networks

We consider the problem of learning Bayesian network classifiers that maximize the marginover a set of classification variables. We find that this problem is harder for Bayesian networks than for undirected graphical models like maximum margin Markov networks. The main difficulty is that the parameters in a Bayesian network must satisfy additional normalization constraints that an undirected graphical model need not respect. These additional constraints complicate the optimization task. Nevertheless, we derive an effective training algorithm that solves the maximum margin training problem for a range of Bayesian network topologies, and converges to an approximate solution for arbitrary network topologies. Experimental results show that the method can demonstrate improved generalization performance over Markov networks when the directed graphical structure encodes relevant knowledge. In practice, the training technique allows one to combine prior knowledge expressed as a directed (causal) model with state of the art discriminative learning methods.

Yuhong Guo

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Object Detection in 20 Years: A Survey

A novel method based on node correlation to evaluate the important nodes in complex networks

A Transductive Multi-Head Model for Cross-Domain Few-Shot Learning

Adaptive Object Detection with Dual Multi-Label Prediction

Adversarial Partial Multi-Label Learning

Ensemble Model with Batch Spectral Regularization and Data Blending for Cross-Domain Few-Shot Learning with Unlabeled Data

Feature Transformation Ensemble Model with Batch Spectral Regularization for Cross-Domain Few-Shot Classification

Multi-Level Generative Models for Partial Label Learning with Non-random Label Noise

Mutual Learning Network for Multi-Source Domain Adaptation

Time-aware Large Kernel Convolutions

Unsupervised Domain Adaptation with Progressive Domain Augmentation

Convex Structure Learning for Bayesian Networks: Polynomial Feature Selection and Approximate Ordering

Cross Language Text Classification via Subspace Co-Regularized Multi-View Learning

Maximum Margin Bayesian Networks