Source author record

Duo Li

Duo Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mes-hall math.AG

Catalog footprint

What is connected

7works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

In this paper, we build upon the weakly-supervised generation mechanism of intermediate attention maps in any convolutional neural networks and disclose the effectiveness of attention modules more straightforwardly to fully exploit their potential. Given an existing neural network equipped with arbitrary attention modules, we introduce a meta critic network to evaluate the quality of attention maps in the main network. Due to the discreteness of our designed reward, the proposed learning method is arranged in a reinforcement learning setting, where the attention actors and recurrent critics are alternately optimized to provide instant critique and revision for the temporary attention representation, hence coined as Deep REinforced Attention Learning (DREAL). It could be applied universally to network architectures with different types of attention modules and promotes their expressive ability by maximizing the relative gain of the final recognition performance arising from each individual attention module, as demonstrated by extensive experiments on both category and instance recognition benchmarks.

preprint2022arXiv

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

Expandable networks have demonstrated their advantages in dealing with catastrophic forgetting problem in incremental learning. Considering that different tasks may need different structures, recent methods design dynamic structures adapted to different tasks via sophisticated skills. Their routine is to search expandable structures first and then train on the new tasks, which, however, breaks tasks into multiple training stages, leading to suboptimal or overmuch computational cost. In this paper, we propose an end-to-end trainable adaptively expandable network named E2-AEN, which dynamically generates lightweight structures for new tasks without any accuracy drop in previous tasks. Specifically, the network contains a serial of powerful feature adapters for augmenting the previously learned representations to new tasks, and avoiding task interference. These adapters are controlled via an adaptive gate-based pruning strategy which decides whether the expanded structures can be pruned, making the network structure dynamically changeable according to the complexity of the new tasks. Moreover, we introduce a novel sparsity-activation regularization to encourage the model to learn discriminative features with limited parameters. E2-AEN reduces cost and can be built upon any feed-forward architectures in an end-to-end manner. Extensive experiments on both classification (i.e., CIFAR and VDD) and detection (i.e., COCO, VOC and ICCV2021 SSLAD challenge) benchmarks demonstrate the effectiveness of the proposed method, which achieves the new remarkable results.

preprint2022arXiv

Technical Report for ICCV 2021 Challenge SSLAD-Track3B: Transformers Are Better Continual Learners

In the SSLAD-Track 3B challenge on continual learning, we propose the method of COntinual Learning with Transformer (COLT). We find that transformers suffer less from catastrophic forgetting compared to convolutional neural network. The major principle of our method is to equip the transformer based feature extractor with old knowledge distillation and head expanding strategies to compete catastrophic forgetting. In this report, we first introduce the overall framework of continual learning for object detection. Then, we analyse the key elements' effect on withstanding catastrophic forgetting in our solution. Our method achieves 70.78 mAP on the SSLAD-Track 3B challenge test set.

preprint2020arXiv

Learning to Learn Parameterized Classification Networks for Scalable Input Images

Convolutional Neural Networks (CNNs) do not have a predictable recognition behavior with respect to the input resolution change. This prevents the feasibility of deployment on different input image resolutions for a specific model. To achieve efficient and flexible image classification at runtime, we employ meta learners to generate convolutional weights of main networks for various input scales and maintain privatized Batch Normalization layers per scale. For improved training performance, we further utilize knowledge distillation on the fly over model predictions based on different input resolutions. The learned meta network could dynamically parameterize main networks to act on input images of arbitrary size with consistently better accuracy compared to individually trained models. Extensive experiments on the ImageNet demonstrate that our method achieves an improved accuracy-efficiency trade-off during the adaptive inference process. By switching executable input resolutions, our method could satisfy the requirement of fast adaption in different resource-constrained environments. Code and models are available at https://github.com/d-li14/SAN.

preprint2020arXiv

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Despite their strong modeling capacities, Convolutional Neural Networks (CNNs) are often scale-sensitive. For enhancing the robustness of CNNs to scale variance, multi-scale feature fusion from different layers or filters attracts great attention among existing solutions, while the more granular kernel space is overlooked. We bridge this regret by exploiting multi-scale features in a finer granularity. The proposed convolution operation, named Poly-Scale Convolution (PSConv), mixes up a spectrum of dilation rates and tactfully allocate them in the individual convolutional kernels of each filter regarding a single convolutional layer. Specifically, dilation rates vary cyclically along the axes of input and output channels of the filters, aggregating features over a wide range of scales in a neat style. PSConv could be a drop-in replacement of the vanilla convolution in many prevailing CNN backbones, allowing better representation learning without introducing additional parameters and computational complexities. Comprehensive experiments on the ImageNet and MS COCO benchmarks validate the superior performance of PSConv. Code and models are available at https://github.com/d-li14/PSConv.

preprint2015arXiv

Classification of two-dimensional algebraic projective semigroups

In this article, we address the classification of smooth projective algebraic surfaces over complex numbers admitting algebraic semigroup structures. We give a full description of those surfaces $S$, which has at least one non-trivial algebraic semigroup structure, when the Kodaira dimension of $S$ is $ -\infty$ and $ 0$. For the case "$ κ(S)=1$", we give a description of one special type of elliptic surfaces which admit non-trivial algebraic semigroup laws. \\ For a given surface $S$, it is an interesting problem to describe all algebraic semigroup structures on it and determine the dimension of this moduli. In this article, we solve this problem for case "$ κ(S)\ge 0$".

preprint2011arXiv

Enhancement of shot noise due to the fluctuation of Coulomb interaction

We have developed a theoretical formalism to investigate the contribution of fluctuation of Coulomb interaction to the shot noise based on Keldysh non-equilibrium Green's function method. We have applied our theory to study the behavior of dc shot noise of atomic junctions using the method of nonequilibrium Green's function combined with the density functional theory (NEGF-DFT). In particular, for atomic carbon wire consisting 4 carbon atoms in contact with two Al(100) electrodes, first principles calculation within NEGF-DFT formalism shows a negative differential resistance (NDR) region in I-V curve at finite bias due to the effective band bottom of the Al lead. We have calculated the shot noise spectrum using the conventional gauge invariant transport theory with Coulomb interaction considered explicitly on the Hartree level along with exchange and correlation effect. Although the Fano factor is enhanced from 0.6 to 0.8 in the NDR region, the expected super-Poissonian behavior in the NDR regionis not observed. When the fluctuation of Coulomb interaction is included in the shot noise, our numerical results show that the Fano factor is greater than one in the NDR region indicating a super-Poissonian behavior.

Duo Li

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Deep Reinforced Attention Learning for Quality-Aware Visual Recognition

E2-AEN: End-to-End Incremental Learning with Adaptively Expandable Network

Technical Report for ICCV 2021 Challenge SSLAD-Track3B: Transformers Are Better Continual Learners

Learning to Learn Parameterized Classification Networks for Scalable Input Images

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer

Classification of two-dimensional algebraic projective semigroups

Enhancement of shot noise due to the fluctuation of Coulomb interaction