Source author record

Dan Xu

Dan Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning eess.IV Artificial Intelligence Applications cond-mat.soft cond-mat.stat-mech Graphics Information Retrieval Networking and Internet Architecture physics.flu-dyn physics.optics Populations and Evolution q-fin.ST

Catalog footprint

What is connected

22works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Controllable Video Generation: A Survey

With the rapid development of AI-generated content (AIGC), video generation has emerged as one of its most dynamic and impactful subfields. In particular, the advancement of video generation foundation models has led to growing demand for controllable video generation methods that can more accurately reflect user intent. Most existing foundation models are designed for text-to-video generation, where text prompts alone are often insufficient to express complex, multi-modal, and fine-grained user requirements. This limitation makes it challenging for users to generate videos with precise control using current models. To address this issue, recent research has explored the integration of additional non-textual conditions, such as camera motion, depth maps, and human pose, to extend pretrained video generation models and enable more controllable video synthesis. These approaches aim to enhance the flexibility and practical applicability of AIGC-driven video generation systems. In this survey, we provide a systematic review of controllable video generation, covering both theoretical foundations and recent advances in the field. We begin by introducing the key concepts and commonly used open-source video generation models. We then focus on control mechanisms in video diffusion models, analyzing how different types of conditions can be incorporated into the denoising process to guide generation. Finally, we categorize existing methods based on the types of control signals they leverage, including single-condition generation, multi-condition generation, and universal controllable generation. For a complete list of the literature on controllable video generation reviewed, please visit our curated repository at https://github.com/mayuelala/Awesome-Controllable-Video-Generation.

preprint2026arXiv

Policy-Grounded Dynamic Facet Suggestions for Job Search

Job seekers often initiate search with short, underspecified queries. At LinkedIn, over 80% of job-related queries contain three or fewer keywords, making accurate user intent inference and relevant job retrieval particularly challenging. We present dynamic facet suggestion (DFS), an interactive query refinement mechanism that facilitates intent disambiguation by surfacing personalized semantic attributes conditioned on the joint user-query context in real time. We propose a policy-grounded, retrieval-augmented ranking framework for facet suggestion, comprising offline taxonomy curation, embedding-based retrieval of top-K candidates, and distilled small language model (SLM) based candidate scoring. The system is optimized for real-time serving via pointwise single-token scoring with batching and prefix caching. Offline evaluation demonstrates high precision for generated suggestions, and online A/B tests show significant improvements in suggestion engagement and job search outcomes.

preprint2022arXiv

An Adaptive and Scalable ANN-based Model-Order-Reduction Method for Large-Scale TO Designs

Topology Optimization (TO) provides a systematic approach for obtaining structure design with optimum performance of interest. However, the process requires numerical evaluation of objective function and constraints at each iteration, which is computational expensive especially for large-scale design. Deep learning-based models have been developed to accelerate the process either by acting as surrogate models replacing the simulation process, or completely replacing the optimization process. However, most of them require a large set of labelled training data, which are generated mostly through simulations. The data generation time scales rapidly with the design domain size, decreasing the efficiency of the method itself. Another major issue is the weak generalizability of most deep learning models. Most models are trained to work with the design problem similar to that used for data generation and require retraining if the design problem changes. In this work a scalable deep learning-based model-order-reduction method is proposed to accelerate large-scale TO process, by utilizing MapNet, a neural network which maps the field of interest from coarse-scale to fine-scale. The proposed method allows for each simulation of the TO process to be performed at a coarser mesh, thereby greatly reducing the total computational time. Moreover, by using domain fragmentation, the transferability of the MapNet is largely improved. Specifically, it has been demonstrated that the MapNet trained using data from one cantilever beam design with a specific loading condition can be directly applied to other structure design problems with different domain shapes, sizes, boundary and loading conditions.

preprint2022arXiv

Continual Attentive Fusion for Incremental Learning in Semantic Segmentation

Over the past years, semantic segmentation, as many other tasks in computer vision, benefited from the progress in deep neural networks, resulting in significantly improved performance. However, deep architectures trained with gradient-based techniques suffer from catastrophic forgetting, which is the tendency to forget previously learned knowledge while learning new tasks. Aiming at devising strategies to counteract this effect, incremental learning approaches have gained popularity over the past years. However, the first incremental learning methods for semantic segmentation appeared only recently. While effective, these approaches do not account for a crucial aspect in pixel-level dense prediction problems, i.e. the role of attention mechanisms. To fill this gap, in this paper we introduce a novel attentive feature distillation approach to mitigate catastrophic forgetting while accounting for semantic spatial- and channel-level dependencies. Furthermore, we propose a {continual attentive fusion} structure, which takes advantage of the attention learned from the new and the old tasks while learning features for the new task. Finally, we also introduce a novel strategy to account for the background class in the distillation loss, thus preventing biased predictions. We demonstrate the effectiveness of our approach with an extensive evaluation on Pascal-VOC 2012 and ADE20K, setting a new state of the art.

preprint2022arXiv

Depth-Aware Generative Adversarial Network for Talking Head Video Generation

Talking head video generation aims to produce a synthetic human face video that contains the identity and pose information respectively from a given source image and a driving video.Existing works for this task heavily rely on 2D representations (e.g. appearance and motion) learned from the input images. However, dense 3D facial geometry (e.g. pixel-wise depth) is extremely important for this task as it is particularly beneficial for us to essentially generate accurate 3D face structures and distinguish noisy information from the possibly cluttered background. Nevertheless, dense 3D geometry annotations are prohibitively costly for videos and are typically not available for this video generation task. In this paper, we first introduce a self-supervised geometry learning method to automatically recover the dense 3D geometry (i.e.depth) from the face videos without the requirement of any expensive 3D annotation data. Based on the learned dense depth maps, we further propose to leverage them to estimate sparse facial keypoints that capture the critical movement of the human head. In a more dense way, the depth is also utilized to learn 3D-aware cross-modal (i.e. appearance and depth) attention to guide the generation of motion fields for warping source image representations. All these contributions compose a novel depth-aware generative adversarial network (DaGAN) for talking head generation. Extensive experiments conducted demonstrate that our proposed method can generate highly realistic faces, and achieve significant results on the unseen human faces.

preprint2022arXiv

Dynamic Graph Message Passing Networks

Modelling long-range dependencies is critical for scene understanding tasks in computer vision. Although CNNs have excelled in many vision tasks, they are still limited in capturing long-range structured relationships as they typically consist of layers of local kernels. A fully-connected graph is beneficial for such modelling, however, its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity compared to related works modelling a fully-connected graph. This is achieved by adaptively sampling nodes in the graph, conditioned on the input, for message passing. Based on the sampled nodes, we dynamically predict node-dependent filter weights and the affinity matrix for propagating information between them. Using this model, we show significant improvements with respect to strong, state-of-the-art baselines on three different tasks and backbone architectures. Our approach also outperforms fully-connected graphs while using substantially fewer floating-point operations and parameters. The project website is http://www.robots.ox.ac.uk/~lz/dgmn/

preprint2022arXiv

Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information

A counter-intuitive property of convolutional neural networks (CNNs) is their inherent susceptibility to adversarial examples, which severely hinders the application of CNNs in security-critical fields. Adversarial examples are similar to original examples but contain malicious perturbations. Adversarial training is a simple and effective defense method to improve the robustness of CNNs to adversarial examples. The mechanisms behind adversarial examples and adversarial training are worth exploring. Therefore, this work investigates similarities and differences between normally trained CNNs (NT-CNNs) and adversarially trained CNNs (AT-CNNs) in information extraction from the mutual information perspective. We show that 1) whether NT-CNNs or AT-CNNs, for original and adversarial examples, the trends towards mutual information are almost similar throughout training; 2) compared with normal training, adversarial training is more difficult and the amount of information that AT-CNNs extract from the input is less; 3) the CNNs trained with different methods have different preferences for certain types of information; NT-CNNs tend to extract texture-based information from the input, while AT-CNNs prefer to shape-based information. The reason why adversarial examples mislead CNNs may be that they contain more texture-based information about other classes. Furthermore, we also analyze the mutual information estimators used in this work and find that they outline the geometric properties of the middle layer's output.

preprint2022arXiv

Lipschitz Continuity Retained Binary Neural Network

Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective. However, robustness remains to be an ill-defined concept without solid theoretical support. In this work, we introduce the Lipschitz continuity, a well-defined functional property, as the rigorous criteria to define the model robustness for BNN. We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness. Particularly, while the popular Lipschitz-involved regularization methods often collapse in BNN due to its extreme sparsity, we design the Retention Matrices to approximate spectral norms of the targeted weight matrices, which can be deployed as the approximation for the Lipschitz constant of BNNs without the exact Lipschitz constant computation (NP-hard). Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN (testified on ImageNet-C), achieving state-of-the-art performance on CIFAR and ImageNet.

preprint2022arXiv

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

This paper proposes a new transformer-based framework to learn class-specific object localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS). Inspired by the fact that the attended regions of the one-class token in the standard vision transformer can be leveraged to form a class-agnostic localization map, we investigate if the transformer model can also effectively capture class-specific attention for more discriminative object localization by learning multiple class tokens within the transformer. To this end, we propose a Multi-class Token Transformer, termed as MCTformer, which uses multiple class tokens to learn interactions between the class tokens and the patch tokens. The proposed MCTformer can successfully produce class-discriminative object localization maps from class-to-patch attentions corresponding to different class tokens. We also propose to use a patch-level pairwise affinity, which is extracted from the patch-to-patch transformer attention, to further refine the localization maps. Moreover, the proposed framework is shown to fully complement the Class Activation Mapping (CAM) method, leading to remarkably superior WSSS results on the PASCAL VOC and MS COCO datasets. These results underline the importance of the class token for WSSS.

preprint2022arXiv

Network Binarization via Contrastive Learning

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As the quantization error caused by weights binarization has been reduced in earlier works, the activations binarization becomes the major obstacle for further improvement of the accuracy. BNN characterises a unique and interesting structure, where the binary and latent FP activations exist in the same forward pass (i.e., $\text{Binarize}(\mathbf{a}_F) = \mathbf{a}_B$). To mitigate the information degradation caused by the binarization operation from FP to binary activations, we establish a novel contrastive learning framework while training BNNs through the lens of Mutual Information (MI) maximization. MI is introduced as the metric to measure the information shared between binary and FP activations, which assists binarization with contrastive learning. Specifically, the representation ability of the BNNs is greatly strengthened via pulling the positive pairs with binary and FP activations from the same input samples, as well as pushing negative pairs from different samples (the number of negative pairs can be exponentially large). This benefits the downstream tasks, not only classification but also segmentation and depth estimation, etc. The experimental results show that our method can be implemented as a pile-up module on existing state-of-the-art binarization methods and can remarkably improve the performance over them on CIFAR-10/100 and ImageNet, in addition to the great generalization ability on NYUD-v2.

preprint2022arXiv

Probabilistic Graph Attention Network with Conditional Kernels for Pixel-Wise Prediction

Multi-scale representations deeply learned via convolutional neural networks have shown tremendous importance for various pixel-level prediction problems. In this paper we present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion. In contrast to previous works directly considering multi-scale feature maps obtained from the inner layers of a primary CNN architecture, and simply fusing the features with weighted averaging or concatenation, we propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner. In order to further improve the learning capacity of the network structure, we propose to exploit feature dependant conditional kernels within the deep probabilistic framework. Extensive experiments are conducted on four publicly available datasets (i.e. BSDS500, NYUD-V2, KITTI, and Pascal-Context) and on three challenging pixel-wise prediction problems involving both discrete and continuous labels (i.e. monocular depth estimation, object contour prediction, and semantic segmentation). Quantitative and qualitative results demonstrate the effectiveness of the proposed latent AG-CRF model and the overall probabilistic graph attention network with feature conditional kernels for structured feature learning and pixel-wise prediction.

preprint2022arXiv

Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation

A fundamental and challenging problem in deep learning is catastrophic forgetting, i.e. the tendency of neural networks to fail to preserve the knowledge acquired from old tasks when learning new tasks. This problem has been widely investigated in the research community and several Incremental Learning (IL) approaches have been proposed in the past years. While earlier works in computer vision have mostly focused on image classification and object detection, more recently some IL approaches for semantic segmentation have been introduced. These previous works showed that, despite its simplicity, knowledge distillation can be effectively employed to alleviate catastrophic forgetting. In this paper, we follow this research direction and, inspired by recent literature on contrastive learning, we propose a novel distillation framework, Uncertainty-aware Contrastive Distillation (\method). In a nutshell, \method~is operated by introducing a novel distillation loss that takes into account all the images in a mini-batch, enforcing similarity between features associated to all the pixels from the same classes, and pulling apart those corresponding to pixels from different classes. In order to mitigate catastrophic forgetting, we contrast features of the new model with features extracted by a frozen model learned at the previous incremental step. Our experimental results demonstrate the advantage of the proposed distillation technique, which can be used in synergy with previous IL approaches, and leads to state-of-art performance on three commonly adopted benchmarks for incremental semantic segmentation. The code is available at \url{https://github.com/ygjwd12345/UCD}.

preprint2020arXiv

A Simple Prediction Model for the Development Trend of 2019-nCov Epidemics Based on Medical Observations

In order to predict the development trend of the 2019 coronavirus (2019-nCov), we established an prediction model to predict the number of diagnoses case in China except Hubei Province. From January 25 to January 29, 2020, we optimized 6 prediction models, 5 of them based on the number of medical observations to predicts the peak time of confirmed diagnosis will appear on the period of morning of January 29 from 24:00 to February 2 before 5 o'clock 24:00. Then we tracked the data from 24 o'clock on January 29 to 24 o'clock on January 31, and found that the predicted value of the data on the 3rd has a small deviation from the actual value, and the actual value has always remained within the range predicted by the comprehensive prediction model 6. Therefore we discloses this finding and will continue to track whether this pattern can be maintained for longer. We believe that the changes medical observation case number may help to judge the trend of the epidemic situation in advance.

preprint2020arXiv

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

In this work, we propose a novel Cycle In Cycle Generative Adversarial Network (C$^2$GAN) for the task of keypoint-guided image generation. The proposed C$^2$GAN is a cross-modal framework exploring a joint exploitation of the keypoint and the image data in an interactive manner. C$^2$GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involving in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network. Extensive experimental results on two publicly available datasets, i.e., Radboud Faces and Market-1501, demonstrate that our approach is effective to generate more photo-realistic images compared with state-of-the-art models.

preprint2020arXiv

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

In this paper, we address the task of semantic-guided scene generation. One open challenge in scene generation is the difficulty of the generation of small objects and detailed local texture, which has been widely observed in global image-level generation methods. To tackle this issue, in this work we consider learning the scene generation in a local context, and correspondingly design a local class-specific generative network with semantic maps as a guidance, which separately constructs and learns sub-generators concentrating on the generation of different classes, and is able to provide more scene details. To learn more discriminative class-specific feature representations for the local generation, a novel classification module is also proposed. To combine the advantage of both the global image-level and the local class-specific generation, a joint generation network is designed with an attention fusion module and a dual-discriminator structure embedded. Extensive experiments on two scene image generation tasks show superior generation performance of the proposed model. The state-of-the-art results are established by large margins on both tasks and on challenging public benchmarks. The source code and trained models are available at https://github.com/Ha0Tang/LGGAN.

preprint2020arXiv

Scope Head for Accurate Localization in Object Detection

Existing anchor-based and anchor-free object detectors in multi-stage or one-stage pipelines have achieved very promising detection performance. However, they still encounter the design difficulty in hand-crafted 2D anchor definition and the learning complexity in 1D direct location regression. To tackle these issues, in this paper, we propose a novel detector coined as ScopeNet, which models anchors of each location as a mutually dependent relationship. This approach quantizes the prediction space and employs a coarse-to-fine strategy for localization. It achieves superior flexibility as in the regression based anchor-free methods, while produces more precise prediction. Besides, an inherit anchor selection score is learned to indicate the localization quality of the detection result, and we propose to better represent the confidence of a detection box by combining the category-classification score and the anchor-selection score. With our concise and effective design, the proposed ScopeNet achieves state-of-the-art results on COCO

preprint2016arXiv

Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking

In this paper, we address the problem of searching action proposals in unconstrained video clips. Our approach starts from actionness estimation on frame-level bounding boxes, and then aggregates the bounding boxes belonging to the same actor across frames via linking, associating, tracking to generate spatial-temporal continuous action paths. To achieve the target, a novel actionness estimation method is firstly proposed by utilizing both human appearance and motion cues. Then, the association of the action paths is formulated as a maximum set coverage problem with the results of actionness estimation as a priori. To further promote the performance, we design an improved optimization objective for the problem and provide a greedy search algorithm to solve it. Finally, a tracking-by-detection scheme is designed to further refine the searched action paths. Extensive experiments on two challenging datasets, UCF-Sports and UCF-101, show that the proposed approach advances state-of-the-art proposal generation performance in terms of both accuracy and proposal quantity.

preprint2016arXiv

Transition from lognormal to chi-square superstatistics for financial time series

Share price returns on different time scales can be well modelled by a superstatistical dynamics. Here we provide an investigation which type of superstatistics is most suitable to properly describe share price dynamics on various time scales. It is shown that while chi-square superstatistics works well on a time scale of days, on a much smaller time scale of minutes the price changes are better described by lognormal superstatistics. The system dynamics thus exhibits a transition from lognormal to chi-square superstatistics as a function of time scale. We discuss a more general model interpolating between both statistics which fits the observed data very well. We also present results on correlation functions of the extracted superstatistical volatility parameter, which exhibits exponential decay for returns on large time scales, whereas for returns on small time scales there are long-range correlations and power-law decay.

preprint2015arXiv

Joint time and frequency dissemination network over delay-stabilized fiber optic links

A precise fiber-based time and frequency dissemination scheme for multiple users with a tree-like branching topology is proposed. Through this scheme, ultra-stable signals can be easily accessed online anywhere along the fiber without affecting other sites. The scheme is tested through an experiment, in which a modulated frequency signal and a synchronized time signal are transferred to multiple remote sites over a delay-stabilized fiber optic links that are over 50 km long. Results show that the relative stabilities are 5E-14@1s and 2E-17@10000s. Meanwhile, compared with each site, time synchronization precision is less than 80 ps. These results can pave the way to practical applications in joint time and frequency dissemination network systems.

preprint2015arXiv

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection

We present a novel unsupervised deep learning framework for anomalous event detection in complex video scenes. While most existing works merely use hand-crafted appearance and motion features, we propose Appearance and Motion DeepNet (AMDN) which utilizes deep neural networks to automatically learn feature representations. To exploit the complementary information of both appearance and motion patterns, we introduce a novel double fusion framework, combining both the benefits of traditional early fusion and late fusion strategies. Specifically, stacked denoising autoencoders are proposed to separately learn both appearance and motion features as well as a joint representation (early fusion). Based on the learned representations, multiple one-class SVM models are used to predict the anomaly scores of each input, which are then integrated with a late fusion strategy for final anomaly detection. We evaluate the proposed method on two publicly available video surveillance datasets, showing competitive performance with respect to state of the art approaches.

preprint2012arXiv

Aggregation and settling in aqueous polydisperse alumina nanoparticle suspensions

Nanoparticle suspensions (also called nanofluids) are often polydisperse and tend to settle with time. Settling kinetics in these systems are known to be complex and hence challenging to understand. In this work, polydisperse spherical alumina (Al2O3) nanoparticles in the size range of ~10-100nm were dispersed in water and examined for aggregation and settling behaviour near its isoelectric point (IEP). A series of settling experiments were conducted and the results were analysed by photography and by Small Angle X-ray Scattering (SAXS). The settling curve obtained from standard bed height measurement experiments indicated two different types of behaviour, both of which were also seen in the SAXS data. But the SAXS data were remarkably able to pick out the rapid settling regime as a result of the high temporal resolution (10s) used. By monitoring the SAXS intensity, it was further possible to record the particle aggregation process for the first time. Optical microscopy images were produced on drying and dried droplets extracted from the suspension at various times. Dried deposits showed the rapid decrease in the number of very large particles with time which qualitatively validates the SAXS prediction, and therefore its suitability as a tool to study unstable polydisperse colloids. Keywords: Nanoparticles, nanofluids, polydisperse, aggregation, settling, alumina, microscopy, SAXS

preprint2011arXiv

Geographic Trough Filling for Internet Datacenters

To reduce datacenter energy consumption and cost, current practice has considered demand-proportional resource provisioning schemes, where servers are turned on/off according to the load of requests. Most existing work considers instantaneous (Internet) requests only, which are explicitly or implicitly assumed to be delay-sensitive. On the other hand, in datacenters, there exist a vast amount of delay-tolerant jobs, such as background/maintainance jobs. In this paper, we explicitly differentiate delay-sensitive jobs and delay tolerant jobs. We focus on the problem of using delay-tolerant jobs to fill the extra capacity of datacenters, referred to as trough/valley filling. Giving a higher priority to delay-sensitive jobs, our schemes complement to most existing demand-proportional resource provisioning schemes. Our goal is to design intelligent trough filling mechanisms that are energy efficient and also achieve good delay performance. Specifically, we propose two joint dynamic speed scaling and traffic shifting schemes, one subgradient-based and the other queue-based. Our schemes assume little statistical information of the system, which is usually difficult to obtain in practice. In both schemes, energy cost saving comes from dynamic speed scaling, statistical multiplexing, electricity price diversity, and service efficiency diversity. In addition, good delay performance is achieved in the queue-based scheme via load shifting and capacity allocation based on queue conditions. Practical issues that may arise in datacenter networks are considered, including capacity and bandwidth constraint, service agility constraint, and load shifting cost. We use both artificial and real datacenter traces to evaluate the proposed schemes.

Dan Xu

What is connected

Connect this record

See the researcher in context

Building this map preview

22 published item(s)

Controllable Video Generation: A Survey

Policy-Grounded Dynamic Facet Suggestions for Job Search

An Adaptive and Scalable ANN-based Model-Order-Reduction Method for Large-Scale TO Designs

Continual Attentive Fusion for Incremental Learning in Semantic Segmentation

Depth-Aware Generative Adversarial Network for Talking Head Video Generation

Dynamic Graph Message Passing Networks

Exploring Adversarial Examples and Adversarial Robustness of Convolutional Neural Networks by Mutual Information

Lipschitz Continuity Retained Binary Neural Network

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation

Network Binarization via Contrastive Learning

Probabilistic Graph Attention Network with Conditional Kernels for Pixel-Wise Prediction

Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation

A Simple Prediction Model for the Development Trend of 2019-nCov Epidemics Based on Medical Observations

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

Scope Head for Accurate Localization in Object Detection

Searching Action Proposals via Spatial Actionness Estimation and Temporal Path Inference and Tracking

Transition from lognormal to chi-square superstatistics for financial time series

Joint time and frequency dissemination network over delay-stabilized fiber optic links

Learning Deep Representations of Appearance and Motion for Anomalous Event Detection

Aggregation and settling in aqueous polydisperse alumina nanoparticle suspensions

Geographic Trough Filling for Internet Datacenters