Researcher profile

Jun Guo

Jun Guo contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
27works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

27 published item(s)

preprint2026arXiv

Detecting Axion-like Particles with Plasmon in Reactor-based Experiment

Axion and axion-like particles (ALPs), predicted by various extensions of the Standard Model, can be copiously produced in nuclear reactors via the Primakoff process. In this work, we explore the detection of such relativistic ALPs through the plasmon effect in silicon detectors located near reactors. Utilizing the data from the Connie and Atucha-II experiments, we set the 90\% confidence level upper limits on the ALP-photon coupling $g_{aγ}$ over the mass range $0.1< m_a <100$ keV. Furthermore, we present that the projected sensitivity of the Oscura experiment, with an exposure of 30 kg$\cdot$ yr, will surpass the current reach of the NEON experiment by approximately one order of magnitude in the same mass range. This improvement would substantially expand the explored region of the QCD axion and ALP parameter space.

preprint2026arXiv

Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising

We propose X-WAM, a Unified 4D World Model that unifies real-time robotic action execution and high-fidelity 4D world synthesis (video + 3D reconstruction) in a single framework, addressing the critical limitations of prior unified world models (e.g., UWM) that only model 2D pixel-space and fail to balance action efficiency and world modeling quality. To leverage the strong visual priors of pretrained video diffusion models, X-WAM imagines the future world by predicting multi-view RGB-D videos, and obtains spatial information efficiently through a lightweight structural adaptation: replicating the final few blocks of the pretrained Diffusion Transformer into a dedicated depth prediction branch for the reconstruction of future spatial information. Moreover, we propose Asynchronous Noise Sampling (ANS) to jointly optimize generation quality and action decoding efficiency. ANS applies a specialized asynchronous denoising schedule during inference, which rapidly decodes actions with fewer steps to enable efficient real-time execution, while dedicating the full sequence of steps to generate high-fidelity video. Rather than entirely decoupling the timesteps during training, ANS samples from their joint distribution to align with the inference distribution. Pretrained on over 5,800 hours of robotic data, X-WAM achieves 79.2% and 90.7% average success rate on RoboCasa and RoboTwin 2.0 benchmarks, while producing high-fidelity 4D reconstruction and generation surpassing existing methods in both visual and geometric metrics.

preprint2023arXiv

Bi-directional Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification

The main challenge for fine-grained few-shot image classification is to learn feature representations with higher inter-class and lower intra-class variations, with a mere few labelled samples. Conventional few-shot learning methods however cannot be naively adopted for this fine-grained setting -- a quick pilot study reveals that they in fact push for the opposite (i.e., lower inter-class variations and higher intra-class variations). To alleviate this problem, prior works predominately use a support set to reconstruct the query image and then utilize metric learning to determine its category. Upon careful inspection, we further reveal that such unidirectional reconstruction methods only help to increase inter-class variations and are not effective in tackling intra-class variations. In this paper, we for the first time introduce a bi-reconstruction mechanism that can simultaneously accommodate for inter-class and intra-class variations. In addition to using the support set to reconstruct the query set for increasing inter-class variations, we further use the query set to reconstruct the support set for reducing intra-class variations. This design effectively helps the model to explore more subtle and discriminative features which is key for the fine-grained problem in hand. Furthermore, we also construct a self-reconstruction module to work alongside the bi-directional module to make the features even more discriminative. Experimental results on three widely used fine-grained image classification datasets consistently show considerable improvements compared with other methods. Codes are available at: https://github.com/PRIS-CV/Bi-FRN.

preprint2022arXiv

A Survey on Long-Tailed Visual Recognition

The heavy reliance on data is one of the major reasons that currently limit the development of deep learning. Data quality directly dominates the effect of deep learning models, and the long-tailed distribution is one of the factors affecting data quality. The long-tailed phenomenon is prevalent due to the prevalence of power law in nature. In this case, the performance of deep learning models is often dominated by the head classes while the learning of the tail classes is severely underdeveloped. In order to learn adequately for all classes, many researchers have studied and preliminarily addressed the long-tailed problem. In this survey, we focus on the problems caused by long-tailed data distribution, sort out the representative long-tailed visual recognition datasets and summarize some mainstream long-tailed studies. Specifically, we summarize these studies into ten categories from the perspective of representation learning, and outline the highlights and limitations of each category. Besides, we have studied four quantitative metrics for evaluating the imbalance, and suggest using the Gini coefficient to evaluate the long-tailedness of a dataset. Based on the Gini coefficient, we quantitatively study 20 widely-used and large-scale visual datasets proposed in the last decade, and find that the long-tailed phenomenon is widespread and has not been fully studied. Finally, we provide several future directions for the development of long-tailed learning to provide more ideas for readers.

preprint2022arXiv

Cluster-guided Asymmetric Contrastive Learning for Unsupervised Person Re-Identification

Unsupervised person re-identification (Re-ID) aims to match pedestrian images from different camera views in unsupervised setting. Existing methods for unsupervised person Re-ID are usually built upon the pseudo labels from clustering. However, the quality of clustering depends heavily on the quality of the learned features, which are overwhelmingly dominated by the colors in images especially in the unsupervised setting. In this paper, we propose a Cluster-guided Asymmetric Contrastive Learning (CACL) approach for unsupervised person Re-ID, in which cluster structure is leveraged to guide the feature learning in a properly designed asymmetric contrastive learning framework. To be specific, we propose a novel cluster-level contrastive loss to help the siamese network effectively mine the invariance in feature learning with respect to the cluster structure within and between different data augmentation views, respectively. Extensive experiments conducted on three benchmark datasets demonstrate superior performance of our proposal.

preprint2022arXiv

Duplex Contextual Relation Network for Polyp Segmentation

Polyp segmentation is of great importance in the early diagnosis and treatment of colorectal cancer. Since polyps vary in their shape, size, color, and texture, accurate polyp segmentation is very challenging. One promising way to mitigate the diversity of polyps is to model the contextual relation for each pixel such as using attention mechanism. However, previous methods only focus on learning the dependencies between the position within an individual image and ignore the contextual relation across different images. In this paper, we propose Duplex Contextual Relation Network (DCRNet) to capture both within-image and cross-image contextual relations. Specifically, we first design Interior Contextual-Relation Module to estimate the similarity between each position and all the positions within the same image. Then Exterior Contextual-Relation Module is incorporated to estimate the similarity between each position and the positions across different images. Based on the above two types of similarity, the feature at one position can be further enhanced by the contextual region embedding within and across images. To store the characteristic region embedding from all the images, a memory bank is designed and operates as a queue. Therefore, the proposed method can relate similar features even though they come from different images. We evaluate the proposed method on the EndoScene, Kvasir-SEG and the recently released large-scale PICCOLO dataset. Experimental results show that the proposed DCRNet outperforms the state-of-the-art methods in terms of the widely-used evaluation metrics.

preprint2022arXiv

Erdős-Ko-Rado theorem for vector spaces over residue class rings

Let $h=\prod_{i=1}^{t}p_i^{s_i}$ be its decomposition into a product of powers of distinct primes, and $\mathbb{Z}_{h}$ be the residue class ring modulo $h$. Let $\mathbb{Z}_{h}^{n}$ be the $n$-dimensional row vector space over $\mathbb{Z}_{h}$. A generalized Grassmann graph for $\mathbb{Z}_{h}^n$, denoted by $G_r(m,n,\mathbb{Z}_{h})$ ($G_r$ for short), has all $m$-subspaces of $\mathbb{Z}_{h}^n$ as its vertices, and two distinct vertices are adjacent if their intersection is of dimension $>m-r$, where $2\leq r\leq m+1\leq n$. In this paper, we determine the clique number and geometric structures of maximum cliques of $G_r$. As a result, we obtain the Erdős-Ko-Rado theorem for $\mathbb{Z}_{h}^{n}$.

preprint2022arXiv

Learning Invariant Visual Representations for Compositional Zero-Shot Learning

Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions using knowledge learned from seen attribute-object compositions in the training set. Previous works mainly project an image and a composition into a common embedding space to measure their compatibility score. However, both attributes and objects share the visual representations learned above, leading the model to exploit spurious correlations and bias towards seen pairs. Instead, we reconsider CZSL as an out-of-distribution generalization problem. If an object is treated as a domain, we can learn object-invariant features to recognize the attributes attached to any object reliably. Similarly, attribute-invariant features can also be learned when recognizing the objects with attributes as domains. Specifically, we propose an invariant feature learning framework to align different domains at the representation and gradient levels to capture the intrinsic characteristics associated with the tasks. Experiments on two CZSL benchmarks demonstrate that the proposed method significantly outperforms the previous state-of-the-art.

preprint2022arXiv

Towards Comprehensive Testing on the Robustness of Cooperative Multi-agent Reinforcement Learning

While deep neural networks (DNNs) have strengthened the performance of cooperative multi-agent reinforcement learning (c-MARL), the agent policy can be easily perturbed by adversarial examples. Considering the safety critical applications of c-MARL, such as traffic management, power management and unmanned aerial vehicle control, it is crucial to test the robustness of c-MARL algorithm before it was deployed in reality. Existing adversarial attacks for MARL could be used for testing, but is limited to one robustness aspects (e.g., reward, state, action), while c-MARL model could be attacked from any aspect. To overcome the challenge, we propose MARLSafe, the first robustness testing framework for c-MARL algorithms. First, motivated by Markov Decision Process (MDP), MARLSafe consider the robustness of c-MARL algorithms comprehensively from three aspects, namely state robustness, action robustness and reward robustness. Any c-MARL algorithm must simultaneously satisfy these robustness aspects to be considered secure. Second, due to the scarceness of c-MARL attack, we propose c-MARL attacks as robustness testing algorithms from multiple aspects. Experiments on \textit{SMAC} environment reveals that many state-of-the-art c-MARL algorithms are of low robustness in all aspect, pointing out the urgent need to test and enhance robustness of c-MARL algorithms.

preprint2021arXiv

WIMP Dark Matter Hidden behind its Companion

The WIMP dark matter (DM) hypothesis now is in an awkward position, owing to the stronger and stronger exclusion from DM direct detection. In this article we design a mechanism to evade this constraint.The idea is simple. DM has a companion, and they are both charged under the DM protecting symmetry G; they admit the trilinear coupling DM-DM-companion, so the latter provides a portal to the standard model (SM) via, for instance, the coupling to Higgs doublet.Then, DM semi-annihilates into the companion to arrive correct relic density, without leaving DM-nucleon scattering signal. The idea can be realized for ZN symmetric models with N >2.We stress that this mechanism has the characteristics of co-annihilation, and as a matter of fact its effect becomes necessary near or above the TeV region. This means that it may be difficult to detect our dark matter directly or indirectly.

preprint2020arXiv

A Video Analysis Method on Wanfang Dataset via Deep Neural Network

The topic of object detection has been largely improved recently, especially with the development of convolutional neural network. However, there still exist a lot of challenging cases, such as small object, compact and dense or highly overlapping object. Existing methods can detect multiple objects wonderfully, but because of the slight changes between frames, the detection effect of the model will become unstable, the detection results may result in dropping or increasing the object. In the pedestrian flow detection task, such phenomenon can not accurately calculate the flow. To solve this problem, in this paper, we describe the new function for real-time multi-object detection in sports competition and pedestrians flow detection in public based on deep learning. Our work is to extract a video clip and solve this frame of clips efficiently. More specfically, our algorithm includes two stages: judge method and optimization method. The judge can set a maximum threshold for better results under the model, the threshold value corresponds to the upper limit of the algorithm with better detection results. The optimization method to solve detection jitter problem. Because of the occurrence of frame hopping in the video, and it will result in the generation of video fragments discontinuity. We use optimization algorithm to get the key value, and then the detection result value of index is replaced by key value to stabilize the change of detection result sequence. Based on the proposed algorithm, we adopt wanfang sports competition dataset as the main test dataset and our own test dataset for YOLOv3-Abnormal Number Version(YOLOv3-ANV), which is 5.4% average improvement compared with existing methods. Also, video above the threshold value can be obtained for further analysis. Spontaneously, our work also can used for pedestrians flow detection and pedestrian alarm tasks.

preprint2020arXiv

Attention-guided Context Feature Pyramid Network for Object Detection

For object detection, how to address the contradictory requirement between feature map resolution and receptive field on high-resolution inputs still remains an open question. In this paper, to tackle this issue, we build a novel architecture, called Attention-guided Context Feature Pyramid Network (AC-FPN), that exploits discriminative information from various large receptive fields via integrating attention-guided multi-path features. The model contains two modules. The first one is Context Extraction Module (CEM) that explores large contextual information from multiple receptive fields. As redundant contextual relations may mislead localization and recognition, we also design the second module named Attention-guided Module (AM), which can adaptively capture the salient dependencies over objects by using the attention mechanism. AM consists of two sub-modules, i.e., Context Attention Module (CxAM) and Content Attention Module (CnAM), which focus on capturing discriminative semantics and locating precise positions, respectively. Most importantly, our AC-FPN can be readily plugged into existing FPN-based models. Extensive experiments on object detection and instance segmentation show that existing models with our proposed CEM and AM significantly surpass their counterparts without them, and our model successfully obtains state-of-the-art results. We have released the source code at https://github.com/Caojunxu/AC-FPN.

preprint2020arXiv

Density-Adaptive Kernel based Efficient Reranking Approaches for Person Reidentification

Person reidentification (ReID) refers to the task of verifying the identity of a pedestrian observed from nonoverlapping views in a surveillance camera network. It has recently been validated that reranking can achieve remarkable performance improvements in person ReID systems. However, current reranking approaches either require feedback from users or suffer from burdensome computational costs. In this paper, we propose to exploit a density-adaptive smooth kernel technique to achieve efficient and effective reranking. Specifically, we adopt a smooth kernel function to formulate the neighbor relationships among data samples with a density-adaptive parameter. Based on this new formulation, we present two simple yet effective reranking methods, termed \emph{inverse} density-adaptive kernel based reranking (inv-DAKR) and \emph{bidirectional} density-adaptive kernel based reranking (bi-DAKR), in which the local density information in the vicinity of each gallery sample is elegantly exploited. Moreover, we extend the proposed inv-DAKR and bi-DAKR methods to incorporate the available extra probe samples and demonstrate that when and why these extra probe samples are able to improve the local neighborhood and thus further refine the ranking results. Extensive experiments are conducted on six benchmark datasets, including: PRID450s, VIPeR, CUHK03, GRID, Market-1501 and Mars. The experimental results demonstrate that our proposals are effective and efficient.

preprint2020arXiv

Dual-attention Guided Dropblock Module for Weakly Supervised Object Localization

Attention mechanisms is frequently used to learn the discriminative features for better feature representations. In this paper, we extend the attention mechanism to the task of weakly supervised object localization (WSOL) and propose the dual-attention guided dropblock module (DGDM), which aims at learning the informative and complementary visual patterns for WSOL. This module contains two key components, the channel attention guided dropout (CAGD) and the spatial attention guided dropblock (SAGD). To model channel interdependencies, the CAGD ranks the channel attentions and treats the top-k attentions with the largest magnitudes as the important ones. It also keeps some low-valued elements to increase their value if they become important during training. The SAGD can efficiently remove the most discriminative information by erasing the contiguous regions of feature maps rather than individual pixels. This guides the model to capture the less discriminative parts for classification. Furthermore, it can also distinguish the foreground objects from the background regions to alleviate the attention misdirection. Experimental results demonstrate that the proposed method achieves new state-of-the-art localization performance.

preprint2020arXiv

Erdős-Ko-Rado theorem and bilinear forms graphs for matrices over residue class rings

Let $h=\prod_{i=1}^{t}p_i^{s_i}$ be its decomposition into a product of powers of distinct primes, and $\mathbb{Z}_{h}$ be the residue class ring modulo $h$. Let $1\leq r\leq m\leq n$ and $\mathbb{Z}_{h}^{m\times n}$ be the set of all $m\times n$ matrices over $\mathbb{Z}_{h}$. The generalized bilinear forms graph over $\mathbb{Z}_{h}$, denoted by $\hbox{Bil}_r(\mathbb{Z}_{h}^{m\times n})$, has the vertex set $\mathbb{Z}_{h}^{m\times n}$, and two distinct vertices $A$ and $B$ are adjacent if the inner rank of $A-B$ is less than or equal to $r$. In this paper, we determine the clique number and geometric structures of maximum cliques of $\hbox{Bil}_r(\mathbb{Z}_{h}^{m\times n})$. As a result, the Erdős-Ko-Rado theorem for $\mathbb{Z}_h^{m\times n}$ is obtained.

preprint2020arXiv

Fine-Grained Instance-Level Sketch-Based Video Retrieval

Existing sketch-analysis work studies sketches depicting static objects or scenes. In this work, we propose a novel cross-modal retrieval problem of fine-grained instance-level sketch-based video retrieval (FG-SBVR), where a sketch sequence is used as a query to retrieve a specific target video instance. Compared with sketch-based still image retrieval, and coarse-grained category-level video retrieval, this is more challenging as both visual appearance and motion need to be simultaneously matched at a fine-grained level. We contribute the first FG-SBVR dataset with rich annotations. We then introduce a novel multi-stream multi-modality deep network to perform FG-SBVR under both strong and weakly supervised settings. The key component of the network is a relation module, designed to prevent model over-fitting given scarce training data. We show that this model significantly outperforms a number of existing state-of-the-art models designed for video analysis.

preprint2020arXiv

Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches

Fine-grained visual classification (FGVC) is much more challenging than traditional classification tasks due to the inherently subtle intra-class object variations. Recent works mainly tackle this problem by focusing on how to locate the most discriminative parts, more complementary parts, and parts of various granularities. However, less effort has been placed to which granularities are the most discriminative and how to fuse information cross multi-granularity. In this work, we propose a novel framework for fine-grained visual classification to tackle these problems. In particular, we propose: (i) a progressive training strategy that effectively fuses features from different granularities, and (ii) a random jigsaw patch generator that encourages the network to learn features at specific granularities. We obtain state-of-the-art performances on several standard FGVC benchmark datasets, where the proposed method consistently outperforms existing methods or delivers competitive results. The code will be available at https://github.com/PRIS-CV/PMG-Progressive-Multi-Granularity-Training.

preprint2020arXiv

Impacts of Weather Conditions on District Heat System

Using artificial neural network for the prediction of heat demand has attracted more and more attention. Weather conditions, such as ambient temperature, wind speed and direct solar irradiance, have been identified as key input parameters. In order to further improve the model accuracy, it is of great importance to understand the influence of different parameters. Based on an Elman neural network (ENN), this paper investigates the impact of direct solar irradiance and wind speed on predicting the heat demand of a district heating network. Results show that including wind speed can generally result in a lower overall mean absolute percentage error (MAPE) (6.43%) than including direct solar irradiance (6.47%); while including direct solar irradiance can achieve a lower maximum absolute deviation (71.8%) than including wind speed (81.53%). In addition, even though including both wind speed and direct solar irradiance shows the best overall performance (MAPE=6.35%).

preprint2020arXiv

Mind the Gap: Enlarging the Domain Gap in Open Set Domain Adaptation

Unsupervised domain adaptation aims to leverage labeled data from a source domain to learn a classifier for an unlabeled target domain. Among its many variants, open set domain adaptation (OSDA) is perhaps the most challenging, as it further assumes the presence of unknown classes in the target domain. In this paper, we study OSDA with a particular focus on enriching its ability to traverse across larger domain gaps. Firstly, we show that existing state-of-the-art methods suffer a considerable performance drop in the presence of larger domain gaps, especially on a new dataset (PACS) that we re-purposed for OSDA. We then propose a novel framework to specifically address the larger domain gaps. The key insight lies with how we exploit the mutually beneficial information between two networks; (a) to separate samples of known and unknown classes, (b) to maximize the domain confusion between source and target domain without the influence of unknown samples. It follows that (a) and (b) will mutually supervise each other and alternate until convergence. Extensive experiments are conducted on Office-31, Office-Home, and PACS datasets, demonstrating the superiority of our method in comparison to other state-of-the-arts. Code available at https://github.com/dongliangchang/Mutual-to-Separate/

preprint2020arXiv

On the Convergence of Extended Variational Inference for Non-Gaussian Statistical Models

Variational inference (VI) is a widely used framework in Bayesian estimation. For most of the non-Gaussian statistical models, it is infeasible to find an analytically tractable solution to estimate the posterior distributions of the parameters. Recently, an improved framework, namely the extended variational inference (EVI), has been introduced and applied to derive analytically tractable solution by employing lower-bound approximation to the variational objective function. Two conditions required for EVI implementation, namely the weak condition and the strong condition, are discussed and compared in this paper. In practical implementation, the convergence of the EVI depends on the selection of the lower-bound approximation, no matter with the weak condition or the strong condition. In general, two approximation strategies, the single lower-bound (SLB) approximation and the multiple lower-bounds (MLB) approximation, can be applied to carry out the lower-bound approximation. To clarify the differences between the SLB and the MLB, we will also discuss the convergence properties of the aforementioned two approximations. Extensive comparisons are made based on some existing EVI-based non-Gaussian statistical models. Theoretical analysis are conducted to demonstrate the differences between the weak and the strong conditions. Qualitative and quantitative experimental results are presented to show the advantages of the SLB approximation.

preprint2020arXiv

OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer

A deep neural network of multiple nonlinear layers forms a large function space, which can easily lead to overfitting when it encounters small-sample data. To mitigate overfitting in small-sample classification, learning more discriminative features from small-sample data is becoming a new trend. To this end, this paper aims to find a subspace of neural networks that can facilitate a large decision margin. Specifically, we propose the Orthogonal Softmax Layer (OSL), which makes the weight vectors in the classification layer remain orthogonal during both the training and test processes. The Rademacher complexity of a network using the OSL is only $\frac{1}{K}$, where $K$ is the number of classes, of that of a network using the fully connected classification layer, leading to a tighter generalization error bound. Experimental results demonstrate that the proposed OSL has better performance than the methods used for comparison on four small-sample benchmark datasets, as well as its applicability to large-sample datasets. Codes are available at: https://github.com/dongliangchang/OSLNet.

preprint2020arXiv

OVC-Net: Object-Oriented Video Captioning with Temporal Graph and Detail Enhancement

Traditional video captioning requests a holistic description of the video, yet the detailed descriptions of the specific objects may not be available. Without associating the moving trajectories, these image-based data-driven methods cannot understand the activities from the spatio-temporal transitions in the inter-object visual features. Besides, adopting ambiguous clip-sentence pairs in training, it goes against learning the multi-modal functional mappings owing to the one-to-many nature. In this paper, we propose a novel task to understand the videos in object-level, named object-oriented video captioning. We introduce the video-based object-oriented video captioning network (OVC)-Net via temporal graph and detail enhancement to effectively analyze the activities along time and stably capture the vision-language connections under small-sample condition. The temporal graph provides useful supplement over previous image-based approaches, allowing to reason the activities from the temporal evolution of visual features and the dynamic movement of spatial locations. The detail enhancement helps to capture the discriminative features among different objects, with which the subsequent captioning module can yield more informative and precise descriptions. Thereafter, we construct a new dataset, providing consistent object-sentence pairs, to facilitate effective cross-modal learning. To demonstrate the effectiveness, we conduct experiments on the new dataset and compare it with the state-of-the-art video captioning methods. From the experimental results, the OVC-Net exhibits the ability of precisely describing the concurrent objects, and achieves the state-of-the-art performance.

preprint2020arXiv

PIC simulations of microinstabilities and waves at near-Sun solar wind perpendicular shocks: Predictions for Parker Solar Probe and Solar Orbiter

Microinstabilities and waves excited at moderate-Mach-number perpendicular shocks in the near-Sun solar wind are investigated by full particle-in-cell (PIC) simulations. By analyzing the dispersion relation of fluctuating field components directly issued from the shock simulation, we obtain key findings concerning wave excitations at the shock front: (1) at the leading edge of the foot, two types of electrostatic (ES) waves are observed. The relative drift of the reflected ions versus the electrons triggers an electron cyclotron drift instability (ECDI) which excites the first ES wave. Because the bulk velocity of gyro-reflected ions shifts to the direction of the shock front, the resulting ES wave propagates oblique to the shock normal. Immediately, a fraction of incident electrons are accelerated by this ES wave and a ring-like velocity distribution is generated. They can couple with the hot Maxwellian core and excite the second ES wave around the upper hybrid frequency. (2) from the middle of the foot all the way to the ramp, electrons can couple with both incident and reflected ions. ES waves excited by ECDI in different directions propagate across each other. Electromagnetic (EM) waves (X mode) emitted toward upstream are observed in both regions. They are probably induced by a small fraction of relativistic electrons. Results shed new insight on the mechanism for the occurrence of ES wave excitations and possible EM wave emissions at young CME-driven shocks in the near-Sun solar wind.

preprint2020arXiv

ReMarNet: Conjoint Relation and Margin Learning for Small-Sample Image Classification

Despite achieving state-of-the-art performance, deep learning methods generally require a large amount of labeled data during training and may suffer from overfitting when the sample size is small. To ensure good generalizability of deep networks under small sample sizes, learning discriminative features is crucial. To this end, several loss functions have been proposed to encourage large intra-class compactness and inter-class separability. In this paper, we propose to enhance the discriminative power of features from a new perspective by introducing a novel neural network termed Relation-and-Margin learning Network (ReMarNet). Our method assembles two networks of different backbones so as to learn the features that can perform excellently in both of the aforementioned two classification mechanisms. Specifically, a relation network is used to learn the features that can support classification based on the similarity between a sample and a class prototype; at the meantime, a fully connected network with the cross entropy loss is used for classification via the decision boundary. Experiments on four image datasets demonstrate that our approach is effective in learning discriminative features from a small set of labeled samples and achieves competitive performance against state-of-the-art methods. Codes are available at https://github.com/liyunyu08/ReMarNet.

preprint2020arXiv

SSKD: Self-Supervised Knowledge Distillation for Cross Domain Adaptive Person Re-Identification

Domain adaptive person re-identification (re-ID) is a challenging task due to the large discrepancy between the source domain and the target domain. To reduce the domain discrepancy, existing methods mainly attempt to generate pseudo labels for unlabeled target images by clustering algorithms. However, clustering methods tend to bring noisy labels and the rich fine-grained details in unlabeled images are not sufficiently exploited. In this paper, we seek to improve the quality of labels by capturing feature representation from multiple augmented views of unlabeled images. To this end, we propose a Self-Supervised Knowledge Distillation (SSKD) technique containing two modules, the identity learning and the soft label learning. Identity learning explores the relationship between unlabeled samples and predicts their one-hot labels by clustering to give exact information for confidently distinguished images. Soft label learning regards labels as a distribution and induces an image to be associated with several related classes for training peer network in a self-supervised manner, where the slowly evolving network is a core to obtain soft labels as a gentle constraint for reliable images. Finally, the two modules can resist label noise for re-ID by enhancing each other and systematically integrating label information from unlabeled images. Extensive experiments on several adaptation tasks demonstrate that the proposed method outperforms the current state-of-the-art approaches by large margins.