Source author record

Jun Zhou

Jun Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

72works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection

Video anomaly detection (VAD) systems often prioritize accuracy while overlooking privacy concerns, limiting their suitability for real-world deployment. We propose the Orthogonal Projection Layer (OPL), a lightweight module that removes task-irrelevant variations to produce representations focused on anomaly-relevant cues. To address privacy risks in human-centered scenarios, we introduce Guided OPL (G-OPL), which suppresses facial attributes using weak supervision from face-presence signals while preserving non-identifying features such as pose and motion. A cosine alignment objective enforces consistent capture and removal of facial information without identity labels or adversarial training. We further present a privacy-aware evaluation framework that jointly assesses detection performance and privacy preservation, and enables analysis of how sensitive information is filtered. Experiments show that embedding privacy constraints into model design reduces sensitive information while maintaining or improving detection accuracy, supporting projection-based architectures as a principled approach for privacy-aware VAD.

preprint2024arXiv

GLISP: A Scalable GNN Learning System by Exploiting Inherent Structural Properties of Graphs

As a powerful tool for modeling graph data, Graph Neural Networks (GNNs) have received increasing attention in both academia and industry. Nevertheless, it is notoriously difficult to deploy GNNs on industrial scale graphs, due to their huge data size and complex topological structures. In this paper, we propose GLISP, a sampling based GNN learning system for industrial scale graphs. By exploiting the inherent structural properties of graphs, such as power law distribution and data locality, GLISP addresses the scalability and performance issues that arise at different stages of the graph learning process. GLISP consists of three core components: graph partitioner, graph sampling service and graph inference engine. The graph partitioner adopts the proposed vertex-cut graph partitioning algorithm AdaDNE to produce balanced partitioning for power law graphs, which is essential for sampling based GNN systems. The graph sampling service employs a load balancing design that allows the one hop sampling request of high degree vertices to be handled by multiple servers. In conjunction with the memory efficient data structure, the efficiency and scalability are effectively improved. The graph inference engine splits the $K$-layer GNN into $K$ slices and caches the vertex embeddings produced by each slice in the data locality aware hybrid caching system for reuse, thus completely eliminating redundant computation caused by the data dependency of graph. Extensive experiments show that GLISP achieves up to $6.53\times$ and $70.77\times$ speedups over existing GNN systems for training and inference tasks, respectively, and can scale to the graph with over 10 billion vertices and 40 billion edges with limited resources.

preprint2022arXiv

Agriculture-Vision Challenge 2022 -- The Runner-Up Solution for Agricultural Pattern Recognition via Transformer-based Models

The Agriculture-Vision Challenge in CVPR is one of the most famous and competitive challenges for global researchers to break the boundary between computer vision and agriculture sectors, aiming at agricultural pattern recognition from aerial images. In this paper, we propose our solution to the third Agriculture-Vision Challenge in CVPR 2022. We leverage a data pre-processing scheme and several Transformer-based models as well as data augmentation techniques to achieve a mIoU of 0.582, accomplishing the 2nd place in this challenge.

preprint2022arXiv

AHEAD: A Triple Attention Based Heterogeneous Graph Anomaly Detection Approach

Graph anomaly detection on attributed networks has become a prevalent research topic due to its broad applications in many influential domains. In real-world scenarios, nodes and edges in attributed networks usually display distinct heterogeneity, i.e. attributes of different types of nodes show great variety, different types of relations represent diverse meanings. Anomalies usually perform differently from the majority in various perspectives of heterogeneity in these networks. However, existing graph anomaly detection approaches do not leverage heterogeneity in attributed networks, which is highly related to anomaly detection. In light of this problem, we propose AHEAD: a heterogeneity-aware unsupervised graph anomaly detection approach based on the encoder-decoder framework. Specifically, for the encoder, we design three levels of attention, i.e. attribute level, node type level, and edge level attentions to capture the heterogeneity of network structure, node properties and information of a single node, respectively. In the decoder, we exploit structure, attribute, and node type reconstruction terms to obtain an anomaly score for each node. Extensive experiments show the superiority of AHEAD on several real-world heterogeneous information networks compared with the state-of-arts in the unsupervised setting. Further experiments verify the effectiveness and robustness of our triple attention, model backbone, and decoder in general.

preprint2022arXiv

An Effective Graph Learning based Approach for Temporal Link Prediction: The First Place of WSDM Cup 2022

Temporal link prediction, as one of the most crucial work in temporal graphs, has attracted lots of attention from the research area. The WSDM Cup 2022 seeks for solutions that predict the existence probabilities of edges within time spans over temporal graph. This paper introduces the solution of AntGraph, which wins the 1st place in the competition. We first analysis the theoretical upper-bound of the performance by removing temporal information, which implies that only structure and attribute information on the graph could achieve great performance. Based on this hypothesis, then we introduce several well-designed features. Finally, experiments conducted on the competition datasets show the superiority of our proposal, which achieved AUC score of 0.666 on dataset A and 0.902 on dataset B, the ablation studies also prove the efficiency of each feature. Code is publicly available at https://github.com/im0qianqian/WSDM2022TGP-AntGraph.

preprint2022arXiv

BadDet: Backdoor Attacks on Object Detection

Deep learning models have been deployed in numerous real-world applications such as autonomous driving and surveillance. However, these models are vulnerable in adversarial environments. Backdoor attack is emerging as a severe security threat which injects a backdoor trigger into a small portion of training data such that the trained model behaves normally on benign inputs but gives incorrect predictions when the specific trigger appears. While most research in backdoor attacks focuses on image classification, backdoor attacks on object detection have not been explored but are of equal importance. Object detection has been adopted as an important module in various security-sensitive applications such as autonomous driving. Therefore, backdoor attacks on object detection could pose severe threats to human lives and properties. We propose four kinds of backdoor attacks for object detection task: 1) Object Generation Attack: a trigger can falsely generate an object of the target class; 2) Regional Misclassification Attack: a trigger can change the prediction of a surrounding object to the target class; 3) Global Misclassification Attack: a single trigger can change the predictions of all objects in an image to the target class; and 4) Object Disappearance Attack: a trigger can make the detector fail to detect the object of the target class. We develop appropriate metrics to evaluate the four backdoor attacks on object detection. We perform experiments using two typical object detection models -- Faster-RCNN and YOLOv3 on different datasets. More crucially, we demonstrate that even fine-tuning on another benign dataset cannot remove the backdoor hidden in the object detection model. To defend against these backdoor attacks, we propose Detector Cleanse, an entropy-based run-time detection framework to identify poisoned testing samples for any deployed object detector.

preprint2022arXiv

Chiral Phonon Activated Spin Seebeck Effect

Efficient generation of spin polarization is the central focus of spintronics. In magnetic materials, spin currents can arise from heat currents by the conventional spin Seebeck effect. Recently, chiral phonons with definite handedness and angular momenta have also produced profound impacts on multiple research fields. In this paper, starting with nonequilibrium distribution of chiral phonons under temperature gradient, we find a new spin selectivity effect - chiral phonon activated spin Seebeck (CPASS) effect, in chiral materials without magnetic order nor spin-orbit coupling. With both phonon-drag and band transport contributions, the CPASS coefficients are computed based on the Boltzmann transport theory. The spin accumulations by the CPASS effect quadratically increase with temperature gradient, and vary with the chemical potential modulation, thus enabling highly efficient and tunable spin generation. The CPASS effect provides a promising explanation on the chiral-induced spin selectivity effect and opportunities for designing advanced spintronic devices based on nonmagnetic chiral materials.

preprint2022arXiv

Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift

Graph Convolutional Networks (GCNs) have recently attracted vast interest and achieved state-of-the-art performance on graphs, but its success could typically hinge on careful training with amounts of expensive and time-consuming labeled data. To alleviate labeled data scarcity, self-training methods have been widely adopted on graphs by labeling high-confidence unlabeled nodes and then adding them to the training step. In this line, we empirically make a thorough study for current self-training methods on graphs. Surprisingly, we find that high-confidence unlabeled nodes are not always useful, and even introduce the distribution shift issue between the original labeled dataset and the augmented dataset by self-training, severely hindering the capability of self-training on graphs. To this end, in this paper, we propose a novel Distribution Recovered Graph Self-Training framework (DR-GST), which could recover the distribution of the original labeled dataset. Specifically, we first prove the equality of loss function in self-training framework under the distribution shift case and the population distribution if each pseudo-labeled node is weighted by a proper coefficient. Considering the intractability of the coefficient, we then propose to replace the coefficient with the information gain after observing the same changing trend between them, where information gain is respectively estimated via both dropout variational inference and dropedge variational inference in DR-GST. However, such a weighted loss function will enlarge the impact of incorrect pseudo labels. As a result, we apply the loss correction method to improve the quality of pseudo labels. Both our theoretical analysis and extensive experiments on five benchmark datasets demonstrate the effectiveness of the proposed DR-GST, as well as each well-designed component in DR-GST.

preprint2022arXiv

CrowdMLP: Weakly-Supervised Crowd Counting via Multi-Granularity MLP

Existing state-of-the-art crowd counting algorithms rely excessively on location-level annotations, which are burdensome to acquire. When only count-level (weak) supervisory signals are available, it is arduous and error-prone to regress total counts due to the lack of explicit spatial constraints. To address this issue, a novel and efficient counter (referred to as CrowdMLP) is presented, which probes into modelling global dependencies of embeddings and regressing total counts by devising a multi-granularity MLP regressor. In specific, a locally-focused pre-trained frontend is cascaded to extract crude feature maps with intrinsic spatial cues, which prevent the model from collapsing into trivial outcomes. The crude embeddings, along with raw crowd scenes, are tokenized at different granularity levels. The multi-granularity MLP then proceeds to mix tokens at the dimensions of cardinality, channel, and spatial for mining global information. An effective proxy task, namely Split-Counting, is also proposed to evade the barrier of limited samples and the shortage of spatial hints in a self-supervised manner. Extensive experiments demonstrate that CrowdMLP significantly outperforms existing weakly-supervised counting algorithms and performs on par with state-of-the-art location-level supervised approaches.

preprint2022arXiv

Decisive role of electron-phonon coupling for phonon and electron instabilities in transition metal dichalcogenides

The origin of the charge density wave (CDW) in transition metal dichalcognides has been in hot debate and no conclusive agreement has been reached. Here, we propose an ab-initio framework for an accurate description of both Fermi surface nesting and electron-phonon coupling (EPC) and systematically investigate their roles in the formation of CDW. Using monolayer 1H-NbSe$_2$ and 1T-VTe$_2$ as representative examples, we show that it is the momentum-dependent EPC softens the phonon frequencies, which become imaginary (phonon instabilities) at CDW vectors (indicating CDW formation). Besides, the distribution of the CDW gap opening (electron instabilities) can be correctly predicted only if EPC is included in the mean-field model. These results emphasize the decisive role of EPC in the CDW formation. Our analytical process is general and can be applied to other CDW systems.

preprint2022arXiv

Gaia: Graph Neural Network with Temporal Shift aware Attention for Gross Merchandise Value Forecast in E-commerce

E-commerce has gone a long way in empowering merchants through the internet. In order to store the goods efficiently and arrange the marketing resource properly, it is important for them to make the accurate gross merchandise value (GMV) prediction. However, it's nontrivial to make accurate prediction with the deficiency of digitized data. In this article, we present a solution to better forecast GMV inside Alipay app. Thanks to graph neural networks (GNN) which has great ability to correlate different entities to enrich information, we propose Gaia, a graph neural network (GNN) model with temporal shift aware attention. Gaia leverages the relevant e-seller' sales information and learn neighbor correlation based on temporal dependencies. By testing on Alipay's real dataset and comparing with other baselines, Gaia has shown the best performance. And Gaia is deployed in the simulated online environment, which also achieves great improvement compared with baselines.

preprint2022arXiv

Interpretable learning of voltage for electrode design of multivalent metal-ion batteries

Deep learning (DL) has indeed emerged as a powerful tool for rapidly and accurately predicting materials properties from big data, such as the design of current commercial Li-ion batteries. However, its practical utility for multivalent metal-ion batteries (MIBs), the most promising future solution of large-scale energy storage, is limited due to the scarce MIB data availability and poor DL model interpretability. Here, we develop an interpretable DL model as an effective and accurate method for learning electrode voltages of multivalent MIBs (divalent magnesium, calcium, zinc, and trivalent aluminum) at small dataset limits (150~500). Using the experimental results as validation, our model is much more accurate than machine-learning models which usually are better than DL in the small dataset regime. Besides the high accuracy, our feature-engineering-free DL model is explainable, which automatically extracts the atom covalent radius as the most important feature for the voltage learning by visualizing vectors from the layers of the neural network. The presented model potentially accelerates the design and optimization of multivalent MIB materials with fewer data and less domain-knowledge restriction, and is implemented into a publicly available online tool kit in http://batteries.2dmatpedia.org/ for the battery community.

preprint2022arXiv

KGNN: Distributed Framework for Graph Neural Knowledge Representation

Knowledge representation learning has been commonly adopted to incorporate knowledge graph (KG) into various online services. Although existing knowledge representation learning methods have achieved considerable performance improvement, they ignore high-order structure and abundant attribute information, resulting unsatisfactory performance on semantics-rich KGs. Moreover, they fail to make prediction in an inductive manner and cannot scale to large industrial graphs. To address these issues, we develop a novel framework called KGNN to take full advantage of knowledge data for representation learning in the distributed learning system. KGNN is equipped with GNN based encoder and knowledge aware decoder, which aim to jointly explore high-order structure and attribute information together in a fine-grained fashion and preserve the relation patterns in KGs, respectively. Extensive experiments on three datasets for link prediction and triplet classification task demonstrate the effectiveness and scalability of KGNN framework.

preprint2022arXiv

Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective

Despite recent stereo matching networks achieving impressive performance given sufficient training data, they suffer from domain shifts and generalize poorly to unseen domains. We argue that maintaining feature consistency between matching pixels is a vital factor for promoting the generalization capability of stereo matching networks, which has not been adequately considered. Here we address this issue by proposing a simple pixel-wise contrastive learning across the viewpoints. The stereo contrastive feature loss function explicitly constrains the consistency between learned features of matching pixel pairs which are observations of the same 3D points. A stereo selective whitening loss is further introduced to better preserve the stereo feature consistency across domains, which decorrelates stereo features from stereo viewpoint-specific style information. Counter-intuitively, the generalization of feature consistency between two viewpoints in the same scene translates to the generalization of stereo matching performance to unseen domains. Our method is generic in nature as it can be easily embedded into existing stereo networks and does not require access to the samples in the target domain. When trained on synthetic data and generalized to four real-world testing sets, our method achieves superior performance over several state-of-the-art networks.

preprint2022arXiv

RVAE-LAMOL: Residual Variational Autoencoder to Enhance Lifelong Language Learning

Lifelong Language Learning (LLL) aims to train a neural network to learn a stream of NLP tasks while retaining knowledge from previous tasks. However, previous works which followed data-free constraint still suffer from catastrophic forgetting issue, where the model forgets what it just learned from previous tasks. In order to alleviate catastrophic forgetting, we propose the residual variational autoencoder (RVAE) to enhance LAMOL, a recent LLL model, by mapping different tasks into a limited unified semantic space. In this space, previous tasks are easy to be correct to their own distribution by pseudo samples. Furthermore, we propose an identity task to make the model is discriminative to recognize the sample belonging to which task. For training RVAE-LAMOL better, we propose a novel training scheme Alternate Lag Training. In the experiments, we test RVAE-LAMOL on permutations of three datasets from DecaNLP. The experimental results demonstrate that RVAE-LAMOL outperforms naïve LAMOL on all permutations and generates more meaningful pseudo-samples.

preprint2022arXiv

SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation

6D pose estimation of rigid objects from RGB-D images is crucial for object grasping and manipulation in robotics. Although RGB channels and the depth (D) channel are often complementary, providing respectively the appearance and geometry information, it is still non-trivial how to fully benefit from the two cross-modal data. From the simple yet new observation, when an object rotates, its semantic label is invariant to the pose while its keypoint offset direction is variant to the pose. To this end, we present SO(3)-Pose, a new representation learning network to explore SO(3)-equivariant and SO(3)-invariant features from the depth channel for pose estimation. The SO(3)-invariant features facilitate to learn more distinctive representations for segmenting objects with similar appearance from RGB channels. The SO(3)-equivariant features communicate with RGB features to deduce the (missed) geometry for detecting keypoints of an object with the reflective surface from the depth channel. Unlike most of existing pose estimation methods, our SO(3)-Pose not only implements the information communication between the RGB and depth channels, but also naturally absorbs the SO(3)-equivariance geometry knowledge from depth images, leading to better appearance and geometry representation learning. Comprehensive experiments show that our method achieves the state-of-the-art performance on three benchmarks.

preprint2022arXiv

Towards Scalable and Privacy-Preserving Deep Neural Network via Algorithmic-Cryptographic Co-design

Deep Neural Networks (DNNs) have achieved remarkable progress in various real-world applications, especially when abundant training data are provided. However, data isolation has become a serious problem currently. Existing works build privacy preserving DNN models from either algorithmic perspective or cryptographic perspective. The former mainly splits the DNN computation graph between data holders or between data holders and server, which demonstrates good scalability but suffers from accuracy loss and potential privacy risks. In contrast, the latter leverages time-consuming cryptographic techniques, which has strong privacy guarantee but poor scalability. In this paper, we propose SPNN - a Scalable and Privacy-preserving deep Neural Network learning framework, from algorithmic-cryptographic co-perspective. From algorithmic perspective, we split the computation graph of DNN models into two parts, i.e., the private data related computations that are performed by data holders and the rest heavy computations that are delegated to a server with high computation ability. From cryptographic perspective, we propose using two types of cryptographic techniques, i.e., secret sharing and homomorphic encryption, for the isolated data holders to conduct private data related computations privately and cooperatively. Furthermore, we implement SPNN in a decentralized setting and introduce user-friendly APIs. Experimental results conducted on real-world datasets demonstrate the superiority of SPNN.

preprint2022arXiv

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

One intriguing property of adversarial attacks is their "transferability" -- an adversarial example crafted with respect to one deep neural network (DNN) model is often found effective against other DNNs as well. Intensive research has been conducted on this phenomenon under simplistic controlled conditions. Yet, thus far, there is still a lack of comprehensive understanding about transferability-based attacks ("transfer attacks") in real-world environments. To bridge this critical gap, we conduct the first large-scale systematic empirical study of transfer attacks against major cloud-based MLaaS platforms, taking the components of a real transfer attack into account. The study leads to a number of interesting findings which are inconsistent to the existing ones, including: (1) Simple surrogates do not necessarily improve real transfer attacks. (2) No dominant surrogate architecture is found in real transfer attacks. (3) It is the gap between posterior (output of the softmax layer) rather than the gap between logit (so-called $κ$ value) that increases transferability. Moreover, by comparing with prior works, we demonstrate that transfer attacks possess many previously unknown properties in real-world environments, such as (1) Model similarity is not a well-defined concept. (2) $L_2$ norm of perturbation can generate high transferability without usage of gradient and is a more powerful source than $L_\infty$ norm. We believe this work sheds light on the vulnerabilities of popular MLaaS platforms and points to a few promising research directions.

preprint2022arXiv

Unravelling Distance-Dependent Inter-Site Interactions and Magnetic Transition Effects of Heteronuclear Single Atom Catalysts on Electrochemical Oxygen Reduction

Inter-site interactions between single atom catalysts (SACs) in the high loading regime are critical to tuning the catalytic performance. However, the understanding on such interactions and their distance dependent effects remains elusive, especially for the heteronuclear SACs. In this study, we reveal the effects of the distance-dependent inter-site interaction on the catalytic performance of SACs. Using the density functional theory calculations, we systematically investigate the heteronuclear iron and cobalt single atoms co-supported on the nitrogen-doped graphene (FeN4-C and CoN4-C) for oxygen reduction reaction (ORR). We find that as the distance between Fe and Co SACs decreases, FeN4-C exhibits a reduced catalytic activity, which can be mitigated by the presence of an axial hydroxyl ligand, whereas the activity of CoN4-C shows a volcano-like evolution with the optimum reached at the intermediate distance. We further unravel that the transition towards the high-spin state upon adsorption of ORR intermediate adsorbates is responsible for the decreased activity of both FeN4-C and CoN4-C at short inter-site distance. Such high-spin state transition is also found to significantly shift the linear relation between hydroxyl (*OH) and hydroperoxyl (*OOH) adsorbates. These findings not only shed light on the SAC-specific effect of the distance-dependent inter-site interaction between heteronuclear SACs, but also pave a way towards shifting the long-standing linear relations observed in multiple-electron chemical reactions.

preprint2022arXiv

Vertically Federated Graph Neural Network for Privacy-Preserving Node Classification

Recently, Graph Neural Network (GNN) has achieved remarkable progresses in various real-world tasks on graph data, consisting of node features and the adjacent information between different nodes. High-performance GNN models always depend on both rich features and complete edge information in graph. However, such information could possibly be isolated by different data holders in practice, which is the so-called data isolation problem. To solve this problem, in this paper, we propose VFGNN, a federated GNN learning paradigm for privacy-preserving node classification task under data vertically partitioned setting, which can be generalized to existing GNN models. Specifically, we split the computation graph into two parts. We leave the private data (i.e., features, edges, and labels) related computations on data holders, and delegate the rest of computations to a semi-honest server. We also propose to apply differential privacy to prevent potential information leakage from the server. We conduct experiments on three benchmarks and the results demonstrate the effectiveness of VFGNN.

preprint2021arXiv

Cross-Domain Recommendation: Challenges, Progress, and Prospects

To address the long-standing data sparsity problem in recommender systems (RSs), cross-domain recommendation (CDR) has been proposed to leverage the relatively richer information from a richer domain to improve the recommendation performance in a sparser domain. Although CDR has been extensively studied in recent years, there is a lack of a systematic review of the existing CDR approaches. To fill this gap, in this paper, we provide a comprehensive review of existing CDR approaches, including challenges, research progress, and future directions. Specifically, we first summarize existing CDR approaches into four types, including single-target CDR, multi-domain recommendation, dual-target CDR, and multi-target CDR. We then present the definitions and challenges of these CDR approaches. Next, we propose a full-view categorization and new taxonomies on these approaches and report their research progress in detail. In the end, we share several promising research directions in CDR.

preprint2021arXiv

Dimension reduction induced anisotropic magnetic thermal conductivity in hematite nanowire

The thermophysical properties near the magnetic phase transition point is of great importance in the study of critical phenomenon. Low-dimensional materials are suggested to hold different thermophysical properties comparing to their bulk counterpart due to the dimension induced quantum confinement and anisotropy. In this work, we measured the thermal conductivity of $α$-Fe$_2$O$_3$ nanowires along [110] direction (growing direction) with temperature from 100K to 150K and found a dip of thermal conductivity near the Morin temperature. We found the thermal conductivity near Morin temperature varies with the angle between magnetic field and [110] direction of nanowire. More specifically, an angular-dependent thermal conductivity is observed, due to the magnetic field induced movement of magnetic domain wall. The angle corresponding to the maximum of thermal conductivity varies near the Morin transition temperature, due to the different magnetic easy axis as suggested by our calculation based on magnetic anisotropy energy. This angular dependence of thermal conductivity indicates that the easy axis of $α$-Fe$_2$O$_3$ nanowires is different from bulk $α$-Fe$_2$O$_3$ due to the geometric anisotropy.

preprint2021arXiv

Goal-Oriented Gaze Estimation for Zero-Shot Learning

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. Since semantic knowledge is built on attributes shared between different classes, which are highly local, strong prior for localization of object attribute is beneficial for visual-semantic embedding. Interestingly, when recognizing unseen images, human would also automatically gaze at regions with certain semantic clue. Therefore, we introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization based on the class-level attributes for ZSL. We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description. Specifically, the task-dependent attention is learned with the goal-oriented GEM, and the global image features are simultaneously optimized with the regression of local attribute features. Experiments on three ZSL benchmarks, i.e., CUB, SUN and AWA2, show the superiority or competitiveness of our proposed method against the state-of-the-art ZSL methods. The ablation analysis on real gaze data CUB-VWSW also validates the benefits and accuracy of our gaze estimation module. This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks. The code is available at https://github.com/osierboy/GEM-ZSL.

preprint2021arXiv

Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning

Zero-shot learning (ZSL) aims to recognize novel classes by transferring semantic knowledge from seen classes to unseen classes. Though many ZSL methods rely on a direct mapping between the visual and the semantic space, the calibration deviation and hubness problem limit the generalization capability to unseen classes. Recently emerged generative ZSL methods generate unseen image features to transform ZSL into a supervised classification problem. However, most generative models still suffer from the seen-unseen bias problem as only seen data is used for training. To address these issues, we propose a novel bidirectional embedding based generative model with a tight visual-semantic coupling constraint. We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces. Since the embedding from high-dimensional visual features comprise much non-semantic information, the alignment of visual and semantic in latent space would inevitably been deviated. Therefore, we introduce information bottleneck (IB) constraint to ZSL for the first time to preserve essential attribute information during the mapping. Specifically, we utilize the uncertainty estimation and the wake-sleep procedure to alleviate the feature noises and improve model abstraction capability. In addition, our method can be easily extended to transductive ZSL setting by generating labels for unseen images. We then introduce a robust loss to solve this label noise problem. Extensive experimental results show that our method outperforms the state-of-the-art methods in different ZSL settings on most benchmark datasets. The code will be available at https://github.com/osierboy/IBZSL.

preprint2021arXiv

Phase diagram and superlattice structures of monolayer phosphorus carbide (P$_x$C$_{1-x}$)

Phase stability and properties of two-dimensional phosphorus carbide, P$_x$C$_{1-x}$, are investigated using the first-principles method in combination with cluster expansion and Monte Carlo simulation. Monolayer P$_x$C$_{1-x}$ is found to be a phase separating system which indicates difficulty in fabricating monolayer P$_x$C$_{1-x}$ or crystalline P$_x$C$_{1-x}$ thin films. Nevertheless, a bottom-up design approach is used to determine the stable structures of P$_x$C$_{1-x}$ of various compositions which turn out to be superlattices consisting of alternating carbon and phosphorus nanoribbons along the armchair direction. Results of first-principles calculations indicate that once these structures are produced, they are mechanically and thermodynamically stable. All the ordered structures are predicted to be semiconductors, with band gap ranging from 0.2 to 1.2 eV. In addition, the monolayer P$_x$C$_{1-x}$ are predicted to have high carrier mobility, and high optical absorption in the ultraviolet region which shows a red-shift as the P:C ratio increases. These properties make 2D P$_x$C$_{1-x}$ promising materials for applications in electronics and optoelectronics.

preprint2021arXiv

Role of Magnon-Magnon Scattering in Magnon Polaron Spin Seebeck Effect

The spin Seebeck effect (SSE) signal of magnon polarons in bulk-Y3Fe5O12 (YIG)/Pt heterostructures is found to drastically change as a function of temperature. It appears as a dip in the total SSE signal at low temperatures, but as the temperature increases, the dip gradually decreases before turning to a peak. We attribute the observed dip-to-peak transition to the rapid rise of the four-magnon scattering rate. Our analysis provides important insights into the microscopic origin of the hybridized excitations and the overall temperature dependence of the SSE anomalies.

preprint2021arXiv

Room temperature ferromagnetism of monolayer chromium telluride with perpendicular magnetic anisotropy

The realization of long-range magnetic ordering in two-dimensional (2D) systems can potentially revolutionize next-generation information technology. Here, we report the successful fabrication of crystalline Cr3Te4 monolayers with room temperature ferromagnetism. Using molecular beam epitaxy, the growth of 2D Cr3Te4 films with monolayer thickness is demonstrated at low substrate temperatures (~100C), compatible with Si CMOS technology. X-ray magnetic circular dichroism measurements reveal a Curie temperature (Tc) of ~344 K for the Cr3Te4 monolayer with an out-of-plane magnetic easy axis, which decreases to ~240 K for the thicker film (~ 7 nm) with an in-plane easy axis. The enhancement of ferromagnetic coupling and the magnetic anisotropy transition is ascribed to interfacial effects, in particular the orbital overlap at the monolayer Cr3Te4/graphite interface, supported by density-functional theory calculations. This work sheds light on the low-temperature scalable growth of 2D nonlayered materials with room temperature ferromagnetism for new magnetic and spintronic devices.

Jun Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

72 published item(s)

Privacy-Aware Video Anomaly Detection through Orthogonal Subspace Projection

GLISP: A Scalable GNN Learning System by Exploiting Inherent Structural Properties of Graphs

Agriculture-Vision Challenge 2022 -- The Runner-Up Solution for Agricultural Pattern Recognition via Transformer-based Models

AHEAD: A Triple Attention Based Heterogeneous Graph Anomaly Detection Approach

An Effective Graph Learning based Approach for Temporal Link Prediction: The First Place of WSDM Cup 2022

BadDet: Backdoor Attacks on Object Detection

Chiral Phonon Activated Spin Seebeck Effect

Confidence May Cheat: Self-Training on Graph Neural Networks under Distribution Shift

CrowdMLP: Weakly-Supervised Crowd Counting via Multi-Granularity MLP

Decisive role of electron-phonon coupling for phonon and electron instabilities in transition metal dichalcogenides

Gaia: Graph Neural Network with Temporal Shift aware Attention for Gross Merchandise Value Forecast in E-commerce

Interpretable learning of voltage for electrode design of multivalent metal-ion batteries

KGNN: Distributed Framework for Graph Neural Knowledge Representation

Revisiting Domain Generalized Stereo Matching Networks from a Feature Consistency Perspective

RVAE-LAMOL: Residual Variational Autoencoder to Enhance Lifelong Language Learning

SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation

Towards Scalable and Privacy-Preserving Deep Neural Network via Algorithmic-Cryptographic Co-design

Transfer Attacks Revisited: A Large-Scale Empirical Study in Real Computer Vision Settings

Unravelling Distance-Dependent Inter-Site Interactions and Magnetic Transition Effects of Heteronuclear Single Atom Catalysts on Electrochemical Oxygen Reduction

Vertically Federated Graph Neural Network for Privacy-Preserving Node Classification

Cross-Domain Recommendation: Challenges, Progress, and Prospects

Dimension reduction induced anisotropic magnetic thermal conductivity in hematite nanowire

Goal-Oriented Gaze Estimation for Zero-Shot Learning

Information Bottleneck Constrained Latent Bidirectional Embedding for Zero-Shot Learning

Phase diagram and superlattice structures of monolayer phosphorus carbide (P$_x$C$_{1-x}$)

Role of Magnon-Magnon Scattering in Magnon Polaron Spin Seebeck Effect

Room temperature ferromagnetism of monolayer chromium telluride with perpendicular magnetic anisotropy

3DPVNet: Patch-level 3D Hough Voting Network for 6D Pose Estimation

A Comprehensive Analysis of Information Leakage in Deep Transfer Learning

A Semi-supervised Graph Attentive Network for Financial Fraud Detection

A Time Attention based Fraud Transaction Detection Framework

Adapted tree boosting for Transfer Learning

AGL: a Scalable System for Industrial-purpose Graph Machine Learning

Bandit Samplers for Training Graph Neural Networks

Beyond Triplet Loss: Person Re-identification with Fine-grained Difference-aware Pairwise Loss

Data-Free Adversarial Perturbations for Practical Black-Box Attack

Deep Residual-Dense Lattice Network for Speech Enhancement

Distributed Deep Forest and its Application to Automatic Detection of Cash-out Fraud

DSSLP: A Distributed Framework for Semi-supervised Link Prediction

Generating Natural Language Adversarial Examples on a Large Scale with Generative Models

Graph Representation Learning for Merchant Incentive Optimization in Mobile Payment Marketing

Heterogeneous Graph Neural Network for Recommendation

Heterogeneous Graph Neural Networks for Malicious Account Detection

How Much Can A Retailer Sell? Sales Forecasting on Tmall

Industrial Scale Privacy Preserving Deep Neural Network

InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance

Interlayer and Intralayer Scale Aggregation for Scale-invariant Crowd Counting

Learning-Based Stopping Power Mapping on Dual Energy CT for Proton Radiation Therapy

Learning-Based Synthetic Dual Energy CT Imaging from Single Energy CT for Stopping Power Ratio Calculation in Proton Radiation Therapy

NetDP: An Industrial-Scale Distributed Network Representation Framework for Default Prediction in Ant Credit Pay

Practical Privacy Preserving POI Recommendation

Privacy Preserving PCA for Multiparty Modeling

Privacy Preserving Point-of-interest Recommendation Using Decentralized Matrix Factorization

RNE: A Scalable Network Embedding for Billion-scale Recommendation

SAFE: Scalable Automatic Feature Engineering Framework for Industrial Tasks

Secret Sharing based Secure Regressions with Applications

Secure Social Recommendation based on Secret Sharing

Tailoring Magnetic Anisotropy in Cr$_2$Ge$_2$Te$_6$ by Electrostatic Gating

Uncovering Insurance Fraud Conspiracy with Network Learning

Unpack Local Model Interpretation for GBDT

A Thermal Resistance Network Model for Heat Conduction of Amorphous Polymers

Phonon renormalization induced by electric field in ferroelectric P(VDF-TrFE) nanofibers

Dimensional crossover of heat conduction in amorphous Polyimide nanofibers

Interfacial thermal conductance across metal-insulator/semiconductor interfaces due to surface states

Spin-dependent Seebeck Effect in Aharonov-Bohm Rings with Rashba and Dresselhaus Spin-orbit Interactions

The interplay of electronic reconstructions, lattice distortions, and surface oxygen vacancies in insulator-metal transition of LaAlO$_{3}$/SrTiO$_{3}$

An Electrohydrodynamics Model for Non-equilibrium Electron and Phonon Transport in Metal Films after Ultra-short Pulse Laser Heating

Inhomogeneous Thermal Conductivity Enhances Thermoelectric Cooling

On Transition Metal Catalyzed Reduction of N-nitrosodimethlamine

Spin Seebeck Effect in Asymmetric Four-Terminal Systems with Rashba Spin-Orbit Coupling

Thermal Boundary Conductance Across Metal-Nonmetal Interfaces: Effects of Electron-Phonon Coupling both in Metal and at Interface

Wave-packet rectification in nonlinear electronic systems: A tunable Aharonov-Bohm diode