Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
53works
0followers
30topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

53 published item(s)

preprint2026arXiv

Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models

Prompt learning has emerged as an efficient alternative to fine-tuning pre-trained vision-language models (VLMs). Despite its promise, current methods still struggle to maintain tail-class discriminability when adapting to class-imbalanced datasets. In this work, we propose cluster-aware neural collapse prompt tuning (CPT), which enhances the discriminability of tail classes in prompt-tuned VLMs without sacrificing their overall generalization. First, we design a cluster-invariant space by mining semantic assignments from the pre-trained VLM and mapping them to prompt-tuned features. This computes cluster-level boundaries and restricts the constraints to local neighborhoods, which reduces interference with the global semantic structure of the pre-trained VLM. Second, we introduce neural-collapse-driven discriminability optimization with three losses: textual Equiangular Tight Frame (ETF) separation loss, class-wise convergence loss, and rotation stabilization loss. These losses work together to shape intra-cluster geometry for better inter-class separation and intra-class alignment. Extensive experiments on 11 diverse datasets demonstrate that CPT outperforms SOTA methods, with stronger performance on long-tail classes and good generalization to unseen classes.

preprint2024arXiv

Unifying Structured Data as Graph for Data-to-Text Pre-Training

Data-to-text (D2T) generation aims to transform structured data into natural language text. Data-to-text pre-training has proved to be powerful in enhancing D2T generation and yields impressive performances. However, previous pre-training methods either oversimplified structured data into a sequence without considering input structures or designed training objectives tailored for a specific data structure (e.g., table or knowledge graph). In this paper, we unify different types of structured data (i.e., table, key-value data, knowledge graph) into the graph format and cast different data-to-text generation tasks as graph-to-text generation. To effectively exploit the structural information of the input graph, we propose a structure-enhanced pre-training method for D2T generation by designing a structure-enhanced Transformer. Concretely, we devise a position matrix for the Transformer, encoding relative positional information of connected nodes in the input graph. In addition, we propose a new attention matrix to incorporate graph structures into the original Transformer by taking the available explicit connectivity structure into account. Extensive experiments on six benchmark datasets show the effectiveness of our model. Our source codes are available at https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/unid2t.

preprint2022arXiv

Adaptive actuation of magnetic soft robots using deep reinforcement learning

Magnetic soft robots have attracted growing interest due to their unique advantages in terms of untethered actuation and excellent controllability. However, finding the required magnetization patterns or magnetic fields to achieve the desired functions of these robots is quite challenging in many cases. No unified framework for design has been proposed yet, and existing methods mainly rely on manual heuristics, which are hard to satisfy the high complexity level of the desired robotic motion. Here, we develop an intelligent method to solve the related inverse-design problems, implemented by introducing a novel simulation platform for magnetic soft robots based on Cosserat rod models and a deep reinforcement learning framework based on TD3. We demonstrate that magnetic soft robots with different magnetization patterns can learn to move without human guidance in simulations, and effective magnetic fields can be autonomously generated that can then be applied directly to real magnetic soft robots in an open-loop way.

preprint2022arXiv

Adaptive Structural Similarity Preserving for Unsupervised Cross Modal Hashing

Cross-modal hashing is an important approach for multimodal data management and application. Existing unsupervised cross-modal hashing algorithms mainly rely on data features in pre-trained models to mine their similarity relationships. However, their optimization objectives are based on the static metric between the original uni-modal features, without further exploring data correlations during the training. In addition, most of them mainly focus on association mining and alignment among pairwise instances in continuous space but ignore the latent structural correlations contained in the semantic hashing space. In this paper, we propose an unsupervised hash learning framework, namely Adaptive Structural Similarity Preservation Hashing (ASSPH), to solve the above problems. Firstly, we propose an adaptive learning scheme, with limited data and training batches, to enrich semantic correlations of unlabeled instances during the training process and meanwhile to ensure a smooth convergence of the training process. Secondly, we present an asymmetric structural semantic representation learning scheme. We introduce structural semantic metrics based on graph adjacency relations during the semantic reconstruction and correlation mining stage and meanwhile align the structure semantics in the hash space with an asymmetric binary optimization process. Finally, we conduct extensive experiments to validate the enhancements of our work in comparison with existing works.

preprint2022arXiv

An inequality regarding non-radiative linear waves via a geometric method

In this work we consider the operator \[ (\mathbf{T} G) (x)= \int_{\mathbb{S}^2} G(x\cdot ω, ω) dω, \quad x\in \mathbb{R}^3, \; G\in L^2(\mathbb{R}\times \mathbb{S}^2). \] This is the adjoint operator of the Radon transform. We manage to give an optimal $L^6$ decay estimate of $\mathbf{T} G$ near the infinity by a geometric method, if the function $G$ is compactly supported. As an application we give decay estimate of non-radiative solutions to the 3D linear wave equation in the exterior region $\{(x,t)\in \mathbb{R}^3 \times \mathbb{R}: |x|>R+|t|\}$. This kind of decay estimate is useful in the channel of energy method for wave equations

preprint2022arXiv

Asymptotic behaviour of non-radiative solution to the wave equations

In this work we consider weakly non-radiative solutions to both linear and non-linear wave equations. We first characterize all weakly non-radiative free waves, without the radial assumption. Then in dimension 3 we show that the initial data of non-radiative solutions to a wide range of nonlinear wave equations are similar to those of non-radiative free waves in term of asymptotic behaviour.

preprint2022arXiv

Automatic Relation-aware Graph Network Proliferation

Graph neural architecture search has sparked much attention as Graph Neural Networks (GNNs) have shown powerful reasoning capability in many relational tasks. However, the currently used graph search space overemphasizes learning node features and neglects mining hierarchical relational information. Moreover, due to diverse mechanisms in the message passing, the graph search space is much larger than that of CNNs. This hinders the straightforward application of classical search strategies for exploring complicated graph search space. We propose Automatic Relation-aware Graph Network Proliferation (ARGNP) for efficiently searching GNNs with a relation-guided message passing mechanism. Specifically, we first devise a novel dual relation-aware graph search space that comprises both node and relation learning operations. These operations can extract hierarchical node/relational information and provide anisotropic guidance for message passing on a graph. Second, analogous to cell proliferation, we design a network proliferation search paradigm to progressively determine the GNN architectures by iteratively performing network division and differentiation. The experiments on six datasets for four graph learning tasks demonstrate that GNNs produced by our method are superior to the current state-of-the-art hand-crafted and search-based GNNs. Codes are available at https://github.com/phython96/ARGNP.

preprint2022arXiv

Debiased Batch Normalization via Gaussian Process for Generalizable Person Re-Identification

Generalizable person re-identification aims to learn a model with only several labeled source domains that can perform well on unseen domains. Without access to the unseen domain, the feature statistics of the batch normalization (BN) layer learned from a limited number of source domains is doubtlessly biased for unseen domain. This would mislead the feature representation learning for unseen domain and deteriorate the generalizaiton ability of the model. In this paper, we propose a novel Debiased Batch Normalization via Gaussian Process approach (GDNorm) for generalizable person re-identification, which models the feature statistic estimation from BN layers as a dynamically self-refining Gaussian process to alleviate the bias to unseen domain for improving the generalization. Specifically, we establish a lightweight model with multiple set of domain-specific BN layers to capture the discriminability of individual source domain, and learn the corresponding parameters of the domain-specific BN layers. These parameters of different source domains are employed to deduce a Gaussian process. We randomly sample several paths from this Gaussian process served as the BN estimations of potential new domains outside of existing source domains, which can further optimize these learned parameters from source domains, and estimate more accurate Gaussian process by them in return, tending to real data distribution. Even without a large number of source domains, GDNorm can still provide debiased BN estimation by using the mean path of the Gaussian process, while maintaining low computational cost during testing. Extensive experiments demonstrate that our GDNorm effectively improves the generalization ability of the model on unseen domain.

preprint2022arXiv

DSPNet: Towards Slimmable Pretrained Networks based on Discriminative Self-supervised Learning

Self-supervised learning (SSL) has achieved promising downstream performance. However, when facing various resource budgets in real-world applications, it costs a huge computation burden to pretrain multiple networks of various sizes one by one. In this paper, we propose Discriminative-SSL-based Slimmable Pretrained Networks (DSPNet), which can be trained at once and then slimmed to multiple sub-networks of various sizes, each of which faithfully learns good representation and can serve as good initialization for downstream tasks with various resource budgets. Specifically, we extend the idea of slimmable networks to a discriminative SSL paradigm, by integrating SSL and knowledge distillation gracefully. We show comparable or improved performance of DSPNet on ImageNet to the networks individually pretrained one by one under the linear evaluation and semi-supervised evaluation protocols, while reducing large training cost. The pretrained models also generalize well on downstream detection and segmentation tasks. Code will be made public.

preprint2022arXiv

Energy and Spectrum Efficient Federated Learning via High-Precision Over-the-Air Computation

Federated learning (FL) enables mobile devices to collaboratively learn a shared prediction model while keeping data locally. However, there are two major research challenges to practically deploy FL over mobile devices: (i) frequent wireless updates of huge size gradients v.s. limited spectrum resources, and (ii) energy-hungry FL communication and local computing during training v.s. battery-constrained mobile devices. To address those challenges, in this paper, we propose a novel multi-bit over-the-air computation (M-AirComp) approach for spectrum-efficient aggregation of local model updates in FL and further present an energy-efficient FL design for mobile devices. Specifically, a high-precision digital modulation scheme is designed and incorporated in the M-AirComp, allowing mobile devices to upload model updates at the selected positions simultaneously in the multi-access channel. Moreover, we theoretically analyze the convergence property of our FL algorithm. Guided by FL convergence analysis, we formulate a joint transmission probability and local computing control optimization, aiming to minimize the overall energy consumption (i.e., iterative local computing + multi-round communications) of mobile devices in FL. Extensive simulation results show that our proposed scheme outperforms existing ones in terms of spectrum utilization, energy efficiency, and learning accuracy.

preprint2022arXiv

Energy Efficient Federated Learning over Heterogeneous Mobile Devices via Joint Design of Weight Quantization and Wireless Transmission

Federated learning (FL) is a popular collaborative distributed machine learning paradigm across mobile devices. However, practical FL over resource constrained mobile devices confronts multiple challenges, e.g., the local on-device training and model updates in FL are power hungry and radio resource intensive for mobile devices. To address these challenges, in this paper, we attempt to take FL into the design of future wireless networks and develop a novel joint design of wireless transmission and weight quantization for energy efficient FL over mobile devices. Specifically, we develop flexible weight quantization schemes to facilitate on-device local training over heterogeneous mobile devices. Based on the observation that the energy consumption of local computing is comparable to that of model updates, we formulate the energy efficient FL problem into a mixed-integer programming problem where the quantization and spectrum resource allocation strategies are jointly determined for heterogeneous mobile devices to minimize the overall FL energy consumption (computation + transmissions) while guaranteeing model performance and training latency. Since the optimization variables of the problem are strongly coupled, an efficient iterative algorithm is proposed, where the bandwidth allocation and weight quantization levels are derived. Extensive simulations are conducted to verify the effectiveness of the proposed scheme.

preprint2022arXiv

Entity-enhanced Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding

Weakly supervised Referring Expression Grounding (REG) aims to ground a particular target in an image described by a language expression while lacking the correspondence between target and expression. Two main problems exist in weakly supervised REG. First, the lack of region-level annotations introduces ambiguities between proposals and queries. Second, most previous weakly supervised REG methods ignore the discriminative location and context of the referent, causing difficulties in distinguishing the target from other same-category objects. To address the above challenges, we design an entity-enhanced adaptive reconstruction network (EARN). Specifically, EARN includes three modules: entity enhancement, adaptive grounding, and collaborative reconstruction. In entity enhancement, we calculate semantic similarity as supervision to select the candidate proposals. Adaptive grounding calculates the ranking score of candidate proposals upon subject, location and context with hierarchical attention. Collaborative reconstruction measures the ranking result from three perspectives: adaptive reconstruction, language reconstruction and attribute classification. The adaptive mechanism helps to alleviate the variance of different referring expressions. Experiments on five datasets show EARN outperforms existing state-of-the-art methods. Qualitative results demonstrate that the proposed EARN can better handle the situation where multiple objects of a particular category are situated together.

preprint2022arXiv

EpiGNN: Exploring Spatial Transmission with Graph Neural Network for Regional Epidemic Forecasting

Epidemic forecasting is the key to effective control of epidemic transmission and helps the world mitigate the crisis that threatens public health. To better understand the transmission and evolution of epidemics, we propose EpiGNN, a graph neural network-based model for epidemic forecasting. Specifically, we design a transmission risk encoding module to characterize local and global spatial effects of regions in epidemic processes and incorporate them into the model. Meanwhile, we develop a Region-Aware Graph Learner (RAGL) that takes transmission risk, geographical dependencies, and temporal information into account to better explore spatial-temporal dependencies and makes regions aware of related regions' epidemic situations. The RAGL can also combine with external resources, such as human mobility, to further improve prediction performance. Comprehensive experiments on five real-world epidemic-related datasets (including influenza and COVID-19) demonstrate the effectiveness of our proposed method and show that EpiGNN outperforms state-of-the-art baselines by 9.48% in RMSE.

preprint2022arXiv

FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset

We present FaceVerse, a fine-grained 3D Neural Face Model, which is built from hybrid East Asian face datasets containing 60K fused RGB-D images and 2K high-fidelity 3D head scan models. A novel coarse-to-fine structure is proposed to take better advantage of our hybrid dataset. In the coarse module, we generate a base parametric model from large-scale RGB-D images, which is able to predict accurate rough 3D face models in different genders, ages, etc. Then in the fine module, a conditional StyleGAN architecture trained with high-fidelity scan models is introduced to enrich elaborate facial geometric and texture details. Note that different from previous methods, our base and detailed modules are both changeable, which enables an innovative application of adjusting both the basic attributes and the facial details of 3D face models. Furthermore, we propose a single-image fitting framework based on differentiable rendering. Rich experiments show that our method outperforms the state-of-the-art methods.

preprint2022arXiv

Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment

Training a generative adversarial network (GAN) with limited data has been a challenging task. A feasible solution is to start with a GAN well-trained on a large scale source domain and adapt it to the target domain with a few samples, termed as few shot generative model adaption. However, existing methods are prone to model overfitting and collapse in extremely few shot setting (less than 10). To solve this problem, we propose a relaxed spatial structural alignment method to calibrate the target generative models during the adaption. We design a cross-domain spatial structural consistency loss comprising the self-correlation and disturbance correlation consistency loss. It helps align the spatial structural information between the synthesis image pairs of the source and target domains. To relax the cross-domain alignment, we compress the original latent space of generative models to a subspace. Image pairs generated from the subspace are pulled closer. Qualitative and quantitative experiments show that our method consistently surpasses the state-of-the-art methods in few shot setting.

preprint2022arXiv

Fingerprint of the Interbond Electron Hopping in Second-Order Harmonic Generation

We experimentally explore the fingerprint of the microscopic electron dynamics in second-order harmonic generation (SHG). It is shown that the interbond electron hopping induces a novel source of nonlinear polarization and plays an important role even when the driving laser intensity is 2 orders of magnitude lower than the characteristic atomic field. Our model predicts anomalous anisotropic structures of the SHG yield contributed by the interbond electron hopping, which is identified in our experiments with ZnO crystals. Moreover, a generalized second-order susceptibility with an explicit form is proposed, which provides a unified description in both the weak and strong field regimes. Our work reveals the nonlinear responses of materials at the electron scale and extends the nonlinear optics to a previously unexplored regime, where the nonlinearity related to the interbond electron hopping becomes dominant. It paves the way for realizing controllable nonlinearity on an ultrafast time scale.

preprint2022arXiv

Impact of tensor force on quantum shell effects in quasifission reactions

Quantum shell effects drive many aspects of many-body quantal systems and their interactions. Among these are the quasifission reactions that impede the formation of a compound nucleus in superheavy element (SHE) searches. Fragment production in quasifission is influenced by shell effects as a nontrivial manifestation of microscopic dynamics hindering the full equilibration of the composite system to form the compound nucleus. In this Letter, we use the microscopic time-dependent Hartree-Fock (TDHF) theory to study 48Ca+249Bk collisions to investigate the influence of the tensor component of the effective nucleon-nucleon interaction. The results show that the inclusion of the tensor force causes the spherical shell effect to become more prominent, particularly for the neutron number yield whose peak is exactly at magic number N = 126. This suggests that the tensor force plays a compelling role in the evolution of dynamical shell effects in nuclear reactions, influencing the competition between spherical and deformed shell gaps.

preprint2022arXiv

IR-GAN: Image Manipulation with Linguistic Instruction by Increment Reasoning

Conditional image generation is an active research topic including text2image and image translation. Recently image manipulation with linguistic instruction brings new challenges of multimodal conditional generation. However, traditional conditional image generation models mainly focus on generating high-quality and visually realistic images, and lack resolving the partial consistency between image and instruction. To address this issue, we propose an Increment Reasoning Generative Adversarial Network (IR-GAN), which aims to reason the consistency between visual increment in images and semantic increment in instructions. First, we introduce the word-level and instruction-level instruction encoders to learn user's intention from history-correlated instructions as semantic increment. Second, we embed the representation of semantic increment into that of source image for generating target image, where source image plays the role of referring auxiliary. Finally, we propose a reasoning discriminator to measure the consistency between visual increment and semantic increment, which purifies user's intention and guarantees the good logic of generated target image. Extensive experiments and visualization conducted on two datasets show the effectiveness of IR-GAN.

preprint2022arXiv

Is magnetically dominated outflow required to explain GRBs?

The composition of relativistic outflows producing gamma-ray bursts is a long standing open question. One of the main arguments in favor of magnetically dominated outflows is the absence of photospheric component in their broadband time resolved spectra, with such notable example as GRB 080916C. Here, we perform a time-resolved analysis of this burst and confirm the previous detection of an additional spectral component. We show that this subdominant component is consistent with the photosphere of ultrarelativistic baryonic outflow, deep in the coasting regime. We argue that, contrary to previous statements, the magnetic dominance of the outflow is not required for the interpretation of this GRB. Moreover, simultaneous detection of high energy emission in its prompt phase requires departure from a one-zone emission model.

preprint2022arXiv

Local Sample-weighted Multiple Kernel Clustering with Consensus Discriminative Graph

Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels. Constructing precise and local kernel matrices is proved to be of vital significance in applications since the unreliable distant-distance similarity estimation would degrade clustering per-formance. Although existing localized MKC algorithms exhibit improved performance compared to globally-designed competi-tors, most of them widely adopt KNN mechanism to localize kernel matrix by accounting for τ -nearest neighbors. However, such a coarse manner follows an unreasonable strategy that the ranking importance of different neighbors is equal, which is impractical in applications. To alleviate such problems, this paper proposes a novel local sample-weighted multiple kernel clustering (LSWMKC) model. We first construct a consensus discriminative affinity graph in kernel space, revealing the latent local structures. Further, an optimal neighborhood kernel for the learned affinity graph is output with naturally sparse property and clear block diagonal structure. Moreover, LSWMKC im-plicitly optimizes adaptive weights on different neighbors with corresponding samples. Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms. The source code of LSWMKC can be publicly accessed from https://github.com/liliangnudt/LSWMKC.

preprint2022arXiv

Modality-Adaptive Mixup and Invariant Decomposition for RGB-Infrared Person Re-Identification

RGB-infrared person re-identification is an emerging cross-modality re-identification task, which is very challenging due to significant modality discrepancy between RGB and infrared images. In this work, we propose a novel modality-adaptive mixup and invariant decomposition (MID) approach for RGB-infrared person re-identification towards learning modality-invariant and discriminative representations. MID designs a modality-adaptive mixup scheme to generate suitable mixed modality images between RGB and infrared images for mitigating the inherent modality discrepancy at the pixel-level. It formulates modality mixup procedure as Markov decision process, where an actor-critic agent learns dynamical and local linear interpolation policy between different regions of cross-modality images under a deep reinforcement learning framework. Such policy guarantees modality-invariance in a more continuous latent space and avoids manifold intrusion by the corrupted mixed modality samples. Moreover, to further counter modality discrepancy and enforce invariant visual semantics at the feature-level, MID employs modality-adaptive convolution decomposition to disassemble a regular convolution layer into modality-specific basis layers and a modality-shared coefficient layer. Extensive experimental results on two challenging benchmarks demonstrate superior performance of MID over state-of-the-art methods.

preprint2022arXiv

Model order reduction for parameterized electromagnetic problems using matrix decomposition and deep neural networks

A non-intrusive model order reduction (MOR) method for solving parameterized electromagnetic scattering problems is proposed in this paper. A database collecting snapshots of high-fidelity solutions is built by solving the parameterized time-domain Maxwell equations for some values of the material parameters using a fullwave solver based on a high order discontinuous Galerkin time-domain (DGTD) method. To perform a prior dimensionality reduction, a set of reduced basis (RB) functions are extracted from the database via a two-step proper orthogonal decomposition (POD) method. Projection coefficients of the reduced basis functions are further compressed through a convolutional autoencoder (CAE) network. Singular value decomposition (SVD) is then used to extract the principal components of the reduced-order matrices generated by CAE, and a cubic spline interpolation-based (CSI) approach is employed for approximating the dominating time- and parameter-modes of the reduced-order matrices. The generation of the reduced basis and the training of the CAE and CSI are accomplished in the offline stage, thus the RB solution for given time/parameter values can be quickly recovered via outputs of the interpolation model and decoder network. In particular, the offline and online stages of the proposed RB method are completely decoupled, which ensures the validity of the method. The performance of the proposed CAE-CSI ROM is illustrated with numerical experiments for scattering of a plane wave by a 2-D dielectric disk and a multi-layer heterogeneous medium.

preprint2022arXiv

Multiple Kernel Clustering with Dual Noise Minimization

Clustering is a representative unsupervised method widely applied in multi-modal and multi-view scenarios. Multiple kernel clustering (MKC) aims to group data by integrating complementary information from base kernels. As a representative, late fusion MKC first decomposes the kernels into orthogonal partition matrices, then learns a consensus one from them, achieving promising performance recently. However, these methods fail to consider the noise inside the partition matrix, preventing further improvement of clustering performance. We discover that the noise can be disassembled into separable dual parts, i.e. N-noise and C-noise (Null space noise and Column space noise). In this paper, we rigorously define dual noise and propose a novel parameter-free MKC algorithm by minimizing them. To solve the resultant optimization problem, we design an efficient two-step iterative strategy. To our best knowledge, it is the first time to investigate dual noise within the partition in the kernel space. We observe that dual noise will pollute the block diagonal structures and incur the degeneration of clustering performance, and C-noise exhibits stronger destruction than N-noise. Owing to our efficient mechanism to minimize dual noise, the proposed algorithm surpasses the recent methods by large margins.

preprint2022arXiv

Nanoscale three-dimensional magnetic sensing with a probabilistic nanomagnet driven by spin-orbit torque

Detection of vector magnetic fields at nanoscale dimensions is critical in applications ranging from basic material science, to medical diagnostic. Meanwhile, an all-electric operation is of great significance for achieving a simple and compact sensing system. Here, we propose and experimentally demonstrate a simple approach to sensing a vector magnetic field at nanoscale dimensions, by monitoring a probabilistic nanomagnet's transition probability from a metastable state, excited by a driving current due to SOT, to a settled state. We achieve sensitivities for Hx, Hy, and Hz of 1.02%/Oe, 1.09%/Oe and 3.43%/Oe, respectively, with a 200 x 200 nm^2 nanomagnet. The minimum detectable field is dependent on the driving pulse events N, and is expected to be as low as 1 uT if N = 3 x 10^6.

preprint2022arXiv

Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

Open-vocabulary object detection aims to detect novel object categories beyond the training set. The advanced open-vocabulary two-stage detectors employ instance-level visual-to-visual knowledge distillation to align the visual space of the detector with the semantic space of the Pre-trained Visual-Language Model (PVLM). However, in the more efficient one-stage detector, the absence of class-agnostic object proposals hinders the knowledge distillation on unseen objects, leading to severe performance degradation. In this paper, we propose a hierarchical visual-language knowledge distillation method, i.e., HierKD, for open-vocabulary one-stage detection. Specifically, a global-level knowledge distillation is explored to transfer the knowledge of unseen categories from the PVLM to the detector. Moreover, we combine the proposed global-level knowledge distillation and the common instance-level knowledge distillation to learn the knowledge of seen and unseen categories simultaneously. Extensive experiments on MS-COCO show that our method significantly surpasses the previous best one-stage detector with 11.9\% and 6.7\% $AP_{50}$ gains under the zero-shot detection and generalized zero-shot detection settings, and reduces the $AP_{50}$ performance gap from 14\% to 7.3\% compared to the best two-stage detector.

preprint2022arXiv

Quantum photonics in triangular-cross-section nanodevices in silicon carbide

Silicon carbide is evolving as a prominent solid-state platform for the realization of quantum information processing hardware. Angle-etched nanodevices are emerging as a solution to photonic integration in bulk substrates where color centers are best defined. We model triangular cross-section waveguides and photonic crystal cavities using Finite-Difference Time-Domain and Finite-Difference Eigensolver approaches. We analyze optimal color center positioning within the modes of these devices and provide estimates on achievable Purcell enhancement in nanocavities with applications in quantum communications. Using open quantum system modeling, we explore emitter-cavity interactions of multiple non-identical color centers coupled to both a single cavity and a photonic crystal molecule in SiC. We observe polariton and subradiant state formation in the cavity-protected regime of cavity quantum electrodynamics applicable in quantum simulation.

preprint2022arXiv

Radiation fields and non-radiative solutions to the energy sub-critical wave equations

Radiation field and channel of energy method have become important tools in the study of nonlinear wave equations in recent years. In this work we give basic theory of radiation fields of free waves in the energy sub-critical case. We also show that the asymptotic behaviours of non-radiative solutions to a wide range of non-linear wave equations resemble those of non-radiative free waves. Our theory is completely given in the critical Sobolev spaces of the corresponding nonlinear wave equation and avoids any assumption on the energy of the solutions.

preprint2022arXiv

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Tracking visual objects from a single initial exemplar in the testing phase has been broadly cast as a one-/few-shot problem, i.e., one-shot learning for initial adaptation and few-shot learning for online adaptation. The recent few-shot online adaptation methods incorporate the prior knowledge from large amounts of annotated training data via complex meta-learning optimization in the offline phase. This helps the online deep trackers to achieve fast adaptation and reduce overfitting risk in tracking. In this paper, we propose a simple yet effective recursive least-squares estimator-aided online learning approach for few-shot online adaptation without requiring offline training. It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before, and thus the seen data can be safely removed from training. This also bears certain similarities to the emerging continual learning field in preventing catastrophic forgetting. This mechanism enables us to unveil the power of modern online deep trackers without incurring too much extra computational cost. We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP. The consistent improvements on several challenging tracking benchmarks demonstrate its effectiveness and efficiency.

preprint2022arXiv

Relativistic Effects and GRB Polarization in Power-Law Evolution

Despite decades of polarization observations and high-significance polarized $γ$-ray, X-ray, optical, and radio emissions in gamma-ray bursts (GRBs) have been accumulating in dozens of cases, people have yet to find a consistent scenario for understanding the globally observed timing properties of GRB polarization to date. Here, we report that the observed properties of GRB polarization exhibit a four-segment timing evolution at the cosmological distance: (I) an initial hump early on (within the first few seconds); (II) a later on power-law decay (from $\sim$10$^{1}$ to $\sim$10$^{4}$ s), which takes the form of $π_{\rm obs} \propto t^{-0.50 \pm 0.02}$; (III) afterwards a late-time rebrightening hump (from $\sim$10$^{4}$ to $\sim$10$^{5}$ s); and (IV) finally a flatting power-law decay (from $\sim$ 10$^{5}$ to $\sim$ 10$^{7}$ s), with the the form of $π_{\rm obs} \propto t^{-0.21 \pm 0.08}$. These findings may present a challenge to the mainstream of polarization models that assume the polarization time evolution change in different emission regions. We show that these results can be explained by relativistic and geometric effects of a highly relativistic and magnetized jet generated by the central engine, and "magnetic patches" distributed as a globally random but locally coherent form. Our analysis suggests that there is a single dominant mechanism that might account for the global observational properties of GRB polarization, and other emission mechanisms and effects might play a role in spatially local and temporally short effects on GRB polarization.

preprint2022arXiv

Superconductivity in the nodal-line compound La$_3$Pt$_3$Bi$_4$

Owing to the specific topological states in nodal-line semimetals, novel topological superconductivity is expected to emerge in these systems. In this letter, by combination of the first-principles calculations and resistivity, susceptibility and specific heat measurements, we demonstrate that La$_3$Pt$_3$Bi$_4$ is a topologically nontrivial nodal-ring semimetal protected by the gliding-mirror symmetry even in the presence of spin-orbit coupling. Meanwhile, we discover bulk superconductivity with a transition temperature of $\sim$1.1 K, and an upper critical field of $\sim$0.41 T. These findings demonstrate that La$_3$Pt$_3$Bi$_4$ provides a material platform for studying novel superconductivity in the nodal-ring system.

preprint2022arXiv

The concentration of zero-noise limits of invariant measures for stochastic dynamical systems

In this paper, we study concentration phenomena of zero-noise limits of invariant measures for stochastic differential equations defined on $\mathbb{R}^d$ with locally Lipschitz continuous coefficients and more than one ergodic state. Under some dissipative conditions, by using Lyapunov-like functions and large deviations methods, we estimate the invariant measures in neighborhoods of stable sets, neighborhoods of unstable sets and their complement, respectively. Our result illustrates that invariant measures concentrate on the intersection of stable sets where a cost functional $W(K_i)$ is minimized and the Birkhoff center of the corresponding deterministic systems as noise tends down to zero. Furthermore, we prove the large deviations principle of invariant measures. At the end of this paper, we provide some explicit examples and their numerical simulations.

preprint2022arXiv

The white dwarf binary merger model of GRB 170817A

Following the GRB 170817A prompt emission lasting a fraction of a second, $10^8$ s of data in the X-rays, optical, and radio wavelengths have been acquired. We here present a model that fits the spectra, flux, and time variability of all these emissions, based on the thermal and synchrotron cooling of the expanding matter ejected in a binary white dwarf merger. The $10^{-3} M_\odot$ of ejecta, expanding at velocities of $10^9$ cm s$^{-1}$, are powered by the newborn massive, fast rotating, magnetized white dwarf with a mass of $1.3 M_\odot$, a rotation period of $\gtrsim 12$ s, and a dipole magnetic field $\sim 10^{10}$ G, born in the merger of a $1.0+0.8 M_\odot$ white dwarf binary. Therefore, the long-lasting mystery of the GRB 170817A nature is solved by the merger of a white dwarf binary that also explains the prompt emission energetics.

preprint2022arXiv

Unsupervised Coherent Video Cartoonization with Perceptual Motion Consistency

In recent years, creative content generations like style transfer and neural photo editing have attracted more and more attention. Among these, cartoonization of real-world scenes has promising applications in entertainment and industry. Different from image translations focusing on improving the style effect of generated images, video cartoonization has additional requirements on the temporal consistency. In this paper, we propose a spatially-adaptive semantic alignment framework with perceptual motion consistency for coherent video cartoonization in an unsupervised manner. The semantic alignment module is designed to restore deformation of semantic structure caused by spatial information lost in the encoder-decoder architecture. Furthermore, we devise the spatio-temporal correlative map as a style-independent, global-aware regularization on the perceptual motion consistency. Deriving from similarity measurement of high-level features in photo and cartoon frames, it captures global semantic information beyond raw pixel-value in optical flow. Besides, the similarity measurement disentangles temporal relationships from domain-specific style properties, which helps regularize the temporal consistency without hurting style effects of cartoon images. Qualitative and quantitative experiments demonstrate our method is able to generate highly stylistic and temporal consistent cartoon videos.

preprint2022arXiv

YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications

For years, the YOLO series has been the de facto industry-level standard for efficient object detection. The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios. In this technical report, we strive to push its limits to the next level, stepping forward with an unwavering mindset for industry application. Considering the diverse requirements for speed and accuracy in the real environment, we extensively examine the up-to-date object detection advancements either from industry or academia. Specifically, we heavily assimilate ideas from recent network design, training strategies, testing techniques, quantization, and optimization methods. On top of this, we integrate our thoughts and practice to build a suite of deployment-ready networks at various scales to accommodate diversified use cases. With the generous permission of YOLO authors, we name it YOLOv6. We also express our warm welcome to users and contributors for further enhancement. For a glimpse of performance, our YOLOv6-N hits 35.9% AP on the COCO dataset at a throughput of 1234 FPS on an NVIDIA Tesla T4 GPU. YOLOv6-S strikes 43.5% AP at 495 FPS, outperforming other mainstream detectors at the same scale~(YOLOv5-S, YOLOX-S, and PPYOLOE-S). Our quantized version of YOLOv6-S even brings a new state-of-the-art 43.3% AP at 869 FPS. Furthermore, YOLOv6-M/L also achieves better accuracy performance (i.e., 49.5%/52.3%) than other detectors with a similar inference speed. We carefully conducted experiments to validate the effectiveness of each component. Our code is made available at https://github.com/meituan/YOLOv6.

preprint2021arXiv

Complex contraction on trees without proof of correlation decay

We prove complex contraction for zero-free regions of counting weighted set cover problem in which an element can appear in an unbounded number of sets, thus obtaining fully polynomial-time approximation schemes(FPTAS) via Barvinok's algorithmic paradigm\cite{barvinok2016combinatorics}. Relying on the computation tree expansion, our approach does not need proof of correlation decay in the real axis. We directly look in the complex plane for a region that contracts into its interior as the tree recursion procedure goes from leaves to the root. For the class of problems under the framework of weighted set covers, we are able to give a general approach for describing the contraction regions and draw a unified algorithmic conclusion. Several previous results, including counting (weighted-)edge covers, counting bipartite independent sets and counting monotone CNFs can be completely or partially covered by our main theorem. In contrast to the correlation decay method which also depends on tree expansions and needs different potential functions for different problems, our approach is more generic in the sense that our contraction region for different problems shares a common shape in the complex plane.

preprint2021arXiv

Dissecting the Energy Budget of a Gamma-Ray Burst Fireball

The jet composition and radiative efficiency of GRBs are poorly constrained from the data. If the jet composition is matter-dominated (i.e. a fireball), the GRB prompt emission spectra would include a dominant thermal component originating from the fireball photosphere, and a non-thermal component presumably originating from internal shocks whose radii are greater than the photosphere radius. We propose a method to directly dissect the GRB fireball energy budget into three components and measure their values by combining the prompt emission and early afterglow data. The measured parameters include the initial dimensionless specific enthalpy density ($η$), bulk Lorentz factors at the photosphere radius ($Γ_{\rm ph}$) and before fireball deceleration ($Γ_0$), the amount of mass loading ($M$), as well as the GRB radiative efficiency ($η_γ$). All the parameters can be derived from the data for a GRB with a dominant thermal spectral component, a deceleration bump feature in the early afterglow lightcurve, and a measured redshift. The results only weakly depend on the density $n$ of the interstellar medium when the composition ${\cal Y}$ parameter (typically unity) is specified.

preprint2021arXiv

Dynamical transition of hydromagnetic convection in a rotating fluid layer

In this article, we aim to study the stability and dynamic transition of an electrically conducting fluid in the presence of an external uniform horizontal magnetic field and rotation based on a Boussinesq approximation model. By analyzing the spectrum of the linear part of the model and verifying the validity of the principle of exchange of stability, we take a hybrid approach combining theoretical analysis with numerical computation to study the transition from a simple real eigenvalue, a pair of complex conjugate eigenvalues and a real eigenvalue of multiplicity two, respectively. The center manifold reduction theory is applied to reduce the infinite dimensional system to the corresponding finite dimensional one together with one or several non-dimensional transition numbers that determine the dynamic transition types. Careful numerical computations are performed to determine these transition numbers as well as related temporal and flow patterns etc. Our results indicate that both continuous and jump transitions can occur at certain parameter region.

preprint2021arXiv

Magnetic field-tuned quantum criticality in optimally electron-doped cuprate thin films

Antiferromagnetic (AF) spin fluctuations are commonly believed to play a key role in electron pairing of cuprate superconductors. In electron-doped cuprates, it is still in paradox about the interplay among different electronic states in quantum perturbations, especially between superconducting and magnetic states. Here, we report a systematic transport study on cation-optimized La2-xCexCuO4 (x = 0.10) thin films in high magnetic fields. We find an AF quantum phase transition near 60 T, where the Hall number jumps from nH =-x to nH = 1-x, resembling the change of nH at the AF boundary (xAF = 0.14) tuned by Ce doping. In the AF region a spin dependent state manifesting anomalous positive magnetoresistance is observed, which is closely related to superconductivity. Once the AF state is suppressed by magnetic field, a polarized ferromagnetic state is predicted, reminiscent of the recently reported ferromagnetic state at the quantum endpoint of the superconducting dome by Ce doping. The magnetic field that drives phase transitions in a similar but distinct manner to doping thereby provides a unique perspective to understand the quantum criticality of electron-doped cuprates.

preprint2021arXiv

Towards Energy Efficient Federated Learning over 5G+ Mobile Devices

The continuous convergence of machine learning algorithms, 5G and beyond (5G+) wireless communications, and artificial intelligence (AI) hardware implementation hastens the birth of federated learning (FL) over 5G+ mobile devices, which pushes AI functions to mobile devices and initiates a new era of on-device AI applications. Despite the remarkable progress made in FL, huge energy consumption is one of the most significant obstacles restricting the development of FL over battery-constrained 5G+ mobile devices. To address this issue, in this paper, we investigate how to develop energy efficient FL over 5G+ mobile devices by making a trade-off between energy consumption for "working" (i.e., local computing) and that for "talking" (i.e., wireless communications) in order to boost the overall energy efficiency. Specifically, we first examine energy consumption models for graphics processing unit (GPU) computation and wireless transmissions. Then, we overview the state of the art of integrating FL procedure with energy-efficient learning techniques (e.g., gradient sparsification, weight quantization, pruning, etc.). Finally, we present several potential future research directions for FL over 5G+ mobile devices from the perspective of energy efficiency.

preprint2020arXiv

An Update to the Letter of Intent for MATHUSLA: Search for Long-Lived Particles at the HL-LHC

We report on recent progress in the design of the proposed MATHUSLA Long Lived Particle (LLP) detector for the HL-LHC, updating the information in the original Letter of Intent (LoI), see CDS:LHCC-I-031, arXiv:1811.00927. A suitable site has been identified at LHC Point 5 that is closer to the CMS Interaction Point (IP) than assumed in the LoI. The decay volume has been increased from 20 m to 25 m in height. Engineering studies have been made in order to locate much of the decay volume below ground, bringing the detector even closer to the IP. With these changes, a 100 m x 100 m detector has the same physics reach for large c$τ$ as the 200 m x 200 m detector described in the LoI and other studies. The performance for small c$τ$ is improved because of the proximity to the IP. Detector technology has also evolved while retaining the strip-like sensor geometry in Resistive Plate Chambers (RPC) described in the LoI. The present design uses extruded scintillator bars read out using wavelength shifting fibers and silicon photomultipliers (SiPM). Operations will be simpler and more robust with much lower operating voltages and without the use of greenhouse gases. Manufacturing is straightforward and should result in cost savings. Understanding of backgrounds has also significantly advanced, thanks to new simulation studies and measurements taken at the MATHUSLA test stand operating above ATLAS in 2018. We discuss next steps for the MATHUSLA collaboration, and identify areas where new members can make particularly important contributions.

preprint2020arXiv

Atomic origin for hydrogenation promoted bulk oxygen vacancies removal in vanadium dioxide

Oxygen vacancies (VO), a common type of point defects in metal oxides materials, play important roles on the physical and chemical properties. To obtain stoichiometric oxide crystal, the pre-existing VO is always removed via careful post-annealing treatment at high temperature in air or oxygen atmosphere. However, the annealing conditions is difficult to control and the removal of VO in bulk phase is restrained due to high energy barrier of VO migration. Here, we selected VO2 crystal film as the model system and developed an alternative annealing treatment aided by controllable hydrogen doping, which can realizes effective removal of VO defects in VO2-δ crystal at lower temperature. This finding is attributed to the hydrogenation accelerated oxygen vacancies recovery in VO2-δ crystal. Theoretical calculations revealed that the H-doping induced electrons are prone to accumulate around the oxygen defects in VO2-δ film, which facilitates the diffusion of VO and thus makes it easier to be removed. The methodology is expected to be applied to other metal oxides for oxygen-related point defects control.

preprint2020arXiv

Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments

Action and observation delays exist prevalently in the real-world cyber-physical systems which may pose challenges in reinforcement learning design. It is particularly an arduous task when handling multi-agent systems where the delay of one agent could spread to other agents. To resolve this problem, this paper proposes a novel framework to deal with delays as well as the non-stationary training issue of multi-agent tasks with model-free deep reinforcement learning. We formally define the Delay-Aware Markov Game that incorporates the delays of all agents in the environment. To solve Delay-Aware Markov Games, we apply centralized training and decentralized execution that allows agents to use extra information to ease the non-stationarity issue of the multi-agent systems during training, without the need of a centralized controller during execution. Experiments are conducted in multi-agent particle environments including cooperative communication, cooperative navigation, and competitive experiments. We also test the proposed algorithm in traffic scenarios that require coordination of all autonomous vehicles to show the practical value of delay-awareness. Results show that the proposed delay-aware multi-agent reinforcement learning algorithm greatly alleviates the performance degradation introduced by delay. Codes and demo videos are available at: https://github.com/baimingc/delay-aware-MARL.

preprint2020arXiv

Metamagnetic transitions and anomalous magnetoresistance in EuAg$_4$As$_2$ single crystal

In this paper, the magnetic and transport properties were systematically studied for EuAg$_4$As$_2$ single crystals, crystallizing in a centrosymmetric trigonal CaCu$_4$P$_2$ type structure. It was confirmed that two magnetic transitions occur at $\textit{T}$$_{N1}$ = 10 K and $\textit{T}$$_{N2}$ = 15 K, respectively. With the increasing field, the two transitions are noticeably driven to lower temperature. At low temperatures, applying a magnetic field in the $\textit{ab}$ plane induces two successive metamagnetic transitions. For both $\textit{H}$ $\parallel$ $\textit{ab}$ and $\textit{H}$ $\parallel$ $\textit{c}$, EuAg$_4$As$_2$ shows a positive, unexpected large magnetoresistance (up to 202\%) at low fields below 10 K, and a large negative magnetoresistance (up to -78\%) at high fields/intermediate temperatures. Such anomalous field dependence of magnetoresistance may have potential application in the future magnetic sensors. Finally, the magnetic phase diagrams of EuAg$_{4}$As$_{2}$ were constructed for both $\textit{H}$ $\parallel$ $\textit{ab}$ and $\textit{H}$ $\parallel$ $\textit{c}$.

preprint2020arXiv

Parsing-based View-aware Embedding Network for Vehicle Re-Identification

Vehicle Re-Identification is to find images of the same vehicle from various views in the cross-camera scenario. The main challenges of this task are the large intra-instance distance caused by different views and the subtle inter-instance discrepancy caused by similar vehicles. In this paper, we propose a parsing-based view-aware embedding network (PVEN) to achieve the view-aware feature alignment and enhancement for vehicle ReID. First, we introduce a parsing network to parse a vehicle into four different views, and then align the features by mask average pooling. Such alignment provides a fine-grained representation of the vehicle. Second, in order to enhance the view-aware features, we design a common-visible attention to focus on the common visible views, which not only shortens the distance among intra-instances, but also enlarges the discrepancy of inter-instances. The PVEN helps capture the stable discriminative information of vehicle under different views. The experiments conducted on three datasets show that our model outperforms state-of-the-art methods by a large margin.

preprint2020arXiv

PrBi: Topology meets quadrupolar degrees of freedom

Novel materials incorporating electronic degrees of freedom other than charge, including spin, orbital or valley \textit{et al} have manifested themselves to be of the great interests and applicable potentials. Recently, the multipolar degrees of freedom have attracted remarkable attention in the electronic correlated effects. In this work, we systematically studied the transport, magnetic and thermodynamic properties of the topological semimetal candidate PrBi in the framework of crystalline electric field theory. Our results demonstrate the $Γ_3$ non-Kramers doublet as the ground state of Pr$^{3+}$ (4$f^2$) ions. This ground state is nonmagnetic but carries a non-zero quadrupolar moment $\langle\hat{O}_2^0\rangle$. A quadrupolar phase transition is inferred below 0.08 K. No obvious quadrupolar Kondo effect can be identified. Ultrahigh-field quantum oscillation measurements confirm PrBi as a semimetal with non-trivial Berry phase and low total carrier density 0.06 /f.u. We discuss the interplay between low carrier density and $4f^2$ quadrupolar moment, and ascribe the weak quadrupolar ordering and Kondo effect to consequences of the low carrier density. PrBi, thus, opens a new window to the physics of topology and strongly correlated effect with quadrupolar degrees of freedom in the low-carrier-density limit, evoking the need for a reexamination of the Nozières exhaustion problem in the context of multi-channel Kondo effect.

preprint2020arXiv

Real-world Person Re-Identification via Degradation Invariance Learning

Person re-identification (Re-ID) in real-world scenarios usually suffers from various degradation factors, e.g., low-resolution, weak illumination, blurring and adverse weather. On the one hand, these degradations lead to severe discriminative information loss, which significantly obstructs identity representation learning; on the other hand, the feature mismatch problem caused by low-level visual variations greatly reduces retrieval performance. An intuitive solution to this problem is to utilize low-level image restoration methods to improve the image quality. However, existing restoration methods cannot directly serve to real-world Re-ID due to various limitations, e.g., the requirements of reference samples, domain gap between synthesis and reality, and incompatibility between low-level and high-level methods. In this paper, to solve the above problem, we propose a degradation invariance learning framework for real-world person Re-ID. By introducing a self-supervised disentangled representation learning strategy, our method is able to simultaneously extract identity-related robust features and remove real-world degradations without extra supervision. We use low-resolution images as the main demonstration, and experiments show that our approach is able to achieve state-of-the-art performance on several Re-ID benchmarks. In addition, our framework can be easily extended to other real-world degradation factors, such as weak illumination, with only a few modifications.

preprint2020arXiv

Secret Sharing based Secure Regressions with Applications

Nowadays, the utilization of the ever expanding amount of data has made a huge impact on web technologies while also causing various types of security concerns. On one hand, potential gains are highly anticipated if different organizations could somehow collaboratively share their data for technological improvements. On the other hand, data security concerns may arise for both data holders and data providers due to commercial or sociological concerns. To make a balance between technical improvements and security limitations, we implement secure and scalable protocols for multiple data holders to train linear regression and logistic regression models. We build our protocols based on the secret sharing scheme, which is scalable and efficient in applications. Moreover, our proposed paradigm can be generalized to any secure multiparty training scenarios where only matrix summation and matrix multiplications are used. We demonstrate our approach by experiments which shows the scalability and efficiency of our proposed protocols, and finally present its real-world applications.

preprint2020arXiv

Secure Social Recommendation based on Secret Sharing

Nowadays, privacy preserving machine learning has been drawing much attention in both industry and academy. Meanwhile, recommender systems have been extensively adopted by many commercial platforms (e.g. Amazon) and they are mainly built based on user-item interactions. Besides, social platforms (e.g. Facebook) have rich resources of user social information. It is well known that social information, which is rich on social platforms such as Facebook, are useful to recommender systems. It is anticipated to combine the social information with the user-item ratings to improve the overall recommendation performance. Most existing recommendation models are built based on the assumptions that the social information are available. However, different platforms are usually reluctant to (or cannot) share their data due to certain concerns. In this paper, we first propose a SEcure SOcial RECommendation (SeSoRec) framework which can (1) collaboratively mine knowledge from social platform to improve the recommendation performance of the rating platform, and (2) securely keep the raw data of both platforms. We then propose a Secret Sharing based Matrix Multiplication (SSMM) protocol to optimize SeSoRec and prove its correctness and security theoretically. By applying minibatch gradient descent, SeSoRec has linear time complexities in terms of both computation and communication. The comprehensive experimental results on three real-world datasets demonstrate the effectiveness of our proposed SeSoRec and SSMM.

preprint2020arXiv

State-Relabeling Adversarial Active Learning

Active learning is to design label-efficient algorithms by sampling the most representative samples to be labeled by an oracle. In this paper, we propose a state relabeling adversarial active learning model (SRAAL), that leverages both the annotation and the labeled/unlabeled state information for deriving the most informative unlabeled samples. The SRAAL consists of a representation generator and a state discriminator. The generator uses the complementary annotation information with traditional reconstruction information to generate the unified representation of samples, which embeds the semantic into the whole data representation. Then, we design an online uncertainty indicator in the discriminator, which endues unlabeled samples with different importance. As a result, we can select the most informative samples based on the discriminator's predicted state. We also design an algorithm to initialize the labeled pool, which makes subsequent sampling more efficient. The experiments conducted on various datasets show that our model outperforms the previous state-of-art active learning methods and our initially sampling algorithm achieves better performance.

preprint2020arXiv

Thermal Components in Gamma-ray Bursts. II. Constraining the Hybrid Jet Model

In explaining the physical origin of the jet composition of gamma-ray bursts (GRBs), a more general picture, i.e. the hybrid jet model (which introduced another magnetization parameter $σ_{0}$ on the basis of the traditional fireball model), has been well studied in Gao \& Zhang. However, it still has not yet been applied to a large GRB sample. Here, we first employ the "top-down" approach of Gao \& Zhang to diagnose the photosphere properties at the central engine to see how the hybrid model can account for the observed data as well, through applying a {\it Fermi} GRB sample (eight bursts) with the detected photosphere component, as presented in Li (our Paper I). We infer all physical parameters of a hybrid problem with three typical values of the radius of the jet base ($r_{0}$ = 10$^{7}$, 10$^{8}$, and 10$^{9}$ cm). We find that the dimensionless entropy for all the bursts shows $η\gg$ 1 while the derived (1+$σ_{0}$) for five bursts (GRB 081224, GRB 110721A, GRB 090719, GRB 100707, and GRB 100724) is larger than unity, indicating that in addition to a hot fireball component, another cold Poynting-flux component may also play an important role. Our analysis also shows that in a few time bins for all $r_{0}$ in GRB 081224 and GRB 110721A, the magnetization parameter at $\sim$ 10$^{15}$cm (1+$σ_{\rm r15}$) is greater than unity, which implies that internal-collision-induced magnetic reconnection and turbulence may be the mechanism to power the nonthermal emission, rather than internal shocks. We conclude that the majority of bursts (probably all) can be well explained by the hybrid jet problem.

preprint2020arXiv

Towards Discriminability and Diversity: Batch Nuclear-norm Maximization under Label Insufficient Situations

The learning of the deep networks largely relies on the data with human-annotated labels. In some label insufficient situations, the performance degrades on the decision boundary with high data density. A common solution is to directly minimize the Shannon Entropy, but the side effect caused by entropy minimization, i.e., reduction of the prediction diversity, is mostly ignored. To address this issue, we reinvestigate the structure of classification output matrix of a randomly selected data batch. We find by theoretical analysis that the prediction discriminability and diversity could be separately measured by the Frobenius-norm and rank of the batch output matrix. Besides, the nuclear-norm is an upperbound of the Frobenius-norm, and a convex approximation of the matrix rank. Accordingly, to improve both discriminability and diversity, we propose Batch Nuclear-norm Maximization (BNM) on the output matrix. BNM could boost the learning under typical label insufficient learning scenarios, such as semi-supervised learning, domain adaptation and open domain recognition. On these tasks, extensive experimental results show that BNM outperforms competitors and works well with existing well-known methods. The code is available at https://github.com/cuishuhao/BNM.

preprint2020arXiv

Two CSCS-based iteration methods for solving absolute value equations

Recently, two families of HSS-based iteration methods are constructed for solving the system of absolute value equations (AVEs), which is a class of non-differentiable NP-hard problems. In this study, we establish the Picard-CSCS iteration method and the nonlinear CSCS-like iteration method for AVEs involving the Toeplitz matrix. Then, we analyze the convergence of the Picard-CSCS iteration method for solving AVEs. By using the theory about nonsmooth analysis, we particularly prove the convergence of the nonlinear CSCS-like iterationsolver for AVEs. The advantage of these methods is that they do not require the storage of coefficient matrices at all, and the sub-system of linear equations can be solved efficiently via the fast Fourier transforms (FFTs). Therefore, computational cost and storage can be saved in practical implementations. Numerical examples including numerical solutions of nonlinear fractional diffusion equations are reported to show the effectiveness of the proposed methods in comparison with some existing methods.

preprint2019arXiv

Determination of Electron Band Structure using Temporal Interferometry

We propose an all-optical method to directly reconstruct the band structure of semiconductors. Our scheme is based on the temporal Young's interferometer realized by high harmonic generation (HHG) with a few-cycle laser pulse. As a time-energy domain interferometric device, temporal interferometer encodes the band structure into the fringe in the energy domain. The relation between the band structure and the emitted harmonic frequencies is established. This enables us to retrieve the band structure from the HHG spectrum with a single-shot measurement. Our scheme paves the way to study matters under ambient conditions and to track the ultrafast modification of band structures.