Source author record

Bing Han

Bing Han appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.ins-det Computer Vision Machine Learning Artificial Intelligence eess.AS Sound Applications Computation and Language cond-mat.mtrl-sci Emerging Technologies Information Retrieval Neural and Evolutionary Computing physics.chem-ph physics.class-ph Robotics

Catalog footprint

What is connected

16works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

AdaMixer: A Fast-Converging Query-Based Object Detector

Traditional object detectors employ the dense paradigm of scanning over locations and scales in an image. The recent query-based object detectors break this convention by decoding image features with a set of learnable queries. However, this paradigm still suffers from slow convergence, limited performance, and design complexity of extra networks between backbone and decoder. In this paper, we find that the key to these issues is the adaptability of decoders for casting queries to varying objects. Accordingly, we propose a fast-converging query-based detector, named AdaMixer, by improving the adaptability of query-based decoding processes in two aspects. First, each query adaptively samples features over space and scales based on estimated offsets, which allows AdaMixer to efficiently attend to the coherent regions of objects. Then, we dynamically decode these sampled features with an adaptive MLP-Mixer under the guidance of each query. Thanks to these two critical designs, AdaMixer enjoys architectural simplicity without requiring dense attentional encoders or explicit pyramid networks. On the challenging MS COCO benchmark, AdaMixer with ResNet-50 as the backbone, with 12 training epochs, reaches up to 45.0 AP on the validation set along with 27.9 APs in detecting small objects. With the longer training scheme, AdaMixer with ResNeXt-101-DCN and Swin-S reaches 49.5 and 51.3 AP. Our work sheds light on a simple, accurate, and fast converging architecture for query-based object detectors. The code is made available at https://github.com/MCG-NJU/AdaMixer

preprint2022arXiv

Cross-Architecture Self-supervised Video Representation Learning

In this paper, we present a new cross-architecture contrastive learning (CACL) framework for self-supervised video representation learning. CACL consists of a 3D CNN and a video transformer which are used in parallel to generate diverse positive pairs for contrastive learning. This allows the model to learn strong representations from such diverse yet meaningful pairs. Furthermore, we introduce a temporal self-supervised learning module able to predict an Edit distance explicitly between two video sequences in the temporal order. This enables the model to learn a rich temporal representation that compensates strongly to the video-level representation learned by the CACL. We evaluate our method on the tasks of video retrieval and action recognition on UCF101 and HMDB51 datasets, where our method achieves excellent performance, surpassing the state-of-the-art methods such as VideoMoCo and MoCo+BE by a large margin. The code is made available at https://github.com/guoshengcv/CACL.

preprint2022arXiv

Multi-layer VI-GNSS Global Positioning Framework with Numerical Solution aided MAP Initialization

Motivated by the goal of achieving long-term drift-free camera pose estimation in complex scenarios, we propose a global positioning framework fusing visual, inertial and Global Navigation Satellite System (GNSS) measurements in multiple layers. Different from previous loosely- and tightly- coupled methods, the proposed multi-layer fusion allows us to delicately correct the drift of visual odometry and keep reliable positioning while GNSS degrades. In particular, local motion estimation is conducted in the inner-layer, solving the problem of scale drift and inaccurate bias estimation in visual odometry by fusing the velocity of GNSS, pre-integration of Inertial Measurement Unit (IMU) and camera measurement in a tightly-coupled way. The global localization is achieved in the outer-layer, where the local motion is further fused with GNSS position and course in a long-term period in a loosely-coupled way. Furthermore, a dedicated initialization method is proposed to guarantee fast and accurate estimation for all state variables and parameters. We give exhaustive tests of the proposed framework on indoor and outdoor public datasets. The mean localization error is reduced up to 63%, with a promotion of 69% in initialization accuracy compared with state-of-the-art works. We have applied the algorithm to Augmented Reality (AR) navigation, crowd sourcing high-precision map update and other large-scale applications.

preprint2022arXiv

Poincaré Heterogeneous Graph Neural Networks for Sequential Recommendation

Sequential recommendation (SR) learns users' preferences by capturing the sequential patterns from users' behaviors evolution. As discussed in many works, user-item interactions of SR generally present the intrinsic power-law distribution, which can be ascended to hierarchy-like structures. Previous methods usually handle such hierarchical information by making user-item sectionalization empirically under Euclidean space, which may cause distortion of user-item representation in real online scenarios. In this paper, we propose a Poincaré-based heterogeneous graph neural network named PHGR to model the sequential pattern information as well as hierarchical information contained in the data of SR scenarios simultaneously. Specifically, for the purpose of explicitly capturing the hierarchical information, we first construct a weighted user-item heterogeneous graph by aliening all the user-item interactions to improve the perception domain of each user from a global view. Then the output of the global representation would be used to complement the local directed item-item homogeneous graph convolution. By defining a novel hyperbolic inner product operator, the global and local graph representation learning are directly conducted in Poincaré ball instead of commonly used projection operation between Poincaré ball and Euclidean space, which could alleviate the cumulative error issue of general bidirectional translation process. Moreover, for the purpose of explicitly capturing the sequential dependency information, we design two types of temporal attention operations under Poincaré ball space. Empirical evaluations on datasets from the public and financial industry show that PHGR outperforms several comparison methods.

preprint2022arXiv

Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction

For self-supervised speaker verification, the quality of pseudo labels decides the upper bound of the system due to the massive unreliable labels. In this work, we propose dynamic loss-gate and label correction (DLG-LC) to alleviate the performance degradation caused by unreliable estimated labels. In DLG, we adopt Gaussian Mixture Model (GMM) to dynamically model the loss distribution and use the estimated GMM to distinguish the reliable and unreliable labels automatically. Besides, to better utilize the unreliable data instead of dropping them directly, we correct the unreliable label with model predictions. Moreover, we apply the negative-pairs-free DINO framework in our experiments for further improvement. Compared to the best-known speaker verification system with self-supervised learning, our proposed DLG-LC converges faster and achieves 11.45%, 18.35% and 15.16% relative improvement on Vox-O, Vox-E and Vox-H trials of Voxceleb1 evaluation dataset.

preprint2022arXiv

Semi-Supervised Clustering with Contrastive Learning for Discovering New Intents

Most dialogue systems in real world rely on predefined intents and answers for QA service, so discovering potential intents from large corpus previously is really important for building such dialogue services. Considering that most scenarios have few intents known already and most intents waiting to be discovered, we focus on semi-supervised text clustering and try to make the proposed method benefit from labeled samples for better overall clustering performance. In this paper, we propose Deep Contrastive Semi-supervised Clustering (DCSC), which aims to cluster text samples in a semi-supervised way and provide grouped intents to operation staff. To make DCSC fully utilize the limited known intents, we propose a two-stage training procedure for DCSC, in which DCSC will be trained on both labeled samples and unlabeled samples, and achieve better text representation and clustering performance. We conduct experiments on two public datasets to compare our model with several popular methods, and the results show DCSC achieve best performance across all datasets and circumstances, indicating the effect of the improvements in our work.

preprint2022arXiv

The SJTU System for Short-duration Speaker Verification Challenge 2021

This paper presents the SJTU system for both text-dependent and text-independent tasks in short-duration speaker verification (SdSV) challenge 2021. In this challenge, we explored different strong embedding extractors to extract robust speaker embedding. For text-independent task, language-dependent adaptive snorm is explored to improve the system performance under the cross-lingual verification condition. For text-dependent task, we mainly focus on the in-domain fine-tuning strategies based on the model pre-trained on large-scale out-of-domain data. In order to improve the distinction between different speakers uttering the same phrase, we proposed several novel phrase-aware fine-tuning strategies and phrase-aware neural PLDA. With such strategies, the system performance is further improved. Finally, we fused the scores of different systems, and our fusion systems achieved 0.0473 in Task1 (rank 3) and 0.0581 in Task2 (rank 8) on the primary evaluation metric.

preprint2020arXiv

RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network

Spiking Neural Networks (SNNs) have recently attracted significant research interest as the third generation of artificial neural networks that can enable low-power event-driven data analytics. The best performing SNNs for image recognition tasks are obtained by converting a trained Analog Neural Network (ANN), consisting of Rectified Linear Units (ReLU), to SNN composed of integrate-and-fire neurons with "proper" firing thresholds. The converted SNNs typically incur loss in accuracy compared to that provided by the original ANN and require sizable number of inference time-steps to achieve the best accuracy. We find that performance degradation in the converted SNN stems from using "hard reset" spiking neuron that is driven to fixed reset potential once its membrane potential exceeds the firing threshold, leading to information loss during SNN inference. We propose ANN-SNN conversion using "soft reset" spiking neuron model, referred to as Residual Membrane Potential (RMP) spiking neuron, which retains the "residual" membrane potential above threshold at the firing instants. We demonstrate near loss-less ANN-SNN conversion using RMP neurons for VGG-16, ResNet-20, and ResNet-34 SNNs on challenging datasets including CIFAR-10 (93.63% top-1), CIFAR-100 (70.93% top-1), and ImageNet (73.09% top-1 accuracy). Our results also show that RMP-SNN surpasses the best inference accuracy provided by the converted SNN with "hard reset" spiking neurons using 2-8 times fewer inference time-steps across network architectures and datasets.

preprint2020arXiv

Wettability and surface energy of parylene F

Parylenes are barrier materials employed as protective layers. However, many parylenes are unsuitable for applications under harsh conditions. A new material, parylene F, demonstrates considerable potential for a wide range of applications due to its high temperature and UV resistance. For the first time, the wettability and surface energy of parylene F were investigated to determine the feasibility of parylene F as an alternative to the commonly employed parylene C. The results show that parylene F has a hydrophobic surface with a water contact angle of 109.63 degrees. We found that 3.5 ul probe liquid is an optimal value for the contact angle measurement of parylene F. Moreover, we found that the Owens-Wendt-Kaelble and the Lifshitz-van der Waals/acid-base approaches are unsuitable for determining the surface energy of parylene F, whereas an approach based on the limitless liquid-solid interface wetting system is compatible. Furthermore, the results show that parylene F has a surface energy of 39.05 mJ/m2. Considering the improved resistance, relatively low cost, and the desirable properties, parylene F can replace parylene C for applications under harsh conditions.

preprint2016arXiv

Probabilistic Deep Spiking Neural Systems Enabled by Magnetic Tunnel Junction

Deep Spiking Neural Networks are becoming increasingly powerful tools for cognitive computing platforms. However, most of the existing literature on such computing models are developed with limited insights on the underlying hardware implementation, resulting in area and power expensive designs. Although several neuromimetic devices emulating neural operations have been proposed recently, their functionality has been limited to very simple neural models that may prove to be inefficient at complex recognition tasks. In this work, we venture into the relatively unexplored area of utilizing the inherent device stochasticity of such neuromimetic devices to model complex neural functionalities in a probabilistic framework in the time domain. We consider the implementation of a Deep Spiking Neural Network capable of performing high accuracy and low latency classification tasks where the neural computing unit is enabled by the stochastic switching behavior of a Magnetic Tunnel Junction. Simulation studies indicate an energy improvement of $20\times$ over a baseline CMOS design in $45nm$ technology.

preprint2015arXiv

An analytical algorithm for 3D magnetic field mapping of a watt balance magnet

A yoke-based permanent magnet, which has been employed in many watt balances at national metrology institutes, is supposed to generate strong and uniform magnetic field in an air gap in the radial direction. However, in reality the fringe effect due to the finite height of the air gap will introduce an undesired vertical magnetic component to the air gap, which should either be measured or modeled towards some optimizations of the watt balance. A recent publication, i.e., {\it Metrologia} 52(4) 445 [1], presented a full field mapping method, which in theory will supply useful information for profile characterization and misalignment analysis. This article is an additional material of [1], which develops a different analytical algorithm to represent the 3D magnetic field of a watt balance magnet based on only one measurement for the radial magnetic flux density along the vertical direction, $B_r(z)$. The new algorithm is based on the electromagnetic nature of the magnet, which has a much better accuracy.

preprint2015arXiv

Coils and the Electromagnet Used in the Joule Balance at the NIM

In the joule balance developed at National Institute of Metrology (NIM), the dynamic phase of a watt balance is replaced by the mutual inductance measurement in an attempt to provide an alternative method for the kg redefinition. But for this method a rather large current in the exciting coil, is needed to offer the necessary magnetic field in the force weighing phase, and the coil heating becomes an important uncertainty source. To reduce coil heating, a new coil system, in which a ferromagnetic material is used to increase the magnetic field was designed recently. But adopting the ferromagnetic material brings the difficulty from the nonlinear characteristic of material. This problem can be removed by measuring the magnetic flux linkage difference of the suspended coil at two vertical positions directly to replace the mutual inductance parameter. Some systematic effects of this magnet are discussed.

preprint2014arXiv

Construction, Measurement, Shimming, and Performance of the NIST-4 Magnet System

The magnet system is one of the key elements of a watt balance. For the new watt balance currently under construction at the National Institute of Standards and Technology, a permanent magnet system was chosen. We describe the detailed construction of the magnet system, first measurements of the field profile, and shimming techniques that were used to achieve a flat field profile. The relative change of the radial magnetic flux density is less than $10^{-4}$ over a range of 5 cm. We further characterize the most important aspects of the magnet and give order of magnitude estimates for several systematic effects that originate from the magnet system.

preprint2014arXiv

The Improvement of Joule Balance NIM-1 and the Design of New Joule Balance NIM-2

The development of the joule balance method to measure the Planck constant, in support of the redefinition of the kilogram, has been going on at the National Institute of Metrology of China (NIM) since 2007. The first prototype has been built to check the feasibility of the principle. In 2011, the relative uncertainty of the Planck constant measurement at NIM is $7.7\times10^{-5}$. Self-heating of the coils, swing of the coil, are the main uncertainty contributions. Since 2012, some improvements have been made to reduce these uncertainties. The relative uncertainty of the joule balance is reduced to $7.2\times10^{-6}$ at present. The Planck constant measured with the joule balance is $h=6.6261041(470)\times10^{-34}$Js. The relative difference between the determined h and the CODATA2010 recommendation value is $5\times10^{-6}$. Further improvements are still being carried out on the NIM-1 apparatus. At the same time, the design and construction of a brand new and compact joule balance NIM-2 are also in progress and presented here.

preprint2014arXiv

The NIM Inertial Mass Measurement Project

An inertial mass measurement project, which is expected to precisely measure the Planck constant, $h$, for possible comparisons with known gravitational mass measurement projects, e.g., the watt balance and the Avogadro project, is being carried out at the National Institute of Metrology, China. The principle, apparatus, and experimental investigations of the inertial mass measurement are presented. The prototype of the experiment and the Planck constant with relative uncertainty of several parts in $10^{4}$ have been achieved for principle testing.

preprint2010arXiv

Detection of radioactive material entering national ports: A Bayesian approach to radiation portal data

Given the potential for illicit nuclear material being used for terrorism, most ports now inspect a large number of goods entering national borders for radioactive cargo. The U.S. Department of Homeland Security is moving toward one hundred percent inspection of all containers entering the U.S. at various ports of entry for nuclear material. We propose a Bayesian classification approach for the real-time data collected by the inline Polyvinyl Toluene radiation portal monitors. We study the computational and asymptotic properties of the proposed method and demonstrate its efficacy in simulations. Given data available to the authorities, it should be feasible to implement this approach in practice.

Bing Han

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

AdaMixer: A Fast-Converging Query-Based Object Detector

Cross-Architecture Self-supervised Video Representation Learning

Multi-layer VI-GNSS Global Positioning Framework with Numerical Solution aided MAP Initialization

Poincaré Heterogeneous Graph Neural Networks for Sequential Recommendation

Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction

Semi-Supervised Clustering with Contrastive Learning for Discovering New Intents

The SJTU System for Short-duration Speaker Verification Challenge 2021

RMP-SNN: Residual Membrane Potential Neuron for Enabling Deeper High-Accuracy and Low-Latency Spiking Neural Network

Wettability and surface energy of parylene F

Probabilistic Deep Spiking Neural Systems Enabled by Magnetic Tunnel Junction

An analytical algorithm for 3D magnetic field mapping of a watt balance magnet

Coils and the Electromagnet Used in the Joule Balance at the NIM

Construction, Measurement, Shimming, and Performance of the NIST-4 Magnet System

The Improvement of Joule Balance NIM-1 and the Design of New Joule Balance NIM-2

The NIM Inertial Mass Measurement Project

Detection of radioactive material entering national ports: A Bayesian approach to radiation portal data