Source author record

Chang Huang

Chang Huang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision physics.atom-ph quant-ph Artificial Intelligence Hardware Architecture Machine Learning Neural and Evolutionary Computing physics.optics

Catalog footprint

What is connected

13works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Many-ported and Shared Memory Architecture for High-Performance ADAS SoCs

Increasing investment in computing technologies and the advancements in silicon technology has fueled rapid growth in advanced driver assistance systems (ADAS) and corresponding SoC developments. An ADAS SoC represents a heterogeneous architecture that consists of CPUs, GPUs and artificial intelligence (AI) accelerators. In order to guarantee its safety and reliability, it must process massive amount of raw data collected from multiple redundant sources such as high-definition video cameras, Radars, and Lidars to recognize objects correctly and to make the right decisions promptly. A domain specific memory architecture is essential to achieve the above goals. We present a shared memory architecture that enables high data throughput among multiple parallel accesses native to the ADAS applications. It also provides deterministic access latency with proper isolation under the stringent real-time QoS constraints. A prototype is built and analyzed. The results validate that the proposed architecture provides close to 100\% throughput for both read and write accesses generated simultaneously by many accessing masters with full injection rate. It can also provide consistent QoS to the domain specific payloads while enabling the scalability and modularity of the design.

preprint2022arXiv

AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception

Studying the inherent symmetry of data is of great importance in machine learning. Point cloud, the most important data format for 3D environmental perception, is naturally endowed with strong radial symmetry. In this work, we exploit this radial symmetry via a divide-and-conquer strategy to boost 3D perception performance and ease optimization. We propose Azimuth Normalization (AziNorm), which normalizes the point clouds along the radial direction and eliminates the variability brought by the difference of azimuth. AziNorm can be flexibly incorporated into most LiDAR-based perception methods. To validate its effectiveness and generalization ability, we apply AziNorm in both object detection and semantic segmentation. For detection, we integrate AziNorm into two representative detection methods, the one-stage SECOND detector and the state-of-the-art two-stage PV-RCNN detector. Experiments on Waymo Open Dataset demonstrate that AziNorm improves SECOND and PV-RCNN by 7.03 mAPH and 3.01 mAPH respectively. For segmentation, we integrate AziNorm into KPConv. On SemanticKitti dataset, AziNorm improves KPConv by 1.6/1.1 mIoU on val/test set. Besides, AziNorm remarkably improves data efficiency and accelerates convergence, reducing the requirement of data amounts or training epochs by an order of magnitude. SECOND w/ AziNorm can significantly outperform fully trained vanilla SECOND, even trained with only 10% data or 10% epochs. Code and models are available at https://github.com/hustvl/AziNorm.

preprint2022arXiv

Sparse Instance Activation for Real-Time Instance Segmentation

In this paper, we propose a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation. Previously, most instance segmentation methods heavily rely on object detection and perform mask prediction based on bounding boxes or dense centers. In contrast, we propose a sparse set of instance activation maps, as a new object representation, to highlight informative regions for each foreground object. Then instance-level features are obtained by aggregating features according to the highlighted regions for recognition and segmentation. Moreover, based on bipartite matching, the instance activation maps can predict objects in a one-to-one style, thus avoiding non-maximum suppression (NMS) in post-processing. Owing to the simple yet effective designs with instance activation maps, SparseInst has extremely fast inference speed and achieves 40 FPS and 37.9 AP on the COCO benchmark, which significantly outperforms the counterparts in terms of speed and accuracy. Code and models are available at https://github.com/hustvl/SparseInst.

preprint2021arXiv

Dark-state sideband cooling in an atomic ensemble

We utilize the dark state in a Λ-type three-level system to cool an ensemble of 85Rb atoms in an optical lattice [Morigi et al., Phys. Rev. Lett. 85, 4458 (2000)]. The common suppression of the carrier transition of atoms with different vibrational frequencies allows them to reach a subrecoil temperature of 100 nK after being released from the optical lattice. A nearly zero vibrational quantum number is determined from the time-of-flight measurements and adiabatic expansion process. The features of sideband cooling are examined in various parameter spaces. Our results show that dark-state sideband cooling is a simple and compelling method for preparing a large ensemble of atoms into their vibrational ground state of a harmonic potential and can be generalized to different species of atoms and molecules for studying ultracold physics that demands recoil temperature and below.

preprint2020arXiv

Long Light Storage Time in an Optical Fiber

Light storage in an optical fiber is an attractive component in quantum optical delay line technologies. Although silica-core optical fibers are excellent in transmitting broadband optical signals, it is challenging to tailor their dispersive property to slow down a light pulse or store it in the silica-core for a long delay time. Coupling a dispersive and coherent medium with an optical fiber is promising in supporting long optical delay. Here, we load cold Rb atomic vapor into an optical trap inside a hollow-core photonic crystal fiber, and store the phase of the light in a long-lived spin-wave formed by atoms and retrieve it after a fully controllable delay time using electromagnetically-induced-transparency (EIT). We achieve over 50 ms of storage time and the result is equivalent to 8.7x10^-5 dB s^-1 of propagation loss in an optical fiber. Our demonstration could be used for buffering and regulating classical and quantum information flow between remote networks.

preprint2020arXiv

Quantum-Enhanced Velocimetry with Doppler-Broadened Atomic Vapor

Traditionally, measuring the center-of-mass (c.m.) velocity of an atomic ensemble relies on measuring the Doppler shift of the absorption spectrum of single atoms in the ensemble. Mapping out the velocity distribution of the ensemble is indispensable when determining the c.m. velocity using this technique. As a result, highly sensitive measurements require preparation of an ensemble with a narrow Doppler width. Here, we use a dispersive measurement of light passing through a moving room temperature atomic vapor cell to determine the velocity of the cell in a single shot with a short-term sensitivity of 5.5 $μ$m s$^{-1}$ Hz$^{-1/2}$. The dispersion of the medium is enhanced by creating quantum interference through an auxiliary transition for the probe light under electromagnetically induced transparency condition. In contrast to measurement of single atoms, this method is based on the collective motion of atoms and can sense the c.m. velocity of an ensemble without knowing its velocity distribution. Our results improve the previous measurements by 3 orders of magnitude and can be used to design a compact motional sensor based on thermal atoms.

preprint2019arXiv

Diversity Transfer Network for Few-Shot Learning

Few-shot learning is a challenging task that aims at training a classifier for unseen classes with only a few training examples. The main difficulty of few-shot learning lies in the lack of intra-class diversity within insufficient training samples. To alleviate this problem, we propose a novel generative framework, Diversity Transfer Network (DTN), that learns to transfer latent diversities from known categories and composite them with support features to generate diverse samples for novel categories in feature space. The learning problem of the sample generation (i.e., diversity transfer) is solved via minimizing an effective meta-classification loss in a single-stage network, instead of the generative loss in previous works. Besides, an organized auxiliary task co-training over known categories is proposed to stabilize the meta-training process of DTN. We perform extensive experiments and ablation studies on three datasets, i.e., \emph{mini}ImageNet, CIFAR100 and CUB. The results show that DTN, with single-stage training and faster convergence speed, obtains the state-of-the-art results among the feature generation based few-shot learning methods. Code and supplementary material are available at: \texttt{https://github.com/Yuxin-CV/DTN}

preprint2016arXiv

CNN-RNN: A Unified Framework for Multi-label Image Classification

While deep convolutional neural networks (CNNs) have shown a great success in single-label image classification, it is important to note that real world images generally contain multiple labels, which could correspond to different objects, scenes, actions and attributes in an image. Traditional approaches to multi-label image classification learn independent classifiers for each category and employ ranking or thresholding on the classification results. These techniques, although working well, fail to explicitly exploit the label dependencies in an image. In this paper, we utilize recurrent neural networks (RNNs) to address this problem. Combined with CNNs, the proposed CNN-RNN framework learns a joint image-label embedding to characterize the semantic label dependency as well as the image-label relevance, and it can be trained end-to-end from scratch to integrate both information in a unified framework. Experimental results on public benchmark datasets demonstrate that the proposed architecture achieves better performance than the state-of-the-art multi-label classification model

preprint2016arXiv

Conditional Random Fields as Recurrent Neural Networks

Pixel-level labelling tasks, such as semantic segmentation, play a central role in image understanding. Recent approaches have attempted to harness the capabilities of deep learning techniques for image recognition to tackle pixel-level labelling tasks. One central issue in this methodology is the limited capacity of deep learning techniques to delineate visual objects. To solve this problem, we introduce a new form of convolutional neural network that combines the strengths of Convolutional Neural Networks (CNNs) and Conditional Random Fields (CRFs)-based probabilistic graphical modelling. To this end, we formulate mean-field approximate inference for the Conditional Random Fields with Gaussian pairwise potentials as Recurrent Neural Networks. This network, called CRF-RNN, is then plugged in as a part of a CNN to obtain a deep network that has desirable properties of both CNNs and CRFs. Importantly, our system fully integrates CRF modelling with CNNs, making it possible to train the whole deep network end-to-end with the usual back-propagation algorithm, avoiding offline post-processing methods for object delineation. We apply the proposed method to the problem of semantic image segmentation, obtaining top results on the challenging Pascal VOC 2012 segmentation benchmark.

preprint2016arXiv

Large Fizeau's light-dragging effect in a moving electromagnetically induced transparent medium

As one of the most influential experiments on the development of modern macroscopic theory from Newtonian mechanics to Einstein's special theory of relativity, the phenomenon of light dragging in a moving medium has been discussed and observed extensively in different types of systems. To have a significant dragging effect, the long duration of light travelling in the medium is preferred. Here we demonstrate a light-dragging experiment in an electromagnetically induced transparent cold atomic ensemble and enhance the dragging effect by at least three orders of magnitude compared with the previous experiments. With a large enhancement of the dragging effect, we realize an atom-based velocimeter that has a sensitivity two orders of magnitude higher than the velocity width of the atomic medium used. Such a demonstration could pave the way for motional sensing using the collective state of atoms in a room temperature vapour cell or solid state material.

preprint2016arXiv

Text Flow: A Unified Text Detection System in Natural Scene Images

The prevalent scene text detection approach follows four sequential steps comprising character candidate detection, false character candidate removal, text line extraction, and text line verification. However, errors occur and accumulate throughout each of these sequential steps which often lead to low detection performance. To address these issues, we propose a unified scene text detection system, namely Text Flow, by utilizing the minimum cost (min-cost) flow network model. With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively. The proposed technique has been tested on three public datasets, i.e, ICDAR2011 dataset, ICDAR2013 dataset and a multilingual dataset and it outperforms the state-of-the-art methods on all three datasets with much higher recall and F-score. The good performance on the multilingual dataset shows that the proposed technique can be used for the detection of texts in different languages.

preprint2015arXiv

Targeting Ultimate Accuracy: Face Recognition via Deep Embedding

Face Recognition has been studied for many decades. As opposed to traditional hand-crafted features such as LBP and HOG, much more sophisticated features can be learned automatically by deep learning methods in a data-driven way. In this paper, we propose a two-stage approach that combines a multi-patch deep CNN and deep metric learning, which extracts low dimensional but very discriminative features for face verification and recognition. Experiments show that this method outperforms other state-of-the-art methods on LFW dataset, achieving 99.77% pair-wise verification accuracy and significantly better accuracy under other two more practical protocols. This paper also discusses the importance of data size and the number of patches, showing a clear path to practical high-performance face recognition systems in real world.

preprint2012arXiv

Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval

Learning Mahanalobis distance metrics in a high- dimensional feature space is very difficult especially when structural sparsity and low rank are enforced to improve com- putational efficiency in testing phase. This paper addresses both aspects by an ensemble metric learning approach that consists of sparse block diagonal metric ensembling and join- t metric learning as two consecutive steps. The former step pursues a highly sparse block diagonal metric by selecting effective feature groups while the latter one further exploits correlations between selected feature groups to obtain an accurate and low rank metric. Our algorithm considers all pairwise or triplet constraints generated from training samples with explicit class labels, and possesses good scala- bility with respect to increasing feature dimensionality and growing data volumes. Its applications to face verification and retrieval outperform existing state-of-the-art methods in accuracy while retaining high efficiency.

Chang Huang

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

A Many-ported and Shared Memory Architecture for High-Performance ADAS SoCs

AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception

Sparse Instance Activation for Real-Time Instance Segmentation

Dark-state sideband cooling in an atomic ensemble

Long Light Storage Time in an Optical Fiber

Quantum-Enhanced Velocimetry with Doppler-Broadened Atomic Vapor

Diversity Transfer Network for Few-Shot Learning

CNN-RNN: A Unified Framework for Multi-label Image Classification

Conditional Random Fields as Recurrent Neural Networks

Large Fizeau's light-dragging effect in a moving electromagnetically induced transparent medium

Text Flow: A Unified Text Detection System in Natural Scene Images

Targeting Ultimate Accuracy: Face Recognition via Deep Embedding

Large Scale Strongly Supervised Ensemble Metric Learning, with Applications to Face Verification and Retrieval