Source author record

Z. Jane Wang

Z. Jane Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Theory Machine Learning math.IT cond-mat.dis-nn eess.SP physics.flu-dyn Biological Physics chao-dyn Human-Computer Interaction math-ph math.MP nlin.CD physics.comp-ph quant-ph Robotics

Catalog footprint

What is connected

19works

16topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

AdaptPose: Cross-Dataset Adaptation for 3D Human Pose Estimation by Learnable Motion Generation

This paper addresses the problem of cross-dataset generalization of 3D human pose estimation models. Testing a pre-trained 3D pose estimator on a new dataset results in a major performance drop. Previous methods have mainly addressed this problem by improving the diversity of the training data. We argue that diversity alone is not sufficient and that the characteristics of the training data need to be adapted to those of the new dataset such as camera viewpoint, position, human actions, and body size. To this end, we propose AdaptPose, an end-to-end framework that generates synthetic 3D human motions from a source dataset and uses them to fine-tune a 3D pose estimator. AdaptPose follows an adversarial training scheme. From a source 3D pose the generator generates a sequence of 3D poses and a camera orientation that is used to project the generated poses to a novel view. Without any 3D labels or camera information AdaptPose successfully learns to create synthetic 3D poses from the target dataset while only being trained on 2D poses. In experiments on the Human3.6M, MPI-INF-3DHP, 3DPW, and Ski-Pose datasets our method outperforms previous work in cross-dataset evaluations by 14% and previous semi-supervised learning methods that use partial 3D annotations by 16%.

preprint2022arXiv

Efficient Subsampling of Realistic Images From GANs Conditional on a Class or a Continuous Variable

Recently, subsampling or refining images generated from unconditional GANs has been actively studied to improve the overall image quality. Unfortunately, these methods are often observed less effective or inefficient in handling conditional GANs (cGANs) -- conditioning on a class (aka class-conditional GANs) or a continuous variable (aka continuous cGANs or CcGANs). In this work, we introduce an effective and efficient subsampling scheme, named conditional density ratio-guided rejection sampling (cDR-RS), to sample high-quality images from cGANs. Specifically, we first develop a novel conditional density ratio estimation method, termed cDRE-F-cSP, by proposing the conditional Softplus (cSP) loss and an improved feature extraction mechanism. We then derive the error bound of a density ratio model trained with the cSP loss. Finally, we accept or reject a fake image in terms of its estimated conditional density ratio. A filtering scheme is also developed to increase fake images' label consistency without losing diversity when sampling from CcGANs. We extensively test the effectiveness and efficiency of cDR-RS in sampling from both class-conditional GANs and CcGANs on five benchmark datasets. When sampling from class-conditional GANs, cDR-RS outperforms modern state-of-the-art methods by a large margin (except DRE-F-SP+RS) in terms of effectiveness. Although the effectiveness of cDR-RS is often comparable to that of DRE-F-SP+RS, cDR-RS is substantially more efficient. When sampling from CcGANs, the superiority of cDR-RS is even more noticeable in terms of both effectiveness and efficiency. Notably, with the consumption of reasonable computational resources, cDR-RS can substantially reduce Label Score without decreasing the diversity of CcGAN-generated images, while other methods often need to trade much diversity for slightly improved Label Score.

preprint2022arXiv

Joint Precoding for Active Intelligent Transmitting Surface Empowered Outdoor-to-Indoor Communication in mmWave Cellular Networks

Outdoor-to-indoor communications in millimeter-wave (mmWave) cellular networks have been one challenging research problem due to the severe attenuation and the high penetration loss caused by the propagation characteristics of mmWave signals. We propose a viable solution to implement the outdoor-to-indoor mmWave communication system with the aid of an active intelligent transmitting surface (active-ITS), where the active-ITS allows the incoming signal from an outdoor base station (BS) to pass through the surface and be received by the indoor user-equipments (UEs) after shifting its phase and magnifying its amplitude. Then, the problem of joint precoding of the BS and active-ITS is investigated to maximize the weighted sum-rate (WSR) of the communication system. An efficient block coordinate descent (BCD) based algorithm is developed to solve it with the suboptimal solutions in nearly closed-forms. In addition, to reduce the size and hardware cost of an active-ITS, we provide a block-amplifying architecture to partially remove the circuit components for power-amplifying, where multiple transmissive-type elements (TEs) in each block share a same power amplifier. Simulations indicate that active-ITS has the potential of achieving a given performance with much fewer TEs compared to the passive-ITS under the same total system power consumption, which makes it suitable for application to the size-limited and aesthetic-needed scenario, and the inevitable performance degradation caused by the block-amplifying architecture is acceptable.

preprint2022arXiv

Multi-modal Streaming 3D Object Detection

Modern autonomous vehicles rely heavily on mechanical LiDARs for perception. Current perception methods generally require 360° point clouds, collected sequentially as the LiDAR scans the azimuth and acquires consecutive wedge-shaped slices. The acquisition latency of a full scan (~ 100ms) may lead to outdated perception which is detrimental to safe operation. Recent streaming perception works proposed directly processing LiDAR slices and compensating for the narrow field of view (FOV) of a slice by reusing features from preceding slices. These works, however, are all based on a single modality and require past information which may be outdated. Meanwhile, images from high-frequency cameras can support streaming models as they provide a larger FoV compared to a LiDAR slice. However, this difference in FoV complicates sensor fusion. To address this research gap, we propose an innovative camera-LiDAR streaming 3D object detection framework that uses camera images instead of past LiDAR slices to provide an up-to-date, dense, and wide context for streaming perception. The proposed method outperforms prior streaming models on the challenging NuScenes benchmark. It also outperforms powerful full-scan detectors while being much faster. Our method is shown to be robust to missing camera images, narrow LiDAR slices, and small camera-LiDAR miscalibration.

preprint2022arXiv

SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for Lightweight Skin Lesion Classification Using Dermoscopic Images

Skin cancer is one of the most common types of malignancy, affecting a large population and causing a heavy economic burden worldwide. Over the last few years, computer-aided diagnosis has been rapidly developed and make great progress in healthcare and medical practices due to the advances in artificial intelligence. However, most studies in skin cancer detection keep pursuing high prediction accuracies without considering the limitation of computing resources on portable devices. In this case, knowledge distillation (KD) has been proven as an efficient tool to help improve the adaptability of lightweight models under limited resources, meanwhile keeping a high-level representation capability. To bridge the gap, this study specifically proposes a novel method, termed SSD-KD, that unifies diverse knowledge into a generic KD framework for skin diseases classification. Our method models an intra-instance relational feature representation and integrates it with existing KD research. A dual relational knowledge distillation architecture is self-supervisedly trained while the weighted softened outputs are also exploited to enable the student model to capture richer knowledge from the teacher model. To demonstrate the effectiveness of our method, we conduct experiments on ISIC 2019, a large-scale open-accessed benchmark of skin diseases dermoscopic images. Experiments show that our distilled lightweight model can achieve an accuracy as high as 85% for the classification tasks of 8 different skin diseases with minimal parameters and computing requirements. Ablation studies confirm the effectiveness of our intra- and inter-instance relational knowledge integration strategy. Compared with state-of-the-art knowledge distillation techniques, the proposed method demonstrates improved performances for multi-diseases classification on the large-scale dermoscopy database.

preprint2021arXiv

A Low Complexity Quantum Principal Component Analysis Algorithm

In this paper, we propose a low complexity quantum principal component analysis (qPCA) algorithm. Similar to the state-of-the-art qPCA, it achieves dimension reduction by extracting principal components of the data matrix, rather than all components of the data matrix, to quantum registers, so that samples of measurement required can be reduced considerably. However, the major advantage of our qPCA over the state-of-the-art qPCA is that it requires much less quantum gates. In addition, it is more accurate due to the simplification of the quantum circuit. We implement the proposed qPCA on the IBM quantum computing platform, and the experimental results are consistent with our expectations.

preprint2021arXiv

SCIDA: Self-Correction Integrated Domain Adaptation from Single- to Multi-label Aerial Images

Most publicly available datasets for image classification are with single labels, while images are inherently multi-labeled in our daily life. Such an annotation gap makes many pre-trained single-label classification models fail in practical scenarios. This annotation issue is more concerned for aerial images: Aerial data collected from sensors naturally cover a relatively large land area with multiple labels, while annotated aerial datasets, which are publicly available (e.g., UCM, AID), are single-labeled. As manually annotating multi-label aerial images would be time/labor-consuming, we propose a novel self-correction integrated domain adaptation (SCIDA) method for automatic multi-label learning. SCIDA is weakly supervised, i.e., automatically learning the multi-label image classification model from using massive, publicly available single-label images. To achieve this goal, we propose a novel Label-Wise self-Correction (LWC) module to better explore underlying label correlations. This module also makes the unsupervised domain adaptation (UDA) from single- to multi-label data possible. For model training, the proposed model only uses single-label information yet requires no prior knowledge of multi-labeled data; and it predicts labels for multi-label aerial images. In our experiments, trained with single-labeled MAI-AID-s and MAI-UCM-s datasets, the proposed model is tested directly on our collected Multi-scene Aerial Image (MAI) dataset.

preprint2021arXiv

Toward Open-World Electroencephalogram Decoding Via Deep Learning: A Comprehensive Survey

Electroencephalogram (EEG) decoding aims to identify the perceptual, semantic, and cognitive content of neural processing based on non-invasively measured brain activity. Traditional EEG decoding methods have achieved moderate success when applied to data acquired in static, well-controlled lab environments. However, an open-world environment is a more realistic setting, where situations affecting EEG recordings can emerge unexpectedly, significantly weakening the robustness of existing methods. In recent years, deep learning (DL) has emerged as a potential solution for such problems due to its superior capacity in feature extraction. It overcomes the limitations of defining `handcrafted' features or features extracted using shallow architectures, but typically requires large amounts of costly, expertly-labelled data - something not always obtainable. Combining DL with domain-specific knowledge may allow for development of robust approaches to decode brain activity even with small-sample data. Although various DL methods have been proposed to tackle some of the challenges in EEG decoding, a systematic tutorial overview, particularly for open-world applications, is currently lacking. This article therefore provides a comprehensive survey of DL methods for open-world EEG decoding, and identifies promising research directions to inspire future studies for EEG decoding in real-world applications.

preprint2020arXiv

CHAIN: Concept-harmonized Hierarchical Inference Interpretation of Deep Convolutional Neural Networks

With the great success of networks, it witnesses the increasing demand for the interpretation of the internal network mechanism, especially for the net decision-making logic. To tackle the challenge, the Concept-harmonized HierArchical INference (CHAIN) is proposed to interpret the net decision-making process. For net-decisions being interpreted, the proposed method presents the CHAIN interpretation in which the net decision can be hierarchically deduced into visual concepts from high to low semantic levels. To achieve it, we propose three models sequentially, i.e., the concept harmonizing model, the hierarchical inference model, and the concept-harmonized hierarchical inference model. Firstly, in the concept harmonizing model, visual concepts from high to low semantic-levels are aligned with net-units from deep to shallow layers. Secondly, in the hierarchical inference model, the concept in a deep layer is disassembled into units in shallow layers. Finally, in the concept-harmonized hierarchical inference model, a deep-layer concept is inferred from its shallow-layer concepts. After several rounds, the concept-harmonized hierarchical inference is conducted backward from the highest semantic level to the lowest semantic level. Finally, net decision-making is explained as a form of concept-harmonized hierarchical inference, which is comparable to human decision-making. Meanwhile, the net layer structure for feature learning can be explained based on the hierarchical visual concepts. In quantitative and qualitative experiments, we demonstrate the effectiveness of CHAIN at the instance and class levels.

preprint2020arXiv

Subsampling Generative Adversarial Networks: Density Ratio Estimation in Feature Space with Softplus Loss

Filtering out unrealistic images from trained generative adversarial networks (GANs) has attracted considerable attention recently. Two density ratio based subsampling methods---Discriminator Rejection Sampling (DRS) and Metropolis-Hastings GAN (MH-GAN)---were recently proposed, and their effectiveness in improving GANs was demonstrated on multiple datasets. However, DRS and MH-GAN are based on discriminator based density ratio estimation (DRE) methods, so they may not work well if the discriminator in the trained GAN is far from optimal. Moreover, they do not apply to some GANs (e.g., MMD-GAN). In this paper, we propose a novel Softplus (SP) loss for DRE. Based on it, we develop a sample-based DRE method in a feature space learned by a specially designed and pre-trained ResNet-34 (DRE-F-SP). We derive the rate of convergence of a density ratio model trained under the SP loss. Then, we propose three different density ratio subsampling methods (DRE-F-SP+RS, DRE-F-SP+MH, and DRE-F-SP+SIR) for GANs based on DRE-F-SP. Our subsampling methods do not rely on the optimality of the discriminator and are suitable for all types of GANs. We empirically show our subsampling approach can substantially outperform DRS and MH-GAN on a synthetic dataset and the CIFAR-10 dataset, using multiple GANs.

preprint2016arXiv

Real-time 2D/3D Registration via CNN Regression

In this paper, we present a Convolutional Neural Network (CNN) regression approach for real-time 2-D/3-D registration. Different from optimization-based methods, which iteratively optimize the transformation parameters over a scalar-valued metric function representing the quality of the registration, the proposed method exploits the information embedded in the appearances of the Digitally Reconstructed Radiograph and X-ray images, and employs CNN regressors to directly estimate the transformation parameters. The CNN regressors are trained for local zones and applied in a hierarchical manner to break down the complex regression task into simpler sub-tasks that can be learned separately. Our experiment results demonstrate the advantage of the proposed method in computational efficiency with negligible degradation of registration accuracy compared to intensity-based methods.

preprint2014arXiv

On the Instability and Critical Damping Conditions, $kτ= 1/e$ and $kτ= π/2$ of the equation $\dotθ = -k θ(t-τ)$

In this note, I show that it is possible to use elementary mathematics, instead of the machinery of Lambert function, Laplace Transform, or numerics, to derive the instability condition, $k τ= π/2$, and the critical damping condition, $kτ= 1/e$, for the time-delayed equation $\dotθ = -k θ(t-τ)$. I hope it will be useful for the new comers to this equation, and perhaps even to the experts if this is a simpler method compared to other versions.

preprint2014arXiv

Performance of General STCs over Spatially Correlated MIMO Single-keyhole Channels

For MIMO Rayleigh channels, it has been shown that transmitter correlations always degrade the performance of general space-time codes (STCs) in high SNR regimes. In this correspondence, however, we show that when MIMO channels experience single-keyhole conditions, the effect of spatial correlations between transmission antennas is more sophisticated for general STCs: when $M>N$ (i.e., the number of transmission antennas is greater than the number of receiving antennas), depending on how the correlation matrix $\mathbf{P}$ beamforms the code word difference matrix $\mathbfΔ$, the PEP performance of general STCs can be either degraded or improved in high SNR regimes. We provide a new measure, which is based on the eigenvalues of $\mathbfΔ$ and the numbers of transmission and receiving antennas, to exam if there exists certain correlation matrices that can improve the performance of general STCs in high SNR regimes. Previous studies on the effect of spatial correlations over single-keyhole channels only concentrated on orthogonal STCs, while our study here is for general STCs and can also be used to explain previous findings for orthogonal STCs.

preprint2014arXiv

Unitary Query for the $M \times L \times N$ MIMO Backscatter RFID Channel

A MIMO backscatter RFID system consists of three operational ends: the query end (with $M$ reader transmitting antennas), the tag end (with $L$ tag antennas) and the receiving end (with $N$ reader receiving antennas). Such an $M \times L \times N$ setting in RFID can bring spatial diversity and has been studied for STC at the tag end. Current understanding of the query end is that it is only an energy provider for the tag and query signal designs cannot improve the performance. However, we propose a novel \textit{unitary query} scheme, which creates time diversity \emph{within channel coherent time} and can yield \emph{significant} performance improvements. To overcome the difficulty of evaluating the performance when the unitary query is employed at the query end and STC is employed at the tag end, we derive a new measure based on the ranks of certain carefully constructed matrices. The measure implies that the unitary query has superior performance. Simulations show that the unitary query can bring $5-10$ dB gain in mid SNR regimes. In addition, the unitary query can also improve the performance of single-antenna tags significantly, allowing employing low complex and small-size single-antenna tags for high performance. This improvement is unachievable for single-antenna tags when the conventional uniform query is employed.

preprint2013arXiv

An Adaptive Descriptor Design for Object Recognition in the Wild

Digital images nowadays have various styles of appearance, in the aspects of color tones, contrast, vignetting, and etc. These 'picture styles' are directly related to the scene radiance, image pipeline of the camera, and post processing functions. Due to the complexity and nonlinearity of these causes, popular gradient-based image descriptors won't be invariant to different picture styles, which will decline the performance of object recognition. Given that images shared online or created by individual users are taken with a wide range of devices and may be processed by various post processing functions, to find a robust object recognition system is useful and challenging. In this paper, we present the first study on the influence of picture styles for object recognition, and propose an adaptive approach based on the kernel view of gradient descriptors and multiple kernel learning, without estimating or specifying the styles of images used in training and testing. We conduct experiments on Domain Adaptation data set and Oxford Flower data set. The experiments also include several variants of the flower data set by processing the images with popular photo effects. The results demonstrate that our proposed method improve from standard descriptors in all cases.

preprint2010arXiv

Falling Particles in Fluids at Intermediate Reynolds Numbers

In this video, we present the dynamics of an array of falling particles at intermediate Reynolds numbers. The film shows the vorticity plots of 3, 4, 7, 16 falling particles at $Re = 200$. We highlight the effect of parity on the falling configuration of the array. In steady state, an initially uniformly spaced array forms a convex shape when $n=3$, i.e the middle particle leads, but forms a concave shape when $n = 4$. For larger odd numbers of particles, the final state consists of a mixture of concave and convex shapes. For larger even numbers of particles, the steady state remains a concave shape. Below a threshold of initial particle spacing, particles cluster in groups of 2 to 3.

preprint2009arXiv

Fruit flies modulate passive wing pitching to generate in-flight turns

Flying insects execute aerial maneuvers through subtle manipulations of their wing motions. Here, we measure the free flight kinematics of fruit flies and determine how they modulate their wing pitching to induce sharp turns. By analyzing the torques these insects exert to pitch their wings, we infer that the wing hinge acts as a torsional spring that passively resists the wing's tendency to flip in response to aerodynamic and inertial forces. To turn, the insects asymmetrically change the spring rest angles to generate rowing motions of their wings. Thus, insects can generate these maneuvers using only a slight active actuation that biases their wing motion.

preprint1999arXiv

Spectrum of the Fokker-Planck operator representing diffusion in a random velocity field

We study spectral properties of the Fokker-Planck operator that represents particles moving via a combination of diffusion and advection in a time-independent random velocity field, presenting in detail work outlined elsewhere [J. T. Chalker and Z. J. Wang, Phys. Rev. Lett. {\bf 79}, 1797 (1997)]. We calculate analytically the ensemble-averaged one-particle Green function and the eigenvalue density for this Fokker-Planck operator, using a diagrammatic expansion developed for resolvents of non-Hermitian random operators, together with a mean-field approximation (the self-consistent Born approximation) which is well-controlled in the weak-disorder regime for dimension d>2. The eigenvalue density in the complex plane is non-zero within a wedge that encloses the negative real axis. Particle motion is diffusive at long times, but for short times we find a novel time-dependence of the mean-square displacement, $<r^2> \sim t^{2/d}$ in dimension d>2, associated with the imaginary parts of eigenvalues.

preprint1997arXiv

Diffusion in a Random Velocity Field: Spectral Properties of a Non-Hermitian Fokker-Planck Operator

We study spectral properties of the Fokker-Planck operator that describes particles diffusing in a quenched random velocity field. This random operator is non-Hermitian and has eigenvalues occupying a finite area in the complex plane. We calculate the eigenvalue density and averaged one-particle Green's function, for weak disorder and dimension d>2. We relate our results to the time-evolution of particle density, and compare them with numerical simulations.

Z. Jane Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

AdaptPose: Cross-Dataset Adaptation for 3D Human Pose Estimation by Learnable Motion Generation

Efficient Subsampling of Realistic Images From GANs Conditional on a Class or a Continuous Variable

Joint Precoding for Active Intelligent Transmitting Surface Empowered Outdoor-to-Indoor Communication in mmWave Cellular Networks

Multi-modal Streaming 3D Object Detection

SSD-KD: A Self-supervised Diverse Knowledge Distillation Method for Lightweight Skin Lesion Classification Using Dermoscopic Images

A Low Complexity Quantum Principal Component Analysis Algorithm

SCIDA: Self-Correction Integrated Domain Adaptation from Single- to Multi-label Aerial Images

Toward Open-World Electroencephalogram Decoding Via Deep Learning: A Comprehensive Survey

CHAIN: Concept-harmonized Hierarchical Inference Interpretation of Deep Convolutional Neural Networks

Subsampling Generative Adversarial Networks: Density Ratio Estimation in Feature Space with Softplus Loss

Real-time 2D/3D Registration via CNN Regression

On the Instability and Critical Damping Conditions, $kτ= 1/e$ and $kτ= π/2$ of the equation $\dotθ = -k θ(t-τ)$

Performance of General STCs over Spatially Correlated MIMO Single-keyhole Channels

Unitary Query for the $M \times L \times N$ MIMO Backscatter RFID Channel

An Adaptive Descriptor Design for Object Recognition in the Wild

Falling Particles in Fluids at Intermediate Reynolds Numbers

Fruit flies modulate passive wing pitching to generate in-flight turns

Spectrum of the Fokker-Planck operator representing diffusion in a random velocity field

Diffusion in a Random Velocity Field: Spectral Properties of a Non-Hermitian Fokker-Planck Operator