Researcher profile

Quan Zhou

Quan Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2023arXiv

Fast Replica Exchange Stochastic Gradient Langevin Dynamics

Application of the replica exchange (i.e., parallel tempering) technique to Langevin Monte Carlo algorithms, especially stochastic gradient Langevin dynamics (SGLD), has scored great success in non-convex learning problems, but one potential limitation is the computational cost caused by running multiple chains. Upon observing that a large variance of the gradient estimator in SGLD essentially increases the temperature of the stationary distribution, we propose expediting tempering schemes for SGLD by directly estimating the bias caused by the stochastic gradient estimator. This simple idea enables us to simulate high-temperature chains at a negligible computational cost (compared to that of the low-temperature chain) while preserving the convergence to the target distribution. Our method is fundamentally different from the recently proposed m-reSGLD (multi-variance replica exchange SGLD) method in that the latter suffers from the low accuracy of the gradient estimator (e.g., the chain can fail to converge to the target) while our method benefits from it. Further, we derive a swapping rate that can be easily evaluated, providing another significant improvement over m-reSGLD. To theoretically demonstrate the advantage of our method, we develop convergence bounds in Wasserstein distances. Numerical examples for Gaussian mixture and inverse PDE models are also provided, which show that our method can converge quicker than the vanilla multi-variance replica exchange method.

preprint2022arXiv

Deep Neural Network for Blind Visual Quality Assessment of 4K Content

The 4K content can deliver a more immersive visual experience to consumers due to the huge improvement of spatial resolution. However, existing blind image quality assessment (BIQA) methods are not suitable for the original and upscaled 4K contents due to the expanded resolution and specific distortions. In this paper, we propose a deep learning-based BIQA model for 4K content, which on one hand can recognize true and pseudo 4K content and on the other hand can evaluate their perceptual visual quality. Considering the characteristic that high spatial resolution can represent more abundant high-frequency information, we first propose a Grey-level Co-occurrence Matrix (GLCM) based texture complexity measure to select three representative image patches from a 4K image, which can reduce the computational complexity and is proven to be very effective for the overall quality prediction through experiments. Then we extract different kinds of visual features from the intermediate layers of the convolutional neural network (CNN) and integrate them into the quality-aware feature representation. Finally, two multilayer perception (MLP) networks are utilized to map the quality-aware features into the class probability and the quality score for each patch respectively. The overall quality index is obtained through the average pooling of patch results. The proposed model is trained through the multi-task learning manner and we introduce an uncertainty principle to balance the losses of the classification and regression tasks. The experimental results show that the proposed model outperforms all compared BIQA metrics on four 4K content quality assessment databases.

preprint2022arXiv

Dimension-free Mixing for High-dimensional Bayesian Variable Selection

Yang et al. (2016) proved that the symmetric random walk Metropolis--Hastings algorithm for Bayesian variable selection is rapidly mixing under mild high-dimensional assumptions. We propose a novel MCMC sampler using an informed proposal scheme, which we prove achieves a much faster mixing time that is independent of the number of covariates, under the same assumptions. To the best of our knowledge, this is the first high-dimensional result which rigorously shows that the mixing rate of informed MCMC methods can be fast enough to offset the computational cost of local posterior evaluation. Motivated by the theoretical analysis of our sampler, we further propose a new approach called "two-stage drift condition" to studying convergence rates of Markov chains on general state spaces, which can be useful for obtaining tight complexity bounds in high-dimensional settings. The practical advantages of our algorithm are illustrated by both simulation studies and real data analysis.

preprint2022arXiv

Rapid Convergence of Informed Importance Tempering

Informed Markov chain Monte Carlo (MCMC) methods have been proposed as scalable solutions to Bayesian posterior computation on high-dimensional discrete state spaces, but theoretical results about their convergence behavior in general settings are lacking. In this article, we propose a class of MCMC schemes called informed importance tempering (IIT), which combine importance sampling and informed local proposals, and derive generally applicable spectral gap bounds for IIT estimators. Our theory shows that IIT samplers have remarkable scalability when the target posterior distribution concentrates on a small set. Further, both our theory and numerical experiments demonstrate that the informed proposal should be chosen with caution: the performance of some proposals may be very sensitive to the shape of the target distribution. We find that the "square-root proposal weighting" scheme tends to perform well in most settings.

preprint2022arXiv

The CDF W-mass, muon g-2, and dark matter in a $U(1)_{L_μ-L_τ}$ model with vector-like leptons

We study the CDF $W$-mass, muon $g-2$, and dark matter observables in a local $U(1)_{L_μ-L_τ}$ model in which the new particles include three vector-like leptons ($E_1,~ E_2,~ N$), a new gauge boson $Z&#39;$, a scalar $S$ (breaking $U(1)_{L_μ-L_τ}$), a scalar dark matter $X_I$ and its partner $X_R$. We find that the CDF $W$-mass disfavors $m_{E_1}= m_{E_2}={m_N}$ or $s_L=s_R=0$ where $s_{L(R)}$ is mixing parameter of left (right)-handed fields of vector-like leptons. A large mass splitting between $E_1$ and $E_2$ is favored when the differences between $s_L$ and $s_R$ becomes small. The muon $g-2$ anomaly can be simultaneously explained for appropriate difference between $s_L$ $(m_{E_1})$ and $s_R$ $(m_{E_2})$, and some regions are excluded by the diphoton signal data of the 125 GeV Higgs. Combined with the CDF $W$-mass, muon $g-2$ anomaly and other relevant constraints, the correct dark matter relic density is mainly obtained in two different scenarios: (i) $X_IX_I\to Z&#39;Z&#39;,~ SS$ for $m_{Z&#39;}(m_S)<m_{X_I}$ and (ii) the co-annihilation processes for $min(m_{E_1},m_{E_2},m_N,m_{X_R})$ closed to $m_{X_I}$. Finally, we use the direct searches for $2\ell+E_T^{miss}$ event at the LHC to constrain the model, and show the allowed mass ranges of the vector-like leptons and dark matter.

preprint2021arXiv

An Energy Sharing Mechanism Achieving the Same Flexibility as Centralized Dispatch

Deploying distributed renewable energy at the demand side is an important measure to implement a sustainable society. However, the massive small solar and wind generation units are beyond the control of a central operator. To encourage users to participate in energy management and reduce the dependence on dispatchable resources, a peer-to-peer energy sharing scheme is proposed which releases the flexibility at the demand side. Every user makes decision individually considering only local constraints; the microgrid operator announces the sharing prices subjective to the coupling constraints without knowing users&#39; local constraints. This can help protect privacy. We prove that the proposed mechanism can achieve the same disutility and flexibility as centralized dispatch, and develop an effective modified best response based algorithm for reaching the market equilibrium. The concept of absorbable region is presented to measure the operating flexibility under the proposed energy sharing mechanism. A linear programming based polyhedral projection algorithm is developed to compute that region. Case studies validate the theoretical results and show that the proposed method is scalable.

preprint2021arXiv

BubbleNet: Inferring micro-bubble dynamics with semi-physics-informed deep learning

Micro-bubbles and bubbly flows are widely observed and applied in chemical engineering, medicine, involves deformation, rupture, and collision of bubbles, phase mixture, etc. We study bubble dynamics by setting up two numerical simulation cases: bubbly flow with a single bubble and multiple bubbles, both confined in the microchannel, with parameters corresponding to their medical backgrounds. Both the cases have their medical background applications. Multiphase flow simulation requires high computation accuracy due to possible component losses that may be caused by sparse meshing during the computation. Hence, data-driven methods can be adopted as an useful tool. Based on physics-informed neural networks (PINNs), we propose a novel deep learning framework BubbleNet, which entails three main parts: deep neural networks (DNN) with sub nets for predicting different physics fields; the semi-physics-informed part, with only the fluid continuum condition and the pressure Poisson equation $\mathcal{P}$ encoded within; the time discretized normalizer (TDN), an algorithm to normalize field data per time step before training. We apply the traditional DNN and our BubbleNet to train the coarsened simulation data and predict the physics fields of both the two bubbly flow cases. The BubbleNets are trained for both with and without $\mathcal{P}$, from which we conclude that the &#39;physics-informed&#39; part can serve as inner supervision. Results indicate our framework can predict the physics fields more accurately, estimating the prediction absolute errors. Our deep learning predictions outperform traditional numerical methods computed with similar data density meshing. The proposed network can potentially be applied to many other engineering fields.

preprint2021arXiv

Fairness in Forecasting and Learning Linear Dynamical Systems

In machine learning, training data often capture the behaviour of multiple subgroups of some underlying human population. When the amounts of training data for the subgroups are not controlled carefully, under-representation bias arises. We introduce two natural notions of subgroup fairness and instantaneous fairness to address such under-representation bias in time-series forecasting problems. In particular, we consider the subgroup-fair and instant-fair learning of a linear dynamical system (LDS) from multiple trajectories of varying lengths, and the associated forecasting problems. We provide globally convergent methods for the learning problems using hierarchies of convexifications of non-commutative polynomial optimisation problems. Our empirical results on a biased data set motivated by insurance applications and the well-known COMPAS data set demonstrate both the beneficial impact of fairness considerations on statistical performance and encouraging effects of exploiting sparsity on run time.

preprint2021arXiv

The Sensitivity of Word Embeddings-based Author Detection Models to Semantic-preserving Adversarial Perturbations

Authorship analysis is an important subject in the field of natural language processing. It allows the detection of the most likely writer of articles, news, books, or messages. This technique has multiple uses in tasks related to authorship attribution, detection of plagiarism, style analysis, sources of misinformation, etc. The focus of this paper is to explore the limitations and sensitiveness of established approaches to adversarial manipulations of inputs. To this end, and using those established techniques, we first developed an experimental frame-work for author detection and input perturbations. Next, we experimentally evaluated the performance of the authorship detection model to a collection of semantic-preserving adversarial perturbations of input narratives. Finally, we compare and analyze the effects of different perturbation strategies, input and model configurations, and the effects of these on the author detection model.

preprint2021arXiv

Tighter Bound Estimation of Sensitivity Analysis for Incremental and Decremental Data Modification

In large-scale classification problems, the data set always be faced with frequent updates when a part of the data is added to or removed from the original data set. In this case, conventional incremental learning, which updates an existing classifier by explicitly modeling the data modification, is more efficient than retraining a new classifier from scratch. However, sometimes, we are more interested in determining whether we should update the classifier or performing some sensitivity analysis tasks. To deal with these such tasks, we propose an algorithm to make rational inferences about the updated linear classifier without exactly updating the classifier. Specifically, the proposed algorithm can be used to estimate the upper and lower bounds of the updated classifier&#39;s coefficient matrix with a low computational complexity related to the size of the updated dataset. Both theoretical analysis and experiment results show that the proposed approach is superior to existing methods in terms of tightness of coefficients&#39; bounds and computational complexity.

preprint2020arXiv

BiCANet: Bi-directional Contextual Aggregating Network for Image Semantic Segmentation

Exploring contextual information in convolution neural networks (CNNs) has gained substantial attention in recent years for semantic segmentation. This paper introduces a Bi-directional Contextual Aggregating Network, called BiCANet, for semantic segmentation. Unlike previous approaches that encode context in feature space, BiCANet aggregates contextual cues from a categorical perspective, which is mainly consist of three parts: contextual condensed projection block (CCPB), bi-directional context interaction block (BCIB), and muti-scale contextual fusion block (MCFB). More specifically, CCPB learns a category-based mapping through a split-transform-merge architecture, which condenses contextual cues with different receptive fields from intermediate layer. BCIB, on the other hand, employs dense skipped-connections to enhance the class-level context exchanging. Finally, MCFB integrates multi-scale contextual cues by investigating short- and long-ranged spatial dependencies. To evaluate BiCANet, we have conducted extensive experiments on three semantic segmentation datasets: PASCAL VOC 2012, Cityscapes, and ADE20K. The experimental results demonstrate that BiCANet outperforms recent state-of-the-art networks without any postprocess techniques. Particularly, BiCANet achieves the mIoU score of 86.7%, 82.4% and 38.66% on PASCAL VOC 2012, Cityscapes and ADE20K testset, respectively.

preprint2020arXiv

Bridge the Domain Gap Between Ultra-wide-field and Traditional Fundus Images via Adversarial Domain Adaptation

For decades, advances in retinal imaging technology have enabled effective diagnosis and management of retinal disease using fundus cameras. Recently, ultra-wide-field (UWF) fundus imaging by Optos camera is gradually put into use because of its broader insights on fundus for some lesions that are not typically seen in traditional fundus images. Research on traditional fundus images is an active topic but studies on UWF fundus images are few. One of the most important reasons is that UWF fundus images are hard to obtain. In this paper, for the first time, we explore domain adaptation from the traditional fundus to UWF fundus images. We propose a flexible framework to bridge the domain gap between two domains and co-train a UWF fundus diagnosis model by pseudo-labelling and adversarial learning. We design a regularisation technique to regulate the domain adaptation. Also, we apply MixUp to overcome the over-fitting issue from incorrect generated pseudo-labels. Our experimental results on either single or both domains demonstrate that the proposed method can well adapt and transfer the knowledge from traditional fundus images to UWF fundus images and improve the performance of retinal disease recognition.

preprint2020arXiv

Entropy Minimization vs. Diversity Maximization for Domain Adaptation

Entropy minimization has been widely used in unsupervised domain adaptation (UDA). However, existing works reveal that entropy minimization only may result into collapsed trivial solutions. In this paper, we propose to avoid trivial solutions by further introducing diversity maximization. In order to achieve the possible minimum target risk for UDA, we show that diversity maximization should be elaborately balanced with entropy minimization, the degree of which can be finely controlled with the use of deep embedded validation in an unsupervised manner. The proposed minimal-entropy diversity maximization (MEDM) can be directly implemented by stochastic gradient descent without use of adversarial learning. Empirical evidence demonstrates that MEDM outperforms the state-of-the-art methods on four popular domain adaptation datasets.

preprint2019arXiv

A lossless data hiding scheme in JPEG images with segment coding

In this paper, we propose a lossless data hiding scheme in JPEG images. After quantified DCT transform, coefficients have characteristics that distribution in high frequencies is relatively sparse and absolute values are small. To improve encoding efficiency, we put forward an encoding algorithm that searches for a high frequency as terminate point and recode the coefficients above, so spare space is reserved to embed secret data and appended data with no file expansion. Receiver can obtain terminate point through data analysis, extract additional data and recover original JPEG images lossless. Experimental results show that the proposed method has a larger capacity than state-of-the-art works.

preprint2012arXiv

Thermal boundary layer profiles in turbulent Rayleigh-Bénard convection in a cylindrical sample

We numerically investigate the structures of the near-plate temperature profiles close to the bottom and top plates of turbulent Rayleigh-Bénard flow in a cylindrical sample at Rayleigh numbers Ra=10^8 to Ra=2\times10^{12} and Prandtl numbers Pr=6.4 and Pr=0.7 with the dynamical frame method [Q. Zhou and K.-Q. Xia, Phys. Rev. Lett. 104, 104301 (2010)] thus extending previous results for quasi-2-dimensional systems to 3D systems for the first time. The dynamical frame method shows that the measured temperature profiles in the spatially and temporally local frame are much closer to the temperature profile of a laminar, zero-pressure gradient boundary layer according to Pohlhausen than in the fixed reference frame. The deviation between the measured profiles in the dynamical reference frame and the laminar profiles increases with decreasing Pr, where the thermal BL is more exposed to the bulk fluctuations due to the thinner kinetic BL, and increasing Ra, where more plumes are passing the measurement location.

preprint2011arXiv

Horizontal Structures of Velocity and Temperature Boundary Layers in 2D Numerical Turbulent Rayleigh-Bénard Convection

We investigate the structures of the near-plate velocity and temperature profiles at different horizontal positions along the conducting bottom (and top) plate of a Rayleigh-Bénard convection cell, using two-dimensional (2D) numerical data obtained at the Rayleigh number Ra=10^8 and the Prandtl number Pr=4.4 of an Oberbeck-Boussinesq flow with constant material parameters. The results show that most of the time, and for both velocity and temperature, the instantaneous profiles scaled by the dynamical frame method [Q. Zhou and K.-Q. Xia, Phys. Rev. Lett. 104, 104301 (2010) agree well with the classical Prandtl-Blasius laminar boundary layer (BL) profiles. Therefore, when averaging in the dynamical reference frames, which fluctuate with the respective instantaneous kinematic and thermal BL thicknesses, the obtained mean velocity and temperature profiles are also of Prandtl-Blasius type for nearly all horizontal positions. We further show that in certain situations the traditional definitions based on the time-averaged profiles can lead to unphysical BL thicknesses, while the dynamical method also in such cases can provide a well-defined BL thickness for both the kinematic and the thermal BLs.

preprint2010arXiv

Prandtl-Blasius temperature and velocity boundary layer profiles in turbulent Rayleigh-Bénard convection

The shape of velocity and temperature profiles near the horizontal conducting plates in turbulent Rayleigh-Bénard convection are studied numerically and experimentally over the Rayleigh number range $10^8\lesssim Ra\lesssim3\times10^{11}$ and the Prandtl number range $0.7\lesssim Pr\lesssim5.4$. The results show that both the temperature and velocity profiles well agree with the classical Prandtl-Blasius laminar boundary-layer profiles, if they are re-sampled in the respective dynamical reference frames that fluctuate with the instantaneous thermal and velocity boundary-layer thicknesses.