Source author record

Feng Zhou

Feng Zhou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

43works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Controllable Generation with Text-to-Image Diffusion Models: A Survey

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the landscape, marking a significant shift in capabilities with their impressive text-guided generative functions. However, relying solely on text for conditioning these models does not fully cater to the varied and complex requirements of different applications and scenarios. Acknowledging this shortfall, a variety of studies aim to control pre-trained text-to-image (T2I) models to support novel conditions. In this survey, we undertake a thorough review of the literature on controllable generation with T2I diffusion models, covering both the theoretical foundations and practical advancements in this domain. Our review begins with a brief introduction to the basics of denoising diffusion probabilistic models (DDPMs) and widely used T2I diffusion models. We then reveal the controlling mechanisms of diffusion models, theoretically analyzing how novel conditions are introduced into the denoising process for conditional generation. Additionally, we offer a detailed overview of research in this area, organizing it into distinct categories from the condition perspective: generation with specific conditions, generation with multiple conditions, and universal controllable generation. For an exhaustive list of the controllable generation literature surveyed, please refer to our curated repository at https://github.com/PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models.

preprint2026arXiv

Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology

Deciphering animal intent is a fundamental challenge in computational ethology, largely because of semantic aliasing, the phenomenon where identical external signals (e.g., a cat's purr) correspond to radically different internal states depending on physiological context. Existing Multimodal Large Language Models (MLLMs) are blind to high-frequency biological time-series data, restricting them to superficial behavioural pattern matching rather than genuine latent-state reasoning. To bridge this gap, we introduce Meow-Omni 1, the first open-source, quad-modal MLLM purpose-built for computational ethology. It natively fuses video, audio, and physiological time-series streams with textual reasoning. Through targeted architectural adaptation, we integrate specialized scientific encoders into a unified backbone and formalize intent inference via physiologically grounded cross-modal alignment. Evaluated on MeowBench, a novel, expert-verified quad-modal benchmark, Meow-Omni 1 achieves state-of-the-art intent-recognition accuracy (71.16%), substantially outperforming leading vision-language and omni-modal baselines. We release the complete open-source pipeline including model weights, training framework, and the Meow-10K dataset, to establish a scalable paradigm for inter-species intent understanding and to advance foundation models toward real-world veterinary diagnostics and wildlife conservation.

preprint2026arXiv

Mud-Standoff Effect Correction Based on Open-Short Calibration and Resistivity Consistency-Constrained Iterative Inversion for Oil-Based Mud Imagers

We propose a mud-standoff effect correction method and a set of approximate apparent resistivity inversion methods suitable for oil-based mud micro-resistivity imaging logging. To calibrate the influence of the mud layer on electrode measurement signals, this study integrates the Open-Short calibration(OSC) method with the three-layer impedance model of the oil-based mud resistivity imager. By treating the electrode and the mud layer as an integrated whole and simulating the open/short-circuit states via the finite element method, the independent extraction of the mud layer impedance signal is achieved, and the formation impedance signal is separated from the total impedance. For fast inversion of formation resistivity, standoff thickness (mud layer thickness), and relative permittivity of formation, a resistivity consistency-constrained iterative inversion method is further proposed. In this method, the formation impedance is first converted into resistivity, and then the consistency residual of the resistivity at different frequencies is used as the objective function to invert the approximate apparent resistivity of the formation through an iterative optimization algorithm. The effectiveness of the finite element-simulated OSC method, the objective function construction scheme, and the consistency inversion method is verified through both numerical models and an example of field data.

preprint2026arXiv

Upstream Laser-based Longitudinal Enhancement of Relativistic Photoelectrons

Controlling the longitudinal phase space of high-brightness relativistic electron beams is crucial for advancing a broad spectrum of charged-particle-based instrumentation and scientific frontiers. A generalized method for achieving this control involves manipulating the photoemission laser's temporal distribution at the picosecond level, a long-standing technical challenge. Recent developments in laser shaping have enabled the creation of high-power, picosecond-scale symmetrical and asymmetrical temporal profiles, capable of fine-tuning complex space-charge dynamics and external field effects in relativistic charged-particle beams. Here, we demonstrate that rather than deviations from theorized, idealized laser distributions, a controlled asymmetry can be harnessed to counteract accelerator-induced distortions. By implementing spatiotemporal shaping of the ultraviolet photocathode laser at the LCLS-II superconducting injector, we achieve deterministic control over the longitudinal phase space without downstream corrections. We find that this optical asymmetry induces a self-linearizing effect across both low (40 pC) and high (80 pC) charge regimes, effectively suppressing nonlinear compression and energy chirp. Consequently, this approach is expected to preserve a low emittance comparable to that of ideal flattop or regular Gaussian profiles, while delivering superior current uniformity and shot-to-shot stability. These results establish spatiotemporal laser shaping as a compact, generalizable tool for directly optimizing beam brightness at the source.

preprint2025arXiv

Federated Neural Nonparametric Point Processes

Temporal point processes (TPPs) are effective for modeling event occurrences over time, but they struggle with sparse and uncertain events in federated systems, where privacy is a major concern. To address this, we propose \textit{FedPP}, a Federated neural nonparametric Point Process model. FedPP integrates neural embeddings into Sigmoidal Gaussian Cox Processes (SGCPs) on the client side, which is a flexible and expressive class of TPPs, allowing it to generate highly flexible intensity functions that capture client-specific event dynamics and uncertainties while efficiently summarizing historical records. For global aggregation, FedPP introduces a divergence-based mechanism that communicates the distributions of SGCPs' kernel hyperparameters between the server and clients, while keeping client-specific parameters local to ensure privacy and personalization. FedPP effectively captures event uncertainty and sparsity, and extensive experiments demonstrate its superior performance in federated settings, particularly with KL divergence and Wasserstein distance-based global aggregation.

preprint2022arXiv

Anticipated emotions associated with trust in autonomous vehicles

Trust in automation has been mainly studied in the cognitive perspective, though some researchers have shown that trust is also influenced by emotion. Therefore, it is essential to investigate the relationships between emotions and trust. In this study, we explored the pattern of 19 anticipated emotions associated with two levels of trust (i.e., low vs. high levels of trust) elicited from two levels of autonomous vehicles (AVs) performance (i.e., failure and non-failure) from 105 participants from Amazon Mechanical Turk (AMT). Trust was assessed at three layers i.e., dispositional, initial learned, and situational trust. The study was designed to measure how emotions are affected with low and high levels of trust. Situational trust was significantly correlated with emotions that a high level of trust significantly improved participants' positive emotions, and vice versa. We also identified the underlying factors of emotions associated with situational trust. Our results offered important implications on anticipated emotions associated with trust in AVs.

preprint2022arXiv

Basket-based Softmax

Softmax-based losses have achieved state-of-the-art performances on various tasks such as face recognition and re-identification. However, these methods highly relied on clean datasets with global labels, which limits their usage in many real-world applications. An important reason is that merging and organizing datasets from various temporal and spatial scenarios is usually not realistic, as noisy labels can be introduced and exponential-increasing resources are required. To address this issue, we propose a novel mining-during-training strategy called Basket-based Softmax (BBS) as well as its parallel version to effectively train models on multiple datasets in an end-to-end fashion. Specifically, for each training sample, we simultaneously adopt similarity scores as the clue to mining negative classes from other datasets, and dynamically add them to assist the learning of discriminative features. Experimentally, we demonstrate the efficiency and superiority of the BBS on the tasks of face recognition and re-identification, with both simulated and real-world datasets.

preprint2022arXiv

Cause-and-Effect Analysis of ADAS: A Comparison Study between Literature Review and Complaint Data

Advanced driver assistance systems (ADAS) are designed to improve vehicle safety. However, it is difficult to achieve such benefits without understanding the causes and limitations of the current ADAS and their possible solutions. This study 1) investigated the limitations and solutions of ADAS through a literature review, 2) identified the causes and effects of ADAS through consumer complaints using natural language processing models, and 3) compared the major differences between the two. These two lines of research identified similar categories of ADAS causes, including human factors, environmental factors, and vehicle factors. However, academic research focused more on human factors of ADAS issues and proposed advanced algorithms to mitigate such issues while drivers complained more of vehicle factors of ADAS failures, which led to associated top consequences. The findings from these two sources tend to complement each other and provide important implications for the improvement of ADAS in the future.

preprint2022arXiv

De-biased Representation Learning for Fairness with Unreliable Labels

Removing bias while keeping all task-relevant information is challenging for fair representation learning methods since they would yield random or degenerate representations w.r.t. labels when the sensitive attributes correlate with labels. Existing works proposed to inject the label information into the learning procedure to overcome such issues. However, the assumption that the observed labels are clean is not always met. In fact, label bias is acknowledged as the primary source inducing discrimination. In other words, the fair pre-processing methods ignore the discrimination encoded in the labels either during the learning procedure or the evaluation stage. This contradiction puts a question mark on the fairness of the learned representations. To circumvent this issue, we explore the following question: \emph{Can we learn fair representations predictable to latent ideal fair labels given only access to unreliable labels?} In this work, we propose a \textbf{D}e-\textbf{B}iased \textbf{R}epresentation Learning for \textbf{F}airness (DBRF) framework which disentangles the sensitive information from non-sensitive attributes whilst keeping the learned representations predictable to ideal fair labels rather than observed biased ones. We formulate the de-biased learning framework through information-theoretic concepts such as mutual information and information bottleneck. The core concept is that DBRF advocates not to use unreliable labels for supervision when sensitive information benefits the prediction of unreliable labels. Experiment results over both synthetic and real-world data demonstrate that DBRF effectively learns de-biased representations towards ideal labels.

preprint2022arXiv

Fully-integrated multipurpose microwave frequency identification system on a single chip

We demonstrate a fully-integrated multipurpose microwave frequency identification system on silicon-on-insulator platform. Thanks to its multipurpose features, the chip is able to identify different types of microwave signals, including single-frequency, multiple-frequency, chirped and frequency-hopping microwave signals, as well as discriminate instantaneous frequency variation among the frequency-modulated signals. This demonstration exhibits fully integrated solution and fully functional microwave frequency identification, which can meet the requirements in reduction of size, weight and power for future advanced microwave photonic processor.

preprint2022arXiv

Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters

The growing public concerns on data privacy in face recognition can be greatly addressed by the federated learning (FL) paradigm. However, conventional FL methods perform poorly due to the uniqueness of the task: broadcasting class centers among clients is crucial for recognition performances but leads to privacy leakage. To resolve the privacy-utility paradox, this work proposes PrivacyFace, a framework largely improves the federated learning face recognition via communicating auxiliary and privacy-agnostic information among clients. PrivacyFace mainly consists of two components: First, a practical Differentially Private Local Clustering (DPLC) mechanism is proposed to distill sanitized clusters from local class centers. Second, a consensus-aware recognition loss subsequently encourages global consensuses among clients, which ergo results in more discriminative features. The proposed framework is mathematically proved to be differentially private, introducing a lightweight overhead as well as yielding prominent performance boosts (\textit{e.g.}, +9.63\% and +10.26\% for TAR@FAR=1e-4 on IJB-B and IJB-C respectively). Extensive experiments and ablation studies on a large-scale dataset have demonstrated the efficacy and practicability of our method.

preprint2022arXiv

Intrinsically accurate sensing with an optomechanical accelerometer

We demonstrate a microfabricated optomechanical accelerometer that is capable of percent-level accuracy without external calibration. To achieve this capability, we use a mechanical model of the device behavior that can be characterized by the thermal noise response along with an optical frequency comb readout method that enables high sensitivity, high bandwidth, high dynamic range, and SI-traceable displacement measurements. The resulting intrinsic accuracy was evaluated over a wide frequency range by comparing to a primary vibration calibration system and local gravity. The average agreement was found to be 2.1 % for the calibration system between 0.1 kHz and 15 kHz and better than 0.2 % for the static acceleration. This capability has the potential to replace costly external calibrations and improve the accuracy of inertial guidance systems and remotely deployed accelerometers. Due to the fundamental nature of the intrinsic accuracy approach, it could be extended to other optomechanical transducers, including force and pressure sensors.

preprint2022arXiv

Investigating Explanations in Conditional and Highly Automated Driving: The Effects of Situation Awareness and Modality

With the level of automation increases in vehicles, such as conditional and highly automated vehicles (AVs), drivers are becoming increasingly out of the control loop, especially in unexpected driving scenarios. Although it might be not necessary to require the drivers to intervene on most occasions, it is still important to improve drivers' situation awareness (SA) in unexpected driving scenarios to improve their trust in and acceptance of AVs. In this study, we conceptualized SA at the levels of perception (SA L1), comprehension (SA L2), and projection (SA L3), and proposed an SA level-based explanation framework based on explainable AI. Then, we examined the effects of these explanations and their modalities on drivers' situational trust, cognitive workload, as well as explanation satisfaction. A three (SA levels: SA L1, SA L2 and SA L3) by two (explanation modalities: visual, visual + audio) between-subjects experiment was conducted with 340 participants recruited from Amazon Mechanical Turk. The results indicated that by designing the explanations using the proposed SA-based framework, participants could redirect their attention to the important objects in the traffic and understand their meaning for the AV system. This improved their SA and filled the gap of understanding the correspondence of AV's behavior in the particular situations which also increased their situational trust in AV. The results showed that participants reported the highest trust with SA L2 explanations, although the mental workload was assessed higher in this level. The results also provided insights into the relationship between the amount of information in explanations and modalities, showing that participants were more satisfied with visual-only explanations in the SA L1 and SA L2 conditions and were more satisfied with visual and auditory explanations in the SA L3 condition.

preprint2022arXiv

Isolated singularities for fractional Lane-Emden equations in the Serrin's supercritical case

In this paper, we give a classification of the isolated singularities of positive solutions to the semilinear fractional elliptic equations $$(E) \quad\quad (-Δ)^s u = |x|^θ u^{p}\quad {\rm in}\ \ B_1\setminus\{0\},\quad u= h\quad{\rm in}\ \ \mathbb{R}^N\setminus B_1,\quad $$ where $s\in(0,1)$, $θ>-2s$, $p>\frac{N+θ}{N-2s}$, $B_1$ is the unit ball centered at the origin of $\mathbb{R}^N$ with $N>2s$. $h$ is a nonnegative Hölder continuous function in $\mathbb{R}^N\setminus B_1$. Our analysis of isolated singularities of $(E)$ is based on an integral upper bounds and the study of the Poisson problem with the fractional Hardy operators. It is worth noting that our classification of isolated singularity holds in the Sobolev super critical case $p>\frac{N+2s+2θ}{N-2s}$ for $s\in(0,1]$ under suitable assumption of $h$.

preprint2022arXiv

Kerr optical parametric oscillation in a photonic crystal microring for accessing the infrared

Continuous wave optical parametric oscillation (OPO) provides a flexible approach for accessing mid-infrared wavelengths between 2 $μ$m to 5 $μ$m, but has not yet been integrated into silicon nanophotonics. Typically, Kerr OPO uses a single transverse mode family for pump, signal, and idler modes, and relies on a delicate balance to achieve normal (but close-to-zero) dispersion near the pump and the requisite higher-order dispersion needed for phase- and frequency-matching. Within integrated photonics platforms, this approach results in two major problems. First, the dispersion is very sensitive to geometry, so that small fabrication errors can have a large impact. Second, the device is susceptible to competing nonlinear processes near the pump. In this letter, we propose a flexible solution to infrared OPO that addresses these two problems, by using a silicon nitride photonic crystal microring (PhCR). The frequency shifts created by the PhCR bandgap enable OPO that would otherwise be forbidden. We report an intrinsic optical quality factor up to (1.2 $\pm$ 0.1)$\times$10$^6$ in the 2 $μ$m band, and use a PhCR ring to demonstrate an OPO with threshold power of (90 $\pm$ 20) mW dropped into the cavity, with the pump wavelength at 1998~nm, and the signal and idler wavelengths at 1937 nm and 2063 nm, respectively. We further discuss how to extend OPO spectral coverage in the mid-infrared. These results establish the PhCR OPO as a promising route for integrated laser sources in the infrared.

preprint2022arXiv

Quantifying Wetting Dynamics with Triboelectrification

Wetting is often perceived as an intrinsic surface property of materials, but determining its evolution is complicated by its complex dependence on roughness across the scales. The Wenzel state, where liquids have intimate contact with the rough substrate, and the Cassie-Baxter state, where liquids sit onto air pockets formed between asperities, are only two states among the plethora of wetting behaviors. Furthermore, transitions from the Cassie-Baxter to the Wenzel state dictate completely different surface performance, such as anti-contamination, anti-icing, drag reduction etc.; however, little is known about how transition occurs during time between the several wetting modes. In this paper, we show that wetting dynamics can be accurately quantified and tracked using solid-liquid triboelectrification. Theoretical underpinning reveals how surface micro-/nano-geometries regulate stability/infiltration, also demonstrating the generality of our theoretical approach in understanding wetting transitions.

preprint2022arXiv

Towards Privacy-Preserving, Real-Time and Lossless Feature Matching

Most visual retrieval applications store feature vectors for downstream matching tasks. These vectors, from where user information can be spied out, will cause privacy leakage if not carefully protected. To mitigate privacy risks, current works primarily utilize non-invertible transformations or fully cryptographic algorithms. However, transformation-based methods usually fail to achieve satisfying matching performances while cryptosystems suffer from heavy computational overheads. In addition, secure levels of current methods should be improved to confront potential adversary attacks. To address these issues, this paper proposes a plug-in module called SecureVector that protects features by random permutations, 4L-DEC converting and existing homomorphic encryption techniques. For the first time, SecureVector achieves real-time and lossless feature matching among sanitized features, along with much higher security levels than current state-of-the-arts. Extensive experiments on face recognition, person re-identification, image retrieval, and privacy analyses demonstrate the effectiveness of our method. Given limited public projects in this field, codes of our method and implemented baselines are made open-source in https://github.com/IrvingMeng/SecureVector.

preprint2021arXiv

Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models

Misinformation of COVID-19 is prevalent on social media as the pandemic unfolds, and the associated risks are extremely high. Thus, it is critical to detect and combat such misinformation. Recently, deep learning models using natural language processing techniques, such as BERT (Bidirectional Encoder Representations from Transformers), have achieved great successes in detecting misinformation. In this paper, we proposed an explainable natural language processing model based on DistilBERT and SHAP (Shapley Additive exPlanations) to combat misinformation about COVID-19 due to their efficiency and effectiveness. First, we collected a dataset of 984 claims about COVID-19 with fact checking. By augmenting the data using back-translation, we doubled the sample size of the dataset and the DistilBERT model was able to obtain good performance (accuracy: 0.972; areas under the curve: 0.993) in detecting misinformation about COVID-19. Our model was also tested on a larger dataset for AAAI2021 - COVID-19 Fake News Detection Shared Task and obtained good performance (accuracy: 0.938; areas under the curve: 0.985). The performance on both datasets was better than traditional machine learning models. Second, in order to boost public trust in model prediction, we employed SHAP to improve model explainability, which was further evaluated using a between-subjects experiment with three conditions, i.e., text (T), text+SHAP explanation (TSE), and text+SHAP explanation+source and evidence (TSESE). The participants were significantly more likely to trust and share information related to COVID-19 in the TSE and TSESE conditions than in the T condition. Our results provided good implications in detecting misinformation about COVID-19 and improving public trust.

preprint2021arXiv

Efficient Inference of Flexible Interaction in Spiking-neuron Networks

Hawkes process provides an effective statistical framework for analyzing the time-dependent interaction of neuronal spiking activities. Although utilized in many real applications, the classic Hawkes process is incapable of modelling inhibitory interactions among neurons. Instead, the nonlinear Hawkes process allows for a more flexible influence pattern with excitatory or inhibitory interactions. In this paper, three sets of auxiliary latent variables (Pólya-Gamma variables, latent marked Poisson processes and sparsity variables) are augmented to make functional connection weights in a Gaussian form, which allows for a simple iterative algorithm with analytical updates. As a result, an efficient expectation-maximization (EM) algorithm is derived to obtain the maximum a posteriori (MAP) estimate. We demonstrate the accuracy and efficiency performance of our algorithm on synthetic and real data. For real neural recordings, we show our algorithm can estimate the temporal dynamics of interaction and reveal the interpretable functional connectivity underlying neural spike trains.

preprint2021arXiv

Hybrid-mode-family Kerr optical parametric oscillation for robust coherent light generation on chip

Optical parametric oscillation (OPO) using the third-order nonlinearity ($χ^{(3)}$) in integrated photonics platforms is an emerging approach for coherent light generation, and has shown great promise in achieving broad spectral coverage with small device footprints and at low pump powers. However, current $χ^{(3)}$ nanophotonic OPO devices use pump, signal, and idler modes of the same transverse spatial mode family. As a result, such single-mode-family OPO (sOPO) is inherently sensitive in dispersion and can be challenging to scalably fabricate and implement. In this work, we propose to use different families of transverse spatial modes for pump, signal, and idler, which we term as hybrid-mode-family OPO (hOPO). We demonstrate its unprecedented robustness in dispersion versus device geometry, pump frequency, and temperature. Moreover, we show the capability of the hOPO scheme to generate a few milliwatts of output signal power with a power conversion efficiency of approximately 8 $\%$ and without competitive processes. The hOPO scheme is an important counterpoint to existing sOPO approaches, and is particularly promising as a robust method to generate coherent on-chip visible and infrared light sources.

preprint2021arXiv

Predicting Driver Fatigue in Automated Driving with Explainability

Research indicates that monotonous automated driving increases the incidence of fatigued driving. Although many prediction models based on advanced machine learning techniques were proposed to monitor driver fatigue, especially in manual driving, little is known about how these black-box machine learning models work. In this paper, we proposed a combination of eXtreme Gradient Boosting (XGBoost) and SHAP (SHapley Additive exPlanations) to predict driver fatigue with explanations due to their efficiency and accuracy. First, in order to obtain the ground truth of driver fatigue, PERCLOS (percentage of eyelid closure over the pupil over time) between 0 and 100 was used as the response variable. Second, we built a driver fatigue regression model using both physiological and behavioral measures with XGBoost and it outperformed other selected machine learning models with 3.847 root-mean-squared error (RMSE), 1.768 mean absolute error (MAE) and 0.996 adjusted $R^2$. Third, we employed SHAP to identify the most important predictor variables and uncovered the black-box XGBoost model by showing the main effects of most important predictor variables globally and explaining individual predictions locally. Such an explainable driver fatigue prediction model offered insights into how to intervene in automated driving when necessary, such as during the takeover transition period from automated driving to manual driving.

preprint2020arXiv

Additive Poisson Process: Learning Intensity of Higher-Order Interaction in Stochastic Processes

We present the Additive Poisson Process (APP), a novel framework that can model the higher-order interaction effects of the intensity functions in stochastic processes using lower dimensional projections. Our model combines the techniques in information geometry to model higher-order interactions on a statistical manifold and in generalized additive models to use lower-dimensional projections to overcome the effects from the curse of dimensionality. Our approach solves a convex optimization problem by minimizing the KL divergence from a sample distribution in lower dimensional projections to the distribution modeled by an intensity function in the stochastic process. Our empirical results show that our model is able to use samples observed in the lower dimensional space to estimate the higher-order intensity function with extremely sparse observations.

preprint2020arXiv

Asymptotic behaviors of governing equation of Gauged Sigma model for Heisenberg ferromagnet

In this note, we study weak solutions of equation \begin{equation}\label{eq 00.1} Δu =\frac{4e^u}{1+e^u} -4π\sum^{N}_{i=1}δ_{p_i}+4π\sum^{M}_{j=1}δ_{q_j} \quad{\rm in}\;\; \mathbb{R}^2, \end{equation} where $\{δ_{p_i}\}_{i=1}^N$ (resp. $\{δ_{q_j}\}_{j=1}^M$ ) are Dirac masses concentrated at the points $p_i, i=1,\cdots, N$, (resp. $q_j, i=1,\cdots, M$) %$δ_{p_j}$ is Dirac mass concentrated at the point $p_j$ and $N-M>1$. This equation presents a governing equation of Gauged Sigma model for Heisenberg ferromagnet and we prove that it has a sequence of solutions $u_β$ having behaviors as $-2πβ\ln |x|+O(1)$ at infinity with a free parameter $β\in(2,2(N-M))$, and our concern in this paper is to study the asymptotic behavior's estimates in the extremal case that $β$ near $2$ and $2(N-M)$.

preprint2020arXiv

Broadband Optomechanical Sensing at the Thermodynamic Limit

Cavity optomechanics has opened new avenues of research in both fundamental physics and precision measurement by significantly advancing the sensitivity achievable in detecting attonewton forces, nanoparticles, magnetic fields, and gravitational waves. A fundamental limit to sensitivity for these measurements is energy exchange with the environment as described by the fluctuation-dissipation theorem. While the limiting sensitivity can be increased by increasing the mass or reducing the damping of the mechanical sensing element, these design tradeoffs lead to larger detectors or limit the range of mechanical frequencies that can be measured, excluding the bandwidth requirements for many real-world applications. We report on a microfabricated optomechanical sensing platform based on a Fabry-Perot microcavity and show that when operating as an accelerometer it can achieve nearly ideal broadband performance at the thermodynamic limit (Brownian motion of the proof mass) with the highest sensitivity reported to date over a wide frequency range ($314\,nm \cdot s^{-2}/\sqrt{Hz}$ over 6.8 kHz). This approach is applicable to a range of measurements from pressure and force sensing to seismology and gravimetry, including searches for new physics such as non-Newtonian gravity or dark matter.

preprint2020arXiv

Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing

Face anti-spoofing is critical to the security of face recognition systems. Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing. Despite the great success, most previous works still formulate the problem as a single-frame multi-task one by simply augmenting the loss with depth, while neglecting the detailed fine-grained information and the interplay between facial depths and moving patterns. In contrast, we design a new approach to detect presentation attacks from multiple frames based on two insights: 1) detailed discriminative clues (e.g., spatial gradient magnitude) between living and spoofing face may be discarded through stacked vanilla convolutions, and 2) the dynamics of 3D moving faces provide important clues in detecting the spoofing faces. The proposed method is able to capture discriminative details via Residual Spatial Gradient Block (RSGB) and encode spatio-temporal information from Spatio-Temporal Propagation Module (STPM) efficiently. Moreover, a novel Contrastive Depth Loss is presented for more accurate depth supervision. To assess the efficacy of our method, we also collect a Double-modal Anti-spoofing Dataset (DMAD) which provides actual depth for each sample. The experiments demonstrate that the proposed approach achieves state-of-the-art results on five benchmark datasets including OULU-NPU, SiW, CASIA-MFSD, Replay-Attack, and the new DMAD. Codes will be available at https://github.com/clks-wzz/FAS-SGTD.

preprint2020arXiv

Do not forget interaction: Predicting fatality of COVID-19 patients using logistic regression

Amid the ongoing COVID-19 pandemic, whether COVID-19 patients with high risks can be recovered or not depends, to a large extent, on how early they will be treated appropriately before irreversible consequences are caused to the patients by the virus. In this research, we reported an explainable, intuitive, and accurate machine learning model based on logistic regression to predict the fatality rate of COVID-19 patients using only three important blood biomarkers, including lactic dehydrogenase, lymphocyte (%) and high-sensitivity C-reactive protein, and their interactions. We found that when the fatality probability produced by the logistic regression model was over 0.8, the model had the optimal performance in that it was able to predict patient fatalities more than 11.30 days on average with maximally 34.91 days in advance, an accumulative f1-score of 93.76% and and an accumulative accuracy score of 93.92%. Such a model can be used to identify COVID-19 patients with high risks with three blood biomarkers and help the medical systems around the world plan critical medical resources amid this pandemic.

preprint2020arXiv

Exact sparse reconstruction form Vandermonde matrices

As a conclusion in classical linear algebra, an underdetermined linear equations usually have an infinite number of solutions. The sparest one among these solutions is significant in many applications. This problem can be modeled as the $l_0$-minimization, However, to find the sparsest solution of an underdetermined linear equations is NP-hard. Therefore, an important approach to solve the following $l_p$-minimization ($0<p\leq1$), The purpose of this problem is to find a $p$-norm minimization solution $(0<p\leq1)$ instead of the sparest one. In order to study the equivalence relationship between $l_0$-minimization and $l_p$-minimization, most of related work adopt Restricted Isometry Property (RIP) and Restricted Isometry Constant (RIC). On the premise of RIP and RIC, those work only solve the situation when the solution $\breve{x}$ of $l_0$-minimization satisfies that $\|\breve{x}\|_0<k$ where $k$ is a known fixed constant with $k<\frac{spark(A)}{2}$. One of the results in this paper is to give an analytic expression $p^*$ such that $l_p$-minimization is equivalent to $l_0$-minimization for every $\|\breve{x}\|_0<\frac{spark(A)}{2}$. In this paper, we also consider the case where the matrix $A$ is a Vandermonde matrix and we present an analytic expression $p^*$ such that the solution of $l_p$-minimization also solve $l_0$-minimization. Compared with the similar results based on RIP and RIC, we do not need the uniqueness assumption, i.e., the solution $x^*$ of $l_0$-minimization do not have to be assumed to be the unique solution which is the main breakthrough in our result. Another superiority of our result is its computability, i.e., each part in the analytic expression can be easily calculated.

preprint2020arXiv

Examining the Effects of Emotional Valence and Arousal on Takeover Performance in Conditionally Automated Driving

In conditionally automated driving, drivers have difficulty in takeover transitions as they become increasingly decoupled from the operational level of driving. Factors influencing takeover performance, such as takeover lead time and the engagement of non-driving related tasks, have been studied in the past. However, despite the important role emotions play in human-machine interaction and in manual driving, little is known about how emotions influence drivers takeover performance. This study, therefore, examined the effects of emotional valence and arousal on drivers takeover timeliness and quality in conditionally automated driving. We conducted a driving simulation experiment with 32 participants. Movie clips were played for emotion induction. Participants with different levels of emotional valence and arousal were required to take over control from automated driving, and their takeover time and quality were analyzed. Results indicate that positive valence led to better takeover quality in the form of a smaller maximum resulting acceleration and a smaller maximum resulting jerk. However, high arousal did not yield an advantage in takeover time. This study contributes to the literature by demonstrating how emotional valence and arousal affect takeover performance. The benefits of positive emotions carry over from manual driving to conditionally automated driving while the benefits of arousal do not.

preprint2020arXiv

Physical-Layer Security for Two-Hop Air-to-Underwater Communication Systems With Fixed-Gain Amplify-and-Forward Relaying

We analyze a secure two-hop mixed radio frequency (RF) and underwater wireless optical communication (UWOC) system using a fixed-gain amplify-and-forward (AF) relay. The UWOC channel is modeled using a unified mixture exponential-generalized Gamma distribution to consider the combined effects of air bubbles and temperature gradients on transmission characteristics. Both legitimate and eavesdropping RF channels are modeled using flexible $α-μ$ distributions. Specifically, we first derive both the probability density function (PDF) and cumulative distribution function (CDF) of the received signal-to-noise ratio (SNR) of the mixed RF and UWOC system. Based on the PDF and CDF expressions, we derive the closed-form expressions for the tight lower bound of the secrecy outage probability (SOP) and the probability of non-zero secrecy capacity (PNZ), which are both expressed in terms bivariate Fox's $H$-function. To utilize these analytical expressions, we derive asymptotic expressions of SOP and PNZ using only elementary functions. Also, we use asymptotic expressions to determine the optimal transmitting power to maximize energy efficiency. Further, we thoroughly investigate the effect of levels of air bubbles and temperature gradients in the UWOC channel, and study nonlinear characteristics of the transmission medium and the number of multipath clusters of the RF channel on the secrecy performance. Finally, all analyses are validated using Monte Carlo simulation.

preprint2020arXiv

Searching Central Difference Convolutional Networks for Face Anti-Spoofing

Face anti-spoofing (FAS) plays a vital role in face recognition systems. Most state-of-the-art FAS methods 1) rely on stacked convolutions and expert-designed network, which is weak in describing detailed fine-grained information and easily being ineffective when the environment varies (e.g., different illumination), and 2) prefer to use long sequence as input to extract dynamic features, making them difficult to deploy into scenarios which need quick response. Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC), which is able to capture intrinsic detailed patterns via aggregating both intensity and gradient information. A network built with CDC, called the Central Difference Convolutional Network (CDCN), is able to provide more robust modeling capacity than its counterpart built with vanilla convolution. Furthermore, over a specifically designed CDC search space, Neural Architecture Search (NAS) is utilized to discover a more powerful network structure (CDCN++), which can be assembled with Multiscale Attention Fusion Module (MAFM) for further boosting performance. Comprehensive experiments are performed on six benchmark datasets to show that 1) the proposed method not only achieves superior performance on intra-dataset testing (especially 0.2% ACER in Protocol-1 of OULU-NPU dataset), 2) it also generalizes well on cross-dataset testing (particularly 6.5% HTER from CASIA-MFSD to Replay-Attack datasets). The codes are available at \href{https://github.com/ZitongYu/CDCN}{https://github.com/ZitongYu/CDCN}.

preprint2020arXiv

Singular Behavior of an Electrostatic--Elastic Membrane System with an External Pressure

We analyze nonnegative solutions of the nonlinear elliptic problem $Δu=\frac{λf(x)}{u^2}+P$, where $λ>0$ and $P\geq0$, on a bounded domain $Ω$ of $\mathbb{R}^N$ ($N\geq 1$) with a Dirichlet boundary condition. This equation models an electrostatic--elastic membrane system with an external pressure $P\geq 0$, where $λ>0$ denotes the applied voltage. First, we completely address the existence and nonexistence of positive solutions. The classification of all possible singularities at $|x|=0$ for nonnegative solutions $u(x)$ satisfying $u(0)=0$ is then analyzed for the special case where $Ω=B_1(0)\subset \mathbb{R}^2$ and $f(x)=|x|^α$ with $α\geq0$. In particular, we show that for some $α,$ $u(x)$ admits only the "isotropic" singularity at $|x|=0$, and otherwise $u(x)$ may admit the "anisotropic" singularity at $|x|=0$. When $u(x)$ admits the "isotropic" singularity at $|x|=0$, the refined singularity of $u(x)$ at $|x|=0$ is further investigated, depending on whether $P>0$, by applying Fourier analysis.

preprint2020arXiv

Solutions of nonhomogeneous equations involving Hardy potentials with singularities on the boundary

In this paper, we present a new distributional identity for the solutions of elliptic equations involving Hardy potentials with singularities located on the boundary of the domain. Then we use it to obtain the boundary isolated singular solutions of nonhomogeneous problems.

preprint2020arXiv

Visible Feature Guidance for Crowd Pedestrian Detection

Heavy occlusion and dense gathering in crowd scene make pedestrian detection become a challenging problem, because it's difficult to guess a precise full bounding box according to the invisible human part. To crack this nut, we propose a mechanism called Visible Feature Guidance (VFG) for both training and inference. During training, we adopt visible feature to regress the simultaneous outputs of visible bounding box and full bounding box. Then we perform NMS only on visible bounding boxes to achieve the best fitting full box in inference. This manner can alleviate the incapable influence brought by NMS in crowd scene and make full bounding box more precisely. Furthermore, in order to ease feature association in the post application process, such as pedestrian tracking, we apply Hungarian algorithm to associate parts for a human instance. Our proposed method can stably bring about 2~3% improvements in mAP and AP50 for both two-stage and one-stage detector. It's also more effective for MR-2 especially with the stricter IoU. Experiments on Crowdhuman, Cityperson, Caltech and KITTI datasets show that visible feature guidance can help detector achieve promisingly better performances. Moreover, parts association produces a strong benchmark on Crowdhuman for the vision community.

preprint2016arXiv

Classification of isolated singularities of positive solutions for Choquard equations

In this paper we classify the isolated singularities of positive solutions to Choquard equation and prove the existence of isolated singular solutions.

preprint2016arXiv

Deep Deformation Network for Object Landmark Localization

We propose a novel cascaded framework, namely deep deformation network (DDN), for localizing landmarks in non-rigid objects. The hallmarks of DDN are its incorporation of geometric constraints within a convolutional neural network (CNN) framework, ease and efficiency of training, as well as generality of application. A novel shape basis network (SBN) forms the first stage of the cascade, whereby landmarks are initialized by combining the benefits of CNN features and a learned shape basis to reduce the complexity of the highly nonlinear pose manifold. In the second stage, a point transformer network (PTN) estimates local deformation parameterized as thin-plate spline transformation for a finer refinement. Our framework does not incorporate either handcrafted features or part connectivity, which enables an end-to-end shape prediction pipeline during both training and testing. In contrast to prior cascaded networks for landmark localization that learn a mapping from feature space to landmark locations, we demonstrate that the regularization induced through geometric priors in the DDN makes it easier to train, yet produces superior results. The efficacy and generality of the architecture is demonstrated through state-of-the-art performances on several benchmarks for multiple tasks such as facial landmark localization, human body pose estimation and bird part localization.

preprint2016arXiv

Embedding Label Structures for Fine-Grained Feature Representation

Recent algorithms in convolutional neural networks (CNN) considerably advance the fine-grained image classification, which aims to differentiate subtle differences among subordinate classes. However, previous studies have rarely focused on learning a fined-grained and structured feature representation that is able to locate similar images at different levels of relevance, e.g., discovering cars from the same make or the same model, both of which require high precision. In this paper, we propose two main contributions to tackle this problem. 1) A multi-task learning framework is designed to effectively learn fine-grained feature representations by jointly optimizing both classification and similarity constraints. 2) To model the multi-level relevance, label structures such as hierarchy or shared attributes are seamlessly embedded into the framework by generalizing the triplet loss. Extensive and thorough experiments have been conducted on three fine-grained datasets, i.e., the Stanford car, the car-333, and the food datasets, which contain either hierarchical labels or shared attributes. Our proposed method has achieved very competitive performance, i.e., among state-of-the-art classification accuracy. More importantly, it significantly outperforms previous fine-grained feature representations for image retrieval at different levels of relevance.

preprint2016arXiv

Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop

Existing fine-grained visual categorization methods often suffer from three challenges: lack of training data, large number of fine-grained categories, and high intraclass vs. low inter-class variance. In this work we propose a generic iterative framework for fine-grained categorization and dataset bootstrapping that handles these three challenges. Using deep metric learning with humans in the loop, we learn a low dimensional feature embedding with anchor points on manifolds for each category. These anchor points capture intra-class variances and remain discriminative between classes. In each round, images with high confidence scores from our model are sent to humans for labeling. By comparing with exemplar images, labelers mark each candidate image as either a "true positive" or a "false positive". True positives are added into our current dataset and false positives are regarded as "hard negatives" for our metric learning model. Then the model is retrained with an expanded dataset and hard negatives for the next round. To demonstrate the effectiveness of the proposed framework, we bootstrap a fine-grained flower dataset with 620 categories from Instagram images. The proposed deep metric learning scheme is evaluated on both our dataset and the CUB-200-2001 Birds dataset. Experimental evaluations show significant performance gain using dataset bootstrapping and demonstrate state-of-the-art results achieved by the proposed deep metric learning methods.

preprint2016arXiv

Symmetry-Breaking Zeeman-Coherence Parametric Wave Mixing Magnetometry

The nonlinear magneto-optical effect has significantly impacted modern society with prolific applications ranging from precision mapping of the Earth's magnetic field to bio-magnetic sensing. Pioneering works on collisional spin-exchange effects have led to ultra-high magnetic field detection sensitivities at the level of $fT/\sqrt{Hz}$ using a single linearly-polarized probe light field. Here we demonstrate a nonlinear Zeeman-coherence parametric wave-mixing optical-atomic magnetometer using room temperature rubidium vapor that results in more than a three-order-of-magnitude optical signal-to-noise ratio (SNR) enhancement for extremely weak magnetic field sensing. This unprecedented enhancement was achieved with nearly a two-order-of-magnitude reduction in laser power while preserving the sensitivity of the widely-used single-probe beam optical-atomic magnetometry method. This new method opens a myriad of applications ranging from bio-magnetic imaging to precision measurement of the magnetic properties of subatomic particles.

preprint2015arXiv

Fine-grained Image Classification by Exploring Bipartite-Graph Labels

Given a food image, can a fine-grained object recognition engine tell "which restaurant which dish" the food belongs to? Such ultra-fine grained image recognition is the key for many applications like search by images, but it is very challenging because it needs to discern subtle difference between classes while dealing with the scarcity of training data. Fortunately, the ultra-fine granularity naturally brings rich relationships among object classes. This paper proposes a novel approach to exploit the rich relationships through bipartite-graph labels (BGL). We show how to model BGL in an overall convolutional neural networks and the resulting system can be optimized through back-propagation. We also show that it is computationally efficient in inference thanks to the bipartite structure. To facilitate the study, we construct a new food benchmark dataset, which consists of 37,885 food images collected from 6 restaurants and totally 975 menus. Experimental results on this new food and three other datasets demonstrates BGL advances previous works in fine-grained object recognition. An online demo is available at http://www.f-zhou.com/fg_demo/.

preprint2015arXiv

On semi-linear elliptic equation arising from Micro-Electromechanical Systems with contacting elastic membrane

This paper is concerned with the nonlinear elliptic problem $-Δu=\frac{λ}{(a-u)^2}$ on a bounded domain $Ω$ of $\mathbb{R}^N$ with Dirichlet boundary conditions. This problem arises from Micro-Electromechanical Systems devices in the case that the elastic membrane contacts the ground plate on the boundary. We analyze the properties of minimal solutions to this equation when $λ>0$ and the function $a:\barΩ\to[0,1]$ satisfying $a(x)\ge κ{\rm dist}(x,\partialΩ)^γ$ for some $κ>0$ and $γ\in(0,1)$. Our results show how the boundary decay of the membrane works on the solutions and pull-in voltage $λ$.

preprint2011arXiv

Regularity of the extremal solution for some elliptic problems with advection

In this note, we investigate the regularity of extremal solution $u^*$ for semilinear elliptic equation $-\triangle u+c(x)\cdot\nabla u=λf(u)$ on a bounded smooth domain of $\mathbb{R}^n$ with Dirichlet boundary condition. Here $f$ is a positive nondecreasing convex function, exploding at a finite value $a\in (0, \infty)$. We show that the extremal solution is regular in low dimensional case. In particular, we prove that for the radial case, all extremal solution is regular in dimension two.

preprint2011arXiv

Semigroups and sequential importance sampling for multiway tables and beyond

When an interval of integers between the lower bound l_i and the upper bounds u_i is the support of the marginal distribution n_i|(n_{i-1}, ...,n_1), Chen et al. 2005 noticed that sampling from the interval at each step, for n_i during the sequential importance sampling (SIS) procedure, always produces a table which satisfies the marginal constraints. However, in general, the interval may not be equal to the support of the marginal distribution. In this case, the SIS procedure may produce tables which do not satisfy the marginal constraints, leading to rejection [Chen et al. 2006]. Rejecting tables is computationally expensive and incorrect proposal distributions result in biased estimators for the number of tables given its marginal sums. This paper has two focuses; (1) we propose a correction coefficient which corrects an interval of integers between the lower bound l_i and the upper bounds u_i to the support of the marginal distribution asymptotically even with rejections and with the same time complexity as the original SIS procedure (2) using univariate and bivariate logistic regression models, we present extensive experiments on simulated data sets for estimating the number of tables, and (3) we applied the volume test proposed by Diaconis and Efron 1985 on 2x2x6 randomly generated tables to compare the performance of SIS versus MCMC. When estimating the number of tables in our simulation study, we used univariate and bivariate logistic regression models since under these models the SIS procedure seems to have higher rate of rejections even with small tables. We also apply our correction coefficients to data sets on coronary heart disease and occurrence of esophageal cancer.

preprint2010arXiv

Films with the discrete nano-DLC-particles as the field emission cascade

The films with the discrete diamond-like-carbon nanoparticles were prepared by the deposition of the carbon nanoparticle beam. Their morphologies were imaged by Scanning Electron Microscopy (SEM) and Atomic Force Microscopy (AFM). The nanoparticles were found distributed on the silicon (100) substrate discretely. The semisphere shapes of the nanoparticles were demonstrated by the AFM line profile. EELS was measured and the sp3 ratio as high as 86% was found. The field-induced electron emission of the as-prepared cascade (nanoDLC/ Si) was tested and the current density of 1mA/cm2 was achieved at 10.2V/μm.

Feng Zhou

What is connected

Connect this record

See the researcher in context

Building this map preview

43 published item(s)

Controllable Generation with Text-to-Image Diffusion Models: A Survey

Meow-Omni 1: A Multimodal Large Language Model for Feline Ethology

Mud-Standoff Effect Correction Based on Open-Short Calibration and Resistivity Consistency-Constrained Iterative Inversion for Oil-Based Mud Imagers

Upstream Laser-based Longitudinal Enhancement of Relativistic Photoelectrons

Federated Neural Nonparametric Point Processes

Anticipated emotions associated with trust in autonomous vehicles

Basket-based Softmax

Cause-and-Effect Analysis of ADAS: A Comparison Study between Literature Review and Complaint Data

De-biased Representation Learning for Fairness with Unreliable Labels

Fully-integrated multipurpose microwave frequency identification system on a single chip

Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters

Intrinsically accurate sensing with an optomechanical accelerometer

Investigating Explanations in Conditional and Highly Automated Driving: The Effects of Situation Awareness and Modality

Isolated singularities for fractional Lane-Emden equations in the Serrin's supercritical case

Kerr optical parametric oscillation in a photonic crystal microring for accessing the infrared

Quantifying Wetting Dynamics with Triboelectrification

Towards Privacy-Preserving, Real-Time and Lossless Feature Matching

Combat COVID-19 Infodemic Using Explainable Natural Language Processing Models

Efficient Inference of Flexible Interaction in Spiking-neuron Networks

Hybrid-mode-family Kerr optical parametric oscillation for robust coherent light generation on chip

Predicting Driver Fatigue in Automated Driving with Explainability

Additive Poisson Process: Learning Intensity of Higher-Order Interaction in Stochastic Processes

Asymptotic behaviors of governing equation of Gauged Sigma model for Heisenberg ferromagnet

Broadband Optomechanical Sensing at the Thermodynamic Limit

Deep Spatial Gradient and Temporal Depth Learning for Face Anti-spoofing

Do not forget interaction: Predicting fatality of COVID-19 patients using logistic regression

Exact sparse reconstruction form Vandermonde matrices

Examining the Effects of Emotional Valence and Arousal on Takeover Performance in Conditionally Automated Driving

Physical-Layer Security for Two-Hop Air-to-Underwater Communication Systems With Fixed-Gain Amplify-and-Forward Relaying

Searching Central Difference Convolutional Networks for Face Anti-Spoofing

Singular Behavior of an Electrostatic--Elastic Membrane System with an External Pressure

Solutions of nonhomogeneous equations involving Hardy potentials with singularities on the boundary

Visible Feature Guidance for Crowd Pedestrian Detection

Classification of isolated singularities of positive solutions for Choquard equations

Deep Deformation Network for Object Landmark Localization

Embedding Label Structures for Fine-Grained Feature Representation

Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop

Symmetry-Breaking Zeeman-Coherence Parametric Wave Mixing Magnetometry

Fine-grained Image Classification by Exploring Bipartite-Graph Labels

On semi-linear elliptic equation arising from Micro-Electromechanical Systems with contacting elastic membrane

Regularity of the extremal solution for some elliptic problems with advection

Semigroups and sequential importance sampling for multiway tables and beyond

Films with the discrete nano-DLC-particles as the field emission cascade