Researcher profile

Le Zhang

Le Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
28works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

28 published item(s)

preprint2026arXiv

Diffusion-APO: Trajectory-Aware Direct Preference Alignment for Video Diffusion Transformers

Efficiently aligning large-scale video diffusion models with human intent requires a scalable and trajectory-aware pathway that bridges the inherent discrepancy between training noise distributions and practical inference trajectories. While existing paradigms such as Direct Preference Optimization (DPO) and Group Relative Policy Optimization (GRPO) attempt to address this, they are often hindered by either reliance on bias-prone, complex reward models or suboptimal timestep sampling. In this paper, we propose Diffusion-APO (Aligned Preference Optimization), a trajectory-aware algorithm that resolves this misalignment by synchronizing training noise with inference-time denoising paths to maximize gradient signal efficacy. To translate this algorithmic innovation into a practical solution, we introduce a unified and modular RLHF framework that integrates online ranking, half-online anchoring, offline refinement, and distillation-aware drift correction. This framework enables flexible, multi-stage preference alignment across diverse data and computational constraints without relying on scalar-reward-based policy gradients. Through extensive experiments, we demonstrate that Diffusion-APO consistently outperforms standard baselines in visual quality and instruction following, while effectively preserving generative fidelity during model acceleration, providing a robust, end-to-end pathway for scalable video diffusion alignment.

preprint2022arXiv

Aesthetic Visual Question Answering of Photographs

Aesthetic assessment of images can be categorized into two main forms: numerical assessment and language assessment. Aesthetics caption of photographs is the only task of aesthetic language assessment that has been addressed. In this paper, we propose a new task of aesthetic language assessment: aesthetic visual question and answering (AVQA) of images. If we give a question of images aesthetics, model can predict the answer. We use images from \textit{www.flickr.com}. The objective QA pairs are generated by the proposed aesthetic attributes analysis algorithms. Moreover, we introduce subjective QA pairs that are converted from aesthetic numerical labels and sentiment analysis from large-scale pre-train models. We build the first aesthetic visual question answering dataset, AesVQA, that contains 72,168 high-quality images and 324,756 pairs of aesthetic questions. Two methods for adjusting the data distribution have been proposed and proved to improve the accuracy of existing models. This is the first work that both addresses the task of aesthetic VQA and introduces subjectiveness into VQA tasks. The experimental results reveal that our methods outperform other VQA models on this new task.

preprint2022arXiv

Attribute Controllable Beautiful Caucasian Face Generation by Aesthetics Driven Reinforcement Learning

In recent years, image generation has made great strides in improving the quality of images, producing high-fidelity ones. Also, quite recently, there are architecture designs, which enable GAN to unsupervisedly learn the semantic attributes represented in different layers. However, there is still a lack of research on generating face images more consistent with human aesthetics. Based on EigenGAN [He et al., ICCV 2021], we build the techniques of reinforcement learning into the generator of EigenGAN. The agent tries to figure out how to alter the semantic attributes of the generated human faces towards more preferable ones. To accomplish this, we trained an aesthetics scoring model that can conduct facial beauty prediction. We also can utilize this scoring model to analyze the correlation between face attributes and aesthetics scores. Empirically, using off-the-shelf techniques from reinforcement learning would not work well. So instead, we present a new variant incorporating the ingredients emerging in the reinforcement learning communities in recent years. Compared to the original generated images, the adjusted ones show clear distinctions concerning various attributes. Experimental results using the MindSpore, show the effectiveness of the proposed method. Altered facial images are commonly more attractive, with significantly improved aesthetic levels.

preprint2022arXiv

Cross-correlation of Planck CMB lensing with DESI galaxy groups

We measure the cross-correlation between galaxy groups constructed from DESI Legacy Imaging Survey DR8 and \emph{Planck} CMB lensing, over overlapping sky area of 16876 $\rm deg^2$. The detections are significant and consistent with the expected signal of the large-scale structure of the universe, over group samples of various redshift, mass, richness $N_{\rm g}$ and over various scale cuts. The overall S/N is 40 for a conservative sample with $N_{\rm g}\geq 5$, and increases to $50$ for the sample with $N_{\rm g}\geq 2$. Adopting the \emph{Planck} 2018 cosmology, we constrain the density bias of groups with $N_{\rm g}\geq 5$ as $b_{\rm g}=1.31\pm 0.10$, $2.22\pm 0.10$, $3.52\pm 0.20$ at $0.1<z\leq 0.33$, $0.33<z\leq 0.67$, $0.67<z\leq1$ respectively. The group catalog provides the estimation of group halo mass and therefore allows us to detect the dependence of bias on group mass with high significance. It also allows us to compare the measured bias with the theoretically predicted one using the estimated group mass. We find excellent agreement for the two high redshift bins. However, it is lower than the theory by $\sim 3σ$ for the lowest redshift bin. Another interesting finding is the significant impact of the thermal Sunyaev Zel&#39;dovich (tSZ). It contaminates the galaxy group-CMB lensing cross-correlation at $\sim 30\%$ level, and must be deprojected first in CMB lensing reconstruction.

preprint2022arXiv

EDN: Salient Object Detection via Extremely-Downsampled Network

Recent progress on salient object detection (SOD) mainly benefits from multi-scale learning, where the high-level and low-level features collaborate in locating salient objects and discovering fine details, respectively. However, most efforts are devoted to low-level feature learning by fusing multi-scale features or enhancing boundary representations. High-level features, which although have long proven effective for many other tasks, yet have been barely studied for SOD. In this paper, we tap into this gap and show that enhancing high- level features is essential for SOD as well. To this end, we introduce an Extremely-Downsampled Network (EDN), which employs an extreme downsampling technique to effectively learn a global view of the whole image, leading to accurate salient object localization. To accomplish better multi-level feature fusion, we construct the Scale-Correlated Pyramid Convolution (SCPC) to build an elegant decoder for recovering object details from the above extreme downsampling. Extensive experiments demonstrate that EDN achieves state-of-the-art performance with real-time speed. Our efficient EDN-Lite also achieves competitive performance with a speed of 316fps. Hence, this work is expected to spark some new thinking in SOD. Code is available at https://github.com/yuhuan-wu/EDN.

preprint2022arXiv

Effects of Active Galactic Nucleus Feedback on Cold Gas Depletion and Quenching of Central Galaxies

We investigate the influence of active galactic nucleus (AGN) feedback on the galaxy cold gas content and its connection to galaxy quenching in three hydrodynamical simulations of Illustris, IllustrisTNG and SIMBA. By comparing to the observed atomic and molecular neutral hydrogen measurements for central galaxies, we find that Illustris over-predicts the cold gas masses in star-forming galaxies and significantly under-predicts them for quenched galaxies. IllustrisTNG performs better in this comparison than Illustris, but quenched galaxies retain too much cold gas compared with observations. SIMBA shows good agreement with observations, by depleting the global cold gas reservoir for quenched galaxies. We find that the discrepancies in IllustrisTNG are caused by its weak kinetic AGN feedback that only redistributes the cold gas from the inner disks to the outer regions and reduces the inner cold gas densities. It agrees with observations much better when only the cold gas within the stellar disk is considered to infer the star formation rates. From dependences of cold gas reservoir on the black hole mass and Eddington ratio, we find that the cumulative energy release during the black hole growth is the dominant reason for the cold gas depletion and thus the galaxy quenching. We further measure the central stellar surface density within 1 kpc ($Σ_1$) for the high-resolution run of IllustrisTNG and find a tight correlation between $Σ_1$ and black hole mass. It suggests that the observed decreasing trend of cold gas mass with $Σ_1$ is also a reflection of the black hole growth.

preprint2022arXiv

Forecasts on CMB lensing observations with AliCPT-1

AliCPT-1 is the first Chinese CMB experiment aiming for high precision measurement of Cosmic Microwave Background B-mode polarization. The telescope, currently under deployment in Tibet, will observe in two frequency bands centered at 90 and 150 GHz. We forecast the CMB lensing reconstruction, lensing-galaxy as well as lensing-CIB (Cosmic Infrared Background) cross correlation signal-to-noise ratio (SNR) for AliCPT-1. We consider two stages with different integrated observation time, namely &#34;4 module*yr&#34; (first stage) and &#34;48 module*yr&#34; (final stage). For lensing reconstruction, we use three different quadratic estimators, namely temperature-only, polarization-only and minimum-variance estimators, using curved sky geometry. We take into account the impact of inhomogeneous hit counts as well as of the mean-field bias due to incomplete sky coverage. In the first stage, our results show that the 150 GHz channel is able to measure the lensing signal at $15σ$ significance with the minimum-variance estimator. In the final stage, the measurement significance will increase to $31σ$. We also combine the two frequency data in the harmonic domain to optimize the SNR. Our result show that the coadding procedure can significantly reduce the reconstruction bias in the multiple range l>800. Thanks to the high quality of the polarization data in the final stage of AliCPT-1, the EB estimator will dominate the lensing reconstruction in this stage. We also estimate the SNR of cross-correlations between AliCPT-1 CMB lensing and other tracers of the large scale structure of the universe. For its cross-correlation with DESI galaxies/quasars, we report the cross-correlation SNR = 10-20 for the 4 redshift bins at 0.05<z<2.1. In the first stage, the total SNR is about $32$. In the final stage, the lensing-galaxy cross-correlation can reach SNR=52.

preprint2022arXiv

Mass of the dynamically hot inner stellar halo predicts the ancient accreted stellar mass

Galactic dynamical structures are fossil records of the assembly histories of galaxies. By analyzing the cosmological hydrodynamical simulation TNG50, we find that a dynamical structure that we call the &#34;hot inner stellar halo,&#34; defined by stars on dynamically hot orbits with circularity $λ_z < 0.5$ at $3.5\,{\rm kpc}<r \lesssim 2\,R_e$, is a strong indicator of the mass of accreted satellite galaxies. We find a strong correlation between the mass of this hot inner stellar halo and the total ex situ stellar mass. There is a similarly strong correlation with the stellar mass of the most massive secondary galaxy ever merged. These TNG50 correlations are compatible with those predicted by other simulations, for example by TNG100 across the whole mass range under study (galaxy stellar masses, $M_*$, in the $10^{10.3-11.6}$\,\Msun\, range) and by EAGLE for $M_* \gtrsim 10^{10.6} $\,\Msun\, galaxies.\ This shows that our predictions are robust across different galaxy formation and feedback models and hold across a wide range of numerical resolution. The hot inner stellar halo is a product of massive and typically ancient mergers, with inner-halo stars exhibiting three main physical origins: accreted and stripped from massive satellites, dynamically heated by mergers from the bulge and/or disk in the main progenitor, and formed from star formation triggered during mergers. The mass of the hot inner stellar halo defined in this paper is a quantity that can be robustly obtained for real galaxies by applying a population-orbit superposition method to integral-field-unit spectroscopy data, out to a distance of $\sim2\,R_e$, which is possible with current observations. Hence, this paper shows that integral-field-unit observations and dynamical models of the inner regions of galaxies provide a way to quantitatively determine the mass of ancient accreted satellites.

preprint2022arXiv

Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation

The essence of video semantic segmentation (VSS) is how to leverage temporal information for prediction. Previous efforts are mainly devoted to developing new techniques to calculate the cross-frame affinities such as optical flow and attention. Instead, this paper contributes from a different angle by mining relations among cross-frame affinities, upon which better temporal information aggregation could be achieved. We explore relations among affinities in two aspects: single-scale intrinsic correlations and multi-scale relations. Inspired by traditional feature processing, we propose Single-scale Affinity Refinement (SAR) and Multi-scale Affinity Aggregation (MAA). To make it feasible to execute MAA, we propose a Selective Token Masking (STM) strategy to select a subset of consistent reference tokens for different scales when calculating affinities, which also improves the efficiency of our method. At last, the cross-frame affinities strengthened by SAR and MAA are adopted for adaptively aggregating temporal information. Our experiments demonstrate that the proposed method performs favorably against state-of-the-art VSS methods. The code is publicly available at https://github.com/GuoleiSun/VSS-MRCFA

preprint2022arXiv

Probing Simile Knowledge from Pre-trained Language Models

Simile interpretation (SI) and simile generation (SG) are challenging tasks for NLP because models require adequate world knowledge to produce predictions. Previous works have employed many hand-crafted resources to bring knowledge-related into models, which is time-consuming and labor-intensive. In recent years, pre-trained language models (PLMs) based approaches have become the de-facto standard in NLP since they learn generic knowledge from a large corpus. The knowledge embedded in PLMs may be useful for SI and SG tasks. Nevertheless, there are few works to explore it. In this paper, we probe simile knowledge from PLMs to solve the SI and SG tasks in the unified framework of simile triple completion for the first time. The backbone of our framework is to construct masked sentences with manual patterns and then predict the candidate words in the masked position. In this framework, we adopt a secondary training process (Adjective-Noun mask Training) with the masked language model (MLM) loss to enhance the prediction diversity of candidate words in the masked position. Moreover, pattern ensemble (PE) and pattern search (PS) are applied to improve the quality of predicted words. Finally, automatic and human evaluations demonstrate the effectiveness of our framework in both SI and SG tasks.

preprint2022arXiv

Realization of a photonic topological insulator in Kagome crystals at terahertz wavelengths

Topological systems are inherently robust to disorder and continuous perturbations, resulting in dissipation-free edge transport of electrons in quantum solids, or reflectionless guiding of photons and phonons in classical wave systems characterized by topological invariants. Despite considerable efforts, direct experimental demonstration of theoretically predicted robust, lossless energy transport in topological insulators operating at terahertz frequencies is needed further investigations to shed affirmative light on the unique properties enabled by topological protection. Here, we introduce Kagome lattice that exhibits a new class of symmetry-protected topological phases with very low Berry curvature but nontrivial bulk polarization, and fabricate an optical topological insulator that provide the valley hall effect. Theoretical analysis show that four type edge states can be obtained. Measurements of THz-TDs with high time-resolution demonstrate that terahertz wave propagating along the straight topological edge and Z-shape edge with sharp turns have almost same high transmission in 0.440 THz to 0.457 THz domain range. Those results quantitatively illustrate the suppression of backscattering due to the non-trivial topology of the structure. The THz-TDs measurement yields amplitude and phase information, showing significant advantage compared to general broadband infrared, single wavelength continuous-wave THz measurements and visible spectroscopy. It allows further exploration of the effective refractive index, group velocity and dispersion relations of edge states. Our work offers possibilities for advanced control of the propagation and manipulation of THz waves, and facilitates the applications including sixth-generation (6G) wireless communication, terahertz integrated circuits, and interconnects for intrachip and interchip communication.

preprint2022arXiv

Ret3D: Rethinking Object Relations for Efficient 3D Object Detection in Driving Scenes

Current efficient LiDAR-based detection frameworks are lacking in exploiting object relations, which naturally present in both spatial and temporal manners. To this end, we introduce a simple, efficient, and effective two-stage detector, termed as Ret3D. At the core of Ret3D is the utilization of novel intra-frame and inter-frame relation modules to capture the spatial and temporal relations accordingly. More Specifically, intra-frame relation module (IntraRM) encapsulates the intra-frame objects into a sparse graph and thus allows us to refine the object features through efficient message passing. On the other hand, inter-frame relation module (InterRM) densely connects each object in its corresponding tracked sequences dynamically, and leverages such temporal information to further enhance its representations efficiently through a lightweight transformer network. We instantiate our novel designs of IntraRM and InterRM with general center-based or anchor-based detectors and evaluate them on Waymo Open Dataset (WOD). With negligible extra overhead, Ret3D achieves the state-of-the-art performance, being 5.5% and 3.2% higher than the recent competitor in terms of the LEVEL 1 and LEVEL 2 mAPH metrics on vehicle detection, respectively.

preprint2022arXiv

Sensitivity tests of cosmic velocity fields to massive neutrinos

We investigate impacts of massive neutrinos on the cosmic velocity fields, employing high-resolution cosmological $N$-body simulations provided by the information-optimized CUBE code, where cosmic neutrinos are evolved using collisionless hydrodynamics and their perturbations can be accurately resolved. In this study we focus, for the first time, on the analysis of massive-neutrino induced suppression effects in various cosmic velocity field components of velocity magnitude, divergence, vorticity and dispersion. By varying the neutrino mass sum $M_ν$ from 0 -- 0.4 eV, the simulations show that, the power spectra of vorticity -- exclusively sourced by non-linear structure formation that is affected by massive neutrinos significantly -- is very sensitive to the mass sum, which potentially provide novel signatures in detecting massive neutrinos. Furthermore, using the chi-square statistic, we quantitatively test the sensitivity of the density and velocity power spectra to the neutrino mass sum. Indeed, we find that, the vorticity spectrum has the highest sensitivity, and the null hypothesis of massless neutrinos is incompatible with both vorticity and divergence spectra from $M_ν=0.1$ eV at high significance ($p$-value $= 0.03$ and $0.07$, respectively). These results demonstrate clearly the importance of peculiar velocity field measurements, in particular of vorticity and divergence components, in determination of neutrino mass and mass hierarchy.

preprint2022arXiv

Strong Neel ordering and luminescence correlation in a two-dimensional antiferromagnet

Magneto-optical effect has been widely used in light modulation, optical sensing and information storage. Recently discovered two-dimensional (2D) van der Waals layered magnets are considered as promising platforms for investigating novel magneto-optical phenomena and devices, due to the long-range magnetic ordering down to atomically-thin thickness, rich species and tunable properties. However, majority 2D antiferromagnets suffer from low luminescence efficiency which hinders their magneto-optical investigations and applications. Here, we uncover strong light-magnetic ordering interactions in 2D antiferromagnetic MnPS3 utilizing a newly-emerged near-infrared photoluminescence (PL) mode far below its intrinsic bandgap. This ingap PL mode shows strong correlation with the Neel ordering and persists down to monolayer thickness. Combining the DFT, STEM and XPS, we illustrate the origin of the PL mode and its correlation with Neel ordering, which can be attributed to the oxygen ion-mediated states. Moreover, the PL strength can be further tuned and enhanced using ultraviolet-ozone treatment. Our studies offer an effective approach to investigate light-magnetic ordering interactions in 2D antiferromagnetic semiconductors.

preprint2022arXiv

SUBS: Subtree Substitution for Compositional Semantic Parsing

Although sequence-to-sequence models often achieve good performance in semantic parsing for i.i.d. data, their performance is still inferior in compositional generalization. Several data augmentation methods have been proposed to alleviate this problem. However, prior work only leveraged superficial grammar or rules for data augmentation, which resulted in limited improvement. We propose to use subtree substitution for compositional data augmentation, where we consider subtrees with similar semantic functions as exchangeable. Our experiments showed that such augmented data led to significantly better performance on SCAN and GeoQuery, and reached new SOTA on compositional split of GeoQuery.

preprint2022arXiv

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding

Data augmentation is an effective approach to tackle over-fitting. Many previous works have proposed different data augmentations strategies for NLP, such as noise injection, word replacement, back-translation etc. Though effective, they missed one important characteristic of language--compositionality, meaning of a complex expression is built from its sub-parts. Motivated by this, we propose a compositional data augmentation approach for natural language understanding called TreeMix. Specifically, TreeMix leverages constituency parsing tree to decompose sentences into constituent sub-structures and the Mixup data augmentation technique to recombine them to generate new sentences. Compared with previous approaches, TreeMix introduces greater diversity to the samples generated and encourages models to learn compositionality of NLP data. Extensive experiments on text classification and SCAN demonstrate that TreeMix outperforms current state-of-the-art data augmentation methods.

preprint2021arXiv

The design of the Ali CMB Polarization Telescope receiver

Ali CMB Polarization Telescope (AliCPT-1) is the first CMB degree-scale polarimeter to be deployed on the Tibetan plateau at 5,250m above sea level. AliCPT-1 is a 90/150 GHz 72 cm aperture, two-lens refracting telescope cooled down to 4 K. Alumina lenses, 800mm in diameter, image the CMB in a 33.4° field of view on a 636mm wide focal plane. The modularized focal plane consists of dichroic polarization-sensitive Transition-Edge Sensors (TESes). Each module includes 1,704 optically active TESes fabricated on a 150mm diameter silicon wafer. Each TES array is read out with a microwave multiplexing readout system capable of a multiplexing factor up to 2,048. Such a large multiplexing factor has allowed the practical deployment of tens of thousands of detectors, enabling the design of a receiver that can operate up to 19 TES arrays for a total of 32,376 TESes. AliCPT-1 leverages the technological advancements in the detector design from multiple generations of previously successful feedhorn-coupled polarimeters, and in the instrument design from BICEP-3, but applied on a larger scale. The cryostat receiver is currently under integration and testing. During the first deployment year, the focal plane will be populated with up to 4 TES arrays. Further TES arrays will be deployed in the following years, fully populating the focal plane with 19 arrays on the fourth deployment year. Here we present the AliCPT-1 receiver design, and how the design has been optimized to meet the experimental requirements.

preprint2020arXiv

A novel centroid update approach for clustering-based superpixel methods and superpixel-based edge detection

Superpixel is widely used in image processing. And among the methods for superpixel generation, clustering-based methods have a high speed and a good performance at the same time. However, most clustering-based superpixel methods are sensitive to noise. To solve these problems, in this paper, we first analyze the features of noise. Then according to the statistical features of noise, we propose a novel centroid update approach to enhance the robustness of clustering-based superpixel methods. Besides, we propose a novel superpixel-based edge detection method. The experiments on BSD500 dataset show that our approach can significantly enhance the performance of clustering-based superpixel methods in noisy environment. Moreover, we also show that our proposed edge detection method outperforms other classical methods.

preprint2020arXiv

First-Principles Study of Hydrogen Behaviors in $α$-Pu$_{2}$O$_{3}$

The in-depth understanding of hydrogen permeation through plutonium-oxide overlayers is the prerequisite to evaluate the complex hydriding induction period of Pu. In this work, the incorporation, diffusion and dissolution of hydrogen in $α$-Pu$_{2}$O$_{3}$ are investigated by the first-principles calculations and $\textit{ab initio}$ thermodynamic method based on DFT+U and DFT-D3 schemes. Our study reveals that the hydrogen incorporation is endothermic and the separated H atoms prefer to recombine as H$_{2}$ molecules rather than reacting with $α$-Pu$_{2}$O$_{3}$. The H and H$_{2}$ diffusion are both feasible, generally, H will recombine first as H$_{2}$ and then migrate. Both pressure P$_{H2}$ and temperature can promote the hydrogen dissolution in $α$-Pu$_{2}$O$_{3}$. The single H$_{2}$ molecule incorporation and (H+H$_{2}$) mixed dissolution will successively appear when increasing P$_{H2}$. Compared to PuO$_{2}$, this work indicates that Pu sesquioxide is hardly reduced by hydrogen, but the porous $α$-Pu$_{2}$O$_{3}$ facilitates hydrogen transport in Pu oxide layers. We presents the microscopic picture of hydrogen behaviors in the defect-free $α$-Pu$_{2}$O$_{3}$, which could shed some light on the study of the hydriding induction period of Pu.

preprint2020arXiv

Fuzzy SLIC: Fuzzy Simple Linear Iterative Clustering

Most superpixel methods are sensitive to noise and cannot control the superpixel number precisely. To solve these problems, in this paper, we propose a robust superpixel method called fuzzy simple linear iterative clustering (Fuzzy SLIC), which adopts a local spatial fuzzy C-means clustering and dynamic fuzzy superpixels. We develop a fast and precise superpixel number control algorithm called onion peeling (OP) algorithm. Fuzzy SLIC is insensitive to most types of noise, including Gaussian, salt and pepper, and multiplicative noise. The OP algorithm can control the superpixel number accurately without reducing much computational efficiency. In the validation experiments, we tested the Fuzzy SLIC and OP algorithm and compared them with state-of-the-art methods on the BSD500 and Pascal VOC2007 benchmarks. The experiment results show that our methods outperform state-of-the-art techniques in both noise-free and noisy environments.

preprint2020arXiv

Generalized Zero-Shot Learning via VAE-Conditioned Generative Flow

Generalized zero-shot learning (GZSL) aims to recognize both seen and unseen classes by transferring knowledge from semantic descriptions to visual representations. Recent generative methods formulate GZSL as a missing data problem, which mainly adopts GANs or VAEs to generate visual features for unseen classes. However, GANs often suffer from instability, and VAEs can only optimize the lower bound on the log-likelihood of observed data. To overcome the above limitations, we resort to generative flows, a family of generative models with the advantage of accurate likelihood estimation. More specifically, we propose a conditional version of generative flows for GZSL, i.e., VAE-Conditioned Generative Flow (VAE-cFlow). By using VAE, the semantic descriptions are firstly encoded into tractable latent distributions, conditioned on that the generative flow optimizes the exact log-likelihood of the observed visual features. We ensure the conditional latent distribution to be both semantic meaningful and inter-class discriminative by i) adopting the VAE reconstruction objective, ii) releasing the zero-mean constraint in VAE posterior regularization, and iii) adding a classification regularization on the latent variables. Our method achieves state-of-the-art GZSL results on five well-known benchmark datasets, especially for the significant improvement in the large-scale setting. Code is released at https://github.com/guyuchao/VAE-cFlow-ZSL.

preprint2020arXiv

HIR4: cosmology from a simulated neutral hydrogen full sky using Horizon Run 4

The distribution of cosmological neutral hydrogen will provide a new window into the large-scale structure of the Universe with the next generation of radio telescopes and surveys. The observation of this material, through 21cm line emission, will be confused by foreground emission in the same frequencies. Even after these foregrounds are removed, the reconstructed map may not exactly match the original cosmological signal, which will introduce systematic errors and offset into the measured correlations. In this paper, we simulate future surveys of neutral hydrogen using the Horizon Run 4 (HR4) cosmological N-body simulation. We generate HI intensity maps from the HR4 halo catalogue, and combine with foreground radio emission maps from the Global Sky Model, to create accurate simulations over the entire sky. We simulate the HI sky for the frequency range 700-800 MHz, matching the sensitivity of the Tianlai pathfinder. We test the accuracy of the fastICA, PCA and log-polynomial fitting foreground removal methods to recover the input cosmological angular power spectrum and measure the parameters. We show the effect of survey noise levels and beam sizes on the recovered the cosmological constraints. We find that while the reconstruction removes power from the cosmological 21cm distribution on large-scales, we can correct for this and recover the input parameters in the noise-free case. However, the effect of noise and beam size of the Tianlai pathfinder prevents accurate recovery of the cosmological parameters when using only intensity mapping information.

preprint2020arXiv

Polar Coupling Enabled Nonlinear Optical Filtering at MoS$_2$/Ferroelectric Heterointerfaces

Complex oxide heterointerfaces and van der Waals heterostructures present two versatile but intrinsically different platforms for exploring emergent quantum phenomena and designing new functionalities. The rich opportunity offered by the synergy between these two classes of materials, however, is yet to be charted. Here, we report an unconventional nonlinear optical filtering effect resulting from the interfacial polar alignment between monolayer MoS$_2$ and a neighboring ferroelectric oxide thin film. The second harmonic generation response at the heterointerface is either substantially enhanced or almost entirely quenched by an underlying ferroelectric domain wall depending on its chirality, and can be further tailored by the polar domains. Unlike the extensively studied coupling mechanisms driven by charge, spin, and lattice, the interfacial tailoring effect is solely mediated by the polar symmetry, as well explained via our density functional theory calculations, pointing to a new material strategy for the functional design of nanoscale reconfigurable optical applications.

preprint2020arXiv

Structure-based Sybil Detection in Social Networks via Local Rule-based Propagation

Sybil detection in social networks is a basic security research problem. Structure-based methods have been shown to be promising at detecting Sybils. Existing structure-based methods can be classified into Random Walk (RW)-based methods and Loop Belief Propagation (LBP)-based methods. RW-based methods cannot leverage labeled Sybils and labeled benign users simultaneously, which limits their detection accuracy, and/or they are not robust to noisy labels. LBP-based methods are not scalable and cannot guarantee convergence. In this work, we propose SybilSCAR, a novel structure-based method to detect Sybils in social networks. SybilSCAR is Scalable, Convergent, Accurate, and Robust to label noise. We first propose a framework to unify RW-based and LBP-based methods. Under our framework, these methods can be viewed as iteratively applying a (different) local rule to every user, which propagates label information among a social graph. Second, we design a new local rule, which SybilSCAR iteratively applies to every user to detect Sybils. We compare SybilSCAR with state-of-the-art RW-based and LBP-based methods theoretically and empirically. Theoretically, we show that, with proper parameter settings, SybilSCAR has a tighter asymptotical bound on the number of Sybils that are falsely accepted into a social network than existing structure-based methods. Empirically, we perform evaluation using both social networks with synthesized Sybils and a large-scale Twitter dataset (41.7M nodes and 1.2B edges) with real Sybils. Our results show that 1) SybilSCAR is substantially more accurate and more robust to label noise than state-of-the-art RW-based methods; 2) SybilSCAR is more accurate and one order of magnitude more scalable than state-of-the-art LBP-based methods.

preprint2020arXiv

Valley-Hall-like second-order photonic topological insulators in Kagome lattice

Valley-Hall-like second-order photonic topological insulators are designed in Kagome-lattice photonic crystals with C3v point-group symmetry. The photonic crystal consists of circular air holes in pure dielectric materials. Different from conventional valley-Hall photonic topological insulators characterized by valley Chern numbers, the proposed insulators have topological invariants described by quantized electric polarization. Topological transition can be realized by tuning the structural size and topological edge states appear at the interface between photonic crystals with different topological phases, preserving important features of valley-Hall photonic insulators such as valley transport with little backscattering. The proposed photonic crystal also support zero-dimensional corner states in oblique corners, showing its second-order topological insulator signature. This work presents the possibility to realize topologically protected reflection suppressed waveguides and local cavities in the same platform.

preprint2019arXiv

Fractional Dark Matter decay: cosmological imprints and observational constraints

If a fraction $f_{\rm dcdm}$ of the Dark Matter decays into invisible and massless particles (so-called &#34;dark radiation&#34;) with the decay rate (or inverse lifetime) $Γ_{\rm dcdm}$, such decay will leave distinctive imprints on cosmological observables. With a full consideration of the Boltzmann hierarchy, we calculate the decay-induced impacts not only on the CMB but also on the redshift distortion and the kinetic Sunyaev-Zel&#39;dovich effect, while providing detailed physical interpretations based on evaluating the evolution of gravitational potential. By using the current cosmological data with a combination of Planck 2015, Baryon Acoustic Oscillation and redshift distortion measurements which can improve the constraints, we update the $1σ$ bound on the fraction of decaying DM from $f_{\rm dcdm}\lesssim5.26\%$ to $f_{\rm dcdm}\lesssim2.73\%$ for the short-lived DM (assuming $Γ_{\rm dcdm}/H_0\gtrsim10^4$). However, no constraints are improved from RSD data ($f_{\rm dcdm}\lesssim0.94\%$) for the long-lived DM (i.e., $Γ_{\rm dcdm}/H_0\lesssim10^4$). We also find the fractional DM decay can only slightly reduce the $H_0$ and $σ_8$ tensions, which is consistent with other previous works. Furthermore, our calculations show that the kSZ effect in future would provide a further constraining power on the decaying DM.

preprint2019arXiv

Holographic Complexity from the Crofton&#39;s Formula in Lorentzian AdS$_3$

We study the Crofton&#39;s formula in the Lorentzian AdS$_3$ and find that the area of a generic space-like two dimensional surface is given by the flux of space-like geodesics. The &#34;complexity=volume&#34; conjecture then implies a new holographic representation of the complexity in terms of the number of geodesics. Finally, we explore the possible explanation of this result from the standpoint of information theory.

preprint2019arXiv

Semi-Supervised Self-Taught Deep Learning for Finger Bones Segmentation

Segmentation stands at the forefront of many high-level vision tasks. In this study, we focus on segmenting finger bones within a newly introduced semi-supervised self-taught deep learning framework which consists of a student network and a stand-alone teacher module. The whole system is boosted in a life-long learning manner wherein each step the teacher module provides a refinement for the student network to learn with newly unlabeled data. Experimental results demonstrate the superiority of the proposed method over conventional supervised deep learning methods.