Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
45works
0followers
25topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

45 published item(s)

preprint2026arXiv

Brightest GRB flare observed in GRB 221009A: bridge the last gap between flare and prompt emission in GRB

Flares are usually observed during the afterglow phase of Gamma-Ray Bursts (GRBs) in soft X-ray, optical and radio bands, but rarely in gamma-ray band. Despite the extraordinary brightness, GECAM-C has accurately measured both the bright prompt emission and flare emission of GRB 221009A without instrumental effects, offering a good opportunity to study the relation between them. In this work, we present a comprehensive analysis of flare emission of GRB 221009A, which is composed of a series of flares. Among them, we identify an exceptionally bright flare with a record-breaking isotropic energy $E_{\rm iso} = 1.82 \times 10^{53}$ erg of GRB flares. It exhibits the highest peak energy ever detected in GRB flares, $E_{\rm peak} \sim 300$ keV, making it a genuine gamma-ray flare. It also shows rapid rise and decay timescales, significantly shorter than those of typical X-ray flares observed in soft X-ray or optical band, but comparable to those observed in prompt emissions. Despite these exceptional properties, the flare shares several common properties with typical GRB flares. We note that this is the first observation of a GRB flare in the keV-MeV band with sufficiently high temporal resolution and high statistics, which bridges the last gap between prompt emission and flare.

preprint2026arXiv

DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Reinforcement learning (RL)-based enhancement of large language models (LLMs) often leads to reduced output diversity, undermining their utility in open-ended tasks like creative writing. Current methods lack explicit mechanisms for guiding diverse exploration and instead prioritize optimization efficiency and performance over diversity. This paper proposes an RL framework structured around a semi-structured long Chain-of-Thought (CoT), in which the generation process is decomposed into explicitly planned intermediate steps. We introduce a Diverse Planning Branching method that strategically introduces divergence at the planning phase based on diversity variation, alongside a group-aware diversity reward to encourage distinct trajectories. Experimental results on creative writing benchmarks demonstrate that our approach significantly improves output diversity without compromising generation quality, consistently outperforming existing baselines.

preprint2025arXiv

PlotGen-Bench: Evaluating VLMs on Generating Visualization Code from Diverse Plots across Multiple Libraries

Recent advances in vision-language models (VLMs) have expanded their multimodal code generation capabilities, yet their ability to generate executable visualization code from plots, especially for complex 3D, animated, plot-to-plot transformations, or multi-library scenarios, remains underexplored. To address this gap, we introduce PlotGen-Bench, a comprehensive benchmark for evaluating plot-to-code generation under realistic and complex visualization scenarios. The benchmark spans 9 major categories, 30 subcategories, and 3 core tasks-plot replication, plot transformation, and multi-library generation, covering both 2D, 3D and animated plots across 5 widely used visualization libraries. Through systematic evaluation of state-of-the-art open- and closed-source VLMs, we find that open-source models still lag considerably behind in visual fidelity and semantic consistency, despite achieving comparable code executability. Moreover, all models exhibit substantial degradation on reasoning-intensive tasks such as chart type conversion and animation generation. PlotGen-Bench establishes a rigorous foundation for advancing research toward more capable and reliable VLMs for visualization authoring and code synthesis, with all data and code available at https://plotgen.github.io.

preprint2023arXiv

Independence number of hypergraphs under degree conditions

A well-known result of Ajtai et al. from 1982 states that every $k$-graph $H$ on $n$ vertices, with girth at least five, and average degree $t^{k-1}$ contains an independent set of size $c n (\log t)^{1/(k-1)}/t$ for some $c>0$. In this paper we show that an independent set of the same size can be found under weaker conditions allowing certain cycles of length 2, 3 and 4. Our work is motivated by a problem of Lo and Zhao, who asked for $k\ge 4$, how large of an independent set a $k$-graph $H$ on $n$ vertices necessarily has when its maximum $(k-2)$-degree $Δ_{k-2}(H)\le dn$. (The corresponding problem with respect to $(k-1)$-degrees was solved by Kostochka, Mubayi, and Varstraëte [Random Structures & Algorithms 44, 224--239, 2014].) In this paper we show that every $k$-graph $H$ on $n$ vertices with $Δ_{k-2}(H)\le dn$ contains an independent set of size $c (\frac nd \log\log \frac nd)^{1/(k-1)}$, and under additional conditions, an independent set of size $c (\frac nd \log \frac nd)^{1/(k-1)}$. The former assertion gives a new upper bound for the $(k-2)$-degree Turán density of complete $k$-graphs.

preprint2023arXiv

Pressure-Induced Superconductivity in Topological Heterostructure (PbSe)5(Bi2Se3)6

Recently, the natural heterostructure of (PbSe)5(Bi2Se3)6 has been theoretically predicted and experimentally confirmed as a topological insulator. In this work, we induce superconductivity in (PbSe)5(Bi2Se3)6 by implementing high pressure. As increasing pressure up to 10 GPa, superconductivity with Tc ~ 4.6 K suddenly appears, followed by an abrupt decrease. Remarkably, upon further compression above 30 GPa, a new superconducting state arises, where pressure raises the Tc to an unsaturated 6.0 K within the limit of our research. Combining XRD and Raman spectroscopies, we suggest that the emergence of two distinct superconducting states occurs concurrently with the pressure-induced structural transition in this topological heterostructure (PbSe)5(Bi2Se3)6.

preprint2022arXiv

Caging-Pnictogen-Induced Superconductivity in Skutterudites IrX3 (X = As, P)

Here we report on a new kind of compound, XδIr4X12-δ (X = P, As), the first hole-doped skutterudites superconductor. We provide atomic resolution images of the caging As atoms using scanning transmission electron microscopy (STEM). By inserting As atoms into the caged structure under a high pressure, superconductivity emerges with a maximum transition temperature (Tc) of 4.4 K (4.8 K) in IrAs3 (IrP3). In contrast to all of the electron-doped skutterudites, the electronic states around the Fermi level in XδIr4X12-δ are dominated by the caged X atom, which can be described by a simple body-centered tight-binding model, implying a distinct paring mechanism. Our density functional theory (DFT) calculations reveal an intimate relationship between the pressure-dependent local-phonon mode and the enhancement of Tc. The discovery of XδIr4X12-δ provides an arena to investigate the uncharted territory of hole-doped skutterudites, and the method proposed here represents a new strategy of carrier doping in caged structures, without introducing extra elements.

preprint2022arXiv

CM-MLP: Cascade Multi-scale MLP with Axial Context Relation Encoder for Edge Segmentation of Medical Image

The convolutional-based methods provide good segmentation performance in the medical image segmentation task. However, those methods have the following challenges when dealing with the edges of the medical images: (1) Previous convolutional-based methods do not focus on the boundary relationship between foreground and background around the segmentation edge, which leads to the degradation of segmentation performance when the edge changes complexly. (2) The inductive bias of the convolutional layer cannot be adapted to complex edge changes and the aggregation of multiple-segmented areas, resulting in its performance improvement mostly limited to segmenting the body of segmented areas instead of the edge. To address these challenges, we propose the CM-MLP framework on MFI (Multi-scale Feature Interaction) block and ACRE (Axial Context Relation Encoder) block for accurate segmentation of the edge of medical image. In the MFI block, we propose the cascade multi-scale MLP (Cascade MLP) to process all local information from the deeper layers of the network simultaneously and utilize a cascade multi-scale mechanism to fuse discrete local information gradually. Then, the ACRE block is used to make the deep supervision focus on exploring the boundary relationship between foreground and background to modify the edge of the medical image. The segmentation accuracy (Dice) of our proposed CM-MLP framework reaches 96.96%, 96.76%, and 82.54% on three benchmark datasets: CVC-ClinicDB dataset, sub-Kvasir dataset, and our in-house dataset, respectively, which significantly outperform the state-of-the-art method. The source code and trained models will be available at https://github.com/ProgrammerHyy/CM-MLP.

preprint2022arXiv

Emergent superconductivity in van der Waals Kagome material Pd3P2S8 under high pressure

Kagome lattice systems have been proposed to host rich physics, which provide an excellent platform to explore unusual quantum states. Here, we report on the discovery of superconductivity in van der Waals material Pd3P2S8 under pressure. The superconductivity is observed in Pd3P2S8 for those pressures where the temperature dependence of the resistivity changes from a semiconducting-like behavior to that of a normal metal. The superconducting transition temperature Tc increases with applied pressure and reaches ~ 6.83 K at 79.5 GPa. Combining high-pressure XRD, Raman spectroscopy and theoretical calculations, our results demonstrate that the observed superconductivity induced by high pressure in Pd3P2S8 is closely related to the formation of amorphous phase, which results from the structural instability due to the enhanced coupling between interlayer Pd and S atoms upon compression.

preprint2022arXiv

Existence of global solutions to the nonlocal Schrödinger equation on the line

In this paper, we address the existence of global solutions to the Cauchy problem for the integrable nonlocal nonlinear Schrödinger (nonlocal NLS) equation with the initial data $q_0(x)\in H^{1,1}(\R)$ with the $L^1(\R)$ small-norm assumption. We rigorously show that the spectral problem for the nonlocal NLS equation admits no eigenvalues or resonances, as well as Zhou vanishing lemma is effective under the $L^1(\R)$ small-norm assumption. With inverse scattering theory and the Riemann-Hilbert approach, we rigorously establish the bijectivity and Lipschitz continuous of the direct and inverse scattering map from the initial data to reflection coefficients.By using reconstruction formula and the Plemelj projection estimates of reflection coefficients,we further obtain the existence of the local solution and the priori estimates, which assure the existence of the global solution to the Cauchy problem for the nonlocal NLS equation.

preprint2022arXiv

Fishtail effect and the vortex phase diagram of high-entropy alloy superconductor

High-entropy alloy (HEA) is an attracting topic raising in materials science and condensed matter physics. Although several types of superconductors have been discovered in HEAs, the critical currents (Jc) of HEA superconductors remain uncharacterized up to now. Here, we systematically study the current-carrying ability of (TaNb)0.7(HfZrTi)0.5 HEA at various heat treatment conditions. We obtained the high upper critical field and large current carrying ability, which point to optimistic applications. Interestingly, the fishtail or second peak effect was found for the first time in HEA superconductors, and the position of the vortex pinning force shows a maximum at 0.72 of the reduced field, which is quite different from the cuprates and iron-based high-Tc superconductors. Together with the resistive measurements, the vortex phase diagram is obtained for HEA superconductor.

preprint2022arXiv

Fusion of Self-supervised Learned Models for MOS Prediction

We participated in the mean opinion score (MOS) prediction challenge, 2022. This challenge aims to predict MOS scores of synthetic speech on two tracks, the main track and a more challenging sub-track: out-of-domain (OOD). To improve the accuracy of the predicted scores, we have explored several model fusion-related strategies and proposed a fused framework in which seven pretrained self-supervised learned (SSL) models have been engaged. These pretrained SSL models are derived from three ASR frameworks, including Wav2Vec, Hubert, and WavLM. For the OOD track, we followed the 7 SSL models selected on the main track and adopted a semi-supervised learning method to exploit the unlabeled data. According to the official analysis results, our system has achieved 1st rank in 6 out of 16 metrics and is one of the top 3 systems for 13 out of 16 metrics. Specifically, we have achieved the highest LCC, SRCC, and KTAU scores at the system level on main track, as well as the best performance on the LCC, SRCC, and KTAU evaluation metrics at the utterance level on OOD track. Compared with the basic SSL models, the prediction accuracy of the fused system has been largely improved, especially on OOD sub-track.

preprint2022arXiv

Gamma-Ray Polarimetry of the Crab Pulsar Observed by POLAR

The X/$γ$ ray polarimetry of the Crab pulsar/nebula is believed to hold crucial information on their emission models. In the past, several missions have shown evidence of polarized emission from the Crab. The significance of these measurements remains however limited. New measurements are therefore required. POLAR is a wide Field of View Compton-scattering polarimeter (sensitive in 50-500 keV) onboard the Chinese spacelab Tiangong-2 which took data from September 2016 to April 2017. Although not designed to perform polarization measurements of pulsars, we present here a novel method which can be applied to POLAR as well as that of other wide Field of View polarimeters. The novel polarimetric joint-fitting method for the Crab pulsar observations with POLAR, allows us to obtain constraining measurements of the pulsar component. The best fitted values and corresponding 1$σ$ deviations for the averaged phase interval: (PD=$14\substack{+15 \\ -10}$\%, PA=$108\substack{+33 \\ -54} ^{\circ}$), for Peak 1: (PD=$17\substack{+18 \\ -12}$\%, PA=$174\substack{+39 \\ -36} ^{\circ}$) and for Peak 2: (PD=$16\substack{+16 \\ -11}$\%, PA=$78\substack{+39 \\ -30} ^{\circ}$). Further more, the 3$σ$ upper limits on the polarization degree are for the averaged phase interval (55\%), Peak 1 (66\%) and Peak 2 (57\%). Finally, to illustrate the capabilities of this method in the future, we simulated two years observation to the Crab pulsar with POLAR-2. The results show that POLAR-2 is able to confirm the emission to be polarized with $5σ$ and $4σ$ confidence level if the Crab pulsar is polarized at $20\%$ and $10\%$ respectively.

preprint2022arXiv

Insight-HXMT dedicated 33-day observation of SGR J1935+2154 I. Burst Catalog

Magnetars are neutron stars with extreme magnetic field and sometimes manifest as soft gamma-ray repeaters (SGRs). SGR J1935+2154 is one of the most prolific bursters and the first confirmed source of fast radio burst (i.e. FRB 200428). Encouraged by the discovery of the first X-ray counterpart of FRB, Insight-Hard X-ray Modulation Telescope (Insight-HXMT) implemented a dedicated 33-day long ToO observation of SGR J1935+2154 since April 28, 2020. With the HE, ME, and LE telescopes, Insight-HXMT provides a thorough monitoring of burst activity evolution of SGR J1935+2154, in a very broad energy range (1-250 keV) with high temporal resolution and high sensitivity, resulting in a unique valuable data set for detailed studies of SGR J1935+2154. In this work, we conduct a comprehensive analysis of this observation including detailed burst search, identification and temporal analyses. After carefully removing false triggers, we find a total of 75 bursts from SGR J1935+2154, out of which 70 are single-pulsed. The maximum burst rate is about 56 bursts/day. Both the burst duration and the waiting time between two successive bursts follow log-normal distributions, consistent with previous studies. We also find that bursts with longer duration (some are multi-pulsed) tend to occur during the period with relatively high burst rate. There is no correlation between the waiting time and the fluence or duration of either the former or latter burst. It also seems that there is no correlation between burst duration and hardness ratio, in contrast to some previous reports. In addition, we do not find any X-ray burst associated with any reported radio bursts except for FRB 200428.

preprint2022arXiv

Insight-HXMT dedicated 33-day observation of SGR J1935+2154 II. Burst Spectral Catalog

Since April 28, 2020, Insight-HXMT has implemented a dedicated observation on the magnetar SGR J1935+2154. Thanks to the wide energy band (1-250 keV) and high sensitivity of Insight-HXMT, we obtained 75 bursts from SGR J1935+2154 during a month-long activity episode after the emission of FRB 200428. Here, we report the detailed time-integrated spectral analysis of these bursts and the statistical distribution of the spectral parameters. We find that for 15%(11/75) of SGR J1935+2154 bursts, the CPL model is preferred, and most of them occurred in the latter part of this active epoch. In the cumulative fluence distribution, we find that the fluence of bursts in our sample is about an order of magnitude weaker than that of Fermi/GBM, but follows the same power law distribution. Finally, we find a burst with similar peak energy to the time-integrated spectrum of the X-ray burst associated with FRB 200428 (FRB 200428-Associated Burst), but the low energy index is harder.

preprint2022arXiv

Longitudinal regression of covariance matrix outcomes

In this study, a longitudinal regression model for covariance matrix outcomes is introduced. The proposal considers a multilevel generalized linear model for regressing covariance matrices on (time-varying) predictors. This model simultaneously identifies covariate associated components from covariance matrices, estimates regression coefficients, and estimates the within-subject variation in the covariance matrices. Optimal estimators are proposed for both low-dimensional and high-dimensional cases by maximizing the (approximated) hierarchical likelihood function and are proved to be asymptotically consistent, where the proposed estimator is the most efficient under the low-dimensional case and achieves the uniformly minimum quadratic loss among all linear combinations of the identity matrix and the sample covariance matrix under the high-dimensional case. Through extensive simulation studies, the proposed approach achieves good performance in identifying the covariate related components and estimating the model parameters. Applying to a longitudinal resting-state fMRI dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI), the proposed approach identifies brain networks that demonstrate the difference between males and females at different disease stages. The findings are in line with existing knowledge of AD and the method improves the statistical power over the analysis of cross-sectional data.

preprint2022arXiv

MC-UNet Multi-module Concatenation based on U-shape Network for Retinal Blood Vessels Segmentation

Accurate segmentation of the blood vessels of the retina is an important step in clinical diagnosis of ophthalmic diseases. Many deep learning frameworks have come up for retinal blood vessels segmentation tasks. However, the complex vascular structure and uncertain pathological features make the blood vessel segmentation still very challenging. A novel U-shaped network named Multi-module Concatenation which is based on Atrous convolution and multi-kernel pooling is put forward to retinal vessels segmentation in this paper. The proposed network structure retains three layers the essential structure of U-Net, in which the atrous convolution combining the multi-kernel pooling blocks are designed to obtain more contextual information. The spatial attention module is concatenated with dense atrous convolution module and multi-kernel pooling module to form a multi-module concatenation. And different dilation rates are selected by cascading to acquire a larger receptive field in atrous convolution. Adequate comparative experiments are conducted on these public retinal datasets: DRIVE, STARE and CHASE_DB1. The results show that the proposed method is effective, especially for microvessels. The code will be put out at https://github.com/Rebeccala/MC-UNet

preprint2022arXiv

Mediation Analysis with Multiple Exposures and Multiple Mediators

A mediation analysis approach is proposed for multiple exposures, multiple mediators, and a continuous scalar outcome under the linear structural equation modeling framework. It assumes that there exist orthogonal components that demonstrate parallel mediation mechanisms on the outcome, and thus is named Principal Component Mediation Analysis (PCMA). Likelihood-based estimators are introduced for simultaneous estimation of the component projections and effect parameters. The asymptotic distribution of the estimators is derived for low-dimensional data. A bootstrap procedure is introduced for inference. Simulation studies illustrate the superior performance of the proposed approach. Applied to a proteomics-imaging dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI), the proposed framework identifies protein deposition - brain atrophy - memory deficit mechanisms consistent with existing knowledge and suggests potential AD pathology by integrating data collected from different modalities.

preprint2022arXiv

Multi-cell Content Caching: Optimization for Cost and Information Freshness

In multi-access edge computing (MEC) systems, there are multiple local cache servers caching contents to satisfy the users' requests, instead of letting the users download via the remote cloud server. In this paper, a multi-cell content scheduling problem (MCSP) in MEC systems is considered. Taking into account jointly the freshness of the cached contents and the traffic data costs, we study how to schedule content updates along time in a multi-cell setting. Different from single-cell scenarios, a user may have multiple candidate local cache servers, and thus the caching decisions in all cells must be jointly optimized. We first prove that MCSP is NP-hard, then we formulate MCSP using integer linear programming, by which the optimal scheduling can be obtained for small-scale instances. For problem solving of large scenarios, via a mathematical reformulation, we derive a scalable optimization algorithm based on repeated column generation. Our performance evaluation shows the effectiveness of the proposed algorithm in comparison to an off-the-shelf commercial solver and a popularity-based caching.

preprint2022arXiv

On subgraphs of tripartite graphs

Bollobás, Erdős, and Szemerédi [Discrete Math 13 (1975), 97--107] investigated a tripartite generalization of the Zarankiewicz problem: what minimum degree forces a tripartite graph with $n$ vertices in each part to contain an octahedral graph $K_3(2)$? They proved that $n+2^{-1/2}n^{3/4}$ suffices and suggested it could be weakened to $n+cn^{1/2}$ for some constant $c>0$. In this note we show that their method only gives $n+ (1+o(1)) n^{11/12}$ and provide many constructions that show if true, $n+ c n^{1/2}$ is better possible.

preprint2022arXiv

Pattern formation of parasite-host model induced by fear effect

In this paper, based on the epidemiological microparasite model, a parasite-host model is established by considering the fear effect of susceptible individuals on infectors. We explored the pattern formation with the help of numerical simulation, and analyzed the effects of fear effect, infected host mortality, population diffusion rate and reducing reproduction ability of infected hosts on population activities in different degrees. Theoretically, we give the general conditions for the stability of the model under non-diffusion and considering the Turing instability caused by diffusion. Our results indicate how fear affects the distribution of the uninfected and infected hosts in the habitat and quantify the influence of the fear factor on the spatiotemporal pattern of the population. In addition, we analyze the influence of natural death rate, reproduction ability of infected hosts, and diffusion level of uninfected (infected) hosts on the spatiotemporal pattern, respectively. The results present that the growth of pattern induced by intensified fear effect follows the certain rule: cold spots $\rightarrow$ cold spots-stripes $\rightarrow$ cold stripes $\rightarrow$ hot stripes $\rightarrow$ hot spots-stripes $\rightarrow$ hot spots. Interestingly, the natural mortality and fear effect take the opposite effect on the growth order of the pattern. From the perspective of biological significance, we find that the degree of fear effect can reshape the distribution of population to meet the previous rule.

preprint2022arXiv

Pressure-Induced Superconductivity and Structural Phase Transitions in Magnetic Topological Insulator Candidate MnSb4Te7

The magnetic van der Waals crystals (MnX2Te4)m(X2Te3)n (X = Sb, Bi) have drawn significant attention due to their rich topological properties and the tenability by external magnetic field. In this work, we report on the discovery of superconductivity in magnetic topological insulator candidate MnSb4Te7 (m = 1, n = 1) via the application of high pressure. The antiferromagnetic ordering is robust to pressure until 8 GPa and then fully suppressed. The carrier type converts from hole- to electron-type accompanied with structural phase transition at around 15 GPa. Superconductivity emerges near the critical pressure 30 GPa where MnSb4Te7 converted into a simple cubic phase. Interestingly, MnSb4Te7 shows a dome-like phase diagram with a maximum Tc of 2.2 K at 50.7 GPa. The results demonstrate that MnSb4Te7 with nontrivial topology of electronic states display new ground states upon compression.

preprint2022arXiv

Shadows of 3-uniform hypergraphs under a minimum degree condition

We prove a minimum degree version of the Kruskal--Katona theorem: given $d\ge 1/4$ and a triple system $F$ on $n$ vertices with minimum degree at least $d\binom n2$, we obtain asymptotically tight lower bounds for the size of its shadow. Equivalently, for $t\ge n/2-1$, we asymptotically determine the minimum size of a graph on $n$ vertices, in which every vertex is contained in at least $\binom t2$ triangles. This can be viewed as a variant of the Rademacher--Turán problem.

preprint2021arXiv

Complete subgraphs in a multipartite graph

In 1975 Bollobás, Erd\H os, and Szemerédi asked the following question: given positive integers $n, t, r$ with $2\le t\le r-1$, what is the largest minimum degree $δ(G)$ among all $r$-partite graphs $G$ with parts of size $n$ and which do not contain a copy of $K_{t+1}$? The $r=t+1$ case has attracted a lot of attention and was fully resolved by Haxell and Szabó, and Szabó and Tardos in 2006. In this paper we investigate the $r>t+1$ case of the problem, which has remained dormant for over forty years. We resolve the problem exactly in the case when $r \equiv -1 \pmod{t}$, and up to an additive constant for many other cases, including when $r \geq (3t-1)(t-1)$. Our approach utilizes a connection to the related problem of determining the maximum of the minimum degrees among the family of balanced $r$-partite $rn$-vertex graphs of chromatic number at most $t$.

preprint2021arXiv

Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm

We present a new approach to disentangle speaker voice and phone content by introducing new components to the VQ-VAE architecture for speech synthesis. The original VQ-VAE does not generalize well to unseen speakers or content. To alleviate this problem, we have incorporated a speaker encoder and speaker VQ codebook that learns global speaker characteristics entirely separate from the existing sub-phone codebooks. We also compare two training methods: self-supervised with global conditions and semi-supervised with speaker labels. Adding a speaker VQ component improves objective measures of speech synthesis quality (estimated MOS, speaker similarity, ASR-based intelligibility) and provides learned representations that are meaningful. Our speaker VQ codebook indices can be used in a simple speaker diarization task and perform slightly better than an x-vector baseline. Additionally, phones can be recognized from sub-phone VQ codebook indices in our semi-supervised VQ-VAE better than self-supervised with global conditions.

preprint2021arXiv

Minimum degree thresholds for Hamilton $(k/2)$-cycles in $k$-uniform hypergraphs

For any even integer $k\ge 6$, integer $d$ such that $k/2\le d\le k-1$, and sufficiently large $n\in (k/2)\mathbb N$, we find a tight minimum $d$-degree condition that guarantees the existence of a Hamilton $(k/2)$-cycle in every $k$-uniform hypergraph on $n$ vertices. When $n\in k\mathbb N$, the degree condition coincides with the one for the existence of perfect matchings provided by Rödl, Ruciński and Szemerédi (for $d=k-1$) and Treglown and Zhao (for $d\ge k/2$), and thus our result strengthens theirs in this case.

preprint2021arXiv

Pressure induced superconductivity in WB2 and ReB2 through modifying the B layers

The recent discovery of superconductivity up to 32 K in the pressurized MoB2 reignites the interests in exploring high-Tc superconductors in transition-metal diborides. Inspired by that work, we turn our attention to the 5d transition-metal diborides. Here we systematically investigate the responses of both structural and physical properties of WB2 and ReB2 to external pressure, which possess different types of boron layers. Similar to MoB2, the pressure-induced superconductivity was also observed in WB2 above 60 GPa with a maximum Tc of 15 K at 100 GPa, while no superconductivity was detected in ReB2 in this pressure range. Interestingly, the structures at ambient pressure for both WB2 and ReB2 persist to high pressure without structural phase transitions. Theoretical calculations suggest that the ratio of flat boron layers in this class of transition-metal diborides may be crucial for the appearance of high Tc. The combined theoretical and experimental results highlight the effect of geometry of boron layers on superconductivity and shed light on the exploration of novel high-Tc superconductors in borides.

preprint2021arXiv

Pressure-induced Superconductivity in dual-topological semimetal Pt2HgSe3

Recently monolayer jacutingaite (Pt2HgSe3), a naturally occurring exfoliable mineral, discovered in Brazil in 2008, has been theoretically predicted as a candidate quantum spin Hall system with a 0.5 eV band gap, while the bulk form is one of only a few known dual-topological insulators which may host different surface states protected by symmetries. In this work, we systematically investigate both structure and electronic evolution of bulk Pt2HgSe3 under high pressure up to 96 GPa. The nontrivial topology persists up to the structural phase transition observed in the high-pressure regime. Interestingly, we found that this phase transition is accompanied by the appearance of superconductivity at around 55 GPa and the critical transition temperature Tc increases with applied pressure. Our results demonstrate that Pt2HgSe3 with nontrivial topology of electronic states displays new ground states upon compression and raises potentials in application to the next-generation spintronic devices.

preprint2021arXiv

Quantum oscillations in Noncentrosymmetric Weyl semimetals RAlSi (R = Sm and Ce)

Weyl semimetal (WSM) as a new type of quantum state of matter hosting low energy relativistic quasiparticles, has attracted significant attention for both scientific community and potential quantum device applications. Here, we report a comprehensive investigation of the structural, magnetic and transport properties of noncentrosymmetric RAlSi (R = Sm, Ce), which have been predicted to be new magnetic WSM candidates. Both samples exhibit non-saturated magnetoresistance (MR), with ~ 900% for SmAlSi and 80% for CeAlSi at 1.8 K, 9 T. The carrier densities of SmAlSi and CeAlSi display remarkable change around magnetic transition temperatures, signifying that the electronic states are sensitive to magnetic ordering of rare earth elements. At low temperatures, SmAlSi reveals prominent Shubnikov-de Haas (SdH) oscillations associated with the nontrivial Berry phase. High pressure experiments demonstrate that the magnetic order is robust and survival under high pressure. Our results would yield valuable insights of WSM physics and potentials in application to the next-generation spintronic devices in RAX family.

preprint2021arXiv

Rainbow Pancyclicity in Graph Systems

Let $G_1,...,G_n$ be graphs on the same vertex set of size $n$, each graph with minimum degree $δ(G_i)\ge n/2$. A recent conjecture of Aharoni asserts that there exists a rainbow Hamiltonian cycle i.e. a cycle with edge set $\{e_1,...,e_n\}$ such that $e_i\in E(G_i)$ for $1\leq i \leq n$. This can be viewed as a rainbow version of the well-known Dirac theorem. In this paper, we prove this conjecture asymptotically by showing that for every $\varepsilon>0$, there exists an integer $N>0$, such that when $n>N$ for any graphs $G_1,...,G_n$ on the same vertex set of size $n$ with $δ(G_i)\ge (\frac{1}{2}+\varepsilon)n$, there exists a rainbow Hamiltonian cycle. Our main tool is the absorption technique. Additionally, we prove that with $δ(G_i)\geq \frac{n+1}{2}$ for each $i$, one can find rainbow cycles of length $3,...,n-1$.

preprint2020arXiv

3D U-Net for Segmentation of Plant Root MRI Images in Super-Resolution

Magnetic resonance imaging (MRI) enables plant scientists to non-invasively study root system development and root-soil interaction. Challenging recording conditions, such as low resolution and a high level of noise hamper the performance of traditional root extraction algorithms, though. We propose to increase signal-to-noise ratio and resolution by segmenting the scanned volumes into root and soil in super-resolution using a 3D U-Net. Tests on real data show that the trained network is capable to detect most roots successfully and even finds roots that were missed by human annotators. Our experiments show that the segmentation performance can be further improved with modifications of the loss function.

preprint2020arXiv

Cross-regional oil palm tree counting and detection via multi-level attention domain adaptation network

Providing an accurate evaluation of palm tree plantation in a large region can bring meaningful impacts in both economic and ecological aspects. However, the enormous spatial scale and the variety of geological features across regions has made it a grand challenge with limited solutions based on manual human monitoring efforts. Although deep learning based algorithms have demonstrated potential in forming an automated approach in recent years, the labelling efforts needed for covering different features in different regions largely constrain its effectiveness in large-scale problems. In this paper, we propose a novel domain adaptive oil palm tree detection method, i.e., a Multi-level Attention Domain Adaptation Network (MADAN) to reap cross-regional oil palm tree counting and detection. MADAN consists of 4 procedures: First, we adopted a batch-instance normalization network (BIN) based feature extractor for improving the generalization ability of the model, integrating batch normalization and instance normalization. Second, we embedded a multi-level attention mechanism (MLA) into our architecture for enhancing the transferability, including a feature level attention and an entropy level attention. Then we designed a minimum entropy regularization (MER) to increase the confidence of the classifier predictions through assigning the entropy level attention value to the entropy penalty. Finally, we employed a sliding window-based prediction and an IOU based post-processing approach to attain the final detection results. We conducted comprehensive ablation experiments using three different satellite images of large-scale oil palm plantation area with six transfer tasks. MADAN improves the detection accuracy by 14.98% in terms of average F1-score compared with the Baseline method (without DA), and performs 3.55%-14.49% better than existing domain adaptation methods.

preprint2020arXiv

Deep Learning Detection of Inaccurate Smart Electricity Meters: A Case Study

Detecting inaccurate smart meters and targeting them for replacement can save significant resources. For this purpose, a novel deep-learning method was developed based on long short-term memory (LSTM) and a modified convolutional neural network (CNN) to predict electricity usage trajectories based on historical data. From the significant difference between the predicted trajectory and the observed one, the meters that cannot measure electricity accurately are located. In a case study, a proof of principle was demonstrated in detecting inaccurate meters with high accuracy for practical usage to prevent unnecessary replacement and increase the service life span of smart meters.

preprint2020arXiv

Deep Learning-Based Gait Recognition Using Smartphones in the Wild

Compared to other biometrics, gait is difficult to conceal and has the advantage of being unobtrusive. Inertial sensors, such as accelerometers and gyroscopes, are often used to capture gait dynamics. These inertial sensors are commonly integrated into smartphones and are widely used by the average person, which makes gait data convenient and inexpensive to collect. In this paper, we study gait recognition using smartphones in the wild. In contrast to traditional methods, which often require a person to walk along a specified road and/or at a normal walking speed, the proposed method collects inertial gait data under unconstrained conditions without knowing when, where, and how the user walks. To obtain good person identification and authentication performance, deep-learning techniques are presented to learn and model the gait biometrics based on walking data. Specifically, a hybrid deep neural network is proposed for robust gait feature representation, where features in the space and time domains are successively abstracted by a convolutional neural network and a recurrent neural network. In the experiments, two datasets collected by smartphones for a total of 118 subjects are used for evaluations. The experiments show that the proposed method achieves higher than 93.5\% and 93.7\% accuracies in person identification and authentication, respectively.

preprint2020arXiv

Existence thresholds and Ramsey properties of random posets

Let $\mathcal P(n)$ denote the power set of $[n]$, ordered by inclusion, and let $\mathcal P (n,p)$ denote the random poset obtained from $\mathcal P(n)$ by retaining each element from $\mathcal P (n)$ independently at random with probability $p$ and discarding it otherwise. Given any fixed poset $F$ we determine the threshold for the property that $\mathcal P(n,p)$ contains $F$ as an induced subposet. We also asymptotically determine the number of copies of a fixed poset $F$ in $\mathcal P(n)$. Finally, we obtain a number of results on the Ramsey properties of the random poset $\mathcal P(n,p)$.

preprint2020arXiv

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image and a pre-built 3D model. Recently, deep neural networks have been exploited to regress the mapping between raw pixels and 3D coordinates in the scene, and thus the matching is implicitly performed by the forward pass through the network. However, in a large and ambiguous environment, learning such a regression task directly can be difficult for a single network. In this work, we present a new hierarchical scene coordinate network to predict pixel scene coordinates in a coarse-to-fine manner from a single RGB image. The network consists of a series of output layers, each of them conditioned on the previous ones. The final output layer predicts the 3D coordinates and the others produce progressively finer discrete location labels. The proposed method outperforms the baseline regression-only network and allows us to train compact models which scale robustly to large environments. It sets a new state-of-the-art for single-image RGB localization performance on the 7-Scenes, 12-Scenes, Cambridge Landmarks datasets, and three combined scenes. Moreover, for large-scale outdoor localization on the Aachen Day-Night dataset, we present a hybrid approach which outperforms existing scene coordinate regression methods, and reduces significantly the performance gap w.r.t. explicit feature matching methods.

preprint2020arXiv

Improved Prosody from Learned F0 Codebook Representations for VQ-VAE Speech Waveform Reconstruction

Vector Quantized Variational AutoEncoders (VQ-VAE) are a powerful representation learning framework that can discover discrete groups of features from a speech signal without supervision. Until now, the VQ-VAE architecture has previously modeled individual types of speech features, such as only phones or only F0. This paper introduces an important extension to VQ-VAE for learning F0-related suprasegmental information simultaneously along with traditional phone features.The proposed framework uses two encoders such that the F0 trajectory and speech waveform are both input to the system, therefore two separate codebooks are learned. We used a WaveRNN vocoder as the decoder component of VQ-VAE. Our speaker-independent VQ-VAE was trained with raw speech waveforms from multi-speaker Japanese speech databases. Experimental results show that the proposed extension reduces F0 distortion of reconstructed speech for all unseen test speakers, and results in significantly higher preference scores from a listening test. We additionally conducted experiments using single-speaker Mandarin speech to demonstrate advantages of our architecture in another language which relies heavily on F0.

preprint2020arXiv

Learning from Suspected Target: Bootstrapping Performance for Breast Cancer Detection in Mammography

Deep learning object detection algorithm has been widely used in medical image analysis. Currently all the object detection tasks are based on the data annotated with object classes and their bounding boxes. On the other hand, medical images such as mammography usually contain normal regions or objects that are similar to the lesion region, and may be misclassified in the testing stage if they are not taken care of. In this paper, we address such problem by introducing a novel top likelihood loss together with a new sampling procedure to select and train the suspected target regions, as well as proposing a similarity loss to further identify suspected targets from targets. Mean average precision (mAP) according to the predicted targets and specificity, sensitivity, accuracy, AUC values according to classification of patients are adopted for performance comparisons. We firstly test our proposed method on a private dense mammogram dataset. Results show that our proposed method greatly reduce the false positive rate and the specificity is increased by 0.25 on detecting mass type cancer. It is worth mention that dense breast typically has a higher risk for developing breast cancers and also are harder for cancer detection in diagnosis, and our method outperforms a reported result from performance of radiologists. Our method is also validated on the public Digital Database for Screening Mammography (DDSM) dataset, brings significant improvement on mass type cancer detection and outperforms the most state-of-the-art work.

preprint2020arXiv

PBRnet: Pyramidal Bounding Box Refinement to Improve Object Localization Accuracy

Many recently developed object detectors focused on coarse-to-fine framework which contains several stages that classify and regress proposals from coarse-grain to fine-grain, and obtains more accurate detection gradually. Multi-resolution models such as Feature Pyramid Network(FPN) integrate information of different levels of resolution and effectively improve the performance. Previous researches also have revealed that localization can be further improved by: 1) using fine-grained information which is more translational variant; 2) refining local areas which is more focused on local boundary information. Based on these principles, we designed a novel boundary refinement architecture to improve localization accuracy by combining coarse-to-fine framework with feature pyramid structure, named as Pyramidal Bounding Box Refinement network(PBRnet), which parameterizes gradually focused boundary areas of objects and leverages lower-level feature maps to extract finer local information when refining the predicted bounding boxes. Extensive experiments are performed on the MS-COCO dataset. The PBRnet brings a significant performance gains by roughly 3 point of $mAP$ when added to FPN or Libra R-CNN. Moreover, by treating Cascade R-CNN as a coarse-to-fine detector and replacing its localization branch by the regressor of PBRnet, it leads an extra performance improvement by 1.5 $mAP$, yielding a total performance boosting by as high as 5 point of $mAP$.

preprint2020arXiv

Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions

The Voice Conversion Challenge 2020 is the third edition under its flagship that promotes intra-lingual semiparallel and cross-lingual voice conversion (VC). While the primary evaluation of the challenge submissions was done through crowd-sourced listening tests, we also performed an objective assessment of the submitted systems. The aim of the objective assessment is to provide complementary performance analysis that may be more beneficial than the time-consuming listening tests. In this study, we examined five types of objective assessments using automatic speaker verification (ASV), neural speaker embeddings, spoofing countermeasures, predicted mean opinion scores (MOS), and automatic speech recognition (ASR). Each of these objective measures assesses the VC output along different aspects. We observed that the correlations of these objective assessments with the subjective results were high for ASV, neural speaker embedding, and ASR, which makes them more influential for predicting subjective test results. In addition, we performed spoofing assessments on the submitted systems and identified some of the VC methods showing a potentially high security risk.

preprint2020arXiv

Pressure-induced Topological and Structural Phase Transitions in an Antiferromagnetic Topological Insulator

Recently, natural van der Waals heterostructures of (MnBi2Te4)m(Bi2Te3)n have been theoretically predicted and experimentally shown to host tunable magnetic properties and topologically nontrivial surface states. In this work, we systematically investigate both the structural and electronic responses of MnBi2Te4 and MnBi4Te7 to external pressure. In addition to the suppression of antiferromagnetic order, MnBi2Te4 is found to undergo a metal-semiconductor-metal transition upon compression. The resistivity of MnBi4Te7 changes dramatically under high pressure and a non-monotonic evolution of \r{ho}(T) is observed. The nontrivial topology is proved to persists before the structural phase transition observed in the high-pressure regime. We find that the bulk and surface states respond differently to pressure, which is consistent with the non-monotonic change of the resistivity. Interestingly, a pressure-induced amorphous state is observed in MnBi2Te4, while two high pressure phase transitions are revealed in MnBi4Te7. Our combined theoretical and experimental research establishes MnBi2Te4 and MnBi4Te7 as highly tunable magnetic topological insulators, in which phase transitions and new ground states emerge upon compression.

preprint2020arXiv

Principal Regression for High Dimensional Covariance Matrices

This manuscript presents an approach to perform generalized linear regression with multiple high dimensional covariance matrices as the outcome. Model parameters are proposed to be estimated by maximizing a pseudo-likelihood. When the data are high dimensional, the normal likelihood function is ill-posed as the sample covariance matrix is rank-deficient. Thus, a well-conditioned linear shrinkage estimator of the covariance matrix is introduced. With multiple covariance matrices, the shrinkage coefficients are proposed to be common across matrices. Theoretical studies demonstrate that the proposed covariance matrix estimator is optimal achieving the uniformly minimum quadratic loss asymptotically among all linear combinations of the identity matrix and the sample covariance matrix. Under regularity conditions, the proposed estimator of the model parameters is consistent. The superior performance of the proposed approach over existing methods is illustrated through simulation studies. Implemented to a resting-state functional magnetic resonance imaging study acquired from the Alzheimer's Disease Neuroimaging Initiative, the proposed approach identified a brain network within which functional connectivity is significantly associated with Apolipoprotein E $\varepsilon$4, a strong genetic marker for Alzheimer's disease.

preprint2020arXiv

Voice Conversion Challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion

The voice conversion challenge is a bi-annual scientific event held to compare and understand different voice conversion (VC) systems built on a common dataset. In 2020, we organized the third edition of the challenge and constructed and distributed a new database for two tasks, intra-lingual semi-parallel and cross-lingual VC. After a two-month challenge period, we received 33 submissions, including 3 baselines built on the database. From the results of crowd-sourced listening tests, we observed that VC methods have progressed rapidly thanks to advanced deep learning methods. In particular, speaker similarity scores of several systems turned out to be as high as target speakers in the intra-lingual semi-parallel VC task. However, we confirmed that none of them have achieved human-level naturalness yet for the same task. The cross-lingual conversion task is, as expected, a more difficult task, and the overall naturalness and similarity scores were lower than those for the intra-lingual conversion task. However, we observed encouraging results, and the MOS scores of the best systems were higher than 4.0. We also show a few additional analysis results to aid in understanding cross-lingual VC better.

preprint2019arXiv

B-Value and Empirical Equivalence Bound: A New Procedure of Hypothesis Testing

In this study, we propose a two-stage procedure for hypothesis testing, where the first stage is conventional hypothesis testing and the second is an equivalence testing procedure using an introduced Empirical Equivalence Bound. In 2016, the American Statistical Association released a policy statement on P-values to clarify the proper use and interpretation in response to the criticism of reproducibility and replicability in scientific findings. A recent solution to improve reproducibility and transparency in statistical hypothesis testing is to integrate P-values (or confidence intervals) with practical or scientific significance. Similar ideas have been proposed via the equivalence test, where the goal is to infer equality under a presumption (null) of inequality of parameters. However, in these testing procedures, the definition of scientific significance/equivalence can be subjective. To circumvent this drawback, we introduce a B-value and the Empirical Equivalence Bound, which are both estimated from the data. Performing a second-stage equivalence test, our procedure offers an opportunity to correct for false positive discoveries and improve the reproducibility in findings across studies.

preprint2019arXiv

Inverse scattering transformation for the Fokas-Lenells equation with nonzero boundary conditions

In this article, we focus on the inverse scattering transformation for the Fokas-Lenells (FL) equation with nonzero boundary conditions via the Riemann-Hilbert (RH) approach. Based on the Lax pair of the FL equation, the analyticity and symmetry, asymptotic behavior of Jost solutions and scattering matrix are discussed in detail. With these results, we further present a generalized RH problem, from which a reconstruction formula between the solution of the FL equation and the Riemann-Hilbert problem is obtained. The N-soliton solutions of the FL equation is obtained via solving the RH problem.