Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
17works
0followers
16topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

17 published item(s)

preprint2026arXiv

Deep learning-driven atmospheric parameter prediction for hot subdwarf stars with synthetic and observed spectra

We design a convolutional neural network (CNN) incorporating channel attention and spatial attention mechanisms to predict atmospheric parameters of hot subdwarfs. The experimental dataset comprises spectra at nine distinct signal-to-noise ratio (SNR) levels, with each SNR level containing 11 396 synthetic spectra and 945 observed spectra. The trained deep learning models achieves mean absolute errors (AME) in predicting hot subdwarf atmospheric parameters of 730 K for effective temperature (Teff ), 0.09 dex for surface gravity (log g), and 0.03 dex for helium abundance (log(nHe/nH)), respectively, which reaches the accuracy of traditional spectral fitting methods. Utilizing the trained deep learning models and low-resolution spectra from LAMOST DR12, we confirm 1512 hot subdwarfs from the catalog of hot subdwarf candidates, of which 291 are newly identified. Our results demonstrate that the deep learning model not only achieves accuracy comparable to traditional methods in obtaining hot subdwarf atmospheric parameters, but also far exceeds them in speed and efficiency, making it particularly suitable for the analysis of large datasets of hot subdwarf spectra.

preprint2022arXiv

Continual Learning for CTR Prediction: A Hybrid Approach

Click-through rate(CTR) prediction is a core task in cost-per-click(CPC) advertising systems and has been studied extensively by machine learning practitioners. While many existing methods have been successfully deployed in practice, most of them are built upon i.i.d.(independent and identically distributed) assumption, ignoring that the click data used for training and inference is collected through time and is intrinsically non-stationary and drifting. This mismatch will inevitably lead to sub-optimal performance. To address this problem, we formulate CTR prediction as a continual learning task and propose COLF, a hybrid COntinual Learning Framework for CTR prediction, which has a memory-based modular architecture that is designed to adapt, learn and give predictions continuously when faced with non-stationary drifting click data streams. Married with a memory population method that explicitly controls the discrepancy between memory and target data, COLF is able to gain positive knowledge from its historical experience and makes improved CTR predictions. Empirical evaluations on click log collected from a major shopping app in China demonstrate our method's superiority over existing methods. Additionally, we have deployed our method online and observed significant CTR and revenue improvement, which further demonstrates our method's efficacy.

preprint2022arXiv

Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling

User historical behaviors are proved useful for Click Through Rate (CTR) prediction in online advertising system. In Meituan, one of the largest e-commerce platform in China, an item is typically displayed with its image and whether a user clicks the item or not is usually influenced by its image, which implies that user's image behaviors are helpful for understanding user's visual preference and improving the accuracy of CTR prediction. Existing user image behavior models typically use a two-stage architecture, which extracts visual embeddings of images through off-the-shelf Convolutional Neural Networks (CNNs) in the first stage, and then jointly trains a CTR model with those visual embeddings and non-visual features. We find that the two-stage architecture is sub-optimal for CTR prediction. Meanwhile, precisely labeled categories in online ad systems contain abundant visual prior information, which can enhance the modeling of user image behaviors. However, off-the-shelf CNNs without category prior may extract category unrelated features, limiting CNN's expression ability. To address the two issues, we propose a hybrid CNN based attention module, unifying user's image behaviors and category prior, for CTR prediction. Our approach achieves significant improvements in both online and offline experiments on a billion scale real serving dataset.

preprint2022arXiv

Improving Deliberation by Text-Only and Semi-Supervised Training

Text-only and semi-supervised training based on audio-only data has gained popularity recently due to the wide availability of unlabeled text and speech data. In this work, we propose incorporating text-only and semi-supervised training into an attention-based deliberation model. By incorporating text-only data in training a bidirectional encoder representation from transformer (BERT) for the deliberation text encoder, and large-scale text-to-speech and audio-only utterances using joint acoustic and text decoder (JATD) and semi-supervised training, we achieved 4%-12% WER reduction for various tasks compared to the baseline deliberation. Compared to a state-of-the-art language model (LM) rescoring method, the deliberation model reduces the Google Voice Search WER by 11% relative. We show that the deliberation model also achieves a positive human side-by-side evaluation compared to the state-of-the-art LM rescorer with reasonable endpointer latencies.

preprint2022arXiv

Intense high-harmonic optical vortices generated from a micro-plasma-waveguide irradiated by a circularly polarized laser pulse

A scheme for generating intense high-harmonic optical vortices is proposed. It relies on spin-orbit interaction of light when a relativistically-strong circularly polarized laser pulse irradiates a micro-plasma-waveguide. The intense laser field drives a strong surface wave at the inner boundary of the waveguide, which leads to high-order harmonic generation as the laser traveling inside. For a circularly polarized drive laser, the optical chirality is imprinted to the surface wave, which facilitates conversion of spin angular momentum of the fundamental light into orbital angular momenta of the harmonics. A "shaken waveguide" model is developed showing that the aforementioned phenomena arises due to nonlinear plasma response that modifies electromagnetic mode at high intensities. We further show the phase velocities of all the harmonic beams are automatically matched to the driving laser, so that the harmonic intensities increase with propagation distance. The efficiency of harmonic production are related to the surface wave breaking effect, which can be significantly enhanced using a tightly focused laser. Our simulation suggests an overall conversion efficiency $\sim5\%$ can be achieved.

preprint2022arXiv

Streaming Align-Refine for Non-autoregressive Deliberation

We propose a streaming non-autoregressive (non-AR) decoding algorithm to deliberate the hypothesis alignment of a streaming RNN-T model. Our algorithm facilitates a simple greedy decoding procedure, and at the same time is capable of producing the decoding result at each frame with limited right context, thus enjoying both high efficiency and low latency. These advantages are achieved by converting the offline Align-Refine algorithm to be streaming-compatible, with a novel transformer decoder architecture that performs local self-attentions for both text and audio, and a time-aligned cross-attention at each layer. Furthermore, we perform discriminative training of our model with the minimum word error rate (MWER) criterion, which has not been done in the non-AR decoding literature. Experiments on voice search datasets and Librispeech show that with reasonable right context, our streaming model performs as well as the offline counterpart, and discriminative training leads to further WER gain when the first-pass model has small capacity.

preprint2021arXiv

Polarized skylight orientation determination artificial neural network

This paper proposes an artificial neural network to determine orientation using polarized skylight. This neural network has specific dilated convolution, which can extract light intensity information of different polarization directions. Then, the degree of polarization (DOP) and angle of polarization (AOP) are directly extracted in the network. In addition, the exponential function encoding of orientation is designed as the network output, which can better reflect the insect's encoding of polarization information, and improve the accuracy of orientation determination. Finally, training and testing were conducted on a public polarized skylight navigation dataset, and the experimental results proved the stability and effectiveness of the network.

preprint2021arXiv

Transformer Based Deliberation for Two-Pass Speech Recognition

Interactive speech recognition systems must generate words quickly while also producing accurate results. Two-pass models excel at these requirements by employing a first-pass decoder that quickly emits words, and a second-pass decoder that requires more context but is more accurate. Previous work has established that a deliberation network can be an effective second-pass model. The model attends to two kinds of inputs at once: encoded audio frames and the hypothesis text from the first-pass model. In this work, we explore using transformer layers instead of long-short term memory (LSTM) layers for deliberation rescoring. In transformer layers, we generalize the "encoder-decoder" attention to attend to both encoded audio and first-pass text hypotheses. The output context vectors are then combined by a merger layer. Compared to LSTM-based deliberation, our best transformer deliberation achieves 7% relative word error rate improvements along with a 38% reduction in computation. We also compare against non-deliberation transformer rescoring, and find a 9% relative improvement.

preprint2020arXiv

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i.e., word error rate (WER), and latency, i.e., the time the hypothesis is finalized after the user stops speaking. In this paper, we develop a first-pass Recurrent Neural Network Transducer (RNN-T) model and a second-pass Listen, Attend, Spell (LAS) rescorer that surpasses a conventional model in both quality and latency. On the quality side, we incorporate a large number of utterances across varied domains to increase acoustic diversity and the vocabulary seen by the model. We also train with accented English speech to make the model more robust to different pronunciations. In addition, given the increased amount of training data, we explore a varied learning rate schedule. On the latency front, we explore using the end-of-sentence decision emitted by the RNN-T model to close the microphone, and also introduce various optimizations to improve the speed of LAS rescoring. Overall, we find that RNN-T+LAS offers a better WER and latency tradeoff compared to a conventional model. For example, for the same latency, RNN-T+LAS obtains a 8% relative improvement in WER, while being more than 400-times smaller in model size.

preprint2020arXiv

Deliberation Model Based Two-Pass End-to-End Speech Recognition

End-to-end (E2E) models have made rapid progress in automatic speech recognition (ASR) and perform competitively relative to conventional models. To further improve the quality, a two-pass model has been proposed to rescore streamed hypotheses using the non-streaming Listen, Attend and Spell (LAS) model while maintaining a reasonable latency. The model attends to acoustics to rescore hypotheses, as opposed to a class of neural correction models that use only first-pass text hypotheses. In this work, we propose to attend to both acoustics and first-pass hypotheses using a deliberation network. A bidirectional encoder is used to extract context information from first-pass hypotheses. The proposed deliberation model achieves 12% relative WER reduction compared to LAS rescoring in Google Voice Search (VS) tasks, and 23% reduction on a proper noun test set. Compared to a large conventional model, our best model performs 21% relatively better for VS. In terms of computational complexity, the deliberation decoder has a larger size than the LAS decoder, and hence requires more computations in second-pass decoding.

preprint2020arXiv

HazeDose: Design and Analysis of a Personal Air Pollution Inhaled Dose Estimation System using Wearable Sensors

Nowadays air pollution becomes one of the biggest world issues in both developing and developed countries. Helping individuals understand their air pollution exposure and health risks, the traditional way is to utilize data from static monitoring stations and estimate air pollution qualities in a large area by government agencies. Data from such sensing system is very sparse and cannot reflect real personal exposure. In recent years, several research groups have developed participatory air pollution sensing systems which use wearable or portable units coupled with smartphones to crowd-source urban air pollution data. These systems have shown remarkable improvement in spatial granularity over government-operated fixed monitoring systems. In this paper, we extend the paradigm to HazeDose system, which can personalize the individuals' air pollution exposure. Specifically, we combine the pollution concentrations obtained from an air pollution estimation system with the activity data from the individual's on-body activity monitors to estimate the personal inhalation dosage of air pollution. Users can visualize their personalized air pollution exposure information via a mobile application. We show that different activities, such as walking, cycling, or driving, impact their dosage, and commuting patterns contribute to a significant proportion of an individual's daily air pollution dosage. Moreover, we propose a dosage minimization algorithm, with the trial results showing that up to 14.1% of a biker's daily exposure can be reduced while using alternative routes the driver can inhale 25.9% less than usual. One heuristic algorithm is also introduced to balance the execution time and dosage reduction for alternative routes scenarios. The results show that up to 20.3% dosage reduction can be achieved when the execution time is almost one seventieth of the original one.

preprint2020arXiv

Highly-efficient terahertz radiation generated by surface electrons from laser-foil interactions

A novel scheme for generating powerful terahertz (THz) radiation based on laser-solid interactions is proposed. When a $p$-polarized femtosecond laser impinges obliquely on a plane solid target and the target partially blocks the laser energy, surface electrons are extracted out and accelerated by the laser fields, forming a low-divergence electron beam. A half-cycle THz radiation pulse is emitted simultaneously as the beam passes by the edge of the target, due to coherent diffraction radiation. Our particle-in-cell simulations show that the relativistic THz pulse can have an energy of a few tens of millijoule and the conversion efficiency can be over 1$\%$ with existing $\sim$J level femtosecond laser sources.

preprint2020arXiv

Multimillijoule terahertz radiation from laser interactions with microplasma-waveguides

When a relativistic, femtosecond laser pulse enters a waveguide, the pulse energy is coupled into waveguide optical modes. The longitudinal laser field effectively accelerates electrons along the axis of the channel, while the asymmetric transverse electromagnetic fields tend to expel fast electrons radially outwards. At the exit of the waveguide, the $\sim$${\rm nC}$, $\sim$$10\ {\rm MeV}$ electron bunch converts its energy to a $\sim$$10\ {\rm mJ}$ terahertz (THz) laser pulse through coherent diffraction radiation. In this paper, we present 3D particle-in-cell simulations and theoretical analyses of the aforementioned interaction process. We investigate the process of longitudinal acceleration and radial expulsion of fast electrons, as well as the dependence of the properties of the resulting THz radiation on laser and plasma parameters and the effects of the preplasma. The simulation results indicate that the conversion efficiency of energy can be over $5\%$ if the waveguide length is optimal and a high contrast pump laser is used. These results guide the design of more intense and powerful THz sources.

preprint2020arXiv

Studying the local magnetic field and anisotropy of magnetic turbulence by synchrotron polarization derivative

Due to the inevitable accumulation of the observed information in the direction of the line of sight, it is difficult to measure the local magnetic field of MHD turbulence. However, the correct understanding of the local magnetic field is a prerequisite for reconstructing the Galactic 3D magnetic field. We study how to reveal the local magnetic field direction and the eddy anisotropy on the basis of the statistics of synchrotron polarization derivative with respect to the squared wavelength $dP/dλ^2$. In the low frequency and strong Faraday rotation regime, we implement numerical simulations in the combination of multiple statistic techniques, such as structure function, quadrupole ratio modulus, spectral correlation function, correlation function anisotropy and spatial gradient techniques. We find that (1) statistic analysis of $dP/dλ^2$ indeed reveals the anisotropy of underlying MHD turbulence, the degree of which increases with the increase of the radiation frequency; (2) the synergy of both correlation function anisotropy and gradient calculation of $dP/dλ^2$ enables the measurement of the local magnetic field direction.

preprint2013arXiv

Robustness of Link-prediction Algorithm Based on Similarity and Application to Biological Networks

Many algorithms have been proposed to predict missing links in a variety of real networks. These studies focus on mainly both accuracy and efficiency of these algorithms. However, little attention is paid to their robustness against either noise or irrationality of a link existing in almost all of real networks. In this paper, we investigate the robustness of several typical node-similarity-based algorithms and find that these algorithms are sensitive to the strength of noise. Moreover, we find that it also depends on networks' structure properties, especially on network efficiency, clustering coefficient and average degree. In addition, we make an attempt to enhance the robustness by using link weighting method to transform un-weighted network to weighted one and then make use of weights of links to characterize their reliability. The result shows that proper link weighting scheme can enhance both robustness and accuracy of these algorithms significantly in biological networks while it brings little computational effort.

preprint2012arXiv

Link Prediction in Complex Networks by Multi Degree Preferential-Attachment Indices

In principle, the rules of links formation of a network model can be considered as a kind of link prediction algorithm. By revisiting the preferential attachment mechanism for generating a scale-free network, here we propose a class of preferential attachment indices which are different from the previous one. Traditionally, the preferential attachment index is defined by the product of the related nodes degrees, while the new indices will define the similarity score of a pair of nodes by either the maximum in the two nodes degrees or the summarization of their degrees. Extensive experiments are carried out on fourteen real-world networks. Compared with the traditional preferential attachment index, the new ones, especially the degree-summarization similarity index, can provide more accurate prediction on most of the networks. Due to the improved prediction accuracy and low computational complexity, these proposed preferential attachment indices may be of help to provide an instruction for mining unknown links in incomplete networks.

preprint2011arXiv

Evolving network models under a dynamic growth rule

Evolving network models under a dynamic growth rule which comprises the addition and deletion of nodes are investigated. By adding a node with a probability $P_a$ or deleting a node with the probability $P_d=1-P_a$ at each time step, where $P_a$ and $P_d$ are determined by the Logistic population equation, topological properties of networks are studied. All the fat-tailed degree distributions observed in real systems are obtained, giving the evidence that the mechanism of addition and deletion can lead to the diversity of degree distribution of real systems. Moreover, it is found that the networks exhibit nonstationary degree distributions, changing from the power-law to the exponential one or from the exponential to the Gaussian one. These results can be expected to shed some light on the formation and evolution of real complex real-world networks.