Source author record

Tao Luo

Tao Luo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

43works

29topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Focus and Dilution: The Multi-stage Learning Process of Attention

Transformer-based models have achieved remarkable success across a wide range of domains, yet our understanding of their training dynamics remains limited. In this work, we identify a recurrent focus-dilution cycle in attention learning and provide a rigorous explanation in a one-layer Transformer setting for Markovian data via gradient-flow analysis. Using stage-wise linearization around critical points, we show that a single focus-dilution cycle can be decomposed into a sequence of distinct stages. First, embedding and projection rapidly condense to a rank-one structure, while attention parameters remain effectively frozen. Then, the attention parameters begin to increase, inducing a frequency-driven focus toward high-frequency tokens. As attention continues to evolve, it generates next-order perturbations in embeddings, leading to a mass-redistribution mechanism that progressively dilutes this focus. Finally, small asymmetries among low-frequency tokens lift a degenerate critical point, opening new embedding directions and initiating the next cycle. Experiments on synthetic Markovian data as well as WikiText and TinyStories corroborate the predicted stages and cyclical dynamics.

preprint2023arXiv

Optimization of Image Transmission in a Cooperative Semantic Communication Networks

In this paper, a semantic communication framework for image transmission is developed. In the investigated framework, a set of servers cooperatively transmit images to a set of users utilizing semantic communication techniques. To evaluate the performance of studied semantic communication system, a multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image. To meet the ISS requirement of each user, each server must jointly determine the semantic information to be transmitted and the resource blocks (RBs) used for semantic information transmission. We formulate this problem as an optimization problem aiming to minimize each server's transmission latency while reaching the ISS requirement. To solve this problem, a value decomposition based entropy-maximized multi-agent reinforcement learning (RL) is proposed, which enables servers to coordinate for training and execute RB allocation in a distributed manner to approach to a globally optimal performance with less training iterations. Compared to traditional multi-agent RL, the proposed RL improves the valuable action exploration of servers and the probability of finding a globally optimal RB allocation policy based on local observation. Simulation results show that the proposed algorithm can reduce the transmission delay by up to 16.1% compared to traditional multi-agent RL.

preprint2022arXiv

A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding

Spiking neural networks (SNNs) recently gained momentum due to their low-power multiplication-free computing and the closer resemblance of biological processes in the nervous system of humans. However, SNNs require very long spike trains (up to 1000) to reach an accuracy similar to their artificial neural network (ANN) counterparts for large models, which offsets efficiency and inhibits its application to low-power systems for real-world use cases. To alleviate this problem, emerging neural encoding schemes are proposed to shorten the spike train while maintaining the high accuracy. However, current accelerators for SNN cannot well support the emerging encoding schemes. In this work, we present a novel hardware architecture that can efficiently support SNN with emerging neural encoding. Our implementation features energy and area efficient processing units with increased parallelism and reduced memory accesses. We verified the accelerator on FPGA and achieve 25% and 90% improvement over previous work in power consumption and latency, respectively. At the same time, high area efficiency allows us to scale for large neural network models. To the best of our knowledge, this is the first work to deploy the large neural network model VGG on physical FPGA-based neuromorphic hardware.

preprint2022arXiv

An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs -- difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, our theory, confirmed by numerical experiments, suggests that there is a critical decaying rate w.r.t. frequency in DNN training. Below the upper limit of the decaying rate, the DNN interpolates the training data by a function with a certain regularity. However, above the upper limit, the DNN interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. Our results indicate a better way to overcome the high-frequency curse is to design a proper pre-condition approach to shift high-frequency information to low-frequency one, which coincides with several previous developed algorithms for fast learning high-frequency information. More importantly, this work rigorously proves that the high-frequency curse is an intrinsic difficulty of DNNs.

preprint2022arXiv

Bunching instability and asymptotic properties in epitaxial growth with elasticity effects: continuum model

We study the continuum epitaxial model for elastic interacting atomic steps on vicinal surfaces proposed by Xiang and E (Xiang, SIAM J. Appl. Math. 63:241-258, 2002; Xiang and E, Phys. Rev. B 69:035409, 2004). The non-local term and the singularity complicate the analysis of its PDE. In this paper, we first generalize this model to the Lennard-Jones (m,n) interaction between steps. Based on several important formulations of the non-local energy, we prove the existence, symmetry, unimodality, and regularity of the energy minimizer in the periodic setting. In particular, the symmetry and unimodality of the minimizer implies that it has a bunching profile. Furthermore, we derive the minimum energy scaling law for the original continnum model. All results are consistent with the corresponding results proved for discrete models by Luo et al. (Luo et al., Multiscale Model. Simul. 14:737 - 771, 2016).

preprint2022arXiv

Embedding Principle of Loss Landscape of Deep Neural Networks

Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important. In this work, we prove an embedding principle that the loss landscape of a DNN "contains" all the critical points of all the narrower DNNs. More precisely, we propose a critical embedding such that any critical point, e.g., local or global minima, of a narrower DNN can be embedded to a critical point/hyperplane of the target DNN with higher degeneracy and preserving the DNN output function. The embedding structure of critical points is independent of loss function and training data, showing a stark difference from other nonconvex problems such as protein-folding. Empirically, we find that a wide DNN is often attracted by highly-degenerate critical points that are embedded from narrow DNNs. The embedding principle provides an explanation for the general easy optimization of wide DNNs and unravels a potential implicit low-complexity regularization during the training. Overall, our work provides a skeleton for the study of loss landscape of DNNs and its implication, by which a more exact and comprehensive understanding can be anticipated in the near

preprint2022arXiv

Existence, uniqueness, and energy scaling of 2+1 dimensional continuum model for stepped epitaxial surfaces with elastic effects

We study the 2+1 dimensional continuum model for the evolution of stepped epitaxial surface under long-range elastic interaction proposed by Xu and Xiang (SIAM J. Appl. Math. 69, 1393-1414, 2009). The long-range interaction term and the two length scales in this model makes PDE analysis challenging. Moreover, unlike in the 1+1 dimensional case, there is a nonconvexity contribution in the total energy in the 2+1 dimensional case, and it is not easy to prove that the solution is always in the well-posed regime during the evolution. In this paper, we propose a modified 2+1 dimensional continuum model based on the underlying physics. This modification fixes the problem of possible illposedness due to the nonconvexity of the energy functional. We prove the existence and uniqueness of both the static and dynamic solutions and derive a minimum energy scaling law for them. We show that the minimum energy surface profile is mainly attained by surfaces with step meandering instability. This is essentially different from the energy scaling law for the 1+1 dimensional epitaxial surfaces under elastic effects attained by step bunching surface profiles. We also discuss the transition from the step bunching instability to the step meandering instability in 2+1 dimensions.

preprint2022arXiv

Feasibility Study of $D_s^+ \to τ^+ ν_τ$ Decay and Test of Lepton Flavor Universality with Leptonic $D_s^+$ Decays at STCF

We report a sensitive study of $D_s^+ \to τ^+ ν_τ$ decay via $τ^+ \to e^+ ν_e \barν_τ$ with an integrated luminosity of 1 ab$^{-1}$ at the center-of-mass energy of $4.009$ GeV at a future Super Tau Charm Facility (STCF). Under the help of the fast simulation software package, the statistical sensitivity for the absolute branching fraction of $D_s^+ \to τ^+ ν_τ$ is determined to be $2\times10^{-4}$. Combining with our previous prospect of $D_s^+ \to μ^+ ν_μ$, the ratio of the branching fractions for $D_s^+ \to τ^+ ν_τ$ over $D_s^+ \to μ^+ ν_μ$ can achieve a relative statistical precision of 0.5%, which will provide the most stringent test of the $τ$-$μ$ lepton flavor universality in heavy quark decays. Taking the decay constant $f_{D_s^+}$ from lattice QCD calculations or the CKM matrix element $|V_{cs}|$ from the CKMfitter group as an input, the relative statistical uncertainties for $|V_{cs}|$ and $f_{D_s^+}$ are estimated to be 0.3% and 0.2%, respectively.

preprint2022arXiv

From Mean Field Games To Navier-Stokes Equations

This work establishes the equivalence between Mean Field Game and a class of PDE systems closely related to compressible Navier-Stokes equations. The solvability of the PDE system via the existence of the Nash Equilibrium of the Mean Field Game is provided under a set of conditions.

preprint2022arXiv

Meta-Reinforcement Learning for Reliable Communication in THz/VLC Wireless VR Networks

In this paper, the problem of enhancing the quality of virtual reality (VR) services is studied for an indoor terahertz (THz)/visible light communication (VLC) wireless network. In the studied model, small base stations (SBSs) transmit high-quality VR images to VR users over THz bands and light-emitting diodes (LEDs) provide accurate indoor positioning services for them using VLC. Here, VR users move in real time and their movement patterns change over time according to their applications, where both THz and VLC links can be blocked by the bodies of VR users. To control the energy consumption of the studied THz/VLC wireless VR network, VLC access points (VAPs) must be selectively turned on so as to ensure accurate and extensive positioning for VR users. Based on the user positions, each SBS must generate corresponding VR images and establish THz links without body blockage to transmit the VR content. The problem is formulated as an optimization problem whose goal is to maximize the reliability of the VR network by selecting the appropriate VAPs to be turned on and controlling the user association with SBSs. To solve this problem, a policy gradient-based reinforcement learning (RL) algorithm that adopts a meta-learning approach is proposed. The proposed meta policy gradient (MPG) algorithm enables the trained policy to quickly adapt to new user movement patterns. In order to solve the problem of maximizing the average number of successfully served users for VR scenarios with a large number of users, a dual method based MPG algorithm (D-MPG) with a low complexity is proposed. Simulation results demonstrate that, compared to the trust region policy optimization algorithm (TRPO), the proposed MPG and D-MPG algorithms yield up to 26.8% and 21.9% improvement in the reliability as well as 81.2% and 87.5% gains in the convergence speed, respectively.

preprint2022arXiv

Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach

In this paper, a semantic communication framework is proposed for textual data transmission. In the studied model, a base station (BS) extracts the semantic information from textual data, and transmits it to each user. The semantic information is modeled by a knowledge graph (KG) that consists of a set of semantic triples. After receiving the semantic information, each user recovers the original text using a graph-to-text generation model. To measure the performance of the considered semantic communication framework, a metric of semantic similarity (MSS) that jointly captures the semantic accuracy and completeness of the recovered text is proposed. Due to wireless resource limitations, the BS may not be able to transmit the entire semantic information to each user and satisfy the transmission delay constraint. Hence, the BS must select an appropriate resource block for each user as well as determine and transmit part of the semantic information to the users. As such, we formulate an optimization problem whose goal is to maximize the total MSS by jointly optimizing the resource allocation policy and determining the partial semantic information to be transmitted. To solve this problem, a proximal-policy-optimization-based reinforcement learning (RL) algorithm integrated with an attention network is proposed. The proposed algorithm can evaluate the importance of each triple in the semantic information using an attention network and then, build a relationship between the importance distribution of the triples in the semantic information and the total MSS. Compared to traditional RL algorithms, the proposed algorithm can dynamically adjust its learning rate thus ensuring convergence to a locally optimal solution.

preprint2022arXiv

Quasi-periodic oscillations of the X-ray burst from the magnetar SGR J1935+2154 and associated with the fast radio burst FRB 200428

The origin(s) and mechanism(s) of fast radio bursts (FRBs), which are short radio pulses from cosmological distances, have remained a major puzzle since their discovery. We report a strong Quasi-Periodic Oscillation(QPO) of 40 Hz in the X-ray burst from the magnetar SGR J1935+2154 and associated with FRB 200428, significantly detected with the Hard X-ray Modulation Telescope (Insight-HXMT) and also hinted by the Konus-Wind data. QPOs from magnetar bursts have only been rarely detected; our 3.4 sigma (p-value is 2.9e-4) detection of the QPO reported here reveals the strongest QPO signal observed from magnetars (except in some very rare giant flares), making this X-ray burst unique among magnetar bursts. The two X-ray spikes coinciding with the two FRB pulses are also among the peaks of the QPO. Our results suggest that at least some FRBs are related to strong oscillation processes of neutron stars. We also show that we may overestimate the significance of the QPO signal and underestimate the errors of QPO parameters if QPO exists only in a fraction of the time series of a X-ray burst which we use to calculate the Leahy-normalized periodogram.

preprint2022arXiv

Singular Limits for the Navier-Stokes-Poisson Equations of Viscous Plasma with Strong Density Boundary Layer

The quasi-neutral limit of the Navier-Stokes-Poisson system modeling a viscous plasma with vanishing viscosity coefficients in the half-space $\mathbb{R}^{3}_{+}$ is rigorously proved under a Navier-slip boundary condition for velocity and the Dirichlet boundary condition for electric potential. This is achieved by establishing the nonlinear stability of the approximation solutions involving the strong boundary layer in density and electric potential, which comes from the break-down of the quasi-neutrality near the boundary, and dealing with the difficulty of the interaction of this strong boundary layer with the weak boundary layer of the velocity field.

preprint2022arXiv

Synthesis and Superconductivity in Yttrium-Cerium Hydrides at Moderate Pressures

Inspired by the high critical temperature in yttrium superhydride and the low stabilized pressure in superconducting cerium superhydride, we carry out four independent runs to synthesize yttrium-cerium alloy hydrides. The phases examined by the Raman scattering and x-ray diffraction measurements. The superconductivity is detected with the zero-resistance state at the critical temperature in the range of 97-140 K at pressures ranging from 114 GPa to 120$\pm$4 GPa. The maximum critical temperature of the synthesized hydrides is larger than those reported for cerium hydrides, while the corresponding stabilized pressure is much lower than those for superconducting yttrium hydrides. The structural analysis and theoretical calculations suggest that the phase of Y$_{0.5}$Ce$_{0.5}$H$_9$ has the space group $P6_3/mmc$ with the calculated critical temperature of 119 K, in fair agreement with the experiments. These results indicate that alloying superhydrides indeed can maintain relatively high critical temperature at modest pressures accessible by many laboratories.

preprint2022arXiv

Synthesis of Superconducting Phase of La$_{0.5}$Ce$_{0.5}$H$_{10}$ at High Pressures

Clathrate hydride \emph{Fm}\={3}\emph{m}-LaH$_{10}$ has been proven as the most extraordinary superconductor with the critical temperature $T_c$ above 250 K upon compression of hundreds of GPa in recent years. A general hope is to reduce the stabilization pressure and maintain the high $T_c$ value of the specific phase in LaH$_{10}$. However, strong structural instability distorts \emph{Fm}\={3}\emph{m} structure and leads to a rapid decrease of $T_c$ at low pressures. Here, we investigate the phase stability and superconducting behaviors of \emph{Fm}\={3}\emph{m}-LaH$_{10}$ with enhanced chemical pre-compression through partly replacing La by Ce atoms from both experiments and calculations. For explicitly characterizing the synthesized hydride, we choose lanthanum-cerium alloy with stoichiometry composition of 1:1. X-ray diffraction and Raman scattering measurements reveal the stabilization of \emph{Fm}\={3}\emph{m}-La$_{0.5}$Ce$_{0.5}$H$_{10}$ in the pressure range of 140-160 GPa. Superconductivity with $T_c$ of 175$\pm$2 K at 155 GPa is confirmed with the observation of the zero-resistivity state and supported by the theoretical calculations. These findings provide applicability in the future explorations for a large variety of hydrogen-rich hydrides.

preprint2022arXiv

Winograd Convolution: A Perspective from Fault Tolerance

Winograd convolution is originally proposed to reduce the computing overhead by converting multiplication in neural network (NN) with addition via linear transformation. Other than the computing efficiency, we observe its great potential in improving NN fault tolerance and evaluate its fault tolerance comprehensively for the first time. Then, we explore the use of fault tolerance of winograd convolution for either fault-tolerant or energy-efficient NN processing. According to our experiments, winograd convolution can be utilized to reduce fault-tolerant design overhead by 27.49\% or energy consumption by 7.19\% without any accuracy loss compared to that without being aware of the fault tolerance

preprint2021arXiv

E3NE: An End-to-End Framework for Accelerating Spiking Neural Networks with Emerging Neural Encoding on FPGAs

Compiler frameworks are crucial for the widespread use of FPGA-based deep learning accelerators. They allow researchers and developers, who are not familiar with hardware engineering, to harness the performance attained by domain-specific logic. There exists a variety of frameworks for conventional artificial neural networks. However, not much research effort has been put into the creation of frameworks optimized for spiking neural networks (SNNs). This new generation of neural networks becomes increasingly interesting for the deployment of AI on edge devices, which have tight power and resource constraints. Our end-to-end framework E3NE automates the generation of efficient SNN inference logic for FPGAs. Based on a PyTorch model and user parameters, it applies various optimizations and assesses trade-offs inherent to spike-based accelerators. Multiple levels of parallelism and the use of an emerging neural encoding scheme result in an efficiency superior to previous SNN hardware implementations. For a similar model, E3NE uses less than 50% of hardware resources and 20% less power, while reducing the latency by an order of magnitude. Furthermore, scalability and generality allowed the deployment of the large-scale SNN models AlexNet and VGG.

preprint2021arXiv

Improving "Fast Iterative Shrinkage-Thresholding Algorithm": Faster, Smarter and Greedier

The "fast iterative shrinkage-thresholding algorithm", a.k.a. FISTA, is one of the most well-known first-order optimisation scheme in the literature, as it achieves the worst-case $O(1/k^2)$ optimal convergence rate in terms of objective function value. However, despite such an optimal theoretical convergence rate, in practice the (local) oscillatory behaviour of FISTA often damps its efficiency. Over the past years, various efforts are made in the literature to improve the practical performance of FISTA, such as monotone FISTA, restarting FISTA and backtracking strategies. In this paper, we propose a simple yet effective modification to the original FISTA scheme which has two advantages: it allows us to 1) prove the convergence of generated sequence; 2) design a so-called "lazy-start" strategy which can up to an order faster than the original scheme. Moreover, by exploring the properties of FISTA scheme, we propose novel adaptive and greedy strategies which probes the limit of the algorithm. The advantages of the proposed schemes are tested through problems arising from inverse problem, machine learning and signal/image processing.

preprint2021arXiv

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs. A MOD-Net is driven by a model to solve PDEs based on operator representation with regularization from data. For linear PDEs, we use a DNN to parameterize the Green's function and obtain the neural operator to approximate the solution according to the Green's method. To train the DNN, the empirical risk consists of the mean squared loss with the least square formulation or the variational formulation of the governing equation and boundary conditions. For complicated problems, the empirical risk also includes a few labels, which are computed on coarse grid points with cheap computation cost and significantly improves the model accuracy. Intuitively, the labeled dataset works as a regularization in addition to the model constraints. The MOD-Net solves a family of PDEs rather than a specific one and is much more efficient than original neural operator because few expensive labels are required. We numerically show MOD-Net is very efficient in solving Poisson equation and one-dimensional radiative transfer equation. For nonlinear PDEs, the nonlinear MOD-Net can be similarly used as an ansatz for solving nonlinear PDEs, exemplified by solving several nonlinear PDE problems, such as the Burgers equation.

preprint2021arXiv

SG-PBFT: a Secure and Highly Efficient Blockchain PBFT Consensus Algorithm for Internet of Vehicles

The Internet of Vehicles (IoV) is an application of the Internet of things (IoT). It faces two main security problems: (1) the central server of the IoV may not be powerful enough to support the centralized authentication of the rapidly increasing connected vehicles, (2) the IoV itself may not be robust enough to single-node attacks. To solve these problems, this paper proposes SG-PBFT: a secure and highly efficient PBFT consensus algorithm for Internet of Vehicles, which is based on a distributed blockchain structure. The distributed structure can reduce the pressure on the central server and decrease the risk of single-node attacks. The SG-PBFT consensus algorithm improves the traditional PBFT consensus algorithm by using a score grouping mechanism to achieve a higher consensus efficiency. The experimental result shows that our method can greatly improve the consensus efficiency and prevent single-node attacks. Specifically, when the number of consensus nodes reaches 1000, the consensus time of our algorithm is only about 27% of what is required for the state-of-the-art consensus algorithm (PBFT). Our proposed SG-PBFT is versatile and can be used in other application scenarios which require high consensus efficiency.

preprint2020arXiv

A regularized deep matrix factorized model of matrix completion for image restoration

It has been an important approach of using matrix completion to perform image restoration. Most previous works on matrix completion focus on the low-rank property by imposing explicit constraints on the recovered matrix, such as the constraint of the nuclear norm or limiting the dimension of the matrix factorization component. Recently, theoretical works suggest that deep linear neural network has an implicit bias towards low rank on matrix completion. However, low rank is not adequate to reflect the intrinsic characteristics of a natural image. Thus, algorithms with only the constraint of low rank are insufficient to perform image restoration well. In this work, we propose a Regularized Deep Matrix Factorized (RDMF) model for image restoration, which utilizes the implicit bias of the low rank of deep neural networks and the explicit bias of total variation. We demonstrate the effectiveness of our RDMF model with extensive experiments, in which our method surpasses the state of art models in common examples, especially for the restoration from very few observations. Our work sheds light on a more general framework for solving other inverse problems by combining the implicit bias of deep learning with explicit regularization.

preprint2020arXiv

A type of generalization error induced by initialization in deep neural networks

How initialization and loss function affect the learning of a deep neural network (DNN), specifically its generalization error, is an important problem in practice. In this work, by exploiting the linearity of DNN training dynamics in the NTK regime \citep{jacot2018neural,lee2019wide}, we provide an explicit and quantitative answer to this problem. Focusing on regression problem, we prove that, in the NTK regime, for any loss in a general class of functions, the DNN finds the same \emph{global} minima---the one that is nearest to the initial value in the parameter space, or equivalently, the one that is closest to the initial DNN output in the corresponding reproducing kernel Hilbert space. Using these optimization problems, we quantify the impact of initial output and prove that a random non-zero one increases the generalization error. We further propose an antisymmetrical initialization (ASI) trick that eliminates this type of error and accelerates the training. To understand whether the above results hold in general, we also perform experiments for DNNs in the non-NTK regime, which demonstrate the effectiveness of our theoretical results and the ASI trick in a qualitative sense. Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

preprint2020arXiv

Deep Learning for Optimal Deployment of UAVs with Visible Light Communications

In this paper, the problem of dynamical deployment of unmanned aerial vehicles (UAVs) equipped with visible light communication (VLC) capabilities for optimizing the energy efficiency of UAV-enabled networks is studied. In the studied model, the UAVs can simultaneously provide communications and illumination to service ground users. Since ambient illumination increases the interference over VLC links while reducing the illumination threshold of the UAVs, it is necessary to consider the illumination distribution of the target area for UAV deployment optimization. This problem is formulated as an optimization problem which jointly optimizes UAV deployment, user association, and power efficiency while meeting the illumination and communication requirements of users. To solve this problem, an algorithm that combines the machine learning framework of gated recurrent units (GRUs) with convolutional neural networks (CNNs) is proposed. Using GRUs and CNNs, the UAVs can model the long-term historical illumination distribution and predict the future illumination distribution. Given the prediction of illumination distribution, the original nonconvex optimization problem can be divided into two sub-problems and is then solved using a low-complexity, iterative algorithm. Then, the proposed algorithm enables UAVs to determine the their deployment and user association to minimize the total transmit power. Simulation results using real data from the Earth observations group (EOG) at NOAA/NCEI show that the proposed approach can achieve up to 68.9% reduction in total transmit power compared to a conventional optimal UAV deployment that does not consider the illumination distribution and user association.

preprint2020arXiv

Discovery of Extended Structures around Two Evolved Planetary Nebulae M 2-55 and Abell 2

We report a multi-wavelength study of two evolved planetary nebulae (PNs) M 2-55 and Abell 2. Deep optical narrow-band images ([O III], H?, and [N II]) of M 2-55 reveal two pairs of bipolar lobes and a new faint arc-like structure. This arc-shaped filament around M 2-55 appears a well-defined boundary from southwest to southeast, strongly suggesting that this nebula is in interaction with its surrounding interstellar medium. From the imaging data of Wide-field Infrared Survey Explorer (WISE) all-sky survey, we discovered extensive mid-infrared halos around these PNs, which are approximately twice larger than their main nebulae seen in the visible. We also present a mid-resolution optical spectrum of M 2-55, which shows that it is a high-excitation evolved PN with a low electron density of 250 cm^-3. Furthermore, we investigate the properties of these nebulae from their spectral energy distributions (SEDs) by means of archival data.

preprint2020arXiv

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

Low-frequency quasi-periodic oscillations (LFQPOs) are commonly found in black hole X-ray binaries, and their origin is still under debate. The properties of LFQPOs at high energies (above 30 keV) are closely related to the nature of the accretion flow in the innermost regions, and thus play a crucial role in critically testing various theoretical models. The Hard X-ray Modulation Telescope (Insight-HXMT) is capable of detecting emissions above 30 keV, and is therefore an ideal instrument to do so. Here we report the discovery of LFQPOs above 200 keV in the new black hole MAXI J1820+070 in the X-ray hard state, which allows us to understand the behaviours of LFQPOs at hundreds of kiloelectronvolts. The phase lag of the LFQPO is constant around zero below 30 keV, and becomes a soft lag (that is, the high-energy photons arrive first) above 30 keV. The soft lag gradually increases with energy and reaches ~0.9s in the 150-200 keV band. The detection at energies above 200 keV, the large soft lag and the energy-related behaviors of the LFQPO pose a great challenge for most currently existing models, but suggest that the LFQPO probably originates from the precession of a small-scale jet.

preprint2020arXiv

EDCompress: Energy-Aware Model Compression for Dataflows

Edge devices demand low energy consumption, cost and small form factor. To efficiently deploy convolutional neural network (CNN) models on edge device, energy-aware model compression becomes extremely important. However, existing work did not study this problem well because the lack of considering the diversity of dataflow types in hardware architectures. In this paper, we propose EDCompress, an Energy-aware model compression method for various Dataflows. It can effectively reduce the energy consumption of various edge devices, with different dataflow types. Considering the very nature of model compression procedures, we recast the optimization process to a multi-step problem, and solve it by reinforcement learning algorithms. Experiments show that EDCompress could improve 20X, 17X, 37X energy efficiency in VGG-16, MobileNet, LeNet-5 networks, respectively, with negligible loss of accuracy. EDCompress could also find the optimal dataflow type for specific neural networks in terms of energy consumption, which can guide the deployment of CNN models on hardware systems.

preprint2020arXiv

The Background Model of the Medium Energy X-ray telescope of Insight-HXMT

The Medium Energy X-ray Telescope (ME) is one of the main payloads of the Hard X-ray Modulation Telescope (dubbed as Insight-HXMT). The background of Insight-HXMT/ME is mainly caused by the environmental charged particles and the background intensity is modulated remarkably by the geomagnetic field, as well as the geographical location. At the same geographical location, the background spectral shape is stable but the intensity varies with the level of the environmental charged particles. In this paper, we develop a model to estimate the ME background based on the ME database that is established with the two-year blank sky observations of the high Galactic latitude. In this model, the entire geographical area covered by Insight-HXMT is divided into grids of $5^{\circ}\times5^{\circ}$ in geographical coordinate system. For each grid, the background spectral shape can be obtained from the background database and the intensity can be corrected by the contemporary count rate of the blind FOV detectors. Thus the background spectrum can be obtained by accumulating the background of all the grids passed by Insight-HXMT during the effective observational time. The model test with the blank sky observations shows that the systematic error of the background estimation in $8.9-44.0$ keV is $\sim1.3\%$ for a pointing observation with an average exposure $\sim5.5$ ks. We also find that the systematic error is anti-correlated with the exposure, which indicates the systematic error is partly contributed by the statistical error of count rate measured by the blind FOV detectors.

preprint2019arXiv

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

As China's first X-ray astronomical satellite, the Hard X-ray Modulation Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15, 2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was designed to perform pointing, scanning and gamma-ray burst (GRB) observations and, based on the Direct Demodulation Method (DDM), the image of the scanned sky region can be reconstructed. Here we give an overview of the mission and its progresses, including payload, core sciences, ground calibration/facility, ground segment, data archive, software, in-orbit performance, calibration, background model, observations and some preliminary results.

preprint2019arXiv

Polarization in $Ξ_c^0$ decays

Measurements on the weak decay asymmetry parameters of charmed baryon, say $Ξ_c$, provide more information on the $W$-emission and $W$-exchange mechanisms controlled by the strong and weak interactions. Taking advantage of the spin polarization in the charmed baryon decays, we investigate the possibility to measure the weak decay asymmetry parameters in the $e^{+}e^{-}\to Ξ_c^0\barΞ_c^0$ process. We analyze the transverse polarization spontaneously produced in this process and spin transfer in the subsequent $Ξ_c$ decays. The sensitivity to measure the asymmetry parameters are estimated for the decay $Ξ_c\toΞπ$.

preprint2019arXiv

The Medium Energy (ME) X-ray telescope onboard the Insight-HXMT astronomy satellite

The Medium Energy X-ray telescope (ME) is one of the three main telescopes on board the Insight Hard X-ray Modulation Telescope (Insight-HXMT) astronomy satellite. ME contains 1728 pixels of Si-PIN detectors sensitive in 5-30 keV with a total geometrical area of 952 cm2. Application Specific Integrated Circuit (ASIC) chips, VA32TA6, is used to achieve low power consumption and low readout noise. The collimators define three kinds of field of views (FOVs) for the telescope, 1°{\times}4°, 4°{\times}4°, and blocked ones. Combination of such FOVs can be used to estimate the in-orbit X-ray and particle background components. The energy resolution of ME is ~3 keV at 17.8 keV (FWHM) and the time resolution is 255 μs. In this paper, we introduce the design and performance of ME.

preprint2016arXiv

Dislocation climb models from atomistic scheme to dislocation dynamics

We develop a mesoscopic dislocation dynamics model for vacancy-assisted dislocation climb by upscalings from a stochastic model on the atomistic scale. Our models incorporate microscopic mechanisms of (i) bulk diffusion of vacancies, (ii) vacancy exchange dynamics between bulk and dislocation core, (iii) vacancy pipe diffusion along the dislocation core, and (iv) vacancy attachment-detachment kinetics at jogs leading to the motion of jogs. Our mesoscopic model consists of the vacancy bulk diffusion equation and a dislocation climb velocity formula. The effects of pipe diffusion and the jog structure on dislocations are incorporated by a Robin boundary condition near the dislocations for the bulk diffusion equation and a new contribution in the dislocation climb velocity due to vacancy pipe diffusion driven by the stress variation along the dislocation. Our climb formulation is able to quantitatively describe the translation of prismatic loops at low temperatures when the bulk diffusion is negligible. Using this new formulation, we derive analytical formulas for the climb velocity of a straight edge dislocation and a prismatic circular loop. Our dislocation climb formulation can be implemented in dislocation dynamics simulations to incorporate all the above four microscopic mechanisms of dislocation climb.

preprint2016arXiv

Systematic theoretical analysis of dual-parameters RF readout by a novel LC-type passive sensor

This paper systematically studied the simultaneous measurement of two parameters by a LC-type passive sensor from the theoretical perspective. Based on the lumped circuit model of the typical LC-type passive dual-parameter sensor system, the influencing factors of the signal strength of the sensor as well as the influencing factors of signal crosstalk were both analyzed. It is found that the influencing factors of the RF readout signal strength of the sensor are mainly quality factors (Q factors) of the LC tanks, coupling coefficients, and the resonant frequency interval of the two LC tanks. And the influencing factors of the signal crosstalk are mainly coupling coefficient between the sensor inductance coils and the resonant frequency interval of the two LC tanks. The specific influence behavior of corresponding influencing factors on the signal strength and crosstalk is illustrated by a series of curves from numerical results simulated by using MATLAB software. Additionally, a decoupling scheme for solving the crosstalk problem algorithmically was proposed and a corresponding function was derived out. Overall, the theoretical analysis conducted in this work can provide design guidelines for making the dual-parameter LC-type passive sensor useful in practical applications.

preprint2015arXiv

Asymptotic Stability of the Lane-Emden Solutions for the Viscous Gaseous Star Problem with Degenerate Density Dependent Viscosities

The nonlinear asymptotic stability of Lane-Emden solutions is proved in this paper for spherically symmetric motions of viscous gaseous stars with the density dependent shear and bulk viscosities which vanish at the vacuum, when the adiabatic exponent $γ$ lies in the stability regime $(4/3, 2)$, by establishing the global-in-time regularity uniformly up to the vacuum boundary for the vacuum free boundary problem of the compressible Navier-Stokes-Poisson systems with spherical symmetry, which ensures the global existence of strong solutions capturing the precise physical behavior that the sound speed is $C^{{1}/{2}}$-H$\ddot{\rm o}$lder continuous across the vacuum boundary, the large time asymptotic uniform convergence of the evolving vacuum boundary, density and velocity to those of Lane-Emden solutions with detailed convergence rates, and the detailed large time behavior of solutions near the vacuum boundary. The results obtained in this paper extend those in \cite{LXZ} of the authors for the constant viscosities to the case of density dependent viscosities which are degenerate at vacuum states.

preprint2015arXiv

On Nonlinear Asymptotic Stability of the Lane-Emden Solutions for the Viscous Gaseous Star Problem

This paper proves the nonlinear asymptotic stability of the Lane-Emden solutions for spherically symmetric motions of viscous gaseous stars if the adiabatic constant $γ$ lies in the stability range $(4/3, 2)$. It is shown that for small perturbations of a Lane-Emden solution with same mass, there exists a unique global (in time) strong solution to the vacuum free boundary problem of the compressible Navier-Stokes-Poisson system with spherical symmetry for viscous stars, and the solution captures the precise physical behavior that the sound speed is $C^{{1}/{2}}$-H$\ddot{\rm o}$lder continuous across the vacuum boundary provided that $γ$ lies in $(4/3, 2)$. The key is to establish the global-in-time regularity uniformly up to the vacuum boundary, which ensures the large time asymptotic uniform convergence of the evolving vacuum boundary, density and velocity to those of the Lane-Emden solution with detailed convergence rates, and detailed large time behaviors of solutions near the vacuum boundary.

preprint2014arXiv

Existence of Magnetic Compressible Fluid Stars

The existence of magnetic star solutions which are axi-symmetric stationary solutions for the Euler-Poisson system of compressible fluids coupled to a magnetic field is proved in this paper by a variational method. Our method of proof consists of deriving an elliptic equation for the magnetic potential in cylindrical coordinates in $\mathbb{R}^3$, and obtaining the estimates of the Green's function for this elliptic equation by transforming it to 5-Laplacian.

preprint2014arXiv

Global Existence of Smooth Solutions and Convergence to Barenblatt Solutions for the Physical Vacuum Free Boundary Problem of Compressible Euler Equations with Damping

For the physical vacuum free boundary problem with the sound speed being $C^{{1}/{2}}$-H$\ddot{\rm o}$lder continuous near vacuum boundaries of the one-dimensional compressible Euler equations with damping, the global existence of the smooth solution is proved, which is shown to converge to the Barenblatt self-similar solution for the the porous media equation with the same total mass when the initial data is a small perturbation of the Barenblatt solution. The pointwise convergence with a rate of density, the convergence rate of velocity in supereme norm and the precise expanding rate of the physical vacuum boundaries are also given. The proof is based on a construction of higher-order weighted functionals with both space and time weights capturing the behavior of solutions both near vacuum states and in large time, an introduction of a new ansatz, higher-order nonlinear energy estimates and elliptic estimates.

preprint2014arXiv

Study of the calibration of X-T relation for the BESIII drift chamber

This paper introduces the calibration of the time-to-distance relation for the BESIII drift chamber. The parameterization of the time-to-distance relation is presented. The studies of left-right asymmetry and the variation with the entrance angle are performed. The impact of dead channels on the time-to-distance relation is given special attention in order to reduce the shifts of the measured momenta for the tracks passing near dead cells. Finally we present the spatial resolution (123 μm) for barrel Bhabha events (|cosθ|<0.8) from J/ψ data taken in 2012.

preprint2014arXiv

Well-Posedness for the Motion of Physical Vacuum of the Three-dimensional Compressible Euler Equations with or without Self-Gravitation

This paper concerns the well-posedness theory of the motion of physical vacuum for the compressible Euler equations with or without self-gravitation. First, a general uniqueness theorem of classical solutions is proved for the three dimensional general motion. Second, for the spherically symmetric motions, without imposing the compatibility condition of the first derivative being zero at the center of symmetry, a new local-in-time existence theory is established in a functional space involving less derivatives than those constructed for three-dimensional motions in \cite{10',7,16'} by constructing suitable weights and cutoff functions featuring the behavior of solutions near both the center of the symmetry and the moving vacuum boundary.

preprint2013arXiv

A priori estimates for free boundary problem of incompressible inviscid magnetohydrodynamic flows

In the present paper, we prove the a priori estimates of Sobolev norms for a free boundary problem of the incompressible inviscid MHD equations in all physical spatial dimensions $n=2$ and 3 by adopting a geometrical point of view used in Christodoulou-Lindblad CPAM 2000, and estimating quantities such as the second fundamental form and the velocity of the free surface. We identify the well-posedness condition that the outer normal derivative of the total pressure including the fluid and magnetic pressures is negative on the free boundary, which is similar to the physical condition (Taylor sign condition) for the incompressible Euler equations of fluids.

preprint2010arXiv

Probing Dark force at BES-III/BEPCII

We study an experimental search of a GeV scale vector boson at BES-III/BEPCII. It is responsible for mediating a new U(1)$_d$ interaction, as recently exploited in the context of weakly interacting massive particle dark matter. At low energy $e^+ e^-$ colliders this dark state can be efficiently probed. We discuss the direct productions of this light vector $U$ boson and the decay of this state with BES-III data and its foreseen larger data. In particular, we show that Higgs' strahlung in the dark sector can lead to multilepton signatures, which probe the physics range for kinetic mixing parameter $ε\sim 10^{-4} -10^{-3}$ over a large portion of the parameter space.

preprint2010arXiv

Stability of Transonic Shock Solutions for One-Dimensional Euler-Poisson Equations

In this paper, both structural and dynamical stabilities of steady transonic shock solutions for one-dimensional Euler-Poission system are investigated. First, a steady transonic shock solution with supersonic backgroumd charge is shown to be structurally stable with respect to small perturbations of the background charge, provided that the electric field is positive at the shock location. Second, any steady transonic shock solution with the supersonic background charge is proved to be dynamically and exponentially stable with respect to small perturbation of the initial data, provided the electric field is not too negative at the shock location. The proof of the first stability result relies on a monotonicity argument for the shock position and the downstream density, and a stability analysis for subsonic and supersonic solutions. The dynamical stability of the steady transonic shock for the Euler-Poisson equations can be transformed to the global well-posedness of a free boundary problem for a quasilinear second order equation with nonlinear boundary conditions. The analysis for the associated linearized problem plays an essential role.

preprint2007arXiv

Nonlinear Dynamical Stability of Newtonian Rotating White Dwarfs and Supermassive Stars

We prove general nonlinear stability and existence theorems for rotating star solutions which are axi-symmetric steady-state solutions of the compressible isentropic Euler-Poisson equations in 3 spatial dimensions. We apply our results to rotating and non-rotating white dwarf, and rotating high density supermassive (extreme relativistic) stars, stars which are in convective equilibrium and have uniform chemical composition. This paper is a continuation of our earlier work ([28]).

preprint2006arXiv

Analytical Study of Electronic Structure in Armchair Graphene Nanoribbons

We present the analytical solution of the wavefunction and energy dispersion of armchair graphene nanoribbons (GNRs) based on the tight-binding approximation. By imposing hard-wall boundary condition, we find that the wavevector in the confined direction is discretized. This discrete wavevector serves as the index of different subbands. Our analytical solutions of wavefunction and associated energy dispersion reproduce the numerical tight-binding results and the solutions based on the k*p approximation. In addition, we also find that all armchair GNRs with edge deformation have energy gaps, which agrees with recently reported first-principles calculations.

Tao Luo

What is connected

Connect this record

See the researcher in context

Building this map preview

43 published item(s)

Focus and Dilution: The Multi-stage Learning Process of Attention

Optimization of Image Transmission in a Cooperative Semantic Communication Networks

A Resource-efficient Spiking Neural Network Accelerator Supporting Emerging Neural Encoding

An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

Bunching instability and asymptotic properties in epitaxial growth with elasticity effects: continuum model

Embedding Principle of Loss Landscape of Deep Neural Networks

Existence, uniqueness, and energy scaling of 2+1 dimensional continuum model for stepped epitaxial surfaces with elastic effects

Feasibility Study of $D_s^+ \to τ^+ ν_τ$ Decay and Test of Lepton Flavor Universality with Leptonic $D_s^+$ Decays at STCF

From Mean Field Games To Navier-Stokes Equations

Meta-Reinforcement Learning for Reliable Communication in THz/VLC Wireless VR Networks

Performance Optimization for Semantic Communications: An Attention-based Reinforcement Learning Approach

Quasi-periodic oscillations of the X-ray burst from the magnetar SGR J1935+2154 and associated with the fast radio burst FRB 200428

Singular Limits for the Navier-Stokes-Poisson Equations of Viscous Plasma with Strong Density Boundary Layer

Synthesis and Superconductivity in Yttrium-Cerium Hydrides at Moderate Pressures

Synthesis of Superconducting Phase of La$_{0.5}$Ce$_{0.5}$H$_{10}$ at High Pressures

Winograd Convolution: A Perspective from Fault Tolerance

E3NE: An End-to-End Framework for Accelerating Spiking Neural Networks with Emerging Neural Encoding on FPGAs

Improving "Fast Iterative Shrinkage-Thresholding Algorithm": Faster, Smarter and Greedier

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

SG-PBFT: a Secure and Highly Efficient Blockchain PBFT Consensus Algorithm for Internet of Vehicles

A regularized deep matrix factorized model of matrix completion for image restoration

A type of generalization error induced by initialization in deep neural networks

Deep Learning for Optimal Deployment of UAVs with Visible Light Communications

Discovery of Extended Structures around Two Evolved Planetary Nebulae M 2-55 and Abell 2

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

EDCompress: Energy-Aware Model Compression for Dataflows

The Background Model of the Medium Energy X-ray telescope of Insight-HXMT

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

Polarization in $Ξ_c^0$ decays

The Medium Energy (ME) X-ray telescope onboard the Insight-HXMT astronomy satellite

Dislocation climb models from atomistic scheme to dislocation dynamics

Systematic theoretical analysis of dual-parameters RF readout by a novel LC-type passive sensor

Asymptotic Stability of the Lane-Emden Solutions for the Viscous Gaseous Star Problem with Degenerate Density Dependent Viscosities

On Nonlinear Asymptotic Stability of the Lane-Emden Solutions for the Viscous Gaseous Star Problem

Existence of Magnetic Compressible Fluid Stars

Global Existence of Smooth Solutions and Convergence to Barenblatt Solutions for the Physical Vacuum Free Boundary Problem of Compressible Euler Equations with Damping

Study of the calibration of X-T relation for the BESIII drift chamber

Well-Posedness for the Motion of Physical Vacuum of the Three-dimensional Compressible Euler Equations with or without Self-Gravitation

A priori estimates for free boundary problem of incompressible inviscid magnetohydrodynamic flows

Probing Dark force at BES-III/BEPCII

Stability of Transonic Shock Solutions for One-Dimensional Euler-Poisson Equations

Nonlinear Dynamical Stability of Newtonian Rotating White Dwarfs and Supermassive Stars

Analytical Study of Electronic Structure in Armchair Graphene Nanoribbons