Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
49works
0followers
25topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

49 published item(s)

preprint2026arXiv

EmoTransCap: Dataset and Pipeline for Emotion Transition-Aware Speech Captioning in Discourses

Emotion perception and adaptive expression are fundamental capabilities in human-agent interaction. While recent advances in speech emotion captioning (SEC) have improved fine-grained emotional modeling, existing systems remain limited to static, single-emotion characterization within isolated sentences, neglecting dynamic emotional transitions at the discourse level. To address this gap, we propose Emotion Transition-Aware Speech Captioning (EmoTransCap), a paradigm that integrates temporal emotion dynamics with discourse-level speech description. To construct a dataset rich in emotion transitions while enabling scalable expansion, we design an automated pipeline for dataset creation. This is the first large-scale dataset explicitly designed to capture discourse-level emotion transitions. To generate semantically rich descriptions, we incorporate acoustic attributes and temporal cues from discourse-level speech. Our Multi-Task Emotion Transition Recognition (MTETR) model performs joint emotion transition detection and diarization. Leveraging the semantic analysis capabilities of LLMs, we produce two annotation versions: descriptive and instruction-oriented. These data and annotations offer a valuable resource for advancing emotion perception and emotional expressiveness. The dataset enables speech captions that capture emotional transitions, facilitating temporal-dynamic and fine-grained emotion understanding. We also introduce a controllable, transition-aware emotional speech synthesis system at the discourse level, enhancing anthropomorphic emotional expressiveness and supporting emotionally intelligent conversational agents.

preprint2026arXiv

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Test-time scaling (TTS) has become an effective approach for improving large language model performance by allocating additional computation during inference. However, existing TTS strategies are largely hand-crafted: researchers manually design reasoning patterns and tune heuristics by intuition, leaving much of the computation-allocation space unexplored. We propose an environment-driven framework, AutoTTS, that changes what researchers design: from individual TTS heuristics to environments where TTS strategies can be discovered automatically. The key to AutoTTS lies in environment construction: the discovery environment must make the control space tractable and provide cheap, frequent feedback for TTS search. As a concrete instantiation, we formulate width--depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals, where controllers decide when to branch, continue, probe, prune, or stop and can be evaluated cheaply without repeated LLM calls. We further introduce beta parameterization to make the search tractable and fine-grained execution trace feedback to improve discovery efficiency by helping the agent diagnose why a TTS program fails. Experiments on mathematical reasoning benchmarks show that the discovered strategies improve the overall accuracy--cost tradeoff over strong manually designed baselines. The discovered strategies generalize to held-out benchmarks and model scales, while the entire discovery costs only $39.9 and 160 minutes. Our data, and code will be open-source at https://github.com/zhengkid/AutoTTS.

preprint2025arXiv

CAML: Collaborative Auxiliary Modality Learning for Multi-Agent Systems

Multi-modal learning has emerged as a key technique for improving performance across domains such as autonomous driving, robotics, and reasoning. However, in certain scenarios, particularly in resource-constrained environments, some modalities available during training may be absent during inference. While existing frameworks effectively utilize multiple data sources during training and enable inference with reduced modalities, they are primarily designed for single-agent settings. This poses a critical limitation in dynamic environments such as connected autonomous vehicles (CAV), where incomplete data coverage can lead to decision-making blind spots. Conversely, some works explore multi-agent collaboration but without addressing missing modality at test time. To overcome these limitations, we propose Collaborative Auxiliary Modality Learning (CAML), a novel multi-modal multi-agent framework that enables agents to collaborate and share multi-modal data during training, while allowing inference with reduced modalities during testing. Experimental results in collaborative decision-making for CAV in accident-prone scenarios demonstrate that CAML achieves up to a 58.1% improvement in accident detection. Additionally, we validate CAML on real-world aerial-ground robot data for collaborative semantic segmentation, achieving up to a 10.6% improvement in mIoU.

preprint2022arXiv

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

Emotion classification of speech and assessment of the emotion strength are required in applications such as emotional text-to-speech and voice conversion. The emotion attribute ranking function based on Support Vector Machine (SVM) was proposed to predict emotion strength for emotional speech corpus. However, the trained ranking function doesn't generalize to new domains, which limits the scope of applications, especially for out-of-domain or unseen speech. In this paper, we propose a data-driven deep learning model, i.e. StrengthNet, to improve the generalization of emotion strength assessment for seen and unseen speech. This is achieved by the fusion of emotional data from various domains. We follow a multi-task learning network architecture that includes an acoustic encoder, a strength predictor, and an auxiliary emotion predictor. Experiments show that the predicted emotion strength of the proposed StrengthNet is highly correlated with ground truth scores for both seen and unseen speech. We release the source codes at: https://github.com/ttslr/StrengthNet.

preprint2022arXiv

Aiming in Harsh Environments: A New Framework for Flexible and Adaptive Resource Management

The harsh environment imposes a unique set of challenges on networking strategies. In such circumstances, the environmental impact on network resources and long-time unattended maintenance has not been well investigated yet. To address these challenges, we propose a flexible and adaptive resource management framework that incorporates the environment awareness functionality. In particular, we propose a new network architecture and introduce the new functionalities against the traditional network components. The novelties of the proposed architecture include a deep-learning-based environment resource prediction module and a self-organized service management module. Specifically, the available network resource under various environmental conditions is predicted by using the prediction module. Then based on the prediction, an environment-oriented resource allocation method is developed to optimize the system utility. To demonstrate the effectiveness and efficiency of the proposed new functionalities, we examine the method via an experiment in a case study. Finally, we introduce several promising directions of resource management in harsh environments that can be extended from this paper.

preprint2022arXiv

Collective Conditioned Reflex: A Bio-Inspired Fast Emergency Reaction Mechanism for Designing Safe Multi-Robot Systems

A multi-robot system (MRS) is a group of coordinated robots designed to cooperate with each other and accomplish given tasks. Due to the uncertainties in operating environments, the system may encounter emergencies, such as unobserved obstacles, moving vehicles, and extreme weather. Animal groups such as bee colonies initiate collective emergency reaction behaviors such as bypassing obstacles and avoiding predators, similar to muscle-conditioned reflex which organizes local muscles to avoid hazards in the first response without delaying passage through the brain. Inspired by this, we develop a similar collective conditioned reflex mechanism for multi-robot systems to respond to emergencies. In this study, Collective Conditioned Reflex (CCR), a bio-inspired emergency reaction mechanism, is developed based on animal collective behavior analysis and multi-agent reinforcement learning (MARL). The algorithm uses a physical model to determine if the robots are experiencing an emergency; then, rewards for robots involved in the emergency are augmented with corresponding heuristic rewards, which evaluate emergency magnitudes and consequences and decide local robots' participation. CCR is validated on three typical emergency scenarios: \textit{turbulence, strong wind, and hidden obstacle}. Simulation results demonstrate that CCR improves robot teams' emergency reaction capability with faster reaction speed and safer trajectory adjustment compared with baseline methods.

preprint2022arXiv

Correlation-corrected band topology and topological surface states in iron-based superconductors

Iron-based superconductors offer an ideal platform for studying topological superconductivity and Majorana fermions. In this paper, we carry out a comprehensive study of the band topology and topological surface states of a number of iron-based superconductors using a combination of density functional theory (DFT) and dynamical mean field theory. We find that the strong electronic correlation of Fe 3d electrons plays a crucial role in determining the band topology and topological surface states of iron-based superconductors. Electronic correlation not only strongly renormalizes the bandwidth of Fe 3d electrons, but also shifts the band positions of both Fe 3d and As/Se p electrons. As a result, electronic correlation moves the DFT-calculated topological surface states of many iron-based superconductors much closer to the Fermi level, which is crucial for realizing topological superconducting surface states and observing Majorana zero modes as well as achieving practical applications, such as quantum computation. More importantly, electronic correlation can change the band topology and make some iron-based superconductors topologically nontrivial with topological surface states whereas they have trivial band topology and no topological surface states in DFT calculations. Our paper demonstrates that it is important to take into account electronic correlation effects in order to accurately determine the band topology and topological surface states of iron-based superconductors and other strongly correlated materials.

preprint2022arXiv

Correlation-enhanced electron-phonon coupling and superconductivity in (Ba,K)SbO$_3$ superconductors

The electronic structure, lattice dynamics, and electron-phonon coupling (EPC) of the newly discovered (Ba,K)SbO$_3$ superconductors are investigated by first-principles calculations. The EPC of (Ba,K)SbO$_3$ is significantly enhanced by considering non-local electronic correlation using the Heyd-Scuseria-Ernzerhof hybrid exchange-correlation functional (HSE06). The EPC strength λ of Ba$_{0.35}$K$_{0.65}$SbO$_3$ is strongly increased from 0.33 in local-density approximation calculations to 0.59 in HSE06 calculations, resulting in a superconducting transition temperature Tc of about 14.9 K, which is in excellent agreement with experimental value of ~ 15 K. Our findings suggest (Ba,K)SbO$_3$ are extraordinary conventional superconductors, where non-local electronic correlation expands the bandwidth, enhances the EPC, and boosts the Tc. Moreover, we find both λ and Tc depend crucially on the K-doping level for (Ba,K)SbO$_3$ and (Ba,K)SbO$_3$ compounds. (Ba,K)SbO$_3$ have stronger EPC strength and higher Tc than those of (Ba,K)SbO$_3$ at the same K-doping level.

preprint2022arXiv

Distributed TD(0) with Almost No Communication

We provide a new non-asymptotic analysis of distributed TD(0) with linear function approximation. Our approach relies on "one-shot averaging," where $N$ agents run local copies of TD(0) and average the outcomes only once at the very end. We consider two models: one in which the agents interact with an environment they can observe and whose transitions depends on all of their actions (which we call the global state model), and one in which each agent can run a local copy of an identical Markov Decision Process, which we call the local state model. In the global state model, we show that the convergence rate of our distributed one-shot averaging method matches the known convergence rate of TD(0). By contrast, the best convergence rate in the previous literature showed a rate which, according to the worst-case bounds given, could underperform the non-distributed version by $O(N^3)$ in terms of the number of agents $N$. In the local state model, we demonstrate a version of the linear time speedup phenomenon, where the convergence time of the distributed process is a factor of $N$ faster than the convergence time of TD(0). As far as we are aware, this is the first result rigorously showing benefits from parallelism for temporal difference methods.

preprint2022arXiv

Electronic structure and magnetism of the Hund insulator CrI3

CrI3 is a two-dimensional ferromagnetic van der Waals material with a charge gap of 1.1-1.2 eV. In this study, the electronic structure and magnetism of CrI3 are investigated by using density functional theory and dynamical mean-field theory. Our calculations successfully reproduce a charge gap of 1.1 eV in the paramagnetic state when a Hund coupling JH = 0.7 eV is included with an on-site Hubbard U = 5 eV. In contrast, with a large U value of 8 eV and negligible Hund coupling JH, CrI3 is predicted to be a moderately correlated metal in the paramagnetic state. We conclude that CrI3 is a Mott-Hund insulator due to the half-filled configuration of the Cr 3d t2g orbitals. The Cr 3d eg orbitals are occupied by approximately one electron, which leads to strong valence fluctuations so that the Cr 3d orbitals cannot be described by a single state. Moreover, at finite temperature, the calculated ordered static magnetic moment in the ferromagnetic state is significantly larger in the R3 phase than in the C2/m phase. This observation indicates that the structural phase transition from the C2/m phase to the R3 phase with decreasing temperature is driven by ferromagnetic spin fluctuations.

preprint2022arXiv

Emotional Voice Conversion: Theory, Databases and ESD

In this paper, we first provide a review of the state-of-the-art emotional voice conversion research, and the existing emotional speech databases. We then motivate the development of a novel emotional speech database (ESD) that addresses the increasing research need. With this paper, the ESD database is now made available to the research community. The ESD database consists of 350 parallel utterances spoken by 10 native English and 10 native Chinese speakers and covers 5 emotion categories (neutral, happy, angry, sad and surprise). More than 29 hours of speech data were recorded in a controlled acoustic environment. The database is suitable for multi-speaker and cross-lingual emotional voice conversion studies. As case studies, we implement several state-of-the-art emotional voice conversion systems on the ESD database. This paper provides a reference study on ESD in conjunction with its release.

preprint2022arXiv

Gating Dropout: Communication-efficient Regularization for Sparsely Activated Transformers

Sparsely activated transformers, such as Mixture of Experts (MoE), have received great interest due to their outrageous scaling capability which enables dramatical increases in model size without significant increases in computational cost. To achieve this, MoE models replace the feedforward sub-layer with Mixture-of-Experts sub-layer in transformers and use a gating network to route each token to its assigned experts. Since the common practice for efficient training of such models requires distributing experts and tokens across different machines, this routing strategy often incurs huge cross-machine communication cost because tokens and their assigned experts likely reside in different machines. In this paper, we propose \emph{Gating Dropout}, which allows tokens to ignore the gating network and stay at their local machines, thus reducing the cross-machine communication. Similar to traditional dropout, we also show that Gating Dropout has a regularization effect during training, resulting in improved generalization performance. We validate the effectiveness of Gating Dropout on multilingual machine translation tasks. Our results demonstrate that Gating Dropout improves a state-of-the-art MoE model with faster wall-clock time convergence rates and better BLEU scores for a variety of model sizes and datasets.

preprint2022arXiv

Influence of magnetic reconnection on the eruptive catastrophes of coronal magnetic flux ropes

Large-scale solar eruptive activities have a close relationship with coronal magnetic flux ropes. Previous numerical studies have found that the equilibrium of a coronal flux rope system could be disrupted if the axial magnetic flux of the rope exceeds a critical value, so that the catastrophe occurs, initiating the flux rope to erupt. Further studies discovered that the catastrophe does not necessarily exist: the flux rope system with certain photospheric flux distributions could be non-catastrophic. It is noteworthy that most previous numerical studies are under the ideal magnetohydrodynamic (MHD) condition, so that it is still elusive whether there is the catastrophe associated with the critical axial flux if magnetic reconnection is included in the flux rope system. In this paper, we carried out numerical simulations to investigate the evolutions of coronal magnetic rope systems under the ideal MHD and the resistive condition. Under the ideal MHD condition, our simulation results demonstrate that the flux rope systems with either too compact or too weak photospheric magnetic source regions are non-catastrophic versus varying axial flux of the rope, and thus no eruption could be initiated; if there is magnetic reconnection in the rope system, however, those flux rope systems could change to be capable of erupting via the catastrophe associated with increasing axial flux. Therefore, magnetic reconnection could significantly influence the catastrophic behaviors of flux rope system. It should be both the magnetic topology and the local physical parameters related to magnetic reconnection that determine whether the increasing axial flux is able to cause flux rope eruptions.

preprint2022arXiv

Local Rotational Jamming and Multi-Scale Hyperuniformities in an Active Spinner System

An active system consisting of many self-spinning dimers is simulated, and a distinct local rotational jamming transition is observed as the density increases. In the low density regime, the system stays in an absorbing state, in which each dimer rotates independently subject to the applied torque. While in the high density regime, a fraction of the dimers become rotationally jammed into local clusters, and the system exhibits spinodal-decomposition like two-phase morphologies. For high enough densities, the system becomes completely jammed in both rotational and translational degrees of freedom. Such a simple system is found to exhibit rich and multiscale disordered hyperuniformities among the above phases: the absorbing state shows a critical hyperuniformity of the strongest class and subcritically preserves the vanishing density-fluctuation scaling up to some length scale; the locally-jammed state shows a two-phase hyperuniformity conversely beyond some length scale with respect to the phase cluster sizes; the totally jammed state appears to be a monomer crystal, but intrinsically loses large-scale hyperuniformity. These results are inspiring for designing novel phase-separation and disordered hyperuniform systems through dynamical organization.

preprint2022arXiv

MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset

Text-to-Speech (TTS) synthesis for low-resource languages is an attractive research issue in academia and industry nowadays. Mongolian is the official language of the Inner Mongolia Autonomous Region and a representative low-resource language spoken by over 10 million people worldwide. However, there is a relative lack of open-source datasets for Mongolian TTS. Therefore, we make public an open-source multi-speaker Mongolian TTS dataset, named MnTTS2, for the benefit of related researchers. In this work, we prepare the transcription from various topics and invite three professional Mongolian announcers to form a three-speaker TTS dataset, in which each announcer records 10 hours of speeches in Mongolian, resulting 30 hours in total. Furthermore, we build the baseline system based on the state-of-the-art FastSpeech2 model and HiFi-GAN vocoder. The experimental results suggest that the constructed MnTTS2 dataset is sufficient to build robust multi-speaker TTS models for real-world applications. The MnTTS2 dataset, training recipe, and pretrained models are released at: \url{https://github.com/ssmlkl/MnTTS2}

preprint2022arXiv

Neutral Utterances are Also Causes: Enhancing Conversational Causal Emotion Entailment with Social Commonsense Knowledge

Conversational Causal Emotion Entailment aims to detect causal utterances for a non-neutral targeted utterance from a conversation. In this work, we build conversations as graphs to overcome implicit contextual modelling of the original entailment style. Following the previous work, we further introduce the emotion information into graphs. Emotion information can markedly promote the detection of causal utterances whose emotion is the same as the targeted utterance. However, it is still hard to detect causal utterances with different emotions, especially neutral ones. The reason is that models are limited in reasoning causal clues and passing them between utterances. To alleviate this problem, we introduce social commonsense knowledge (CSK) and propose a Knowledge Enhanced Conversation graph (KEC). KEC propagates the CSK between two utterances. As not all CSK is emotionally suitable for utterances, we therefore propose a sentiment-realized knowledge selecting strategy to filter CSK. To process KEC, we further construct the Knowledge Enhanced Directed Acyclic Graph networks. Experimental results show that our method outperforms baselines and infers more causes with different emotions from the targeted utterance.

preprint2022arXiv

NTIRE 2022 Challenge on High Dynamic Range Imaging: Methods and Results

This paper reviews the challenge on constrained high dynamic range (HDR) imaging that was part of the New Trends in Image Restoration and Enhancement (NTIRE) workshop, held in conjunction with CVPR 2022. This manuscript focuses on the competition set-up, datasets, the proposed methods and their results. The challenge aims at estimating an HDR image from multiple respective low dynamic range (LDR) observations, which might suffer from under- or over-exposed regions and different sources of noise. The challenge is composed of two tracks with an emphasis on fidelity and complexity constraints: In Track 1, participants are asked to optimize objective fidelity scores while imposing a low-complexity constraint (i.e. solutions can not exceed a given number of operations). In Track 2, participants are asked to minimize the complexity of their solutions while imposing a constraint on fidelity scores (i.e. solutions are required to obtain a higher fidelity score than the prescribed baseline). Both tracks use the same data and metrics: Fidelity is measured by means of PSNR with respect to a ground-truth HDR image (computed both directly and with a canonical tonemapping operation), while complexity metrics include the number of Multiply-Accumulate (MAC) operations and runtime (in seconds).

preprint2022arXiv

Peng Cheng Object Detection Benchmark for Smart City

Object detection is an algorithm that recognizes and locates the objects in the image and has a wide range of applications in the visual understanding of complex urban scenes. Existing object detection benchmarks mainly focus on a single specific scenario and their annotation attributes are not rich enough, these make the object detection model is not generalized for the smart city scenes. Considering the diversity and complexity of scenes in intelligent city governance, we build a large-scale object detection benchmark for the smart city. Our benchmark contains about 500K images and includes three scenarios: intelligent transportation, intelligent security, and drones. For the complexity of the real scene in the smart city, the diversity of weather, occlusion, and other complex environment diversity attributes of the images in the three scenes are annotated. The characteristics of the benchmark are analyzed and extensive experiments of the current state-of-the-art target detection algorithm are conducted based on our benchmark to show their performance.

preprint2022arXiv

Real-Time Anomaly Detection in Edge Streams

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprising edges. In this work, we propose MIDAS, which focuses on detecting microcluster anomalies, or suddenly arriving groups of suspiciously similar edges, such as lockstep behavior, including denial of service attacks in network traffic data. We further propose MIDAS-F, to solve the problem by which anomalies are incorporated into the algorithm's internal states, creating a `poisoning' effect that can allow future anomalies to slip through undetected. MIDAS-F introduces two modifications: 1) We modify the anomaly scoring function, aiming to reduce the `poisoning' effect of newly arriving edges; 2) We introduce a conditional merge step, which updates the algorithm's data structures after each time tick, but only if the anomaly score is below a threshold value, also to reduce the `poisoning' effect. Experiments show that MIDAS-F has significantly higher accuracy than MIDAS. MIDAS has the following properties: (a) it detects microcluster anomalies while providing theoretical guarantees about its false positive probability; (b) it is online, thus processing each edge in constant time and constant memory, and also processes the data orders-of-magnitude faster than state-of-the-art approaches; (c) it provides up to 62% higher ROC-AUC than state-of-the-art approaches.

preprint2022arXiv

Smart Power Supply for UAV Agility Enhancement Using Deep Neural Networks

Recently unmanned aerial vehicles (UAV) have been widely deployed in various real-world scenarios such as disaster rescue and package delivery. Many of these working environments are unstructured with uncertain and dynamic obstacles. UAV collision frequently happens. An UAV with high agility is highly desired to adjust its motions to adapt to these environmental dynamics. However, UAV agility is restricted by its battery power output; particularly, an UAV's power system cannot be aware of its actual power need in motion planning while the need is dynamically changing as the environment and UAV condition vary. It is difficult to accurately and timely align the power supply with power needs in motion plannings. This mismatching will lead to an insufficient power supply to an UAV and cause delayed motion adjustments, largely increasing the risk of collisions with obstacles and therefore undermine UAV agility. To improve UAV agility, a novel intelligent power solution, Agility-Enhanced Power Supply (AEPS), was developed to proactively prepare appropriate amount powers at the right timing to support motion planning with enhanced agility. This method builds a bridge between the physical power system and UAV planning. With agility-enhanced motion planning, the safety of UAV in complex working environment will be enhanced. To evaluate AEPS effectiveness, missions of "patrol missions for community security" with unexpected obstacles were adopted; the power supply is realized by hybrid integration of fuel cell, battery, and capacitor. The effectiveness of AEPS in improving UAV agility was validated by the successful and timely power supply, improved task success rate and system safety, and reduced mission duration.

preprint2022arXiv

Ta2NiSe5: a candidate topological excitonic insulator with multiple band inversions

The electronic structures and topological properties of the orthorhombic and monoclinic phases of the quasi-one-dimensional excitonic insulator Ta2NiSe5 are investigated based on density functional theory. In contrast to a single parity or band inversion across the Fermi level in many topological insulators studied previously, there are multiple parity and band inversions with or without spin-orbit coupling in both phases of Ta2NiSe5, resulting in more complex and topologically nontrivial electronic structures. The Dirac cone type surface states of the low-temperature monoclinic phase are also obtained. In this paper, we demonstrate that Ta2NiSe5 is a promising candidate as a three-dimensional topological excitonic insulator.

preprint2022arXiv

The generality of uncooperative and cooperative effects in elementary hydrogen-bonded systems

The cooperative effect plays a significant role in understanding the intermolecular donor-acceptor interactions of hydrogen bonds (H-bonds, D-H...A). Herein, using the benchmark method of high-precision ab initio, the well-known cooperative effect is reproduced in elementary H-bonded systems with different D and A atoms. That is, with the decreasing of intermolecular distance, the D-H bond length first increases and then decreases, while the H...A bond length decreases. On the contrary, when D and A are the same, as the intermolecular distance decreases, the D-H bond length decreases without increasing, which is referred to as the uncooperative effect. Further analyses conclude that compared to cooperative H-bonded systems, uncooperative systems at their respective equilibrium position have a larger core-valence bifurcation (CVB) index (>0.022) and lower binding energies (<0.25 eV), showing a clear linear inverse relationship related to H-bond strength. Therefore, the intermolecular non-H-bonding interactions are predicted to reflect the uncooperative characteristics, which is confirmed by high-precision ab initio calculations. These findings provide a direction for the comprehensive understanding of H-bonds.

preprint2022arXiv

The Rotation of Magnetic Flux Rope Formed during Solar Eruption

The eruptions of solar filaments often show rotational motion about their rising direction, but it remains elusive what mechanism governs such rotation and how the rotation is related to the initial morphology of the pre-eruptive filament (and co-spatial sigmoid), filament chirality, and magnetic helicity. The conventional view regarding the rotation as a result of a magnetic flux rope (MFR) under-going the ideal kink instability still has confusion in explaining these relationships. Here we proposed an alternative explanation for the rotation during eruptions, by analyzing a magnetohydrodynamic simulation in which magnetic reconnection initiates an eruption from a sheared arcade configuration and an MFR is formed during eruption through the reconnection. The simulation reproduces a reverse S-shaped MFR with dextral chirality, and the axis of this MFR rotates counterclockwise while rising, which compares favorably with a typical filament eruption observed from dual viewing angles. By calculating the twist and writhe numbers of the modeled MFR during its eruption, we found that accompanied with the rotation, the nonlocal writhe of the MFR&#39;s axis decreases while the twist of its surrounding field lines increases, and this is distinct from the kink instability, which converts magnetic twist into writhe of the MFR axis.

preprint2022arXiv

Transformer with Memory Replay

Transformers achieve state-of-the-art performance for natural language processing tasks by pre-training on large-scale text corpora. They are extremely compute-intensive and have very high sample complexity. Memory replay is a mechanism that remembers and reuses past examples by saving to and replaying from a memory buffer. It has been successfully used in reinforcement learning and GANs due to better sample efficiency. In this paper, we propose \emph{Transformer with Memory Replay} (TMR), which integrates memory replay with transformer, making transformer more sample-efficient. Experiments on GLUE and SQuAD benchmark datasets show that Transformer with Memory Replay achieves at least $1\%$ point increase compared to the baseline transformer model when pretrained with the same number of examples. Further, by adopting a careful design that reduces the wall-clock time overhead of memory replay, we also empirically achieve a better runtime efficiency.

preprint2022arXiv

VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over

In this paper, we formulate a novel task to synthesize speech in sync with a silent pre-recorded video, denoted as automatic voice over (AVO). Unlike traditional speech synthesis, AVO seeks to generate not only human-sounding speech, but also perfect lip-speech synchronization. A natural solution to AVO is to condition the speech rendering on the temporal progression of lip sequence in the video. We propose a novel text-to-speech model that is conditioned on visual input, named VisualTTS, for accurate lip-speech synchronization. The proposed VisualTTS adopts two novel mechanisms that are 1) textual-visual attention, and 2) visual fusion strategy during acoustic decoding, which both contribute to forming accurate alignment between the input text content and lip motion in input lip sequence. Experimental results show that VisualTTS achieves accurate lip-speech synchronization and outperforms all baseline systems.

preprint2022arXiv

Where and how does a decay-index profile become saddle-like?

The decay index of solar magnetic fields is known as an important parameter in regulating solar eruptions from the standpoint of the torus instability. In particular, a saddle-like profile of decay index, which hosts a local torus-stable regime at higher altitudes than where the decay index first exceeds the instability threshold, is found to be associated with some confined or two-step eruptions. To understand the occurrence of such a profile, we employed dipoles to emulate different kinds of photospheric flux distributions. Corroborated by observations of representative active regions (ARs), our major results are: 1) in bipolar configurations the critical height increases away from the AR center along the polarity inversion line (PIL) and its average is roughly half of the centroid distance between opposite polarities; 2) in quadrupolar configurations saddle-like profiles appear above the PIL when the two dipoles oriented in the same direction are significantly more separated in this direction than in the perpendicular direction, and when the two dipoles are oriented differently or have unequal fluxes; 3) saddle-like profiles in quadrupolar configurations are associated with magnetic skeletons such as a null point or a hyperbolic flux tube, and the role of such profiles in eruptions is anticipated to be double-edged if magnetic reconnection is involved.

preprint2021arXiv

A Physics-Informed Machine Learning Model for Porosity Analysis in Laser Powder Bed Fusion Additive Manufacturing

To control part quality, it is critical to analyze pore generation mechanisms, laying theoretical foundation for future porosity control. Current porosity analysis models use machine setting parameters, such as laser angle and part pose. However, these setting-based models are machine dependent, hence they often do not transfer to analysis of porosity for a different machine. To address the first problem, a physics-informed, data-driven model (PIM), which instead of directly using machine setting parameters to predict porosity levels of printed parts, it first interprets machine settings into physical effects, such as laser energy density and laser radiation pressure. Then, these physical, machine independent effects are used to predict porosity levels according to pass, flag, fail categories instead of focusing on quantitative pore size prediction. With six learning methods evaluation, PIM proved to achieve good performances with prediction error of 10$\sim$26%. Finally, pore-encouraging influence and pore-suppressing influence were analyzed for quality analysis.

preprint2021arXiv

Computational design of a new layered superconductor LaOTlF2

A new layered compound LaOTlF2 is designed and investigated using first-principles calculations in this work. The parent compound is an insulator with an indirect band gap of 2.65 eV. Electron-doping of the parent compound makes the material metallic. In the meantime, several lattice vibrational modes couple strongly to the conduction band, leading to a large electron-phonon coupling constant and conventional superconductivity. The highest superconducting transition temperature Tc is predicted to be approximately 8.6 K with λ about 1.25 in the optimally doped LaO0.95F0.05TlF2, where λ is calculated using the Wannier interpolation technique.

preprint2021arXiv

Pre-eruption Splitting of the Double-Decker Structure in a Solar Filament

Solar filaments often erupt partially. Although how they split remains elusive, the splitting process has the potential of revealing the filament structure and eruption mechanism. Here we investigate the pre-eruption splitting of an apparently single filament and its subsequent partial eruption on 2012 September 27. The evolution is characterized by three stages with distinct dynamics. During the quasi-static stage, the splitting proceeds gradually for about 1.5 hrs, with the upper branch rising at a few kilometers per second and displaying swirling motions about its axis. During the precursor stage that lasts for about 10 min, the upper branch rises at tens of kilometers per second, with a pair of conjugated dimming regions starting to develop at its footpoints; with the swirling motions turning chaotic, the axis of the upper branch whips southward, which drives an arc-shaped EUV front propagating in the similar direction. During the eruption stage, the upper branch erupts with the onset of a C3.7-class two-ribbon flare, while the lower branch remains stable. Judging from the well separated footpoints of the upper branch from those of the lower one, we suggest that the pre-eruption filament processes a double-decker structure composed of two distinct flux bundles, whose formation is associated with gradual magnetic flux cancellations and converging photospheric flows around the polarity inversion line.

preprint2021arXiv

Resolving Two Distinct Thermal X-ray Components in A compound Solar Flare

X-ray emission provides the most direct diagnostics of the energy-release process in solar flares. Occasionally, a superhot X-ray source is found to be above hot flare loops of ~10 MK temperature. While the origin of the superhot plasma is still elusive, it has conjured up an intriguing image of in-situ plasma heating near the reconnection site high above the flare loops, in contrast to the conventional picture of chromospheric evaporation. Here we investigate an extremely long-duration solar flare, in which EUV images show two distinct flare loop systems that appear successively along a Gamma-shaped polarity inversion line (PIL). When both flare loop systems are present, the HXR spectrum is found to be well fitted by combining a hot component (Te ~12 MK) and a superhot component (Te ~30 MK). Associated with a fast CME, the superhot X-ray source is located at top of the flare arcade that appears earlier, straddling and extending along the long &#34;arm&#34; of the Gamma-shaped PIL. Associated with a slow CME, the hot X-ray source is located at the top of the flare arcade that appears later and sits astride the short &#34;arm&#34; of the Gamma-shaped PIL. Aided by observations from a different viewing angle, we are able to verify that the superhot X-ray source is above the hot one in projection, but the two sources belong to different flare loop systems. Thus, this case study provides a stereoscopic observation explaining the co-existence of superhot and hot X-ray emitting plasmas in solar flares.

preprint2021arXiv

Seen and Unseen emotional style transfer for voice conversion with a new emotional speech dataset

Emotional voice conversion aims to transform emotional prosody in speech while preserving the linguistic content and speaker identity. Prior studies show that it is possible to disentangle emotional prosody using an encoder-decoder network conditioned on discrete representation, such as one-hot emotion labels. Such networks learn to remember a fixed set of emotional styles. In this paper, we propose a novel framework based on variational auto-encoding Wasserstein generative adversarial network (VAW-GAN), which makes use of a pre-trained speech emotion recognition (SER) model to transfer emotional style during training and at run-time inference. In this way, the network is able to transfer both seen and unseen emotional style to a new utterance. We show that the proposed framework achieves remarkable performance by consistently outperforming the baseline framework. This paper also marks the release of an emotional speech dataset (ESD) for voice conversion, which has multiple speakers and languages.

preprint2021arXiv

Statistical Characteristics of Driver Acceleration Behavior and Its Probability Model

Naturalistic driving data were applied to study driver acceleration behaviour, and a probability model of the driver was proposed. First, the question of whether the database is large enough is resolved using kernel density estimation and Kullback-Liebler divergence. Next, the convergence database is utilised to achieve the bivariate acceleration distribution pattern. Subsequently, two probability models are proposed to explain the pattern. Finally, the statistical characteristics of the acceleration behaviours are studied to verify the probability models. The longitudinal and lateral acceleration behaviours always approximate a similar Pareto distribution. The braking, accelerating, and steering manoeuvres become more intense at first and then less intense as the velocity increases. These behaviours characteristics reveal the mechanism of the quadrangle bivariate acceleration distribution pattern. The bivariate acceleration behaviour of the driver will never reach a circle-shaped pattern. The bivariate Pareto distribution model can be applied to describe the bivariate acceleration behaviour of the driver.

preprint2021arXiv

Trust Aware Emergency Response for A Resilient Human-Swarm Cooperative System

A human-swarm cooperative system, which mixes multiple robots and a human supervisor to form a heterogeneous team, is widely used for emergent scenarios such as criminal tracking in social security and victim assistance in a natural disaster. These emergent scenarios require a cooperative team to quickly terminate the current task and transit the system to a new task, bringing difficulty in motion planning. Moreover, due to the immediate task transitions, uncertainty from both physical systems and prior tasks is accumulated to decrease swarm performance, causing robot failures and influencing the cooperation effectiveness between the human and the robot swarm. Therefore, given the quick-transition requirements and the introduced uncertainty, it is challenging for a human-swarm system to respond to emergent tasks, compared with executing normal tasks where a gradual transition between tasks is allowed. Human trust reveals the behavior expectations of others and is used to adjust unsatisfactory behaviors for better cooperation. Inspired by human trust, in this paper, a trust-aware reflective control (Trust-R) is developed to dynamically calibrate human-swarm cooperation. Trust-R, based on a weighted mean subsequence reduced algorithm (WMSR) and human trust modeling, helps a swarm to self-reflect its performance from the perspective of human trust; then proactively correct its faulty behaviors in an early stage before a human intervenes. One typical task scenario {emergency response} was designed in the real-gravity simulation environment, and a human user study with 145 volunteers was conducted. Trust-R&#39;s effectiveness in correcting faulty behaviors in emergency response was validated by the improved swarm performance and increased trust scores.

preprint2020arXiv

Asymptotic Convergence Rate of Alternating Minimization for Rank One Matrix Completion

We study alternating minimization for matrix completion in the simplest possible setting: completing a rank-one matrix from a revealed subset of the entries. We bound the asymptotic convergence rate by the variational characterization of eigenvalues of a reversible consensus problem. This leads to a polynomial upper bound on the asymptotic rate in terms of number of nodes as well as the largest degree of the graph of revealed entries.

preprint2020arXiv

Concept of the Solar Ring Mission: Overview

The concept of the Solar Ring mission was gradually formed from L5/L4 mission concept, and the proposal of its pre-phase study was funded by the National Natural Science Foundation of China in November 2018 and then by the Strategic Priority Program of Chinese Academy of Sciences in space sciences in May 2019. Solar Ring mission will be the first attempt to routinely monitor and study the Sun and inner heliosphere from a full 360-degree perspective in the ecliptic plane. The current preliminary design of the Solar Ring mission is to deploy six spacecraft, grouped in three pairs, on a sub-AU orbit around the Sun. The two spacecraft in each group are separated by about 30 degrees and every two groups by about 120 degrees. This configuration with necessary science payloads will allow us to establish three unprecedented capabilities: (1) determine the photospheric vector magnetic field with unambiguity, (2) provide 360-degree maps of the Sun and the inner heliosphere routinely, and (3) resolve the solar wind structures at multiple scales and multiple longitudes. With these capabilities, the Solar Ring mission aims to address the origin of solar cycle, the origin of solar eruptions, the origin of solar wind structures and the origin of severe space weather events. The successful accomplishment of the mission will advance our understanding of the star and the space environment that hold our life and enhance our capability of expanding the next new territory of human.

preprint2020arXiv

Eruption of Solar Magnetic Flux Ropes Caused by Flux Feeding

Large-scale solar eruptions are believed to have a magnetic flux rope as the core structure. However, it remains elusive as to how the flux rope builds up and what triggers its eruption. Recent observations found that a prominence erupted following multiple episodes of &#34;flux feeding&#34;. During each episode, a chromospheric fibril rose and merged with the prominence lying above. In this letter, we carried out 2.5-dimensional magnetohydrodynamic (MHD) numerical simulations to investigate whether the flux-feeding mechanism can explain such an eruption. The simulations demonstrate that the discrete emergence of small flux ropes can initiate eruptions by feeding axial flux into the preexistent flux rope until its total axial flux reaches a critical value. The onset of the eruption is dominated by an ideal MHD process. Our simulation results corroborate that the flux feeding is a viable mechanism to cause the eruption of solar magnetic flux ropes.

preprint2020arXiv

Inner Attention Supported Adaptive Cooperation for Heterogeneous Multi Robots Teaming based on Multi-agent Reinforcement Learning

Humans can selectively focus on different information based on different tasks requirements, other people&#39;s abilities and availability. Therefore, they can adapt quickly to a completely different and complex environments. If, like people, robot could obtain the same abilities, then it would greatly increase their adaptability to new and unexpected situations. Recent efforts in Heterogeneous Multi Robots Teaming have try to achieve this ability, such as the methods based on communication and multi-modal information fusion strategies. However, these methods will not only suffer from the exponential explosion problem with the increase of robots number but also need huge computational resources. To that end, we introduce an inner attention actor-critic method that replicates aspects of human flexibly cooperation. By bringing attention mechanism on computer vision, natural language process into the realm of multi-robot cooperation, our attention method is able to dynamically select which robots to attend to. In order to test the effectiveness of our proposed method, several simulation experiments have been designed. And the results show that inner attention mechanism can enable flexible cooperation and lower resources consuming in rescuing tasks.

preprint2020arXiv

Keyphrase Prediction With Pre-trained Language Model

Recently, generative methods have been widely used in keyphrase prediction, thanks to their capability to produce both present keyphrases that appear in the source text and absent keyphrases that do not match any source text. However, the absent keyphrases are generated at the cost of the performance on present keyphrase prediction, since previous works mainly use generative models that rely on the copying mechanism and select words step by step. Besides, the extractive model that directly extracts a text span is more suitable for predicting the present keyphrase. Considering the different characteristics of extractive and generative methods, we propose to divide the keyphrase prediction into two subtasks, i.e., present keyphrase extraction (PKE) and absent keyphrase generation (AKG), to fully exploit their respective advantages. On this basis, a joint inference framework is proposed to make the most of BERT in two subtasks. For PKE, we tackle this task as a sequence labeling problem with the pre-trained language model BERT. For AKG, we introduce a Transformer-based architecture, which fully integrates the present keyphrase knowledge learned from PKE by the fine-tuned BERT. The experimental results show that our approach can achieve state-of-the-art results on both tasks on benchmark datasets.

preprint2020arXiv

Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS

Tacotron-based end-to-end speech synthesis has shown remarkable voice quality. However, the rendering of prosody in the synthesized speech remains to be improved, especially for long sentences, where prosodic phrasing errors can occur frequently. In this paper, we extend the Tacotron-based speech synthesis framework to explicitly model the prosodic phrase breaks. We propose a multi-task learning scheme for Tacotron training, that optimizes the system to predict both Mel spectrum and phrase breaks. To our best knowledge, this is the first implementation of multi-task learning for Tacotron based TTS with a prosodic phrasing model. Experiments show that our proposed training scheme consistently improves the voice quality for both Chinese and Mongolian systems.

preprint2020arXiv

Reconstructing solar wind inhomogeneous structures from stereoscopic observations in white-light: Solar wind transients in 3D

White-light images from Heliospheric Imager-1 (HI1) onboard the Solar Terrestrial Relations Observatory (STEREO) provide 2-dimensional (2D) global views of solar wind transients traveling in the inner heliosphere from two perspectives. How to retrieve the hidden three-dimensional (3D) features of the transients from these 2D images is intriguing but challenging. In our previous work (Li et al., 2018), a &#39;correlation-aided&#39; method is developed to recognize the solar wind transients propagating along the Sun-Earth line based on simultaneous HI1 images from two STEREO spacecraft. Here the method is extended from the Sun-Earth line to the whole 3D space to reconstruct the solar wind transients in the common field of view of STEREO HI1 cameras. We demonstrate the capability of the method by showing the 3D shapes and propagation directions of a coronal mass ejection (CME) and three small-scale blobs during 3-4 April 2010. Comparing with some forward modeling methods, we found our method reliable in terms of the position, angular width and propagation direction. Based on our 3D reconstruction result, an angular distorted, nearly North-South oriented CME on 3 April 2010 is revealed, manifesting the complexity of a CME&#39;s 3D structure.

preprint2020arXiv

Solar Flare-CME Coupling Throughout Two Acceleration Phases of a Fast CME

Solar flares and coronal mass ejections (CMEs) are closely coupled through magnetic reconnection. CMEs are usually accelerated impulsively within the low solar corona, synchronized with the impulsive flare energy release. We investigate the dynamic evolution of a fast CME and its associated X2.8 flare occurring on 2013 May 13. The CME experiences two distinct phases of enhanced acceleration, an impulsive one with a peak value of ~5 km s$^{-2}$ followed by an extended phase with accelerations up to 0.7 km s$^{-2}$. The two-phase CME dynamics is associated with a two-episode flare energy release. While the first episode is consistent with the &#34;standard&#34; eruption of a magnetic flux rope, the second episode of flare energy release is initiated by the reconnection of a large-scale loop in the aftermath of the eruption and produces stronger nonthermal emission up to $γ$-rays. In addition, this long-duration flare reveals clear signs of ongoing magnetic reconnection during the decay phase, evidenced by extended HXR bursts with energies up to 100--300 keV and intermittent downflows of reconnected loops for >4 hours. The observations reveal that the two-step flare reconnection substantially contributes to the two-phase CME acceleration, and the impulsive CME acceleration precedes the most intense flare energy release. The implications of this non-standard flare/CME observation are discussed.

preprint2020arXiv

StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching

Large-scale synthetic datasets are beneficial to stereo matching but usually introduce known domain bias. Although unsupervised image-to-image translation networks represented by CycleGAN show great potential in dealing with domain gap, it is non-trivial to generalize this method to stereo matching due to the problem of pixel distortion and stereo mismatch after translation. In this paper, we propose an end-to-end training framework with domain translation and stereo matching networks to tackle this challenge. First, joint optimization between domain translation and stereo matching networks in our end-to-end framework makes the former facilitate the latter one to the maximum extent. Second, this framework introduces two novel losses, i.e., bidirectional multi-scale feature re-projection loss and correlation consistency loss, to help translate all synthetic stereo images into realistic ones as well as maintain epipolar constraints. The effective combination of above two contributions leads to impressive stereo-consistent translation and disparity estimation accuracy. In addition, a mode seeking regularization term is added to endow the synthetic-to-real translation results with higher fine-grained diversity. Extensive experiments demonstrate the effectiveness of the proposed framework on bridging the synthetic-to-real domain gap on stereo matching.

preprint2020arXiv

Teacher-Student Training for Robust Tacotron-based TTS

While neural end-to-end text-to-speech (TTS) is superior to conventional statistical methods in many ways, the exposure bias problem in the autoregressive models remains an issue to be resolved. The exposure bias problem arises from the mismatch between the training and inference process, that results in unpredictable performance for out-of-domain test data at run-time. To overcome this, we propose a teacher-student training scheme for Tacotron-based TTS by introducing a distillation loss function in addition to the feature loss function. We first train a Tacotron2-based TTS model by always providing natural speech frames to the decoder, that serves as a teacher model. We then train another Tacotron2-based model as a student model, of which the decoder takes the predicted speech frames as input, similar to how the decoder works during run-time inference. With the distillation loss, the student model learns the output probabilities from the teacher model, that is called knowledge distillation. Experiments show that our proposed training scheme consistently improves the voice quality for out-of-domain test data both in Chinese and English systems.

preprint2020arXiv

The Relationship between Chirality, Sense of Rotation, and Hemispheric Preference of Solar Eruptive Filaments

The orientation, chirality, and dynamics of solar eruptive filaments is a key to understanding the magnetic field of coronal mass ejections (CMEs) and therefore to predicting the geoeffectiveness of CMEs arriving at Earth. However, confusion and contention remain over the relationship between the filament chirality, magnetic helicity, and sense of rotation during eruption. To resolve the ambiguity in observations, in this paper, we used stereoscopic observations to determine the rotation direction of filament apex and the method proposed by Chen et al. (2014) to determine the filament chirality. Our sample of 12 eruptive active-region filaments establishes a strong one-to-one relationship, i.e., during the eruption, sinistral/dextral filaments (located in the southern/northern hemisphere) rotate clockwise/counterclockwise when viewed from above, and corroborates a weak hemispheric preference, i.e., a filament and related sigmoid both exhibit a forward (reverse) S shape in the southern (northern) hemisphere, which suggests that the sigmoidal filament is associated with a low-lying magnetic flux rope with its axis dipped in the middle. As a result of rotation, the projected S shape of a filament is anticipated to be reversed during eruption.

preprint2020arXiv

Trust Repairing for Human-Swarm Cooperation inDynamic Task Response

Emergency happens in human-UAV cooperation, such as criminal activity tracking and urgent needs for ground assistance. Emergency response usually has high requirements on the motion control of the multi-UAV system, by maintaining both the team performance and team behaviors. However When a UAV swarm executes tasks in a real-world environment, because of real-world factors, such as system reliability and environmental disturbances, some robots in the swarm will behave abnormally, such as slow flocking speed, wrong heading direction, or poor spatial relations. In the meanwhile, incorrect trust between human and UAV swarm could map the abnormal behavior of faulty robot to the whole swarm and request a time-consuming intervention from human supervisor, damage the UAV swarm response for a dynamic task, even evolve to a failure of task because of accumulated error. To correct reflect the trust between humans and UAV swarm and rebuild the trust to improve the performance caused by incorrect trust. We propose a dynamic trust repair model. The dynamic trust model focus on human-supervisory UAV system which can help UAV swarm to reduce the negative influence from faulty UAV on the performance of the UAV swarm, get a flexible reaction and stable human-supervisory UAV task performance. Results show that trust model could improve the performance of the swarm for dynamic task response and regain human trust.

preprint2020arXiv

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Tacotron-based text-to-speech (TTS) systems directly synthesize speech from text input. Such frameworks typically consist of a feature prediction network that maps character sequences to frequency-domain acoustic features, followed by a waveform reconstruction algorithm or a neural vocoder that generates the time-domain waveform from acoustic features. As the loss function is usually calculated only for frequency-domain acoustic features, that doesn&#39;t directly control the quality of the generated time-domain waveform. To address this problem, we propose a new training scheme for Tacotron-based TTS, referred to as WaveTTS, that has 2 loss functions: 1) time-domain loss, denoted as the waveform loss, that measures the distortion between the natural and generated waveform; and 2) frequency-domain loss, that measures the Mel-scale acoustic feature loss between the natural and generated acoustic features. WaveTTS ensures both the quality of the acoustic features and the resulting speech waveform. To our best knowledge, this is the first implementation of Tacotron with joint time-frequency domain loss. Experimental results show that the proposed framework outperforms the baselines and achieves high-quality synthesized speech.

preprint2019arXiv

Homoepitaxial growth of SrTiO$_3$ by Pulsed Laser Deposition: energetic vs thermal growth

Pulsed Laser Deposition (PLD) is widely used to grow epitaxial thin films of quantum materials such as complex oxides. Here, we use in-situ X-ray scattering to study homoepitaxy of SrTiO$_3$ by energetic (e-) and thermalized (th-) PLD. We find that e-PLD suppresses the lateral growth of two-dimensional islands, which suggests that energetic particles break up smaller islands. Fast interlayer transport occurs for both e-PLD and th-PLD, implying a process operating on sub-microsecond timescales that doesn&#39;t depend strongly on the kinetic energy of the incident particles.

preprint2019arXiv

Mass motion in a prominence bubble revealing a kinked flux rope configuration

Prominence bubbles are cavities rising into quiescent prominences from below. The bubble-prominence interface is often the active location for the formation of plumes, which flow turbulently into quiescent prominences. Not only the origin of prominence bubbles is poorly understood, but most of their physical characteristics are still largely unknown. Here, we investigate the dynamical properties of a bubble, which is observed since its early emergence beneath the spine of a quiescent prominence on 20 October 2017 in the H$α$ line-center and in $\pm$0.4 angstrom line-wing wavelengths by the 1-m New Vacuum Solar Telescope. We report the prominence bubble to be exhibiting a disparate morphology in the H$α$ line-center compared to its line-wings&#39; images, indicating a complex pattern of mass motion along the line-of-sight. Combining Doppler maps with flow maps in the plane of sky derived from a Nonlinear Affine Velocity Estimator, we obtained a comprehensive picture of mass motions revealing a counter-clockwise rotation inside the bubble; with blue-shifted material flowing upward and red-shifted material flowing downward. This sequence of mass motions is interpreted to be either outlining a kinked flux rope configuration of the prominence bubble or providing observational evidence of the internal kink instability in the prominence plasma.

preprint2019arXiv

Role of ferroelectric polarization during growth of highly strained ferroelectrics revealed by in-situ x-ray diffraction

Strain engineering of perovskite oxide thin films has proven to be an extremely powerful method for enhancing and inducing ferroelectric behavior. In ferroelectric thin films and superlattices, the polarization is intricately linked to crystal structure, but we show here that it can also play an important role in the growth process, influencing growth rates, relaxation mechanisms, electrical properties and domain structures. We have studied this effect in detail by focusing on the properties of BaTiO$_{3}$ thin films grown on very thin layers of PbTiO$_{3}$ using a combination of x-ray diffraction, piezoforce microscopy, electrical characterization and rapid in-situ x-ray diffraction reciprocal space maps during the growth using synchrotron radiation. Using a simple model we show that the changes in growth are driven by the energy cost for the top material to sustain the polarization imposed upon it by the underlying layer, and these effects may be expected to occur in other multilayer systems where polarization is present during growth. Our research motivates the concept of polarization engineering during the growth process as a new and complementary approach to strain engineering.