Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
34works
0followers
17topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

34 published item(s)

preprint2026arXiv

SWE Atlas: Benchmarking Coding Agents Beyond Issue Resolution

We introduce SWE Atlas, a benchmark suite for coding agents spanning three professional software engineering workflows: Codebase Q&A (124 tasks), Test Writing (90 tasks), and Refactoring (70 tasks). SWE Atlas differs from prior SWE benchmarks in three key ways: it targets underrepresented but practically important task categories, uses comprehensive category-specific evaluation protocols, and adopts under-specified, agentic task formulations that better reflect real-world usage. Its evaluation framework combines programmatic checks with rubric-based assessment. This goes beyond functional correctness, evaluating software engineering quality, including test and refactor completeness, maintainability, reusable abstractions, and codebase hygiene. We evaluate a range of frontier and open-weight models on SWE Atlas and find that GPT-5.4 and Opus 4.7 achieve the strongest overall performance, while even the best open-weight models score poorly. Our analysis suggests that top models rely on extensive codebase exploration and runtime-driven reasoning. However, even top models consistently struggle with subtle edge cases, complex runtime analysis, and adherence to software engineering best practices. Overall, SWE Atlas provides a complementary evaluation suite for measuring both correctness and engineering quality in coding agents.

preprint2025arXiv

Ultrahigh-Energy Gamma-ray Emission Associated with Black Hole-Jet Systems

Black holes (BH), one of the most intriguing objects in the universe, can manifest themselves through electromagnetic radiation initiated by the accretion flow. Some stellar-mass BHs drive relativistic jets when accreting matter from their companion stars, forming microquasars. Non-thermal emission from the radio to tera-electronvolt (TeV) gamma-ray band has been observed from microquasars, indicating the acceleration of relativistic particles. Here we report detection of four microquasars (SS 433, V4641 Sgr, GRS 1915+105, MAXI J1820+070) of spectrum extending to the ultrahigh-energy (UHE; photon energy $E>100$ TeV) band and one microquasar (Cygnus X-1) of spectrum approaching 100 TeV, using the Large High Altitude Air Shower Observatory (LHAASO). Notably, the total emission associated with SS 433 cannot be interpreted with a single leptonic component. In the UHE band, its emission is in spatial coincidence with a giant atomic cloud, which is consistent with a hadronic origin. An elongated source is discovered from V4641 Sgr with the spectrum continuing up to 800 TeV. The detection of UHE gamma rays demonstrates that accreting BHs and their environments can operate as extremely efficient accelerators of particles out of 1 peta-electronvolt (PeV), suggesting microquasars to be important contributors to Galactic cosmic rays especially around the `knee' region.

preprint2023arXiv

Effective Shielding of $\lesssim$ 10 GeV Cosmic Rays from Dense Molecular Clumps

The density of cosmic rays inside molecular clouds determines the ionization rate in the dense cores where stars form. It is also one of the drivers of astrochemistry leading to the creation of complex molecules. Through Fermi Large Area Telescope observations of nearby giant molecular clouds, we observed deficits (holes) in the gamma-ray residual map when modelling with the expected gamma-ray diffuse emission from uniform cosmic rays interacting with the molecular content. We propose that the deficit is due to the lack of penetration of the low-energy (sub-GeV to GeV) cosmic rays into denser regions or clumps. This differs from the prevailing view of fast cosmic ray transport in giant molecular clouds where the magnetic turbulence is suppressed by neutral-ion damping, as our results require a slow diffusion inside dense molecular clumps. Through modelling we find that while the shielding is negligible on the cloud scale, it becomes important in the denser, parsec-sized regions where the gravitational collapse is already at play, changing the initial condition of star formation and astrochemistry.

preprint2023arXiv

Zero-shot information extraction from radiological reports using ChatGPT

Electronic health records contain an enormous amount of valuable information, but many are recorded in free text. Information extraction is the strategy to transform the sequence of characters into structured data, which can be employed for secondary analysis. However, the traditional information extraction components, such as named entity recognition and relation extraction, require annotated data to optimize the model parameters, which has become one of the major bottlenecks in building information extraction systems. With the large language models achieving good performances on various downstream NLP tasks without parameter tuning, it becomes possible to use large language models for zero-shot information extraction. In this study, we aim to explore whether the most popular large language model, ChatGPT, can extract useful information from the radiological reports. We first design the prompt template for the interested information in the CT reports. Then, we generate the prompts by combining the prompt template with the CT reports as the inputs of ChatGPT to obtain the responses. A post-processing module is developed to transform the responses into structured extraction results. We conducted the experiments with 847 CT reports collected from Peking University Cancer Hospital. The experimental results indicate that ChatGPT can achieve competitive performances for some extraction tasks compared with the baseline information extraction system, but some limitations need to be further improved.

preprint2022arXiv

A Multi-Head Model for Continual Learning via Out-of-Distribution Replay

This paper studies class incremental learning (CIL) of continual learning (CL). Many approaches have been proposed to deal with catastrophic forgetting (CF) in CIL. Most methods incrementally construct a single classifier for all classes of all tasks in a single head network. To prevent CF, a popular approach is to memorize a small number of samples from previous tasks and replay them during training of the new task. However, this approach still suffers from serious CF as the parameters learned for previous tasks are updated or adjusted with only the limited number of saved samples in the memory. This paper proposes an entirely different approach that builds a separate classifier (head) for each task (called a multi-head model) using a transformer network, called MORE. Instead of using the saved samples in memory to update the network for previous tasks/classes in the existing approach, MORE leverages the saved samples to build a task specific classifier (adding a new classification head) without updating the network learned for previous tasks/classes. The model for the new task in MORE is trained to learn the classes of the task and also to detect samples that are not from the same data distribution (i.e., out-of-distribution (OOD)) of the task. This enables the classifier for the task to which the test instance belongs to produce a high score for the correct class and the classifiers of other tasks to produce low scores because the test instance is not from the data distributions of these classifiers. Experimental results show that MORE outperforms state-of-the-art baselines and is also naturally capable of performing OOD detection in the continual learning setting.

preprint2022arXiv

Beyond Opinion Mining: Summarizing Opinions of Customer Reviews

Customer reviews are vital for making purchasing decisions in the Information Age. Such reviews can be automatically summarized to provide the user with an overview of opinions. In this tutorial, we present various aspects of opinion summarization that are useful for researchers and practitioners. First, we will introduce the task and major challenges. Then, we will present existing opinion summarization solutions, both pre-neural and neural. We will discuss how summarizers can be trained in the unsupervised, few-shot, and supervised regimes. Each regime has roots in different machine learning methods, such as auto-encoding, controllable text generation, and variational inference. Finally, we will discuss resources and evaluation methods and conclude with the future directions. This three-hour tutorial will provide a comprehensive overview over major advances in opinion summarization. The listeners will be well-equipped with the knowledge that is both useful for research and practical applications.

preprint2022arXiv

Continual Learning Based on OOD Detection and Task Masking

Existing continual learning techniques focus on either task incremental learning (TIL) or class incremental learning (CIL) problem, but not both. CIL and TIL differ mainly in that the task-id is provided for each test sample during testing for TIL, but not provided for CIL. Continual learning methods intended for one problem have limitations on the other problem. This paper proposes a novel unified approach based on out-of-distribution (OOD) detection and task masking, called CLOM, to solve both problems. The key novelty is that each task is trained as an OOD detection model rather than a traditional supervised learning model, and a task mask is trained to protect each task to prevent forgetting. Our evaluation shows that CLOM outperforms existing state-of-the-art baselines by large margins. The average TIL/CIL accuracy of CLOM over six experiments is 87.6/67.9% while that of the best baselines is only 82.4/55.0%.

preprint2022arXiv

Ensemble Semi-supervised Entity Alignment via Cycle-teaching

Entity alignment is to find identical entities in different knowledge graphs. Although embedding-based entity alignment has recently achieved remarkable progress, training data insufficiency remains a critical challenge. Conventional semi-supervised methods also suffer from the incorrect entity alignment in newly proposed training data. To resolve these issues, we design an iterative cycle-teaching framework for semi-supervised entity alignment. The key idea is to train multiple entity alignment models (called aligners) simultaneously and let each aligner iteratively teach its successor the proposed new entity alignment. We propose a diversity-aware alignment selection method to choose reliable entity alignment for each aligner. We also design a conflict resolution mechanism to resolve the alignment conflict when combining the new alignment of an aligner and that from its teacher. Besides, considering the influence of cycle-teaching order, we elaborately design a strategy to arrange the optimal order that can maximize the overall performance of multiple aligners. The cycle-teaching process can break the limitations of each model's learning capability and reduce the noise in new training data, leading to improved performance. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed cycle-teaching framework, which significantly outperforms the state-of-the-art models when the training data is insufficient and the new entity alignment has much noise.

preprint2022arXiv

Gamma-ray observation towards the young massive star cluster NGC 6618 in the M17 region

Young massive clusters have been established as a new population of gamma-ray sources and potential cosmic ray (CR) accelerators. In this paper, we report the detection of gamma-ray emissions near the young star cluster NGC 6618, which is one of the youngest star clusters in our Galaxy. The detected gamma-ray emissions can be divided into two components. One component is point-like and reveals harder spectrum, while the other is extended and with softer spectrum. Such spectral features are significantly different from other young massive clusters and may be due to the propagation effects of CRs accelerated in NGC 6618.

preprint2022arXiv

GeV Gamma-ray Emission and Molecular Clouds towards Supernova Remnant G35.6$-$0.4 and the TeV Source HESS J1858+020

It is difficult to distinguish hadronic process from the leptonic one in $γ$-ray observation, which is however crucial in revealing the origin of cosmic rays. As an endeavor in the regard, we focus in this work on the complex $γ$-ray emitting region, which partially overlaps with the unidentified TeV source HESS~J1858+020 and includes supernova remnant (SNR) G35.6$-$0.4 and HII region G35.6$-$0.5. We reanalyze CO-line, HI, and Fermi-LAT GeV $γ$-ray emission data of this region. The analysis of the molecular and HI data suggests that SNR G35.6$-$0.4 and HII region G35.6$-$0.5 are located at different distances. The analysis the GeV $γ$-rays shows that GeV emission arises from two point sources: one (SrcA) coincident with the SNR, and the other (SrcB) coincident with both HESS J1858+020 and HII region G35.6$-$0.5. The GeV emission of SrcA can be explained by the hadronic process in the SNR-MC association scenario. The GeV-band spectrum of SrcB and the TeV-band spectrum of HESS J1858+020 can be smoothly connected by a power-law function, with an index of $\sim$2.2. The connected spectrum is well explained with a hadronic emission, with the cutoff energy of protons above 1 PeV. It thus indicates that there is a potential PeVatron in the HII region and should be further verified with ultra-high energy observations with, e.g., LHAASO.

preprint2022arXiv

High-quality Task Division for Large-scale Entity Alignment

Entity Alignment (EA) aims to match equivalent entities that refer to the same real-world objects and is a key step for Knowledge Graph (KG) fusion. Most neural EA models cannot be applied to large-scale real-life KGs due to their excessive consumption of GPU memory and time. One promising solution is to divide a large EA task into several subtasks such that each subtask only needs to match two small subgraphs of the original KGs. However, it is challenging to divide the EA task without losing effectiveness. Existing methods display low coverage of potential mappings, insufficient evidence in context graphs, and largely differing subtask sizes. In this work, we design the DivEA framework for large-scale EA with high-quality task division. To include in the EA subtasks a high proportion of the potential mappings originally present in the large EA task, we devise a counterpart discovery method that exploits the locality principle of the EA task and the power of trained EA models. Unique to our counterpart discovery method is the explicit modelling of the chance of a potential mapping. We also introduce an evidence passing mechanism to quantify the informativeness of context entities and find the most informative context graphs with flexible control of the subtask size. Extensive experiments show that DivEA achieves higher EA performance than alternative state-of-the-art solutions.

preprint2022arXiv

KETOD: Knowledge-Enriched Task-Oriented Dialogue

Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains. Towards building a human-like assistant that can converse naturally and seamlessly with users, it is important to build a dialogue system that conducts both types of conversations effectively. In this work, we investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model. To this end, we create a new dataset, KETOD (Knowledge-Enriched Task-Oriented Dialogue), where we naturally enrich task-oriented dialogues with chit-chat based on relevant entity knowledge. We also propose two new models, SimpleToDPlus and Combiner, for the proposed task. Experimental results on both automatic and human evaluations show that the proposed methods can significantly improve the performance in knowledge-enriched response generation while maintaining a competitive task-oriented dialog performance. We believe our new dataset will be a valuable resource for future studies. Our dataset and code are publicly available at \url{https://github.com/facebookresearch/ketod}.

preprint2022arXiv

Neural Collaborative Graph Machines for Table Structure Recognition

Recently, table structure recognition has achieved impressive progress with the help of deep graph models. Most of them exploit single visual cues of tabular elements or simply combine visual cues with other modalities via early fusion to reason their graph relationships. However, neither early fusion nor individually reasoning in terms of multiple modalities can be appropriate for all varieties of table structures with great diversity. Instead, different modalities are expected to collaborate with each other in different patterns for different table cases. In the community, the importance of intra-inter modality interactions for table structure reasoning is still unexplored. In this paper, we define it as heterogeneous table structure recognition (Hetero-TSR) problem. With the aim of filling this gap, we present a novel Neural Collaborative Graph Machines (NCGM) equipped with stacked collaborative blocks, which alternatively extracts intra-modality context and models inter-modality interactions in a hierarchical way. It can represent the intra-inter modality relationships of tabular elements more robustly, which significantly improves the recognition performance. We also show that the proposed NCGM can modulate collaborative pattern of different modalities conditioned on the context of intra-modality cues, which is vital for diversified table cases. Experimental results on benchmarks demonstrate our proposed NCGM achieves state-of-the-art performance and beats other contemporary methods by a large margin especially under challenging scenarios.

preprint2022arXiv

Novel boron nitride polymorphs with graphite-diamond hybrid structure

Both boron nitride (BN) and carbon (C) have sp, sp2 and sp3 hybridization modes, and thus resulting in a variety of BN and C polymorphs with similar structures, such as hexagonal BN (hBN) and graphite, cubic BN (cBN) and diamond. Here, five types of BN polymorph structures were proposed theoretically, inspired by the graphite-diamond hybrid structures discovered in recent experiment. These BN polymorphs with graphite-diamond hybrid structures possessed excellent mechanical properties with combined high hardness and high ductility, and also exhibited various electronic properties such as semi-conductivity, semi-metallicity, and even one- and two-dimensional conductivity, differing from known insulators hBN and cBN. The simulated diffraction patterns of these BN hybrid structures could account for the unsolved diffraction patterns of intermediate products composed of "compressed hBN" and diamond-like BN, caused by phase transitions in previous experiments. Thus, this work provides a theoretical basis for the presence of these types of hybrid materials during phase transitions between graphite-like and diamond-like BN polymorphs.

preprint2022arXiv

Open-set Recognition via Augmentation-based Similarity Learning

The primary assumption of conventional supervised learning or classification is that the test samples are drawn from the same distribution as the training samples, which is called closed set learning or classification. In many practical scenarios, this is not the case because there are unknowns or unseen class samples in the test data, which is called the open set scenario, and the unknowns need to be detected. This problem is referred to as the open set recognition problem and is important in safety-critical applications. We propose to detect unknowns (or unseen class samples) through learning pairwise similarities. The proposed method works in two steps. It first learns a closed set classifier using the seen classes that have appeared in training and then learns how to compare seen classes with pseudo-unseen (automatically generated unseen class samples). The pseudo-unseen generation is carried out by performing distribution shifting augmentations on the seen or training samples. We call our method OPG (Open set recognition based on Pseudo unseen data Generation). The experimental evaluation shows that the learned similarity-based features can successfully distinguish seen from unseen in benchmark datasets for open set recognition.

preprint2022arXiv

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

Learning task-oriented dialog policies via reinforcement learning typically requires large amounts of interaction with users, which in practice renders such methods unusable for real-world applications. In order to reduce the data requirements, we propose to leverage data from across different dialog domains, thereby reducing the amount of data required from each given domain. In particular, we propose to learn domain-agnostic action embeddings, which capture general-purpose structure that informs the system how to act given the current dialog context, and are then specialized to a specific domain. We show how this approach is capable of learning with significantly less interaction with users, with a reduction of 35% in the number of dialogs required to learn, and to a higher level of proficiency than training separate policies for each domain on a set of simulated domains.

preprint2022arXiv

TextDCT: Arbitrary-Shaped Text Detection via Discrete Cosine Transform Mask

Arbitrary-shaped scene text detection is a challenging task due to the variety of text changes in font, size, color, and orientation. Most existing regression based methods resort to regress the masks or contour points of text regions to model the text instances. However, regressing the complete masks requires high training complexity, and contour points are not sufficient to capture the details of highly curved texts. To tackle the above limitations, we propose a novel light-weight anchor-free text detection framework called TextDCT, which adopts the discrete cosine transform (DCT) to encode the text masks as compact vectors. Further, considering the imbalanced number of training samples among pyramid layers, we only employ a single-level head for top-down prediction. To model the multi-scale texts in a single-level head, we introduce a novel positive sampling strategy by treating the shrunk text region as positive samples, and design a feature awareness module (FAM) for spatial-awareness and scale-awareness by fusing rich contextual information and focusing on more significant features. Moreover, we propose a segmented non-maximum suppression (S-NMS) method that can filter low-quality mask regressions. Extensive experiments are conducted on four challenging datasets, which demonstrate our TextDCT obtains competitive performance on both accuracy and efficiency. Specifically, TextDCT achieves F-measure of 85.1 at 17.2 frames per second (FPS) and F-measure of 84.9 at 15.1 FPS for CTW1500 and Total-Text datasets, respectively.

preprint2022arXiv

Zero-Shot Aspect-Based Sentiment Analysis

Aspect-based sentiment analysis (ABSA) typically requires in-domain annotated data for supervised training/fine-tuning. It is a big challenge to scale ABSA to a large number of new domains. This paper aims to train a unified model that can perform zero-shot ABSA without using any annotated data for a new domain. We propose a method called contrastive post-training on review Natural Language Inference (CORN). Later ABSA tasks can be cast into NLI for zero-shot transfer. We evaluate CORN on ABSA tasks, ranging from aspect extraction (AE), aspect sentiment classification (ASC), to end-to-end aspect-based sentiment analysis (E2E ABSA), which show ABSA can be conducted without any human annotated ABSA data.

preprint2021arXiv

Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models

In this work, we study how the finetuning stage in the pretrain-finetune framework changes the behavior of a pretrained neural language generator. We focus on the transformer encoder-decoder model for the open-domain dialogue response generation task. Our major finding is that after standard finetuning, the model forgets some of the important language generation skills acquired during large-scale pretraining. We demonstrate the forgetting phenomenon through a set of detailed behavior analysis from the perspectives of knowledge transfer, context sensitivity, and function space projection. As a preliminary attempt to alleviate the forgetting problem, we propose an intuitive finetuning strategy named "mix-review". We find that mix-review effectively regularizes the finetuning process, and the forgetting problem is alleviated to some extent. Finally, we discuss interesting behavior of the resulting dialogue model and its implications.

preprint2021arXiv

Discovery of carbon-based strongest and hardest amorphous material

Carbon is likely the most fascinating element of the periodic table because of the diversity of its allotropes stemming from its variable (sp, sp2, and sp3) bonding motifs. Exploration of new forms of carbon has been an eternal theme of contemporary scientific research. Here we report on novel amorphous carbon phases containing high fraction of sp3 bonded atoms recovered after compressing fullerene C60 to previously unexplored high pressure and temperature. The synthesized carbons are the hardest and strongest amorphous materials known to date, capable of scratching diamond crystal and approaching its strength which is evidenced by complimentary mechanical tests. Photoluminescence and absorption spectra of the materials demonstrate they are semiconductors with tunable bandgaps in the range of 1.5-2.2 eV, comparable to that of amorphous silicon. A remarkable combination of the outstanding mechanical and electronic properties makes this class of amorphous carbons an excellent candidate for photovoltaic applications demanding ultrahigh strength and wear resistance.

preprint2021arXiv

Kekule Lattice in Graphdiyne: Coexistence of Phononic and Electronic Higher-Order Band Topology

The topological physics has been extensively studied in different kinds of bosonic and fermionic systems, ranging from artificial structures to natural materials. However, the coexistence of topological phonon and electron in one single material is seldom reported. Recently, graphdiyne is proposed to be a two-dimensional (2D) electronic second-order topological insulator (SOTI). In this work, based on density-functional tight-binding calculations, we found that graphdiyne is equivalent to the Kekule lattice, also realizing a 2D phononic SOTI in both out-of-plane and in-plane modes. Depending on edge terminations, the characterized topological corner states can be either inside or outside the bulk gap, which are tunable by local corner potential. Most remarkably, a unique selectivity of space and symmetry is revealed in electron-phonon coupling between the localized phononic and electronic topological corner states. Our results not only demonstrate the phononic higher-order band topology in a real carbon material, but also provide an opportunity to investigate the interplay between phononic and electronic higher-order topological states.

preprint2021arXiv

Lifelong Learning Dialogue Systems: Chatbots that Self-Learn On the Job

Dialogue systems, also called chatbots, are now used in a wide range of applications. However, they still have some major weaknesses. One key weakness is that they are typically trained from manually-labeled data and/or written with handcrafted rules, and their knowledge bases (KBs) are also compiled by human experts. Due to the huge amount of manual effort involved, they are difficult to scale and also tend to produce many errors ought to their limited ability to understand natural language and the limited knowledge in their KBs. Thus, the level of user satisfactory is often low. In this paper, we propose to dramatically improve this situation by endowing the system the ability to continually learn (1) new world knowledge, (2) new language expressions to ground them to actions, and (3) new conversational skills, during conversation or "on the job" by themselves so that as the systems chat more and more with users, they become more and more knowledgeable and are better and better able to understand diverse natural language expressions and improve their conversational skills. A key approach to achieving these is to exploit the multi-user environment of such systems to self-learn through interactions with users via verb and non-verb means. The paper discusses not only key challenges and promising directions to learn from users during conversation but also how to ensure the correctness of the learned knowledge.

preprint2021arXiv

Nuclear de-excitation lines as a probe of low-energy cosmic rays

Low-energy cosmic rays (LECRs) contribute substantially to the energy balance of the interstellar medium. They play also significant role in the heating and chemistry of gas, and, consequently, on the star formation process. Because of the slow propagation coupled with enhanced energy losses of subrelativistic particles, LECRs are concentrated around their acceleration sites. LECRs effectively interact with the ambient gas through nuclear reactions. Although these processes are energetically less effective compared to heating and ionization, they are extremely important from the point of view of nuclear de-excitation lines, which carry unique information about LECRs. We present results on production of de-excitation lines combining the numerical treatment of nuclear reactions using the code TALYS, with the propagation and energy losses of LECRs.

preprint2021arXiv

On the surface brightness radial profile of the extended $γ$-ray sources

The morphology of the extended $γ$-ray source is governed by the propagation process of parent relativistic particles. In this paper, we investigate the surface brightness radial profile of extended $γ$-ray sources illuminated by cosmic ray protons and electrons, considering the radiation mechanisms, projection effects, and the response of instruments. We found that the parent particle species and the propagation process can cause considerable differences in the observed radial profiles. Thus, the surface brightness profile can be used as a unique tool to identify the radiation mechanism and the propagation process of the parent particles. In addition, We also discuss the possible implications regarding the latest discoveries from very/ultra-high energy $γ$-ray instruments like LHAASO and HAWC.

preprint2020arXiv

A Deep Learning Framework for Hydrogen-fueled Turbulent Combustion Simulation

The high cost of high-resolution computational fluid/flame dynamics (CFD) has hindered its application in combustion related design, research and optimization. In this study, we propose a new framework for turbulent combustion simulation based on the deep learning approach. An optimized deep convolutional neural network (CNN) inspired from a U-Net architecture and inception module is designed for constructing the framework of the deep learning solver, named CFDNN. CFDNN is then trained on the simulation results of hydrogen combustion in a cavity with different inlet velocities. After training, CFDNN can not only accurately predict the flow and combustion fields within the range of the training set, but also shows an extrapolation ability for prediction outside the training set. The results from CFDNN solver show excellent consistency with the conventional CFD results in terms of both predicted spatial distributions and temporal dynamics. Meanwhile, two orders of magnitude of acceleration is achieved by using CFDNN solver compared to the conventional CFD solver. The successful development of such a deep learning-based solver opens up new possibilities of low-cost, high-accuracy simulations, fast prototyping, design optimization and real-time control of combustion systems such as gas turbines and scramjets.

preprint2020arXiv

A Model Checking-based Analysis Framework for Systems Biology Models

Biological systems are often modeled as a system of ordinary differential equations (ODEs) with time-invariant parameters. However, cell signaling events or pharmacological interventions may alter the cellular state and induce multi-mode dynamics of the system. Such systems are naturally modeled as hybrid automata, which possess multiple operational modes with specific nonlinear dynamics in each mode. In this paper we introduce a model checking-enabled framework than can model and analyze both single- and multi-mode biological systems. We tackle the central problem in systems biology--identify parameter values such that a model satisfies desired behaviors--using bounded model checking. We resort to the delta-decision procedures to solve satisfiability modulo theories (SMT) problems and sidestep undecidability of reachability problems. Our framework enables several analysis tasks including model calibration and falsification, therapeutic strategy identification, and Lyapunov stability analysis. We demonstrate the applicablitliy of these methods using case studies of prostate cancer progression, cardiac cell action potential and radiation diseases.

preprint2020arXiv

Artificial neural network based chemical mechanisms for computationally efficient modeling of kerosene combustion

To effectively simulate the combustion of hydrocarbon-fueled supersonic engines, such as rocket-based combined cycle (RBCC) engines, a detailed mechanism for chemistry is usually required but computationally prohibitive. In order to accelerate chemistry calculation, an artificial neural network (ANN) based methodology was introduced in this study. This methodology consists of two different layers: self-organizing map (SOM) and back-propagation neural network (BPNN). The SOM is for clustering the dataset into subsets to reduce the nonlinearity, while the BPNN is for regression for each subset. The entire methodology was subsequently employed to establish a skeleton mechanism of kerosene combustion with 41 species. The training data was generated by RANS simulations of the RBCC combustion chamber, and then fed into the SOM-BPNN with six different topologies (three different SOM topologies and two different BPNN topologies). By comparing the predicted results of six cases with those of the conventional ODE solver, it is found that if the topology is properly designed, high-precision results in terms of ignition, quenching and mass fraction prediction can be achieved. As for efficiency, 8~ 20 times speedup of the chemical system integration was achieved, indicating that it has great potential for application in complex chemical mechanisms for a variety of fuels.

preprint2020arXiv

Computational Performance of a Germline Variant Calling Pipeline for Next Generation Sequencing

With the booming of next generation sequencing technology and its implementation in clinical practice and life science research, the need for faster and more efficient data analysis methods becomes pressing in the field of sequencing. Here we report on the evaluation of an optimized germline mutation calling pipeline, HummingBird, by assessing its performance against the widely accepted BWA-GATK pipeline. We found that the HummingBird pipeline can significantly reduce the running time of the primary data analysis for whole genome sequencing and whole exome sequencing while without significantly sacrificing the variant calling accuracy. Thus, we conclude that expansion of such software usage will help to improve the primary data analysis efficiency for next generation sequencing.

preprint2020arXiv

Continual Learning in Task-Oriented Dialogue Systems

Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-offs between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.

preprint2020arXiv

COVID-19 Evolves in Human Hosts

Today, we are all threatened by an unprecedented pandemic: COVID-19. How different is it from other coronaviruses? Will it be attenuated or become more virulent? Which animals may be its original host? In this study, we collected and analyzed nearly thirty thousand publicly available complete genome sequences for COVID-19 virus from 79 different countries, the previously known flu-causing coronaviruses (HCov-229E, HCov-OC43, HCov-NL63 and HCov-HKU1) and the lethal, pathogenic viruses, SARS, MERS, Victoria, Lassa, Yamagata, Ebola, and Dengue. We found strong similarities between the current circulating COVID-19 and SARS and MERS, as well as COVID-19 in rhinolophines and pangolins. On the contrary, COVID-19 shares little similarity with the flu-causing coronaviruses and the other known viruses. Strikingly, we observed that the divergence of COVID-19 strains isolated from human hosts has steadily increased from December 2019 to May 2020, suggesting COVID-19 is actively evolving in human hosts. In this paper, we first propose a novel MLCS algorithm NP-MLCS1 for the big sequence analysis, which can calculate the common model for COVID-19 complete genome sequences to provide important information for vaccine and antibody development. Geographic and time-course analysis of the evolution trees of the human COVID-19 reveals possible evolutional paths among strains from 79 countries. This finding has important implications to the management of COVID-19 and the development of vaccines and medications.

preprint2020arXiv

Detecting Domain Polarity-Changes of Words in a Sentiment Lexicon

Sentiment lexicons are instrumental for sentiment analysis. One can use a set of sentiment words provided in a sentiment lexicon and a lexicon-based classifier to perform sentiment classification. One major issue with this approach is that many sentiment words are domain dependent. That is, they may be positive in some domains but negative in some others. We refer to this problem as domain polarity-changes of words. Detecting such words and correcting their sentiment for an application domain is very important. In this paper, we propose a graph-based technique to tackle this problem. Experimental results show its effectiveness on multiple real-world datasets.

preprint2020arXiv

DomBERT: Domain-oriented Language Model for Aspect-based Sentiment Analysis

This paper focuses on learning domain-oriented language models driven by end tasks, which aims to combine the worlds of both general-purpose language models (such as ELMo and BERT) and domain-specific language understanding. We propose DomBERT, an extension of BERT to learn from both in-domain corpus and relevant domain corpora. This helps in learning domain language models with low-resources. Experiments are conducted on an assortment of tasks in aspect-based sentiment analysis, demonstrating promising results.

preprint2020arXiv

User Memory Reasoning for Conversational Recommendation

We study a conversational recommendation model which dynamically manages users&#39; past (offline) preferences and current (online) requests through a structured and cumulative user memory knowledge graph, to allow for natural interactions and accurate recommendations. For this study, we create a new Memory Graph (MG) <--> Conversational Recommendation parallel corpus called MGConvRex with 7K+ human-to-human role-playing dialogs, grounded on a large-scale user memory bootstrapped from real-world user scenarios. MGConvRex captures human-level reasoning over user memory and has disjoint training/testing sets of users for zero-shot (cold-start) reasoning for recommendation. We propose a simple yet expandable formulation for constructing and updating the MG, and a reasoning model that predicts optimal dialog policies and recommendation items in unconstrained graph space. The prediction of our proposed model inherits the graph structure, providing a natural way to explain the model&#39;s recommendation. Experiments are conducted for both offline metrics and online simulation, showing competitive results.

preprint2019arXiv

Real-Space Investigation of the Charge Density Wave in VTe2 Monolayer with Rotational and Mirror Symmetries Broken

Recently the charge density wave (CDW) in vanadium dichalcogenides have attracted increasing research interests, but a real-space investigation on the symmetry breaking of the CDW state in VTe2 monolayer is still lacking. We have investigated the CDW of VTe2 monolayer by low energy electron diffraction (LEED) and scanning tunneling microscope (STM). While the LEED experiments revealed a (4X4) CDW transition at 192+-2 K, our low-temperature STM experiments resolved the (4X4) lattice distortions and charge-density modulation in real space, and further unveiled a 1D modulation that breaks the three-fold rotational and mirror symmetries in the CDW state. In accordance with the CDW state at low temperature, a CDW gap of 12 meV was detected by scanning tunneling spectroscopy (STS) at 4.9 K. Our work provides real-space evidence on the symmetry breaking of the (4X4) CDW state in VTe2 monolayer, and implies there is a certain mechanism, beyond the conventional Fermi surface nesting or the q-dependent electron-phonon coupling, is responsible for the formation of CDW state in VTe2 monolayer.