Researcher profile

Ke-Yin Chen

Ke-Yin Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Pretrained Domain-Specific Language Model for General Information Retrieval Tasks in the AEC Domain

As an essential task for the architecture, engineering, and construction (AEC) industry, information retrieval (IR) from unstructured textual data based on natural language processing (NLP) is gaining increasing attention. Although various deep learning (DL) models for IR tasks have been investigated in the AEC domain, it is still unclear how domain corpora and domain-specific pretrained DL models can improve performance in various IR tasks. To this end, this work systematically explores the impacts of domain corpora and various transfer learning techniques on the performance of DL models for IR tasks and proposes a pretrained domain-specific language model for the AEC domain. First, both in-domain and close-domain corpora are developed. Then, two types of pretrained models, including traditional wording embedding models and BERT-based models, are pretrained based on various domain corpora and transfer learning strategies. Finally, several widely used DL models for IR tasks are further trained and tested based on various configurations and pretrained models. The result shows that domain corpora have opposite effects on traditional word embedding models for text classification and named entity recognition tasks but can further improve the performance of BERT-based models in all tasks. Meanwhile, BERT-based models dramatically outperform traditional methods in all IR tasks, with maximum improvements of 5.4% and 10.1% in the F1 score, respectively. This research contributes to the body of knowledge in two ways: 1) demonstrating the advantages of domain corpora and pretrained DL models and 2) opening the first domain-specific dataset and pretrained language model for the AEC domain, to the best of our knowledge. Thus, this work sheds light on the adoption and application of pretrained models in the AEC domain.

preprint2022arXiv

The Classification of Blazars Candidates of Uncertain Types

In this work, the support vector machine (SVM) method is adopted to separate BL Lacertae objects (BL Lacs) and flat spectrum radio quasars (FSRQs) in the plots of photon spectrum index against the photon flux, $α_{\rm ph} \sim {\rm log}\,F$, that of photon spectrum index against the variability index, $α_{\rm ph} \sim {\rm log}\, \textit{V\!I}$, and that of variability index against the photon flux, ${\rm log}\,{V\!I} \sim {\rm log}\,F$. Then we used the dividing lines to tell BL Lacs from FSRQs in the blazars candidates of uncertain types from \textit{Fermi}/LAT catalogue. Our main conclusions are: 1. We separate BL Lacs and FSRQs by $α_{\rm ph} = -0.123\,{\rm log}\,F + 1.170$ in the $α_{\rm ph} \sim {\rm log}\,F$ plot, $α_{\rm ph} = -0.161\,{\rm log}\,{V\!I} + 2.594$ in the $α_{\rm ph} \sim {\rm log}\,{V\!I}$ plot, and ${\rm log}\,{V\!I} = 0.792\,{\rm log}\,F + 9.203$ in the ${\rm log}\,{V\!I} \sim {\rm log}\,F$ plot. 2. We obtained 932 BL Lac candidates and possible BL Lac candidates, and 585 FSRQ candidates and possible FSRQ candidates. 3. Some discussions are given for comparisons with those in literature.