Source author record

Sheng Yu

Sheng Yu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Formal Languages and Automata Theory Machine Learning Social and Information Networks Artificial Intelligence math.OC physics.soc-ph quant-ph Computational Complexity cond-mat.mes-hall cond-mat.mtrl-sci physics.comp-ph

Catalog footprint

What is connected

24works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Accurate Unsupervised Method for Joint Entity Alignment and Dangling Entity Detection

Knowledge graph integration typically suffers from the widely existing dangling entities that cannot find alignment cross knowledge graphs (KGs). The dangling entity set is unavailable in most real-world scenarios, and manually mining the entity pairs that consist of entities with the same meaning is labor-consuming. In this paper, we propose a novel accurate Unsupervised method for joint Entity alignment (EA) and Dangling entity detection (DED), called UED. The UED mines the literal semantic information to generate pseudo entity pairs and globally guided alignment information for EA and then utilizes the EA results to assist the DED. We construct a medical cross-lingual knowledge graph dataset, MedED, providing data for both the EA and DED tasks. Extensive experiments demonstrate that in the EA task, UED achieves EA results comparable to those of state-of-the-art supervised EA baselines and outperforms the current state-of-the-art EA methods by combining supervised EA data. For the DED task, UED obtains high-quality results without supervision.

preprint2022arXiv

Automatic Biomedical Term Clustering by Learning Fine-grained Term Representations

Term clustering is important in biomedical knowledge graph construction. Using similarities between terms embedding is helpful for term clustering. State-of-the-art term embeddings leverage pretrained language models to encode terms, and use synonyms and relation knowledge from knowledge graphs to guide contrastive learning. These embeddings provide close embeddings for terms belonging to the same concept. However, from our probing experiments, these embeddings are not sensitive to minor textual differences which leads to failure for biomedical term clustering. To alleviate this problem, we adjust the sampling strategy in pretraining term embeddings by providing dynamic hard positive and negative samples during contrastive learning to learn fine-grained representations which result in better biomedical term clustering. We name our proposed method as CODER++, and it has been applied in clustering biomedical concepts in the newly released Biomedical Knowledge Graph named BIOS.

preprint2022arXiv

BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model

Pretrained language models have served as important backbones for natural language processing. Recently, in-domain pretraining has been shown to benefit various domain-specific downstream tasks. In the biomedical domain, natural language generation (NLG) tasks are of critical importance, while understudied. Approaching natural language understanding (NLU) tasks as NLG achieves satisfying performance in the general domain through constrained language generation or language prompting. We emphasize the lack of in-domain generative language models and the unsystematic generative downstream benchmarks in the biomedical domain, hindering the development of the research community. In this work, we introduce the generative language model BioBART that adapts BART to the biomedical domain. We collate various biomedical language generation tasks including dialogue, summarization, entity linking, and named entity recognition. BioBART pretrained on PubMed abstracts has enhanced performance compared to BART and set strong baselines on several tasks. Furthermore, we conduct ablation studies on the pretraining tasks for BioBART and find that sentence permutation has negative effects on downstream tasks.

preprint2022arXiv

BIOS: An Algorithmically Generated Biomedical Knowledge Graph

Biomedical knowledge graphs (BioMedKGs) are essential infrastructures for biomedical and healthcare big data and artificial intelligence (AI), facilitating natural language processing, model development, and data exchange. For decades, these knowledge graphs have been developed via expert curation; however, this method can no longer keep up with today's AI development, and a transition to algorithmically generated BioMedKGs is necessary. In this work, we introduce the Biomedical Informatics Ontology System (BIOS), the first large-scale publicly available BioMedKG generated completely by machine learning algorithms. BIOS currently contains 4.1 million concepts, 7.4 million terms in two languages, and 7.3 million relation triplets. We present the methodology for developing BIOS, including the curation of raw biomedical terms, computational identification of synonymous terms and aggregation of these terms to create concept nodes, semantic type classification of the concepts, relation identification, and biomedical machine translation. We provide statistics on the current BIOS content and perform preliminary assessments of term quality, synonym grouping, and relation extraction. The results suggest that machine learning-based BioMedKG development is a viable alternative to traditional expert curation.

preprint2022arXiv

Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification

Medical automatic diagnosis aims to imitate human doctors in real-world diagnostic processes and to achieve accurate diagnoses by interacting with the patients. The task is formulated as a sequential decision-making problem with a series of symptom inquiring steps and the final diagnosis. Recent research has studied incorporating reinforcement learning for symptom inquiring and classification techniques for disease diagnosis, respectively. However, studies on efficiently and effectively combining the two procedures are still lacking. To address this issue, we devise an adaptive mechanism to align reinforcement learning and classification methods using distribution entropy as the medium. Additionally, we created a new dataset for patient simulation to address the lacking of large-scale evaluation benchmarks. The dataset is extracted from the MedlinePlus knowledge base and contains significantly more diseases and more comprehensive symptoms and examination information than existing datasets. Experimental evaluation shows that our method outperforms three current state-of-the-art methods on different datasets by achieving higher medical diagnosis accuracy with fewer inquiring turns.

preprint2022arXiv

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

Entities lie in the heart of biomedical natural language understanding, and the biomedical entity linking (EL) task remains challenging due to the fine-grained and diversiform concept names. Generative methods achieve remarkable performances in general domain EL with less memory usage while requiring expensive pre-training. Previous biomedical EL methods leverage synonyms from knowledge bases (KB) which is not trivial to inject into a generative method. In this work, we use a generative approach to model biomedical EL and propose to inject synonyms knowledge in it. We propose KB-guided pre-training by constructing synthetic samples with synonyms and definitions from KB and require the model to recover concept names. We also propose synonyms-aware fine-tuning to select concept names for training, and propose decoder prompt and multi-synonyms constrained prefix tree for inference. Our method achieves state-of-the-art results on several biomedical EL tasks without candidate selection which displays the effectiveness of proposed pre-training and fine-tuning strategies.

preprint2022arXiv

Multimodal Learning on Graphs for Disease Relation Extraction

Objective: Disease knowledge graphs are a way to connect, organize, and access disparate information about diseases with numerous benefits for artificial intelligence (AI). To create knowledge graphs, it is necessary to extract knowledge from multimodal datasets in the form of relationships between disease concepts and normalize both concepts and relationship types. Methods: We introduce REMAP, a multimodal approach for disease relation extraction and classification. The REMAP machine learning approach jointly embeds a partial, incomplete knowledge graph and a medical language dataset into a compact latent vector space, followed by aligning the multimodal embeddings for optimal disease relation extraction. Results: We apply REMAP approach to a disease knowledge graph with 96,913 relations and a text dataset of 1.24 million sentences. On a dataset annotated by human experts, REMAP improves text-based disease relation extraction by 10.0% (accuracy) and 17.2% (F1-score) by fusing disease knowledge graphs with text information. Further, REMAP leverages text information to recommend new relationships in the knowledge graph, outperforming graph-based methods by 8.4% (accuracy) and 10.4% (F1-score). Conclusion: REMAP is a multimodal approach for extracting and classifying disease relationships by fusing structured knowledge and text information. REMAP provides a flexible neural architecture to easily find, access, and validate AI-driven relationships between disease concepts.

preprint2022arXiv

Semi-constraint Optimal Transport for Entity Alignment with Dangling Cases

Entity alignment (EA) merges knowledge graphs (KGs) by identifying the equivalent entities in different graphs, which can effectively enrich knowledge representations of KGs. However, in practice, different KGs often include dangling entities whose counterparts cannot be found in the other graph, which limits the performance of EA methods. To improve EA with dangling entities, we propose an unsupervised method called Semi-constraint Optimal Transport for Entity Alignment in Dangling cases (SoTead). Our main idea is to model the entity alignment between two KGs as an optimal transport problem from one KG's entities to the others. First, we set pseudo entity pairs between KGs based on pretrained word embeddings. Then, we conduct contrastive metric learning to obtain the transport cost between each entity pair. Finally, we introduce a virtual entity for each KG to "align" the dangling entities from the other KGs, which relaxes the optimization constraints and leads to a semi-constraint optimal transport. In the experimental part, we first show the superiority of SoTead on a commonly-used entity alignment dataset. Besides, to analyze the ability for dangling entity detection with other baselines, we construct a medical cross-lingual knowledge graph dataset, MedED, where our SoTead also reaches state-of-the-art performance.

preprint2022arXiv

Sentence Alignment with Parallel Documents Facilitates Biomedical Machine Translation

Objective: Today's neural machine translation (NMT) can achieve near human-level translation quality and greatly facilitates international communications, but the lack of parallel corpora poses a key problem to the development of translation systems for highly specialized domains, such as biomedicine. This work presents an unsupervised algorithm for deriving parallel corpora from document-level translations by using sentence alignment and explores how training materials affect the performance of biomedical NMT systems. Materials and Methods: Document-level translations are mixed to train bilingual word embeddings (BWEs) for the evaluation of cross-lingual word similarity, and sentence distance is defined by combining semantic and positional similarities of the sentences. The alignment of sentences is formulated as an extended earth mover's distance problem. A Chinese-English biomedical parallel corpus is derived with the proposed algorithm using bilingual articles from UpToDate and translations of PubMed abstracts, which is then used for the training and evaluation of NMT. Results: On two manually aligned translation datasets, the proposed algorithm achieved accurate sentence alignment in the 1-to-1 cases and outperformed competing algorithms in the many-to-many cases. The NMT model fine-tuned on biomedical data significantly improved the in-domain translation quality (zh-en: +17.72 BLEU; en-zh: +17.02 BLEU). Both the size of the training data and the combination of different corpora can significantly affect the model's performance. Conclusion: The proposed algorithm relaxes the assumption for sentence alignment and effectively generates accurate translation pairs that facilitate training high quality biomedical NMT models.

preprint2022arXiv

SpinQ Triangulum: a commercial three-qubit desktop quantum computer

SpinQ Triangulum is the second generation of the desktop quantum computers designed and manufactured by SpinQ Technology. SpinQ's desktop quantum computer series, based on room temperature NMR spectrometer, provide light-weighted, cost-effective and maintenance-free quantum computing platforms that aim to provide real-device experience for quantum computing education for K-12 and college level. These platforms also feature quantum control design capabilities for studying quantum control and quantum noise. Compared with the first generation product, the two-qubit SpinQ Gemini, Triangulum features a three-qubit QPU, smaller dimensions (61 * 33 * 56 cm^3) and lighter (40 kg). Furthermore, the magnetic field is more stable and the performance of quantum control is more accurate. This paper introduces the system design of Triangulum and its new features. As an example of performing quantum computing tasks, we present the implementation of the Harrow-Hassidim-Lloyd (HHL) algorithm on Triangulum, demonstrating Triangulum's capability of undertaking complex quantum computing tasks. SpinQ will continue to develop desktop quantum computing platform with more qubits. Meanwhile, a simplified version of SpinQ Gemini, namely Gemini Mini (https://www.spinq.cn/products#geminiMini-anchor) , has been recently realised. Gemini Mini is much more portable (20* 35 * 26 cm^3, 14 kg) and affordable for most K-12 schools around the world.

preprint2021arXiv

SpinQ Gemini: a desktop quantum computer for education and research

SpinQ Gemini is a commercial desktop quantum computer designed and manufactured by SpinQ Technology. It is an integrated hardware-software system. The first generation product with two qubits was launched in January 2020. The hardware is based on NMR spectrometer, with permanent magnets providing $\sim 1$ T magnetic field. SpinQ Gemini operates under room temperature ($0$-$30^{\circ}$C), highlighting its lightweight (55 kg with a volume of $70\times 40 \times 80$ cm$^3$), cost-effective (under $50$k USD), and maintenance-free. SpinQ Gemini aims to provide real-device experience for quantum computing education for K-12 and at the college level. It also features quantum control design capabilities that benefit the researchers studying quantum control and quantum noise. Since its first launch, SpinQ Gemini has been shipped to institutions in Canada, Taiwan and Mainland China. This paper introduces the system of design of SpinQ Gemini, from hardware to software. We also demonstrate examples for performing quantum computing tasks on SpinQ Gemini, including one task for a variational quantum eigensolver of a two-qubit Heisenberg model. The next generations of SpinQ quantum computing devices will adopt models of more qubits, advanced control functions for researchers with comparable cost, as well as simplified models for much lower cost (under $5$k USD) for K-12 education. We believe that low-cost portable quantum computer products will facilitate hands-on experience for teaching quantum computing at all levels, well-prepare younger generations of students and researchers for the future of quantum technologies.

preprint2020arXiv

High-throughput relation extraction algorithm development associating knowledge articles and electronic health records

Objective: Medical relations are the core components of medical knowledge graphs that are needed for healthcare artificial intelligence. However, the requirement of expert annotation by conventional algorithm development processes creates a major bottleneck for mining new relations. In this paper, we present Hi-RES, a framework for high-throughput relation extraction algorithm development. We also show that combining knowledge articles with electronic health records (EHRs) significantly increases the classification accuracy. Methods: We use relation triplets obtained from structured databases and semistructured webpages to label sentences from target corpora as positive training samples. Two methods are also provided for creating improved negative samples by combining positive samples with naïve negative samples. We propose a common model that summarizes sentence information using large-scale pretrained language models and multi-instance attention, which then joins with the concept embeddings trained from the EHRs for relation prediction. Results: We apply the Hi-RES framework to develop classification algorithms for disorder-disorder relations and disorder-location relations. Millions of sentences are created as training data. Using pretrained language models and EHR-based embeddings individually provides considerable accuracy increases over those of previous models. Joining them together further tremendously increases the accuracy to 0.947 and 0.998 for the two sets of relations, respectively, which are 10-17 percentage points higher than those of previous models. Conclusion: Hi-RES is an efficient framework for achieving high-throughput and accurate relation extraction algorithm development.

preprint2016arXiv

Dirac Fermions induced in strained zigzag phosphorus nanotubes and the applications in field effect transistors

In this work, Dirac fermions have been obtained and engineered in one-dimensional (1D) zigzag phosphorus nanotubes (ZPNTs). We have performed a comprehensive first-principle computational study of the electronic properties of ZPNTs with various diameters. The results indicate that as the lattice parameter (Lc) along axial direction increases, ZPNTs undergo transitions from metal to semimetal and semimetal to semiconductor, whereas Dirac fermions appear at Lc ranging from 3.90Å to 4.10Å. In particular, a field effect transistor (FET) based on a 12-ZPNT (with 12 unit cells in transverse direction) exhibits semiconductor behaviors with efficient gate-effect modulation at Lc= 4.60Å. However, only weak gate modulation is demonstrated when the nanotube becomes semimetal at Lc= 4.10Å. This study indicates that ZPNTs are profoundly appealing in applications in the strain sensors. Our findings pave the way for development of high-performance strain-engineered electronics based on Dirac Fermions in 1D materials.

preprint2016arXiv

Strain effect engineered in α-Al2O3/monolayer MoS2 interface by first principle calculations

With the advances in low dimensional transition metal dichalcolgenides (TMDCs) based metal oxide semiconductor field effect transistor (MOSFET), the interface between semiconductors and dielectrics has received considerable attention due to its dramatic effects on the morphology and charge transport of semiconductors. In this study, first principle calculations were utilized to investigate the strain effect induced by the interface between Al2O3 (0001) and monolayer MoS2. The results indicate that Al2O3 in 1.3nm thickness can apply the strain of 0.3% on MoS2 monolayer. The strain effect monotonically increases with the larger thickness of the dielectric layer. Also, the study on temperature effect indicates the monotonic lattice expansion induced by the higher temperature. Our study proposes that the dielectric engineering can be an effective tool for strain effect in the nanotechnology.

preprint2015arXiv

A Survey on Operational State Complexity

Descriptional complexity is the study of the conciseness of the various models representing formal languages. The state complexity of a regular language is the size, measured by the number of states of the smallest, either deterministic or nondeterministic, finite automaton that recognises it. Operational state complexity is the study of the state complexity of operations over languages. In this survey, we review the state complexities of individual regularity preserving language operations on regular and some subregular languages. Then we revisit the state complexities of the combination of individual operations. We also review methods of estimation and approximation of state complexity of more complex combined operations.

preprint2015arXiv

Computing Supply Function Equilibria via Spline Approximations

The supply function equilibrium (SFE) is a model for competition in markets where each firm offers a schedule of prices and quantities to face demand uncertainty, and has been successfully applied to wholesale electricity markets. However, characterizing the SFE is difficult, both analytically and numerically. In this paper, we first present a specialized algorithm for capacity constrained asymmetric duopoly markets with affine costs. We show that solving the first order conditions (a system of differential equations) using spline approximations is equivalent to solving a least squares problem, which makes the algorithm highly efficient. We also propose using splines as a way to improve a recently introduced general algorithm, so that the equilibrium can be found more easily and faster with less user intervention. We show asymptotic convergence of the approximations to the true equilibria for both algorithms, and illustrate their performance with numerical examples.

preprint2012arXiv

A Survey of Prediction Using Social Media

Social media comprises interactive applications and platforms for creating, sharing and exchange of user-generated contents. The past ten years have brought huge growth in social media, especially online social networking services, and it is changing our ways to organize and communicate. It aggregates opinions and feelings of diverse groups of people at low cost. Mining the attributes and contents of social media gives us an opportunity to discover social structure characteristics, analyze action patterns qualitatively and quantitatively, and sometimes the ability to predict future human related events. In this paper, we firstly discuss the realms which can be predicted with current social media, then overview available predictors and techniques of prediction, and finally discuss challenges and possible future directions.

preprint2012arXiv

An Empirical Study of How Users Adopt Famous Entities

Users of social networking services construct their personal social networks by creating asymmetric and symmetric social links. Users usually follow friends and selected famous entities that include celebrities and news agencies. In this paper, we investigate how users follow famous entities. We statically and dynamically analyze data within a huge social networking service with a manually classified set of famous entities. The results show that the in-degree of famous entities does not fit to power-law distribution. Conversely, the maximum number of famous followees in one category for each user shows power-law property. To our best knowledge, there is no research work on this topic with human-chosen famous entity dataset in real life. These findings might be helpful in microblogging marketing and user classification.

preprint2011arXiv

Introducing the Adaptive Convex Enveloping

Convexity, though extremely important in mathematical programming, has not drawn enough attention in the field of dynamic programming. This paper gives conditions for verifying convexity of the cost-to-go functions, and introduces an accurate, fast and reliable algorithm for solving convex dynamic programs with multivariate continuous states and actions, called Adaptive Convex Enveloping. This is a short introduction of the core technique created and used in my dissertation, so it is less formal, and misses some parts, such as literature review and reference, compared to a full journal paper.

preprint2010arXiv

State Complexity of Catenation Combined with Star and Reversal

This paper is a continuation of our research work on state complexity of combined operations. Motivated by applications, we study the state complexities of two particular combined operations: catenation combined with star and catenation combined with reversal. We show that the state complexities of both of these combined operations are considerably less than the compositions of the state complexities of their individual participating operations.

preprint2010arXiv

State Complexity of Two Combined Operations: Reversal-Catenation and Star-Catenation

In this paper, we show that, due to the structural properties of the resulting automaton obtained from a prior operation, the state complexity of a combined operation may not be equal but close to the mathematical composition of the state complexities of its component operations. In particular, we provide two witness combined operations: reversal combined with catenation and star combined with catenation.

preprint2010arXiv

State complexity of union and intersection combined with star and reversal

In this paper, we study the state complexities of union and intersection combined with star and reversal, respectively. We obtain the state complexities of these combined operations on regular languages and show that they are less than the mathematical composition of the state complexities of their individual participating operations.

preprint2010arXiv

Transition Complexity of Incomplete DFAs

In this paper, we consider the transition complexity of regular languages based on the incomplete deterministic finite automata. A number of results on Boolean operations have been obtained. It is shown that the transition complexity results for union and complementation are very different from the state complexity results for the same operations. However, for intersection, the transition complexity result is similar to that of state complexity.

preprint2009arXiv

Hierarchy and equivalence of multi-letter quantum finite automata

Multi-letter {\it quantum finite automata} (QFAs) were a new one-way QFA model proposed recently by Belovs, Rosmanis, and Smotrovs (LNCS, Vol. 4588, Springer, Berlin, 2007, pp. 60-71), and they showed that multi-letter QFAs can accept with no error some regular languages ($(a+b)^{*}b$) that are unacceptable by the one-way QFAs. In this paper, we continue to study multi-letter QFAs. We mainly focus on two issues: (1) we show that $(k+1)$-letter QFAs are computationally more powerful than $k$-letter QFAs, that is, $(k+1)$-letter QFAs can accept some regular languages that are unacceptable by any $k$-letter QFA. A comparison with the one-way QFAs is made by some examples; (2) we prove that a $k_{1}$-letter QFA ${\cal A}_1$ and another $k_{2}$-letter QFA ${\cal A}_2$ are equivalent if and only if they are $(n_{1}+n_{2})^{4}+k-1$-equivalent, and the time complexity of determining the equivalence of two multi-letter QFAs using this method is $O(n^{12}+k^{2}n^{4}+kn^{8})$, where $n_{1}$ and $n_{2}$ are the numbers of states of ${\cal A}_{1}$ and ${\cal A}_{2}$, respectively, and $k=\max(k_{1},k_{2})$. Some other issues are addressed for further consideration.

Sheng Yu

What is connected

Connect this record

See the researcher in context

Building this map preview

24 published item(s)

An Accurate Unsupervised Method for Joint Entity Alignment and Dangling Entity Detection

Automatic Biomedical Term Clustering by Learning Fine-grained Term Representations

BioBART: Pretraining and Evaluation of A Biomedical Generative Language Model

BIOS: An Algorithmically Generated Biomedical Knowledge Graph

Efficient Symptom Inquiring and Diagnosis via Adaptive Alignment of Reinforcement Learning and Classification

Generative Biomedical Entity Linking via Knowledge Base-Guided Pre-training and Synonyms-Aware Fine-tuning

Multimodal Learning on Graphs for Disease Relation Extraction

Semi-constraint Optimal Transport for Entity Alignment with Dangling Cases

Sentence Alignment with Parallel Documents Facilitates Biomedical Machine Translation

SpinQ Triangulum: a commercial three-qubit desktop quantum computer

SpinQ Gemini: a desktop quantum computer for education and research

High-throughput relation extraction algorithm development associating knowledge articles and electronic health records

Dirac Fermions induced in strained zigzag phosphorus nanotubes and the applications in field effect transistors

Strain effect engineered in α-Al2O3/monolayer MoS2 interface by first principle calculations

A Survey on Operational State Complexity

Computing Supply Function Equilibria via Spline Approximations

A Survey of Prediction Using Social Media

An Empirical Study of How Users Adopt Famous Entities

Introducing the Adaptive Convex Enveloping

State Complexity of Catenation Combined with Star and Reversal

State Complexity of Two Combined Operations: Reversal-Catenation and Star-Catenation

State complexity of union and intersection combined with star and reversal

Transition Complexity of Incomplete DFAs

Hierarchy and equivalence of multi-letter quantum finite automata