Source author record

Mutian He

Mutian He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Artificial Intelligence eess.AS Machine Learning Sound

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

End-to-end TTS requires a large amount of speech/text paired data to cover all necessary knowledge, particularly how to pronounce different words in diverse contexts, so that a neural model may learn such knowledge accordingly. But in real applications, such high demand of training data is hard to be satisfied and additional knowledge often needs to be injected manually. For example, to capture pronunciation knowledge on languages without regular orthography, a complicated grapheme-to-phoneme pipeline needs to be built based on a large structured pronunciation lexicon, leading to extra, sometimes high, costs to extend neural TTS to such languages. In this paper, we propose a framework to learn to automatically extract knowledge from unstructured external resources using a novel Token2Knowledge attention module. The framework is applied to build a TTS model named Neural Lexicon Reader that extracts pronunciations from raw lexicon texts in an end-to-end manner. Experiments show the proposed model significantly reduces pronunciation errors in low-resource, end-to-end Chinese TTS, and the lexicon-reading capability can be transferred to other languages with a smaller amount of data.

preprint2020arXiv

Neural Subgraph Isomorphism Counting

In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms. Different from other traditional graph learning problems such as node classification and link prediction, subgraph isomorphism counting is NP-complete and requires more global inference to oversee the whole graph. To make it scalable for large-scale graphs and patterns, we propose a learning framework which augments different representation learning architectures and iteratively attends pattern and target data graphs to memorize subgraph isomorphisms for the global counting. We develop both small graphs (<= 1,024 subgraph isomorphisms in each) and large graphs (<= 4,096 subgraph isomorphisms in each) sets to evaluate different models. A mutagenic compound dataset, MUTAG, is also used to evaluate neural models and demonstrate the success of transfer learning. While the learning based approach is inexact, we are able to generalize to count large patterns and data graphs in linear time compared to the exponential time of the original NP-complete problem. Experimental results show that learning based subgraph isomorphism counting can speed up the traditional algorithm, VF2, 10-1,000 times with acceptable errors. Domain adaptation based on fine-tuning also shows the usefulness of our approach in real-world applications.

preprint2020arXiv

On the Role of Conceptualization in Commonsense Knowledge Graph Construction

Commonsense knowledge graphs (CKGs) like Atomic and ASER are substantially different from conventional KGs as they consist of much larger number of nodes formed by loosely-structured text, which, though, enables them to handle highly diverse queries in natural language related to commonsense, leads to unique challenges for automatic KG construction methods. Besides identifying relations absent from the KG between nodes, such methods are also expected to explore absent nodes represented by text, in which different real-world things, or entities, may appear. To deal with the innumerable entities involved with commonsense in the real world, we introduce to CKG construction methods conceptualization, i.e., to view entities mentioned in text as instances of specific concepts or vice versa. We build synthetic triples by conceptualization, and further formulate the task as triple classification, handled by a discriminatory model with knowledge transferred from pretrained language models and fine-tuned by negative sampling. Experiments demonstrate that our methods can effectively identify plausible triples and expand the KG by triples of both new nodes and edges of high diversity and novelty.

Mutian He

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Subgraph Isomorphism Counting

On the Role of Conceptualization in Commonsense Knowledge Graph Construction