Researcher profile

Catherine Lai

Catherine Lai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals

The impression we make on others depends not only on what we say, but also, to a large extent, on how we say it. As a sub-branch of affective computing and social signal processing, impression recognition has proven critical in both human-human conversations and spoken dialogue systems. However, most research has studied impressions only from the signals expressed by the emitter, ignoring the response from the receiver. In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset. This improved architecture makes use of cross-domain attention and regularization. The cross-domain attention consists of intra- and inter-attention mechanisms, which capture intra- and inter-domain relatedness, respectively. The cross-domain regularization includes knowledge distillation and similarity enhancement losses, which strengthen the feature connections between the emitter and receiver. The experimental evaluation verified the effectiveness of our approach. Our approach achieved a concordance correlation coefficient of 0.770 in competence dimension and 0.748 in warmth dimension.

preprint2022arXiv

Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE

This paper presents an analysis of speech synthesis quality achieved by simultaneously performing voice conversion and language code-switching using multilingual VQ-VAE speech synthesis in German, French, English and Italian. In this paper, we utilize VQ code indices representing phone information from VQ-VAE to perform code-switching and a VQ speaker code to perform voice conversion in a single system with a neural vocoder. Our analysis examines several aspects of code-switching including the number of language switches and the number of words involved in each switch. We found that speech synthesis quality degrades after increasing the number of language switches within an utterance and decreasing the number of words. We also found some evidence of accent transfer when performing voice conversion across languages as observed when a speaker's original language differs from the language of a synthetic target utterance. We present results from our listening tests and discuss the inherent difficulties of assessing accent transfer in speech synthesis. Our work highlights some of the limitations and strengths of using a semi-supervised end-to-end system like VQ-VAE for handling multilingual synthesis. Our work provides insight into why multilingual speech synthesis is challenging and we suggest some directions for expanding work in this area.