Source author record

Andrew Koh

Andrew Koh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language eess.AS Information Retrieval Sound

Catalog footprint

What is connected

2works

4topics

3close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning

In this paper, we propose an algorithm, Epochal Difficult Captions, to supplement the training of any model for the Automated Audio Captioning task. Epochal Difficult Captions is an elegant evolution to the keyword estimation task that previous work have used to train the encoder of the AAC model. Epochal Difficult Captions modifies the target captions based on a curriculum and a difficulty level determined as a function of current epoch. Epochal Difficult Captions can be used with any model architecture and is a lightweight function that does not increase training time. We test our results on three systems and show that using Epochal Difficult Captions consistently improves performance

preprint2022arXiv

Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss

In this paper, we tackle the new Language-Based Audio Retrieval task proposed in DCASE 2022. Firstly, we introduce a simple, scalable architecture which ties both the audio and text encoder together. Secondly, we show that using this architecture along with contrastive loss allows the model to significantly beat the performance of the baseline model. Finally, in addition to having an extremely low training memory requirement, we are able to use pretrained models as it is without needing to finetune them. We test our methods and show that using a combination of our methods beats the baseline scores significantly.

Andrew Koh

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning

Language-Based Audio Retrieval with Converging Tied Layers and Contrastive Loss