Source author record

Ander Arriandiaga

Ander Arriandiaga appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision eess.AS eess.IV Robotics

Catalog footprint

What is connected

2works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

We propose a method to address audio-visual target speaker enhancement in multi-talker environments using event-driven cameras. State of the art audio-visual speech separation methods shows that crucial information is the movement of the facial landmarks related to speech production. However, all approaches proposed so far work offline, using frame-based video input, making it difficult to process an audio-visual signal with low latency, for online applications. In order to overcome this limitation, we propose the use of event-driven cameras and exploit compression, high temporal resolution and low latency, for low cost and low latency motion feature extraction, going towards online embedded audio-visual speech processing. We use the event-driven optical flow estimation of the facial landmarks as input to a stacked Bidirectional LSTM trained to predict an Ideal Amplitude Mask that is then used to filter the noisy audio, to obtain the audio signal of the target speaker. The presented approach performs almost on par with the frame-based approach, with very low latency and computational cost.

preprint2020arXiv

Exploiting Event Cameras for Spatio-Temporal Prediction of Fast-Changing Trajectories

This paper investigates trajectory prediction for robotics, to improve the interaction of robots with moving targets, such as catching a bouncing ball. Unexpected, highly-non-linear trajectories cannot easily be predicted with regression-based fitting procedures, therefore we apply state of the art machine learning, specifically based on Long-Short Term Memory (LSTM) architectures. In addition, fast moving targets are better sensed using event cameras, which produce an asynchronous output triggered by spatial change, rather than at fixed temporal intervals as with traditional cameras. We investigate how LSTM models can be adapted for event camera data, and in particular look at the benefit of using asynchronously sampled data.