Source author record

Jean-Samuel Lauzon

Jean-Samuel Lauzon appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Sound Computer Vision eess.IV Robotics

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

ODAS: Open embeddeD Audition System

Artificial audition aims at providing hearing capabilities to machines, computers and robots. Existing frameworks in robot audition offer interesting sound source localization, tracking and separation performance, although involve a significant amount of computations that limit their use on robots with embedded computing capabilities. This paper presents ODAS, the Open embeddeD Audition System framework, which includes strategies to reduce the computational load and perform robot audition tasks on low-cost embedded computing systems. It presents key features of ODAS, along with cases illustrating its uses in different robots and artificial audition applications.

preprint2022arXiv

SMP-PHAT: Lightweight DoA Estimation by Merging Microphone Pairs

This paper introduces SMP-PHAT, which performs direction of arrival (DoA) of sound estimation with a microphone array by merging pairs of microphones that are parallel in space. This approach reduces the number of pairwise cross-correlation computations, and brings down the number of flops and memory lookups when searching for DoA. Experiments on low-cost hardware with commonly used microphone arrays show that the proposed method provides the same accuracy as the former SRP-PHAT approach, while reducing the computational load by 39% in some cases.

preprint2020arXiv

3D Localization of a Sound Source Using Mobile Microphone Arrays Referenced by SLAM

A microphone array can provide a mobile robot with the capability of localizing, tracking and separating distant sound sources in 2D, i.e., estimating their relative elevation and azimuth. To combine acoustic data with visual information in real world settings, spatial correlation must be established. The approach explored in this paper consists of having two robots, each equipped with a microphone array, localizing themselves in a shared reference map using SLAM. Based on their locations, data from the microphone arrays are used to triangulate in 3D the location of a sound source in relation to the same map. This strategy results in a novel cooperative sound mapping approach using mobile microphone arrays. Trials are conducted using two mobile robots localizing a static or a moving sound source to examine in which conditions this is possible. Results suggest that errors under 0.3 m are observed when the relative angle between the two robots are above 30 degrees for a static sound source, while errors under 0.3 m for angles between 40 degrees and 140 degrees are observed with a moving sound source.

preprint2020arXiv

Dynamic Object Tracking and Masking for Visual SLAM

In dynamic environments, performance of visual SLAM techniques can be impaired by visual features taken from moving objects. One solution is to identify those objects so that their visual features can be removed for localization and mapping. This paper presents a simple and fast pipeline that uses deep neural networks, extended Kalman filters and visual SLAM to improve both localization and mapping in dynamic environments (around 14 fps on a GTX 1080). Results on the dynamic sequences from the TUM dataset using RTAB-Map as visual SLAM suggest that the approach achieves similar localization performance compared to other state-of-the-art methods, while also providing the position of the tracked dynamic objects, a 3D map free of those dynamic objects, better loop closure detection with the whole pipeline able to run on a robot moving at moderate speed.

preprint2020arXiv

GEV Beamforming Supported by DOA-based Masks Generated on Pairs of Microphones

Distant speech processing is a challenging task, especially when dealing with the cocktail party effect. Sound source separation is thus often required as a preprocessing step prior to speech recognition to improve the signal to distortion ratio (SDR). Recently, a combination of beamforming and speech separation networks have been proposed to improve the target source quality in the direction of arrival of interest. However, with this type of approach, the neural network needs to be trained in advance for a specific microphone array geometry, which limits versatility when adding/removing microphones, or changing the shape of the array. The solution presented in this paper is to train a neural network on pairs of microphones with different spacing and acoustic environmental conditions, and then use this network to estimate a time-frequency mask from all the pairs of microphones forming the array with an arbitrary shape. Using this mask, the target and noise covariance matrices can be estimated, and then used to perform generalized eigenvalue (GEV) beamforming. Results show that the proposed approach improves the SDR from 4.78 dB to 7.69 dB on average, for various microphone array geometries that correspond to commercially available hardware.

Jean-Samuel Lauzon

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

ODAS: Open embeddeD Audition System

SMP-PHAT: Lightweight DoA Estimation by Merging Microphone Pairs

3D Localization of a Sound Source Using Mobile Microphone Arrays Referenced by SLAM

Dynamic Object Tracking and Masking for Visual SLAM

GEV Beamforming Supported by DOA-based Masks Generated on Pairs of Microphones