Researcher profile

Martin Vetterli

Martin Vetterli contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2023arXiv

Blind as a bat: audible echolocation on small robots

For safe and efficient operation, mobile robots need to perceive their environment, and in particular, perform tasks such as obstacle detection, localization, and mapping. Although robots are often equipped with microphones and speakers, the audio modality is rarely used for these tasks. Compared to the localization of sound sources, for which many practical solutions exist, algorithms for active echolocation are less developed and often rely on hardware requirements that are out of reach for small robots. We propose an end-to-end pipeline for sound-based localization and mapping that is targeted at, but not limited to, robots equipped with only simple buzzers and low-end microphones. The method is model-based, runs in real time, and requires no prior calibration or training. We successfully test the algorithm on the e-puck robot with its integrated audio hardware, and on the Crazyflie drone, for which we design a reproducible audio extension deck. We achieve centimeter-level wall localization on both platforms when the robots are static during the measurement process. Even in the more challenging setting of a flying drone, we can successfully localize walls, which we demonstrate in a proof-of-concept multi-wall localization and mapping demo.

preprint2022arXiv

How Asynchronous Events Encode Video

As event-based sensing gains in popularity, theoretical understanding is needed to harness this technology's potential. Instead of recording video by capturing frames, event-based cameras have sensors that emit events when their inputs change, thus encoding information in the timing of events. This creates new challenges in establishing reconstruction guarantees and algorithms, but also provides advantages over frame-based video. We use time encoding machines to model event-based sensors: TEMs also encode their inputs by emitting events characterized by their timing and reconstruction from time encodings is well understood. We consider the case of time encoding bandlimited video and demonstrate a dependence between spatial sensor density and overall spatial and temporal resolution. Such a dependence does not occur in frame-based video, where temporal resolution depends solely on the frame rate of the video and spatial resolution depends solely on the pixel grid. However, this dependence arises naturally in event-based video and allows oversampling in space to provide better time resolution. As such, event-based vision encourages using more sensors that emit fewer events over time.

preprint2022arXiv

Learning rich optical embeddings for privacy-preserving lensless image classification

By replacing the lens with a thin optical element, lensless imaging enables new applications and solutions beyond those supported by traditional camera design and post-processing, e.g. compact and lightweight form factors and visual privacy. The latter arises from the highly multiplexed measurements of lensless cameras, which require knowledge of the imaging system to recover a recognizable image. In this work, we exploit this unique multiplexing property: casting the optics as an encoder that produces learned embeddings directly at the camera sensor. We do so in the context of image classification, where we jointly optimize the encoder's parameters and those of an image classifier in an end-to-end fashion. Our experiments show that jointly learning the lensless optical encoder and the digital processing allows for lower resolution embeddings at the sensor, and hence better privacy as it is much harder to recover meaningful images from these measurements. Additional experiments show that such an optimization allows for lensless measurements that are more robust to typical real-world image transformations. While this work focuses on classification, the proposed programmable lensless camera and end-to-end optimization can be applied to other computational imaging tasks.

preprint2022arXiv

LenslessPiCam: A Hardware and Software Platform for Lensless Computational Imaging with a Raspberry Pi

Lensless imaging seeks to replace/remove the lens in a conventional imaging system. The earliest cameras were in fact lensless, relying on long exposure times to form images on the other end of a small aperture in a darkened room/container (camera obscura). The introduction of a lens allowed for more light throughput and therefore shorter exposure times, while retaining sharp focus. The incorporation of digital sensors readily enabled the use of computational imaging techniques to post-process and enhance raw images (e.g. via deblurring, inpainting, denoising, sharpening). Recently, imaging scientists have started leveraging computational imaging as an integral part of lensless imaging systems, allowing them to form viewable images from the highly multiplexed raw measurements of lensless cameras (see [5] and references therein for a comprehensive treatment of lensless imaging). This represents a real paradigm shift in camera system design as there is more flexibility to cater the hardware to the application at hand (e.g. lightweight or flat designs). This increased flexibility comes however at the price of a more demanding post-processing of the raw digital recordings and a tighter integration of sensing and computation, often difficult to achieve in practice due to inefficient interactions between the various communities of scientists involved. With LenslessPiCam, we provide an easily accessible hardware and software framework to enable researchers, hobbyists, and students to implement and explore practical and computational aspects of lensless imaging. We also provide detailed guides and exercises so that LenslessPiCam can be used as an educational resource, and point to results from our graduate-level signal processing course.

preprint2022arXiv

Lippmann Photography: A Signal Processing Perspective

Lippmann (or interferential) photography is the first and only analog photography method that can capture the full color spectrum of a scene in a single take. This technique, invented more than a hundred years ago, records the colors by creating interference patterns inside the photosensitive plate. Lippmann photography provides a great opportunity to demonstrate several fundamental concepts in signal processing. Conversely, a signal processing perspective enables us to shed new light on the technique. In our previous work, we analyzed the spectra of historical Lippmann plates using our own mathematical model. In this paper, we provide the derivation of this model and validate it experimentally. We highlight new behaviors whose explanations were ignored by physicists to date. In particular, we show that the spectra generated by Lippmann plates are in fact distorted versions of the original spectra. We also show that these distortions are influenced by the thickness of the plate and the reflection coefficient of the reflective medium used in the capture of the photographs. We verify our model with extensive experiments on our own Lippmann photographs.

preprint2021arXiv

Asynchrony Increases Efficiency: Time Encoding of Videos and Low-Rank Signals

In event-based sensing, many sensors independently and asynchronously emit events when there is a change in their input. Event-based sensing can present significant improvements in power efficiency when compared to traditional sampling, because (1) the output is a stream of events where the important information lies in the timing of the events, and (2) the sensor can easily be controlled to output information only when interesting activity occurs at the input. Moreover, event-based sampling can often provide better resolution than standard uniform sampling. Not only does this occur because individual event-based sensors have higher temporal resolution, it also occurs because the asynchrony of events allows for less redundant and more informative encoding. We would like to explain how such curious results come about. To do so, we use ideal time encoding machines as a proxy for event-based sensors. We explore time encoding of signals with low rank structure, and apply the resulting theory to video. We then see how the asynchronous firing times of the time encoding machines allow for better reconstruction than in the standard sampling case, if we have a high spatial density of time encoding machines that fire less frequently.

preprint2021arXiv

CPGD: Cadzow Plug-and-Play Gradient Descent for Generalised FRI

Finite rate of innovation (FRI) is a powerful reconstruction framework enabling the recovery of sparse Dirac streams from uniform low-pass filtered samples. An extension of this framework, called generalised FRI (genFRI), has been recently proposed for handling cases with arbitrary linear measurement models. In this context, signal reconstruction amounts to solving a joint constrained optimisation problem, yielding estimates of both the Fourier series coefficients of the Dirac stream and its so-called annihilating filter, involved in the regularisation term. This optimisation problem is however highly non convex and non linear in the data. Moreover, the proposed numerical solver is computationally intensive and without convergence guarantee. In this work, we propose an implicit formulation of the genFRI problem. To this end, we leverage a novel regularisation term which does not depend explicitly on the unknown annihilating filter yet enforces sufficient structure in the solution for stable recovery. The resulting optimisation problem is still non convex, but simpler since linear in the data and with less unknowns. We solve it by means of a provably convergent proximal gradient descent (PGD) method. Since the proximal step does not admit a simple closed-form expression, we propose an inexact PGD method, coined as Cadzow plug-and-play gradient descent (CPGD). The latter approximates the proximal steps by means of Cadzow denoising, a well-known denoising algorithm in FRI. We provide local fixed-point convergence guarantees for CPGD. Through extensive numerical simulations, we demonstrate the superiority of CPGD against the state-of-the-art in the case of non uniform time samples.

preprint2020arXiv

Encoding and Decoding Mixed Bandlimited Signals using Spiking Integrate-and-Fire Neurons

Conventional sampling focuses on encoding and decoding bandlimited signals by recording signal amplitudes at known time points. Alternately, sampling can be approached using biologically-inspired schemes. Among these are integrate-and-fire time encoding machines (IF-TEMs). They behave like simplified versions of spiking neurons and encode their input using spike times rather than amplitudes. Moreover, when multiple of these neurons jointly process a set of mixed signals, they form one layer in a feedforward spiking neural network. In this paper, we investigate the encoding and decoding potential of such a layer. We propose a setup to sample a set of bandlimited signals, by mixing them and sampling the result using different IF-TEMs. We provide conditions for perfect recovery of the set of signals from the samples in the noiseless case, and suggest an algorithm to perform the reconstruction.

preprint2020arXiv

Matrix recovery from bilinear and quadratic measurements

Matrix (or operator) recovery from linear measurements is a well-studied problem. However, there are situations where only bilinear or quadratic measurements are available. A bilinear or quadratic problem can easily be transformed into a linear one, but it raises questions when the linearized problem is solvable and what is the cost of linearization. In this work, we study a few specific cases of this general problem and show when the bilinear problem is solvable. Using this result and certain properties of polynomial rings, we present a scenario when the quadratic problem can be linearized at the cost of just a linear number of additional measurements. Finally, we link our results back to two applications that inspired it: Time Encoding Machines and Continuous Localisation.

preprint2019arXiv

Sampling and Reconstruction of Bandlimited Signals with Multi-Channel Time Encoding

Sampling is classically performed by recording the amplitude of an input signal at given time instants; however, sampling and reconstructing a signal using multiple devices in parallel becomes a more difficult problem to solve when the devices have an unknown shift in their clocks. Alternatively, one can record the times at which a signal (or its integral) crosses given thresholds. This can model integrate-and-fire neurons, for example, and has been studied by Lazar and Tóth under the name of ``Time Encoding Machines''. This sampling method is closer to what is found in nature. In this paper, we show that, when using time encoding machines, reconstruction from multiple channels has a more intuitive solution, and does not require the knowledge of the shifts between machines. We show that, if single-channel time encoding can sample and perfectly reconstruct a $\mathbf{2Ω}$-bandlimited signal, then $\mathbf{M}$-channel time encoding with shifted integrators can sample and perfectly reconstruct a signal with $\mathbf{M}$ times the bandwidth. Furthermore, we present an algorithm to perform this reconstruction and prove that it converges to the correct unique solution, in the noiseless case, without knowledge of the relative shifts between the integrators of the machines. This is quite unlike classical multi-channel sampling, where unknown shifts between sampling devices pose a problem for perfect reconstruction.

preprint2019arXiv

Shapes from Echoes: Uniqueness from Point-to-Plane Distance Matrices

We study the problem of localizing a configuration of points and planes from the collection of point-to-plane distances. This problem models simultaneous localization and mapping from acoustic echoes as well as the notable "structure from sound" approach to microphone localization with unknown sources. In our earlier work we proposed computational methods for localization from point-to-plane distances and noted that such localization suffers from various ambiguities beyond the usual rigid body motions; in this paper we provide a complete characterization of uniqueness. We enumerate equivalence classes of configurations which lead to the same distance measurements as a function of the number of planes and points, and algebraically characterize the related transformations in both 2D and 3D. Here we only discuss uniqueness; computational tools and heuristics for practical localization from point-to-plane distances using sound will be addressed in a companion paper.