Source author record

Tomer Michaeli

Tomer Michaeli appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Information Theory math.IT math.CA math.FA physics.optics

Catalog footprint

What is connected

12works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Illumination Angular Spectrum Encoding for Controlling the Functionality of Diffractive Networks

Diffractive neural networks have recently emerged as a promising framework for all-optical computing. However, these networks are typically trained for a single task, limiting their potential adoption in systems requiring multiple functionalities. Existing approaches to achieving multi-task functionality either modify the mechanical configuration of the network per task or use a different illumination wavelength or polarization state for each task. In this work, we propose a new control mechanism, which is based on the illumination's angular spectrum. Specifically, we shape the illumination using an amplitude mask that selectively controls its angular spectrum. We employ different illumination masks for achieving different network functionalities, so that the mask serves as a unique task encoder. Interestingly, we show that effective control can be achieved over a very narrow angular range, within the paraxial regime. We numerically illustrate the proposed approach by training a single diffractive network to perform multiple image-to-image translation tasks. In particular, we demonstrate translating handwritten digits into typeset digits of different values, and translating handwritten English letters into typeset numbers and typeset Greek letters, where the type of the output is determined by the illumination's angular components. As we show, the proposed framework can work under different coherence conditions, and can be combined with existing control strategies, such as different wavelengths. Our results establish the illumination angular spectrum as a powerful degree of freedom for controlling diffractive networks, enabling a scalable and versatile framework for multi-task all-optical computing.

preprint2022arXiv

Energy awareness in low precision neural networks

Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices. Existing approaches for reducing power consumption rely on quite general principles, including avoidance of multiplication operations and aggressive quantization of weights and activations. However, these methods do not take into account the precise power consumed by each module in the network, and are therefore not optimal. In this paper we develop accurate power consumption models for all arithmetic operations in the DNN, under various working conditions. We reveal several important factors that have been overlooked to date. Based on our analysis, we present PANN (power-aware neural network), a simple approach for approximating any full-precision network by a low-power fixed-precision variant. Our method can be applied to a pre-trained network, and can also be used during training to achieve improved performance. In contrast to previous methods, PANN incurs only a minor degradation in accuracy w.r.t. the full-precision version of the network, even when working at the power-budget of a 2-bit quantized variant. In addition, our scheme enables to seamlessly traverse the power-accuracy trade-off at deployment time, which is a major advantage over existing quantization methods that are constrained to specific bit widths.

preprint2021arXiv

GAN "Steerability" without optimization

Recent research has shown remarkable success in revealing "steering" directions in the latent spaces of pre-trained GANs. These directions correspond to semantically meaningful image transformations e.g., shift, zoom, color manipulations), and have similar interpretable effects across all categories that the GAN can generate. Some methods focus on user-specified transformations, while others discover transformations in an unsupervised manner. However, all existing techniques rely on an optimization procedure to expose those directions, and offer no control over the degree of allowed interaction between different transformations. In this paper, we show that "steering" trajectories can be computed in closed form directly from the generator's weights without any form of training or optimization. This applies to user-prescribed geometric transformations, as well as to unsupervised discovery of more complex effects. Our approach allows determining both linear and nonlinear trajectories, and has many advantages over previous methods. In particular, we can control whether one transformation is allowed to come on the expense of another (e.g. zoom-in with or without allowing translation to keep the object centered). Moreover, we can determine the natural end-point of the trajectory, which corresponds to the largest extent to which a transformation can be applied without incurring degradation. Finally, we show how transferring attributes between images can be achieved without optimization, even across different categories.

preprint2020arXiv

Explorable Super Resolution

Single image super resolution (SR) has seen major performance leaps in recent years. However, existing methods do not allow exploring the infinitely many plausible reconstructions that might have given rise to the observed low-resolution (LR) image. These different explanations to the LR image may dramatically vary in their textures and fine details, and may often encode completely different semantic information. In this paper, we introduce the task of explorable super resolution. We propose a framework comprising a graphical user interface with a neural network backend, allowing editing the SR output so as to explore the abundance of plausible HR explanations to the LR input. At the heart of our method is a novel module that can wrap any existing SR network, analytically guaranteeing that its SR outputs would precisely match the LR input, when downsampled. Besides its importance in our setting, this module is guaranteed to decrease the reconstruction error of any SR network it wraps, and can be used to cope with blur kernels that are different from the one the network was trained for. We illustrate our approach in a variety of use cases, ranging from medical imaging and forensics, to graphics.

preprint2020arXiv

Unique Properties of Flat Minima in Deep Networks

It is well known that (stochastic) gradient descent has an implicit bias towards flat minima. In deep neural network training, this mechanism serves to screen out minima. However, the precise effect that this has on the trained network is not yet fully understood. In this paper, we characterize the flat minima in linear neural networks trained with a quadratic loss. First, we show that linear ResNets with zero initialization necessarily converge to the flattest of all minima. We then prove that these minima correspond to nearly balanced networks whereby the gain from the input to any intermediate representation does not change drastically from one layer to the next. Finally, we show that consecutive layers in flat minima solutions are coupled. That is, one of the left singular vectors of each weight matrix, equals one of the right singular vectors of the next matrix. This forms a distinct path from input to output, that, as we show, is dedicated to the signal that experiences the largest gain end-to-end. Experiments indicate that these properties are characteristic of both linear and nonlinear models trained in practice.

preprint2016arXiv

LMMSE Filtering in Feedback Systems with White Random Modes: Application to Tracking in Clutter

A generalized state space representation of dynamical systems with random modes switching according to a white random process is presented. The new formulation includes a term, in the dynamics equation, that depends on the most recent linear minimum mean squared error (LMMSE) estimate of the state. This can model the behavior of a feedback control system featuring a state estimator. The measurement equation is allowed to depend on the previous LMMSE estimate of the state, which can represent the fact that measurements are obtained from a validation window centered about the predicted measurement and not from the entire surveillance region. The LMMSE filter is derived for the considered problem. The approach is demonstrated in the context of target tracking in clutter and is shown to be competitive with several popular nonlinear methods.

preprint2016arXiv

Nonparametric Canonical Correlation Analysis

Canonical correlation analysis (CCA) is a classical representation learning technique for finding correlated variables in multi-view data. Several nonlinear extensions of the original linear CCA have been proposed, including kernel and deep neural network methods. These approaches seek maximally correlated projections among families of functions, which the user specifies (by choosing a kernel or neural network structure), and are computationally demanding. Interestingly, the theory of nonlinear CCA, without functional restrictions, had been studied in the population setting by Lancaster already in the 1950s, but these results have not inspired practical algorithms. We revisit Lancaster's theory to devise a practical algorithm for nonparametric CCA (NCCA). Specifically, we show that the solution can be expressed in terms of the singular value decomposition of a certain operator associated with the joint density of the views. Thus, by estimating the population density from data, NCCA reduces to solving an eigenvalue system, superficially like kernel CCA but, importantly, without requiring the inversion of any kernel matrix. We also derive a partially linear CCA (PLCCA) variant in which one of the views undergoes a linear projection while the other is nonparametric. Using a kernel density estimate based on a small number of nearest neighbors, our NCCA and PLCCA algorithms are memory-efficient, often run much faster, and perform better than kernel CCA and comparable to deep CCA.

preprint2012arXiv

Semi-Supervised Single- and Multi-Domain Regression with Multi-Domain Training

We address the problems of multi-domain and single-domain regression based on distinct and unpaired labeled training sets for each of the domains and a large unlabeled training set from all domains. We formulate these problems as a Bayesian estimation with partial knowledge of statistical relations. We propose a worst-case design strategy and study the resulting estimators. Our analysis explicitly accounts for the cardinality of the labeled sets and includes the special cases in which one of the labeled sets is very large or, in the other extreme, completely missing. We demonstrate our estimators in the context of removing expressions from facial images and in the context of audio-visual word recognition, and provide comparisons to several recently proposed multi-modal learning algorithms.

preprint2011arXiv

Partially Linear Estimation with Application to Sparse Signal Recovery From Measurement Pairs

We address the problem of estimating a random vector X from two sets of measurements Y and Z, such that the estimator is linear in Y. We show that the partially linear minimum mean squared error (PLMMSE) estimator does not require knowing the joint distribution of X and Y in full, but rather only its second-order moments. This renders it of potential interest in various applications. We further show that the PLMMSE method is minimax-optimal among all estimators that solely depend on the second-order statistics of X and Y. We demonstrate our approach in the context of recovering a signal, which is sparse in a unitary dictionary, from noisy observations of it and of a filtered version of it. We show that in this setting PLMMSE estimation has a clear computational advantage, while its performance is comparable to state-of-the-art algorithms. We apply our approach both in static and dynamic estimation applications. In the former category, we treat the problem of image enhancement from blurred/noisy image pairs, where we show that PLMMSE estimation performs only slightly worse than state-of-the art algorithms, while running an order of magnitude faster. In the dynamic setting, we provide a recursive implementation of the estimator and demonstrate its utility in the context of tracking maneuvering targets from position and acceleration measurements.

preprint2011arXiv

Xampling at the Rate of Innovation

We address the problem of recovering signals from samples taken at their rate of innovation. Our only assumption is that the sampling system is such that the parameters defining the signal can be stably determined from the samples, a condition that lies at the heart of every sampling theorem. Consequently, our analysis subsumes previously studied nonlinear acquisition devices and nonlinear signal classes. In particular, we do not restrict attention to memoryless nonlinear distortions or to union-of-subspace models. This allows treatment of various finite-rate-of-innovation (FRI) signals that were not previously studied, including, for example, continuous phase modulation transmissions. Our strategy relies on minimizing the error between the measured samples and those corresponding to our signal estimate. This least-squares (LS) objective is generally non-convex and might possess many local minima. Nevertheless, we prove that under the stability hypothesis, any optimization method designed to trap a stationary point of the LS criterion necessarily converges to the true solution. We demonstrate our approach in the context of recovering pulse streams in settings that were not previously treated. Furthermore, in situations for which other algorithms are applicable, we show that our method is often preferable in terms of noise robustness.

preprint2010arXiv

Performance Bounds and Design Criteria for Estimating Finite Rate of Innovation Signals

In this paper, we consider the problem of estimating finite rate of innovation (FRI) signals from noisy measurements, and specifically analyze the interaction between FRI techniques and the underlying sampling methods. We first obtain a fundamental limit on the estimation accuracy attainable regardless of the sampling method. Next, we provide a bound on the performance achievable using any specific sampling approach. Essential differences between the noisy and noise-free cases arise from this analysis. In particular, we identify settings in which noise-free recovery techniques deteriorate substantially under slight noise levels, thus quantifying the numerical instability inherent in such methods. This instability, which is only present in some families of FRI signals, is shown to be related to a specific type of structure, which can be characterized by viewing the signal model as a union of subspaces. Finally, we develop a methodology for choosing the optimal sampling kernels based on a generalization of the Karhunen--Loève transform. The results are illustrated for several types of time-delay estimation problems.

preprint2009arXiv

Non-Invertible Gabor Transforms

Time-frequency analysis, such as the Gabor transform, plays an important role in many signal processing applications. The redundancy of such representations is often directly related to the computational load of any algorithm operating in the transform domain. To reduce complexity, it may be desirable to increase the time and frequency sampling intervals beyond the point where the transform is invertible, at the cost of an inevitable recovery error. In this paper we initiate the study of recovery procedures for non-invertible Gabor representations. We propose using fixed analysis and synthesis windows, chosen e.g. according to implementation constraints, and to process the Gabor coefficients prior to synthesis in order to shape the reconstructed signal. We develop three methods to tackle this problem. The first follows from the consistency requirement, namely that the recovered signal has the same Gabor representation as the input signal. The second, is based on the minimization of a worst-case error criterion. Last, we develop a recovery technique based on the assumption that the input signal lies in some subspace of $L_2$. We show that for each of the criteria, the manipulation of the transform coefficients amounts to a 2D twisted convolution operation, which we show how to perform using a filter-bank. When the under-sampling factor is an integer, the processing reduces to standard 2D convolution. We provide simulation results to demonstrate the advantages and weaknesses of each of the algorithms.

Tomer Michaeli

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Illumination Angular Spectrum Encoding for Controlling the Functionality of Diffractive Networks

Energy awareness in low precision neural networks

GAN "Steerability" without optimization

Explorable Super Resolution

Unique Properties of Flat Minima in Deep Networks

LMMSE Filtering in Feedback Systems with White Random Modes: Application to Tracking in Clutter

Nonparametric Canonical Correlation Analysis

Semi-Supervised Single- and Multi-Domain Regression with Multi-Domain Training

Partially Linear Estimation with Application to Sparse Signal Recovery From Measurement Pairs

Xampling at the Rate of Innovation

Performance Bounds and Design Criteria for Estimating Finite Rate of Innovation Signals

Non-Invertible Gabor Transforms