Source author record

M. Salman Asif

M. Salman Asif appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Machine Learning Information Theory math.IT math.OC Computation Cryptography and Security physics.optics

Catalog footprint

What is connected

16works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Coded Illumination for Improved Lensless Imaging

Mask-based lensless cameras can be flat, thin, and light-weight, which makes them suitable for novel designs of computational imaging systems with large surface areas and arbitrary shapes. Despite recent progress in lensless cameras, the quality of images recovered from the lensless cameras is often poor due to the ill-conditioning of the underlying measurement system. In this paper, we propose to use coded illumination to improve the quality of images reconstructed with lensless cameras. In our imaging model, the scene/object is illuminated by multiple coded illumination patterns as the lensless camera records sensor measurements. We designed and tested a number of illumination patterns and observed that shifting dots (and related orthogonal) patterns provide the best overall performance. We propose a fast and low-complexity recovery algorithm that exploits the separability and block-diagonal structure in our system. We present simulation results and hardware experiment results to demonstrate that our proposed method can significantly improve the reconstruction quality.

preprint2022arXiv

H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System

High-speed, high-resolution stereoscopic (H2-Stereo) video allows us to perceive dynamic 3D content at fine granularity. The acquisition of H2-Stereo video, however, remains challenging with commodity cameras. Existing spatial super-resolution or temporal frame interpolation methods provide compromised solutions that lack temporal or spatial details, respectively. To alleviate this problem, we propose a dual camera system, in which one camera captures high-spatial-resolution low-frame-rate (HSR-LFR) videos with rich spatial details, and the other captures low-spatial-resolution high-frame-rate (LSR-HFR) videos with smooth temporal details. We then devise a Learned Information Fusion network (LIFnet) that exploits the cross-camera redundancies to enhance both camera views to high spatiotemporal resolution (HSTR) for reconstructing the H2-Stereo video effectively. We utilize a disparity network to transfer spatiotemporal information across views even in large disparity scenes, based on which, we propose disparity-guided flow-based warping for LSR-HFR view and complementary warping for HSR-LFR view. A multi-scale fusion method in feature domain is proposed to minimize occlusion-induced warping ghosts and holes in HSR-LFR view. The LIFnet is trained in an end-to-end manner using our collected high-quality Stereo Video dataset from YouTube. Extensive experiments demonstrate that our model outperforms existing state-of-the-art methods for both views on synthetic data and camera-captured real data with large disparity. Ablation studies explore various aspects, including spatiotemporal resolution, camera baseline, camera desynchronization, long/short exposures and applications, of our system to fully understand its capability for potential applications.

preprint2022arXiv

Incremental Task Learning with Incremental Rank Updates

Incremental Task learning (ITL) is a category of continual learning that seeks to train a single network for multiple tasks (one after another), where training data for each task is only available during the training of that task. Neural networks tend to forget older tasks when they are trained for the newer tasks; this property is often known as catastrophic forgetting. To address this issue, ITL methods use episodic memory, parameter regularization, masking and pruning, or extensible network structures. In this paper, we propose a new incremental task learning framework based on low-rank factorization. In particular, we represent the network weights for each layer as a linear combination of several rank-1 matrices. To update the network for a new task, we learn a rank-1 (or low-rank) matrix and add that to the weights of every layer. We also introduce an additional selector vector that assigns different weights to the low-rank matrices learned for the previous tasks. We show that our approach performs better than the current state-of-the-art methods in terms of accuracy and forgetting. Our method also offers better memory efficiency compared to episodic memory- and mask-based approaches. Our code will be available at https://github.com/CSIPlab/task-increment-rank-update.git

preprint2022arXiv

Zero-Query Transfer Attacks on Context-Aware Object Detectors

Adversarial attacks perturb images such that a deep neural network produces incorrect classification results. A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check, wherein, if the detected objects are not consistent with an appropriately defined context, then an attack is suspected. Stronger attacks are needed to fool such context-aware detectors. We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check of black-box object detectors operating on complex, natural scenes. Unlike many black-box attacks that perform repeated attempts and open themselves to detection, we assume a "zero-query" setting, where the attacker has no knowledge of the classification decisions of the victim system. First, we derive multiple attack plans that assign incorrect labels to victim objects in a context-consistent manner. Then we design and use a novel data structure that we call the perturbation success probability matrix, which enables us to filter the attack plans and choose the one most likely to succeed. This final attack plan is implemented using a perturbation-bounded adversarial attack algorithm. We compare our zero-query attack against a few-query scheme that repeatedly checks if the victim system is fooled. We also compare against state-of-the-art context-agnostic attacks. Against a context-aware defense, the fooling rate of our zero-query approach is significantly higher than context-agnostic approaches and higher than that achievable with up to three rounds of the few-query scheme.

preprint2020arXiv

A Dual Camera System for High Spatiotemporal Resolution Video Acquisition

This paper presents a dual camera system for high spatiotemporal resolution (HSTR) video acquisition, where one camera shoots a video with high spatial resolution and low frame rate (HSR-LFR) and another one captures a low spatial resolution and high frame rate (LSR-HFR) video. Our main goal is to combine videos from LSR-HFR and HSR-LFR cameras to create an HSTR video. We propose an end-to-end learning framework, AWnet, mainly consisting of a FlowNet and a FusionNet that learn an adaptive weighting function in pixel domain to combine inputs in a frame recurrent fashion. To improve the reconstruction quality for cameras used in reality, we also introduce noise regularization under the same framework. Our method has demonstrated noticeable performance gains in terms of both objective PSNR measurement in simulation with different publicly available video and light-field datasets and subjective evaluation with real data captured by dual iPhone 7 and Grasshopper3 cameras. Ablation studies are further conducted to investigate and explore various aspects (such as reference structure, camera parallax, exposure time, etc) of our system to fully understand its capability for potential applications.

preprint2020arXiv

Joint Image and Depth Estimation with Mask-Based Lensless Cameras

Mask-based lensless cameras replace the lens of a conventional camera with a custom mask. These cameras can potentially be very thin and even flexible. Recently, it has been demonstrated that such mask-based cameras can recover light intensity and depth information of a scene. Existing depth recovery algorithms either assume that the scene consists of a small number of depth planes or solve a sparse recovery problem over a large 3D volume. Both these approaches fail to recover the scenes with large depth variations. In this paper, we propose a new approach for depth estimation based on an alternating gradient descent algorithm that jointly estimates a continuous depth map and light distribution of the unknown scene from its lensless measurements. We present simulation results on image and depth reconstruction for a variety of 3D test scenes. A comparison between the proposed algorithm and other method shows that our algorithm is more robust for natural scenes with a large range of depths. We built a prototype lensless camera and present experimental results for reconstruction of intensity and depth maps of different real objects.

preprint2020arXiv

Learning Illumination Patterns for Coded Diffraction Phase Retrieval

Signal recovery from nonlinear measurements involves solving an iterative optimization problem. In this paper, we present a framework to optimize the sensing parameters to improve the quality of the signal recovered by the given iterative method. In particular, we learn illumination patterns to recover signals from coded diffraction patterns using a fixed-cost alternating minimization-based phase retrieval method. Coded diffraction phase retrieval is a physically realistic system in which the signal is first modulated by a sequence of codes before the sensor records its Fourier amplitude. We represent the phase retrieval method as an unrolled network with a fixed number of layers and minimize the recovery error by optimizing over the measurement parameters. Since the number of iterations/layers are fixed, the recovery incurs a fixed cost. We present extensive simulation results on a variety of datasets under different conditions and a comparison with existing methods. Our results demonstrate that the proposed method provides near-perfect reconstruction using patterns learned with a small number of training images. Our proposed method provides significant improvements over existing methods both in terms of accuracy and speed.

preprint2020arXiv

Non-Adversarial Video Synthesis with Learned Priors

Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our approach generates superior quality videos compared to the existing state-of-the-art methods.

preprint2020arXiv

Solving Phase Retrieval with a Learned Reference

Fourier phase retrieval is a classical problem that deals with the recovery of an image from the amplitude measurements of its Fourier coefficients. Conventional methods solve this problem via iterative (alternating) minimization by leveraging some prior knowledge about the structure of the unknown image. The inherent ambiguities about shift and flip in the Fourier measurements make this problem especially difficult; and most of the existing methods use several random restarts with different permutations. In this paper, we assume that a known (learned) reference is added to the signal before capturing the Fourier amplitude measurements. Our method is inspired by the principle of adding a reference signal in holography. To recover the signal, we implement an iterative phase retrieval method as an unrolled network. Then we use back propagation to learn the reference that provides us the best reconstruction for a fixed number of phase retrieval iterations. We performed a number of simulations on a variety of datasets under different conditions and found that our proposed method for phase retrieval via unrolled network and learned reference provides near-perfect recovery at fixed (small) computational cost. We compared our method with standard Fourier phase retrieval methods and observed significant performance enhancement using the learned reference.

preprint2019arXiv

Generative Models for Low-Rank Video Representation and Reconstruction

Finding compact representation of videos is an essential component in almost every problem related to video processing or understanding. In this paper, we propose a generative model to learn compact latent codes that can efficiently represent and reconstruct a video sequence from its missing or under-sampled measurements. We use a generative network that is trained to map a compact code into an image. We first demonstrate that if a video sequence belongs to the range of the pretrained generative network, then we can recover it by estimating the underlying compact latent codes. Then we demonstrate that even if the video sequence does not belong to the range of a pretrained network, we can still recover the true video sequence by jointly updating the latent codes and the weights of the generative network. To avoid overfitting in our model, we regularize the recovery problem by imposing low-rank and similarity constraints on the latent codes of the neighboring frames in the video sequence. We use our methods to recover a variety of videos from compressive measurements at different compression rates. We also demonstrate that we can generate missing frames in a video sequence by interpolating the latent codes of the observed frames in the low-dimensional space.

preprint2016arXiv

FlatCam: Thin, Bare-Sensor Cameras using Coded Aperture and Computation

FlatCam is a thin form-factor lensless camera that consists of a coded mask placed on top of a bare, conventional sensor array. Unlike a traditional, lens-based camera where an image of the scene is directly recorded on the sensor pixels, each pixel in FlatCam records a linear combination of light from multiple scene elements. A computational algorithm is then used to demultiplex the recorded measurements and reconstruct an image of the scene. FlatCam is an instance of a coded aperture imaging system; however, unlike the vast majority of related work, we place the coded mask extremely close to the image sensor that can enable a thin system. We employ a separable mask to ensure that both calibration and image reconstruction are scalable in terms of memory requirements and computational complexity. We demonstrate the potential of the FlatCam design using two prototypes: one at visible wavelengths and one at infrared wavelengths.

preprint2015arXiv

FPA-CS: Focal Plane Array-based Compressive Imaging in Short-wave Infrared

Cameras for imaging in short and mid-wave infrared spectra are significantly more expensive than their counterparts in visible imaging. As a result, high-resolution imaging in those spectrum remains beyond the reach of most consumers. Over the last decade, compressive sensing (CS) has emerged as a potential means to realize inexpensive short-wave infrared cameras. One approach for doing this is the single-pixel camera (SPC) where a single detector acquires coded measurements of a high-resolution image. A computational reconstruction algorithm is then used to recover the image from these coded measurements. Unfortunately, the measurement rate of a SPC is insufficient to enable imaging at high spatial and temporal resolutions. We present a focal plane array-based compressive sensing (FPA-CS) architecture that achieves high spatial and temporal resolutions. The idea is to use an array of SPCs that sense in parallel to increase the measurement rate, and consequently, the achievable spatio-temporal resolution of the camera. We develop a proof-of-concept prototype in the short-wave infrared using a sensor with 64$\times$ 64 pixels; the prototype provides a 4096$\times$ increase in the measurement rate compared to the SPC and achieves a megapixel resolution at video rate using CS techniques.

preprint2015arXiv

Toward Long Distance, Sub-diffraction Imaging Using Coherent Camera Arrays

In this work, we propose using camera arrays coupled with coherent illumination as an effective method of improving spatial resolution in long distance images by a factor of ten and beyond. Recent advances in ptychography have demonstrated that one can image beyond the diffraction limit of the objective lens in a microscope. We demonstrate a similar imaging system to image beyond the diffraction limit in long range imaging. We emulate a camera array with a single camera attached to an X-Y translation stage. We show that an appropriate phase retrieval based reconstruction algorithm can be used to effectively recover the lost high resolution details from the multiple low resolution acquired images. We analyze the effects of noise, required degree of image overlap, and the effect of increasing synthetic aperture size on the reconstructed image quality. We show that coherent camera arrays have the potential to greatly improve imaging performance. Our simulations show resolution gains of 10x and more are achievable. Furthermore, experimental results from our proof-of-concept systems show resolution gains of 4x-7x for real scenes. Finally, we introduce and analyze in simulation a new strategy to capture macroscopic Fourier Ptychography images in a single snapshot, albeit using a camera array.

preprint2013arXiv

Sparse Recovery of Streaming Signals Using L1-Homotopy

Most of the existing methods for sparse signal recovery assume a static system: the unknown signal is a finite-length vector for which a fixed set of linear measurements and a sparse representation basis are available and an L1-norm minimization program is solved for the reconstruction. However, the same representation and reconstruction framework is not readily applicable in a streaming system: the unknown signal changes over time, and it is measured and reconstructed sequentially over small time intervals. In this paper, we discuss two such streaming systems and a homotopy-based algorithm for quickly solving the associated L1-norm minimization programs: 1) Recovery of a smooth, time-varying signal for which, instead of using block transforms, we use lapped orthogonal transforms for sparse representation. 2) Recovery of a sparse, time-varying signal that follows a linear dynamic model. For both the systems, we iteratively process measurements over a sliding interval and estimate sparse coefficients by solving a weighted L1-norm minimization program. Instead of solving a new L1 program from scratch at every iteration, we use an available signal estimate as a starting point in a homotopy formulation. Starting with a warm-start vector, our homotopy algorithm updates the solution in a small number of computationally inexpensive steps as the system changes. The homotopy algorithm presented in this paper is highly versatile as it can update the solution for the L1 problem in a number of dynamical settings. We demonstrate with numerical experiments that our proposed streaming recovery framework outperforms the methods that represent and reconstruct a signal as independent, disjoint blocks, in terms of quality of reconstruction, and that our proposed homotopy-based updating scheme outperforms current state-of-the-art solvers in terms of the computation time and complexity.

preprint2012arXiv

Fast and Accurate Algorithms for Re-Weighted L1-Norm Minimization

To recover a sparse signal from an underdetermined system, we often solve a constrained L1-norm minimization problem. In many cases, the signal sparsity and the recovery performance can be further improved by replacing the L1 norm with a "weighted" L1 norm. Without any prior information about nonzero elements of the signal, the procedure for selecting weights is iterative in nature. Common approaches update the weights at every iteration using the solution of a weighted L1 problem from the previous iteration. In this paper, we present two homotopy-based algorithms that efficiently solve reweighted L1 problems. First, we present an algorithm that quickly updates the solution of a weighted L1 problem as the weights change. Since the solution changes only slightly with small changes in the weights, we develop a homotopy algorithm that replaces the old weights with the new ones in a small number of computationally inexpensive steps. Second, we propose an algorithm that solves a weighted L1 problem by adaptively selecting the weights while estimating the signal. This algorithm integrates the reweighting into every step along the homotopy path by changing the weights according to the changes in the solution and its support, allowing us to achieve a high quality signal reconstruction by solving a single homotopy problem. We compare the performance of both algorithms, in terms of reconstruction accuracy and computational complexity, against state-of-the-art solvers and show that our methods have smaller computational cost. In addition, we will show that the adaptive selection of the weights inside the homotopy often yields reconstructions of higher quality.

preprint2009arXiv

Channel Protection: Random Coding Meets Sparse Channels

Multipath interference is an ubiquitous phenomenon in modern communication systems. The conventional way to compensate for this effect is to equalize the channel by estimating its impulse response by transmitting a set of training symbols. The primary drawback to this type of approach is that it can be unreliable if the channel is changing rapidly. In this paper, we show that randomly encoding the signal can protect it against channel uncertainty when the channel is sparse. Before transmission, the signal is mapped into a slightly longer codeword using a random matrix. From the received signal, we are able to simultaneously estimate the channel and recover the transmitted signal. We discuss two schemes for the recovery. Both of them exploit the sparsity of the underlying channel. We show that if the channel impulse response is sufficiently sparse, the transmitted signal can be recovered reliably.

M. Salman Asif

What is connected

Connect this record

See the researcher in context

Building this map preview

16 published item(s)

Coded Illumination for Improved Lensless Imaging

H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System

Incremental Task Learning with Incremental Rank Updates

Zero-Query Transfer Attacks on Context-Aware Object Detectors

A Dual Camera System for High Spatiotemporal Resolution Video Acquisition

Joint Image and Depth Estimation with Mask-Based Lensless Cameras

Learning Illumination Patterns for Coded Diffraction Phase Retrieval

Non-Adversarial Video Synthesis with Learned Priors

Solving Phase Retrieval with a Learned Reference

Generative Models for Low-Rank Video Representation and Reconstruction

FlatCam: Thin, Bare-Sensor Cameras using Coded Aperture and Computation

FPA-CS: Focal Plane Array-based Compressive Imaging in Short-wave Infrared

Toward Long Distance, Sub-diffraction Imaging Using Coherent Camera Arrays

Sparse Recovery of Streaming Signals Using L1-Homotopy

Fast and Accurate Algorithms for Re-Weighted L1-Norm Minimization

Channel Protection: Random Coding Meets Sparse Channels