Source author record

Kaushik Mitra

Kaushik Mitra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV cond-mat.mes-hall cond-mat.other cond-mat.str-el Machine Learning math-ph math.MP

Catalog footprint

What is connected

12works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

LWGNet: Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval

Fourier Ptychographic Microscopy (FPM) is an imaging procedure that overcomes the traditional limit on Space-Bandwidth Product (SBP) of conventional microscopes through computational means. It utilizes multiple images captured using a low numerical aperture (NA) objective and enables high-resolution phase imaging through frequency domain stitching. Existing FPM reconstruction methods can be broadly categorized into two approaches: iterative optimization based methods, which are based on the physics of the forward imaging model, and data-driven methods which commonly employ a feed-forward deep learning framework. We propose a hybrid model-driven residual network that combines the knowledge of the forward imaging system with a deep data-driven network. Our proposed architecture, LWGNet, unrolls traditional Wirtinger flow optimization algorithm into a novel neural network design that enhances the gradient images through complex convolutional blocks. Unlike other conventional unrolling techniques, LWGNet uses fewer stages while performing at par or even better than existing traditional and deep learning techniques, particularly, for low-cost and low dynamic range CMOS sensors. This improvement in performance for low-bit depth and low-cost sensors has the potential to bring down the cost of FPM imaging setup significantly. Finally, we show consistently improved performance on our collected real data.

preprint2022arXiv

Synthesizing Light Field Video from Monocular Video

The hardware challenges associated with light-field(LF) imaging has made it difficult for consumers to access its benefits like applications in post-capture focus and aperture control. Learning-based techniques which solve the ill-posed problem of LF reconstruction from sparse (1, 2 or 4) views have significantly reduced the requirement for complex hardware. LF video reconstruction from sparse views poses a special challenge as acquiring ground-truth for training these models is hard. Hence, we propose a self-supervised learning-based algorithm for LF video reconstruction from monocular videos. We use self-supervised geometric, photometric and temporal consistency constraints inspired from a recent self-supervised technique for LF video reconstruction from stereo video. Additionally, we propose three key techniques that are relevant to our monocular video input. We propose an explicit disocclusion handling technique that encourages the network to inpaint disoccluded regions in a LF frame, using information from adjacent input temporal frames. This is crucial for a self-supervised technique as a single input frame does not contain any information about the disoccluded regions. We also propose an adaptive low-rank representation that provides a significant boost in performance by tailoring the representation to each input scene. Finally, we also propose a novel refinement block that is able to exploit the available LF image data using supervised learning to further refine the reconstruction quality. Our qualitative and quantitative analysis demonstrates the significance of each of the proposed building blocks and also the superior results compared to previous state-of-the-art monocular LF reconstruction techniques. We further validate our algorithm by reconstructing LF videos from monocular videos acquired using a commercial GoPro camera.

preprint2020arXiv

Deep Atrous Guided Filter for Image Restoration in Under Display Cameras

Under Display Cameras present a promising opportunity for phone manufacturers to achieve bezel-free displays by positioning the camera behind semi-transparent OLED screens. Unfortunately, such imaging systems suffer from severe image degradation due to light attenuation and diffraction effects. In this work, we present Deep Atrous Guided Filter (DAGF), a two-stage, end-to-end approach for image restoration in UDC systems. A Low-Resolution Network first restores image quality at low-resolution, which is subsequently used by the Guided Filter Network as a filtering input to produce a high-resolution output. Besides the initial downsampling, our low-resolution network uses multiple, parallel atrous convolutions to preserve spatial resolution and emulates multi-scale processing. Our approach's ability to directly train on megapixel images results in significant performance improvement. We additionally propose a simple simulation scheme to pre-train our model and boost performance. Our overall framework ranks 2nd and 5th in the RLQ-TOD'20 UDC Challenge for POLED and TOLED displays, respectively.

preprint2020arXiv

Monocular Retinal Depth Estimation and Joint Optic Disc and Cup Segmentation using Adversarial Networks

One of the important parameters for the assessment of glaucoma is optic nerve head (ONH) evaluation, which usually involves depth estimation and subsequent optic disc and cup boundary extraction. Depth is usually obtained explicitly from imaging modalities like optical coherence tomography (OCT) and is very challenging to estimate depth from a single RGB image. To this end, we propose a novel method using adversarial network to predict depth map from a single image. The proposed depth estimation technique is trained and evaluated using individual retinal images from INSPIRE-stereo dataset. We obtain a very high average correlation coefficient of 0.92 upon five fold cross validation outperforming the state of the art. We then use the depth estimation process as a proxy task for joint optic disc and cup segmentation.

preprint2020arXiv

multi-patch aggregation models for resampling detection

Images captured nowadays are of varying dimensions with smartphones and DSLR's allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments, we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes. To handle this issue, we propose a novel pooling strategy called ITERATIVE POOLING. This pooling strategy can dynamically adjust input tensors in a discrete without much loss of information as in ROI Max-pooling. This pooling strategy can be used with any of the existing deep models and for demonstration purposes, we show its utility on Resnet-18 for the case of resampling detection a fundamental operation for any image sought of image manipulation. Compared to existing strategies and Max-pooling it gives up to 7-8% improvement on public datasets.

preprint2020arXiv

Optimal HDR and Depth from Dual Cameras

Dual camera systems have assisted in the proliferation of various applications, such as optical zoom, low-light imaging and High Dynamic Range (HDR) imaging. In this work, we explore an optimal method for capturing the scene HDR and disparity map using dual camera setups. Hasinoff et al. (2010) have developed a noise optimal framework for HDR capture from a single camera. We generalize this to the dual camera set-up for estimating both HDR and disparity map. It may seem that dual camera systems can capture HDR in a shorter time. However, disparity estimation is a necessary step, which requires overlap among the images captured by the two cameras. This may lead to an increase in the capture time. To address this conflicting requirement, we propose a novel framework to find the optimal exposure and ISO sequence by minimizing the capture time under the constraints of an upper bound on the disparity error and a lower bound on the per-exposure SNR. We show that the resulting optimization problem is non-convex in general and propose an appropriate initialization technique. To obtain the HDR and disparity map from the optimal capture sequence, we propose a pipeline which alternates between estimating the camera ICRFs and the scene disparity map. We demonstrate that our optimal capture sequence leads to better results than other possible capture sequences. Our results are also close to those obtained by capturing the full stereo stack spanning the entire dynamic range. Finally, we present for the first time a stereo HDR dataset consisting of dense ISO and exposure stack captured from a smartphone dual camera. The dataset consists of 6 scenes, with an average of 142 exposure-ISO image sequence per scene.

preprint2020arXiv

UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results

This paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, eight and nine teams submitted the results during the testing phase for each track. The results in the paper are state-of-the-art restoration performance of Under-Display Camera Restoration. Datasets and paper are available at https://yzhouas.github.io/projects/UDC/udc.html.

preprint2019arXiv

Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor

Event sensors output a stream of asynchronous brightness changes (called ``events'') at a very high temporal rate. Previous works on recovering the lost intensity information from the event sensor data have heavily relied on the event stream, which makes the reconstructed images non-photorealistic and also susceptible to noise in the event stream. We propose to reconstruct photorealistic intensity images from a hybrid sensor consisting of a low frame rate conventional camera, which has the scene texture information, along with the event sensor. To accomplish our task, we warp the low frame rate intensity images to temporally dense locations of the event data by estimating a spatially dense scene depth and temporally dense sensor ego-motion. The results obtained from our algorithm are more photorealistic compared to any of the previous state-of-the-art algorithms. We also demonstrate our algorithm's robustness to abrupt camera motion and noise in the event sensor data.

preprint2015arXiv

Spatial Phase-Sweep: Increasing temporal resolution of transient imaging using a light source array

Transient imaging or light-in-flight techniques capture the propagation of an ultra-short pulse of light through a scene, which in effect captures the optical impulse response of the scene. Recently, it has been shown that we can capture transient images using commercially available Time-of-Flight (ToF) systems such as Photonic Mixer Devices (PMD). In this paper, we propose `spatial phase-sweep', a technique that exploits the speed of light to increase the temporal resolution beyond the 100 picosecond limit imposed by current electronics. Spatial phase-sweep uses a linear array of light sources with spatial separation of about 3 mm between them, thereby resulting in a time shift of about 10 picoseconds, which translates into 100 Gfps of transient imaging in theory. We demonstrate a prototype and transient imaging results using spatial phase-sweep.

preprint2014arXiv

A Framework for the Analysis of Computational Imaging Systems with Practical Applications

Over the last decade, a number of Computational Imaging (CI) systems have been proposed for tasks such as motion deblurring, defocus deblurring and multispectral imaging. These techniques increase the amount of light reaching the sensor via multiplexing and then undo the deleterious effects of multiplexing by appropriate reconstruction algorithms. Given the widespread appeal and the considerable enthusiasm generated by these techniques, a detailed performance analysis of the benefits conferred by this approach is important. Unfortunately, a detailed analysis of CI has proven to be a challenging problem because performance depends equally on three components: (1) the optical multiplexing, (2) the noise characteristics of the sensor, and (3) the reconstruction algorithm. A few recent papers have performed analysis taking multiplexing and noise characteristics into account. However, analysis of CI systems under state-of-the-art reconstruction algorithms, most of which exploit signal prior models, has proven to be unwieldy. In this paper, we present a comprehensive analysis framework incorporating all three components. In order to perform this analysis, we model the signal priors using a Gaussian Mixture Model (GMM). A GMM prior confers two unique characteristics. Firstly, GMM satisfies the universal approximation property which says that any prior density function can be approximated to any fidelity using a GMM with appropriate number of mixtures. Secondly, a GMM prior lends itself to analytical tractability allowing us to derive simple expressions for the `minimum mean square error' (MMSE), which we use as a metric to characterize the performance of CI systems. We use our framework to analyze several previously proposed CI techniques, giving conclusive answer to the question: `How much performance gain is due to use of a signal prior and how much is due to multiplexing?

preprint2014arXiv

Free-energy functional of the electronic potential for Schrödinger-Poisson theory

In the study of model electronic device systems where electrons are typically under confinement, a key obstacle is the need to iteratively solve the coupled Schrödinger-Poisson (SP) equation. It is possible to bypass this obstacle by adopting a variational approach and obtaining the solution of the SP equation by minimizing a functional. Further, using molecular dynamics methods that treat the electronic potential as a dynamical variable, the functional can be minimized on the fly in conjunction with the update of other dynamical degrees of freedom leading to considerable reduction in computational costs. But such approaches require access to a true free-energy functional, one that evaluates to the equilibrium free energy at its minimum. In this paper, we present a variational formulation of the Schrödinger-Poisson (SP) theory with the needed free-energy functional of the electronic potential. We apply our formulation to semiconducting nanostructures and provide the expression of the free-energy functional for narrow channel quantum wells where the local density approximation yields accurate physics and for the case of wider channels where Thomas-Fermi approximation is valid.

preprint2008arXiv

Superfluid and Fermi liquid phases of Bose-Fermi mixtures in optical lattices

We describe interacting mixtures of ultracold bosonic and fermionic atoms in harmonically confined optical lattices. For a suitable choice of parameters we study the emergence of superfluid and Fermi liquid (non-insulating) regions out of Bose-Mott and Fermi-band insulators, due to finite Boson and Fermion hopping. We obtain the shell structure for the system and show that angular momentum can be transferred to the non-insulating regions from Laguerre-Gaussian beams, which combined with Bragg spectroscopy can reveal all superfluid and Fermi liquid shells.

Kaushik Mitra

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

LWGNet: Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval

Synthesizing Light Field Video from Monocular Video

Deep Atrous Guided Filter for Image Restoration in Under Display Cameras

Monocular Retinal Depth Estimation and Joint Optic Disc and Cup Segmentation using Adversarial Networks

multi-patch aggregation models for resampling detection

Optimal HDR and Depth from Dual Cameras

UDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results

Photorealistic Image Reconstruction from Hybrid Intensity and Event based Sensor

Spatial Phase-Sweep: Increasing temporal resolution of transient imaging using a light source array

A Framework for the Analysis of Computational Imaging Systems with Practical Applications

Free-energy functional of the electronic potential for Schrödinger-Poisson theory

Superfluid and Fermi liquid phases of Bose-Fermi mixtures in optical lattices