Source author record

Deepak Mishra

Deepak Mishra appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.SP Information Theory math.IT Cryptography and Security eess.IV Machine Learning Distributed, Parallel, and Cluster Computing Graphics Human-Computer Interaction

Catalog footprint

What is connected

12works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OphEdit: Training-Free Text-Guided Editing of Ophthalmic Surgical Videos

High-fidelity surgical video generation can greatly improve medical training and the development of AI, adapting these generative models for precise video editing remains a formidable challenge. Modifying surgical attributes, such as instrument tissue interactions or procedural phases is challenging due to the strict anatomical and temporal constraints. In this paper, we propose OphEdit, a novel training-free framework for the text-guided editing of ophthalmic surgical videos. Our approach leverages a deterministic second-order ODE inversion pipeline to capture Attention Value (V) tensors from the original video. By selectively injecting these stored tensors into the conditional Classifier-Free Guidance (CFG) branch during the denoising phase, OphEdit rigorously preserves the intricate anatomical geometry of the eye while seamlessly mapping text-driven semantic modifications onto the video stream. Clinical evaluations demonstrates that OphEdit effectively handles complex surgical transformations, such as instrument swaps and procedural variations, with superior structural fidelity and temporal consistency compared to natural-domain video editors. Our work represents the first application of training-free video editing in the ophthalmic surgical domain, offering a scalable solution for generating diverse, annotated medical datasets without the need for exhaustive manual recording or costly model fine-tuning. The code and prompts can be accessed at https://github.com/ophedit/OphEdit

preprint2024arXiv

Jointly Optimal RIS Placement and Power Allocation for Underlay D2D Communications: An Outage Probability Minimization Approach

In this paper, we study underlay device-to-device (D2D) communication systems empowered by a reconfigurable intelligent surface (RIS) for cognitive cellular networks. Considering Rayleigh fading channels and the general case where there exist both the direct and RIS-enabled D2D channels, the outage probability (OP) of the D2D communication link is presented in closed-form. Next, for the considered RIS-empowered underlaid D2D system, we frame an OP minimization problem. We target the joint optimization of the transmit power at the D2D source and the RIS placement, under constraints on the transmit power at the D2D source and on the limited interference imposed on the cellular user for two RIS deployment topologies. Due to the coupled optimization variables, the formulated optimization problem is extremely intractable. We propose an equivalent transformation which we are able to solve analytically. In the transformed problem, an expression for the average value of the signal-to-interference-noise ratio (SINR) at the D2D receiver is derived in closed-form. Our theoretical derivations are corroborated through simulation results, and various system design insights are deduced. It is indicatively showcased that the proposed RIS-empowered underlaid D2D system design outperforms the benchmark semi-adaptive optimal power and optimal distance schemes, offering $44\%$ and $20\%$ performance improvement, respectively.

preprint2022arXiv

GITz: Graphene-assisted IRS Design for THz Communication

Graphene-based intelligent reflecting surface (GIRS) has been proved to provide a promising propagation environment to enhance the quality of high frequency terahertz (THz) wireless communication. In this paper, we characterize GIRS for THz communication (GITz) using material specific parameters of graphene to tune the reflection of the incident wave at IRS. In particular, we propose a GITz design model considering the incident signal frequency material level parameters like conductivity, Fermi-level, patch width to control the reflection amplitude (RA) at the communication receiver. We have obtained the closed-form expression of RA for an accurate design and characterization of GIRS, which is incomplete in the existing research due to the inclusion of only phase-shift. The numerical simulation results demonstrate the effectiveness of the proposed characterization by providing key insights.

preprint2022arXiv

Privacy Preserving Release of Mobile Sensor Data

Sensors embedded in mobile smart devices can monitor users' activity with high accuracy to provide a variety of services to end-users ranging from precise geolocation, health monitoring, and handwritten word recognition. However, this involves the risk of accessing and potentially disclosing sensitive information of individuals to the apps that may lead to privacy breaches. In this paper, we aim to minimize privacy leakages that may lead to user identification on mobile devices through user tracking and distinguishability while preserving the functionality of apps and services. We propose a privacy-preserving mechanism that effectively handles the sensor data fluctuations (e.g., inconsistent sensor readings while walking, sitting, and running at different times) by formulating the data as time-series modeling and forecasting. The proposed mechanism also uses the notion of correlated noise-series against noise filtering attacks from an adversary, which aims to filter out the noise from the perturbed data to re-identify the original data. Unlike existing solutions, our mechanism keeps running in isolation without the interaction of a user or a service provider. We perform rigorous experiments on benchmark datasets and show that our proposed mechanism limits user tracking and distinguishability threats to a significant extent compared to the original data while maintaining a reasonable level of utility of functionalities. In general, we show that our obfuscation mechanism reduces the user trackability threat by 60\% across all the datasets while maintaining the utility loss below 0.5 Mean Absolute Error (MAE). We also observe that our mechanism is more effective in large datasets. For example, with the Swipes dataset, the distinguishability risk is reduced by 60\% on average while the utility loss is below 0.5 MAE.

preprint2021arXiv

Domain Adaptive Egocentric Person Re-identification

Person re-identification (re-ID) in first-person (egocentric) vision is a fairly new and unexplored problem. With the increase of wearable video recording devices, egocentric data becomes readily available, and person re-identification has the potential to benefit greatly from this. However, there is a significant lack of large scale structured egocentric datasets for person re-identification, due to the poor video quality and lack of individuals in most of the recorded content. Although a lot of research has been done in person re-identification based on fixed surveillance cameras, these do not directly benefit egocentric re-ID. Machine learning models trained on the publicly available large scale re-ID datasets cannot be applied to egocentric re-ID due to the dataset bias problem. The proposed algorithm makes use of neural style transfer (NST) that incorporates a variant of Convolutional Neural Network (CNN) to utilize the benefits of both fixed camera vision and first-person vision. NST generates images having features from both egocentric datasets and fixed camera datasets, that are fed through a VGG-16 network trained on a fixed-camera dataset for feature extraction. These extracted features are then used to re-identify individuals. The fixed camera dataset Market-1501 and the first-person dataset EGO Re-ID are applied for this work and the results are on par with the present re-identification models in the egocentric domain.

preprint2020arXiv

A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis

Generating photorealistic images of human subjects in any unseen pose have crucial applications in generating a complete appearance model of the subject. However, from a computer vision perspective, this task becomes significantly challenging due to the inability of modelling the data distribution conditioned on pose. Existing works use a complicated pose transformation model with various additional features such as foreground segmentation, human body parsing etc. to achieve robustness that leads to computational overhead. In this work, we propose a simple yet effective pose transformation GAN by utilizing the Residual Learning method without any additional feature learning to generate a given human image in any arbitrary pose. Using effective data augmentation techniques and cleverly tuning the model, we achieve robustness in terms of illumination, occlusion, distortion and scale. We present a detailed study, both qualitative and quantitative, to demonstrate the superiority of our model over the existing methods on two large datasets.

preprint2020arXiv

Effect of The Latent Structure on Clustering with GANs

Generative adversarial networks (GANs) have shown remarkable success in generation of data from natural data manifolds such as images. In several scenarios, it is desirable that generated data is well-clustered, especially when there is severe class imbalance. In this paper, we focus on the problem of clustering in generated space of GANs and uncover its relationship with the characteristics of the latent space. We derive from first principles, the necessary and sufficient conditions needed to achieve faithful clustering in the GAN framework: (i) presence of a multimodal latent space with adjustable priors, (ii) existence of a latent space inversion mechanism and (iii) imposition of the desired cluster priors on the latent space. We also identify the GAN models in the literature that partially satisfy these conditions and demonstrate the importance of all the components required, through ablative studies on multiple real world image datasets. Additionally, we describe a procedure to construct a multimodal latent space which facilitates learning of cluster priors with sparse supervision.

preprint2020arXiv

QoS-aware Stochastic Spatial PLS Model for Analysing Secrecy Performance under Eavesdropping and Jamming

Securing wireless communication, being inherently vulnerable to eavesdropping and jamming attacks, becomes more challenging in resource-constrained networks like Internet-of-Things. Towards this, physical layer security (PLS) has gained significant attention due to its low complexity. In this paper, we address the issue of random inter-node distances in secrecy analysis and develop a comprehensive quality-of-service (QoS) aware PLS framework for the analysis of both eavesdropping and jamming capabilities of attacker. The proposed solution covers spatially stochastic deployment of legitimate nodes and attacker. We characterise the secrecy outage performance against both attacks using inter-node distance based probabilistic distribution functions. The model takes into account the practical limits arising out of underlying QoS requirements, which include the maximum distance between legitimate users driven by transmit power and receiver sensitivity. A novel concept of eavesdropping zone is introduced, and relative impact of jamming power is investigated. Closed-form expressions for asymptotic secrecy outage probability are derived offering insights into design of optimal system parameters for desired security level against the attacker's capability of both attacks. Analytical framework, validated by numerical results, establishes that the proposed solution offers potentially accurate characterisation of the PLS performance and key design perspective from point-of-view of both legitimate user and attacker.

preprint2020arXiv

Spatio-Temporal Coverage Enhancement in Drive-By Sensing Through Utility-Aware Mobile Agent Selection

In recent years, the drive-by sensing paradigm has become increasingly popular for cost-effective monitoring of urban areas. Drive-by sensing is a form of crowdsensing wherein sensor-equipped vehicles (aka, mobile agents) are the primary data gathering agents. Enhancing the efficacy of drive-by sensing poses many challenges, an important one of which is to select non-dedicated mobile agents on which a limited number of sensors are to be mounted. This problem, which we refer to as the mobile-agent selection problem, has a significant impact on the spatio-temporal coverage of the drive-by sensing platforms and the resultant datasets. The challenge here is to achieve maximum spatiotemporal coverage while taking the relative importance levels of geographical areas into account. In this paper, we address this problem in the context of the SCOUTS project, the goal of which is to map and analyze the urban heat island phenomenon accurately. Our work makes several major technical contributions. First, we delineate a model for representing the mobile agents selection problem. This model takes into account the trajectories of the vehicles (public transportation buses in our case) and the relative importance of the urban regions, and formulates it as an optimization problem. Second, we provide two algorithms that are based upon the utility (coverage) values of mobile agents, namely, a hotspot-based algorithm that limits the search space to important sub-regions and a utility-aware genetic algorithm that enables the latter algorithm to make unbiased selections. Third, we design a highly efficient coverage redundancy minimization algorithm that, at each step, chooses the mobile agent, which provides maximal improvement to the spatio-temporal coverage. This paper reports a series of experiments on a real-world dataset from Athens, GA, USA, to demonstrate the effectiveness of the proposed approaches.

preprint2020arXiv

Target-Independent Domain Adaptation for WBC Classification using Generative Latent Search

Automating the classification of camera-obtained microscopic images of White Blood Cells (WBCs) and related cell subtypes has assumed importance since it aids the laborious manual process of review and diagnosis. Several State-Of-The-Art (SOTA) methods developed using Deep Convolutional Neural Networks suffer from the problem of domain shift - severe performance degradation when they are tested on data (target) obtained in a setting different from that of the training (source). The change in the target data might be caused by factors such as differences in camera/microscope types, lenses, lighting-conditions etc. This problem can potentially be solved using Unsupervised Domain Adaptation (UDA) techniques albeit standard algorithms presuppose the existence of a sufficient amount of unlabelled target data which is not always the case with medical images. In this paper, we propose a method for UDA that is devoid of the need for target data. Given a test image from the target data, we obtain its 'closest-clone' from the source data that is used as a proxy in the classifier. We prove the existence of such a clone given that infinite number of data points can be sampled from the source distribution. We propose a method in which a latent-variable generative model based on variational inference is used to simultaneously sample and find the 'closest-clone' from the source distribution through an optimization procedure in the latent space. We demonstrate the efficacy of the proposed method over several SOTA UDA methods for WBC classification on datasets captured using different imaging modalities under multiple settings.

preprint2020arXiv

Wireless Powered Protocol Exploiting Energy Harvesting During Cognitive Communications

In this letter, a novel wireless powered protocol is proposed to maximize the system throughput of an energy harvesting (EH) based cognitive radio network, while satisfying a minimum primary user rate requirement. For EH, we exploit both dedicated wireless power transfer from primary base station as well as ambient ones available due to wireless information transfer among primary and secondary users. Specifically, we prove convexity of the optimization problem and obtain semi-closed-form for globally optimal solution. Numerical results validate the analysis, and show an average performance improvement of $70\%$ over benchmark scheme for various system parameters.

preprint2014arXiv

Comparative Study of Geometric and Image Based Modelling and Rendering Techniques

This is a comparative study of the traditional 3D computer graphics technique of geometric modelling and image-based rendering techniques that were surveyed and implemented.We have discussed the classifications and representative methods of both the techniques. The study has shown that there is a strong continuum between both the techniques and a hybrid of the two is most suitable for further implementations.This hybridisation study is underway to create models of real life situations and provide disaster management training.

Deepak Mishra

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

OphEdit: Training-Free Text-Guided Editing of Ophthalmic Surgical Videos

Jointly Optimal RIS Placement and Power Allocation for Underlay D2D Communications: An Outage Probability Minimization Approach

GITz: Graphene-assisted IRS Design for THz Communication

Privacy Preserving Release of Mobile Sensor Data

Domain Adaptive Egocentric Person Re-identification

A Robust Pose Transformational GAN for Pose Guided Person Image Synthesis

Effect of The Latent Structure on Clustering with GANs

QoS-aware Stochastic Spatial PLS Model for Analysing Secrecy Performance under Eavesdropping and Jamming

Spatio-Temporal Coverage Enhancement in Drive-By Sensing Through Utility-Aware Mobile Agent Selection

Target-Independent Domain Adaptation for WBC Classification using Generative Latent Search

Wireless Powered Protocol Exploiting Energy Harvesting During Cognitive Communications

Comparative Study of Geometric and Image Based Modelling and Rendering Techniques