Source author record

Mohammad Rostami

Mohammad Rostami appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Computation and Language Distributed, Parallel, and Cluster Computing eess.IV math.OC Mathematical Software Numerical Analysis

Catalog footprint

What is connected

8works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion

Machine unlearning for large language models (LLMs) aims to selectively remove memorized content such as private data, copyrighted text, or hazardous knowledge, without costly full retraining. Most existing methods require a retain set of curated examples to prevent catastrophic degradation of general model utility, creating an extra data dependency that complicates deployment. We propose SHRED (Self-distillation via High-surprisal-only Retain-set-free Entropy Demotion), a retain-set-free unlearning method built on a key insight: not all tokens within a forget set instance carry memorized information equally. High-information tokens concentrate the model's memorized knowledge, while low-information tokens reflect general language competence. SHRED operates in two stages. (1) Selection: We perform a forward pass on a forget set instance, collect per-token autoregressive probabilities, and select the bottom (lowest probability, highest Shannon information) as forget positions; the remaining positions are retained as benign anchors. (2) Training: We construct modified KL targets that demote the memorized token's logit at forget positions while preserving the original distribution at benign positions. The model is then trained via a single top KL self-distillation objective that simultaneously drives forgetting and utility preservation. We evaluate SHRED across four standard unlearning benchmarks and demonstrate that it establishes a new Pareto-optimal trade-off between forget efficacy and model utility, outperforming retain-set-dependent methods. Our analysis shows that SHRED is robust against relearning attacks and membership-inference attacks, and it maintains stable utility even after many sequential unlearning runs.

preprint2024arXiv

Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations

Semantic segmentation models trained on annotated data fail to generalize well when the input data distribution changes over extended time period, leading to requiring re-training to maintain performance. Classic Unsupervised domain adaptation (UDA) attempts to address a similar problem when there is target domain with no annotated data points through transferring knowledge from a source domain with annotated data. We develop an online UDA algorithm for semantic segmentation of images that improves model generalization on unannotated domains in scenarios where source data access is restricted during adaptation. We perform model adaptation is by minimizing the distributional distance between the source latent features and the target features in a shared embedding space. Our solution promotes a shared domain-agnostic latent feature space between the two domains, which allows for classifier generalization on the target dataset. To alleviate the need of access to source samples during adaptation, we approximate the source latent feature distribution via an appropriate surrogate distribution, in this case a Gassian mixture model (GMM). We evaluate our approach on well established semantic segmentation datasets and demonstrate it compares favorably against state-of-the-art (SOTA) UDA semantic segmentation methods.

preprint2022arXiv

Automating Detection of Papilledema in Pediatric Fundus Images with Explainable Machine Learning

Papilledema is an ophthalmic neurologic disorder in which increased intracranial pressure leads to swelling of the optic nerves. Undiagnosed papilledema in children may lead to blindness and may be a sign of life-threatening conditions, such as brain tumors. Robust and accurate clinical diagnosis of this syndrome can be facilitated by automated analysis of fundus images using deep learning, especially in the presence of challenges posed by pseudopapilledema that has similar fundus appearance but distinct clinical implications. We present a deep learning-based algorithm for the automatic detection of pediatric papilledema. Our approach is based on optic disc localization and detection of explainable papilledema indicators through data augmentation. Experiments on real-world clinical data demonstrate that our proposed method is effective with a diagnostic accuracy comparable to expert ophthalmologists.

preprint2022arXiv

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning

The ability to continuously expand knowledge over time and utilize it to rapidly generalize to new tasks is a key feature of human linguistic intelligence. Existing models that pursue rapid generalization to new tasks (e.g., few-shot learning methods), however, are mostly trained in a single shot on fixed datasets, unable to dynamically expand their knowledge; while continual learning algorithms are not specifically designed for rapid generalization. We present a new learning setup, Continual Learning of Few-Shot Learners (CLIF), to address the challenges of both learning settings in a unified setup. CLIF assumes a model learns from a sequence of diverse NLP tasks arriving sequentially, accumulating knowledge for improved generalization to new tasks, while also retaining performance on the tasks learned earlier. We examine how the generalization ability is affected in the continual learning setup, evaluate a number of continual learning algorithms, and propose a novel regularized adapter generation approach. We find that catastrophic forgetting affects generalization ability to a less degree than performance on seen tasks; while continual learning algorithms can still bring considerable benefit to the generalization ability.

preprint2021arXiv

Unsupervised Model Adaptation for Continual Semantic Segmentation

We develop an algorithm for adapting a semantic segmentation model that is trained using a labeled source domain to generalize well in an unlabeled target domain. A similar problem has been studied extensively in the unsupervised domain adaptation (UDA) literature, but existing UDA algorithms require access to both the source domain labeled data and the target domain unlabeled data for training a domain agnostic semantic segmentation model. Relaxing this constraint enables a user to adapt pretrained models to generalize in a target domain, without requiring access to source data. To this end, we learn a prototypical distribution for the source domain in an intermediate embedding space. This distribution encodes the abstract knowledge that is learned from the source domain. We then use this distribution for aligning the target domain distribution with the source domain distribution in the embedding space. We provide theoretical analysis and explain conditions under which our algorithm is effective. Experiments on benchmark adaptation task demonstrate our method achieves competitive performance even compared with joint UDA approaches.

preprint2016arXiv

Image Super-Resolution Based on Sparsity Prior via Smoothed $l_0$ Norm

In this paper we aim to tackle the problem of reconstructing a high-resolution image from a single low-resolution input image, known as single image super-resolution. In the literature, sparse representation has been used to address this problem, where it is assumed that both low-resolution and high-resolution images share the same sparse representation over a pair of coupled jointly trained dictionaries. This assumption enables us to use the compressed sensing theory to find the jointly sparse representation via the low-resolution image and then use it to recover the high-resolution image. However, sparse representation of a signal over a known dictionary is an ill-posed, combinatorial optimization problem. Here we propose an algorithm that adopts the smoothed $l_0$-norm (SL0) approach to find the jointly sparse representation. Improved quality of the reconstructed image is obtained for most images in terms of both peak signal-to-noise-ratio (PSNR) and structural similarity (SSIM) measures.

preprint2016arXiv

Testing fine-grained parallelism for the ADMM on a factor-graph

There is an ongoing effort to develop tools that apply distributed computational resources to tackle large problems or reduce the time to solve them. In this context, the Alternating Direction Method of Multipliers (ADMM) arises as a method that can exploit distributed resources like the dual ascent method and has the robustness and improved convergence of the augmented Lagrangian method. Traditional approaches to accelerate the ADMM using multiple cores are problem-specific and often require multi-core programming. By contrast, we propose a problem-independent scheme of accelerating the ADMM that does not require the user to write any parallel code. We show that this scheme, an interpretation of the ADMM as a message-passing algorithm on a factor-graph, can automatically exploit fine-grained parallelism both in GPUs and shared-memory multi-core computers and achieves significant speedup in such diverse application domains as combinatorial optimization, machine learning, and optimal control. Specifically, we obtain 10-18x speedup using a GPU, and 5-9x using multiple CPU cores, over a serial, optimized C-version of the ADMM, which is similar to the typical speedup reported for existing GPU-accelerated libraries, including cuFFT (19x), cuBLAS (17x), and cuRAND (8x).

preprint2011arXiv

Image Deblurring Using Derivative Compressed Sensing for Optical Imaging Application

Reconstruction of multidimensional signals from the samples of their partial derivatives is known to be a standard problem in inverse theory. Such and similar problems routinely arise in numerous areas of applied sciences, including optical imaging, laser interferometry, computer vision, remote sensing and control. Though being ill-posed in nature, the above problem can be solved in a unique and stable manner, provided proper regularization and relevant boundary conditions. In this paper, however, a more challenging setup is addressed, in which one has to recover an image of interest from its noisy and blurry version, while the only information available about the imaging system at hand is the amplitude of the generalized pupil function (GPF) along with partial observations of the gradient of GPF's phase. In this case, the phase-related information is collected using a simplified version of the Shack-Hartmann interferometer, followed by recovering the entire phase by means of derivative compressed sensing. Subsequently, the estimated phase can be combined with the amplitude of the GPF to produce an estimate of the point spread function (PSF), whose knowledge is essential for subsequent image deconvolution. In summary, the principal contribution of this work is twofold. First, we demonstrate how to simplify the construction of the Shack-Hartmann interferometer so as to make it less expensive and hence more accessible. Second, it is shown by means of numerical experiments that the above simplification and its associated solution scheme produce image reconstructions of the quality comparable to those obtained using dense sampling of the GPF phase.

Mohammad Rostami

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

SHRED: Retain-Set-Free Unlearning via Self-Distillation with Logit Demotion

Online Continual Domain Adaptation for Semantic Image Segmentation Using Internal Representations

Automating Detection of Papilledema in Pediatric Fundus Images with Explainable Machine Learning

Learn Continually, Generalize Rapidly: Lifelong Knowledge Accumulation for Few-shot Learning

Unsupervised Model Adaptation for Continual Semantic Segmentation

Image Super-Resolution Based on Sparsity Prior via Smoothed $l_0$ Norm

Testing fine-grained parallelism for the ADMM on a factor-graph

Image Deblurring Using Derivative Compressed Sensing for Optical Imaging Application