Source author record

Mohammad Hossein Rohban

Mohammad Hossein Rohban appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Networking and Internet Architecture

Catalog footprint

What is connected

7works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

OOD Augmentation May Be at Odds with Open-Set Recognition

Despite advances in image classification methods, detecting the samples not belonging to the training classes is still a challenging problem. There has been a burst of interest in this subject recently, which is called Open-Set Recognition (OSR). In OSR, the goal is to achieve both the classification and detecting out-of-distribution (OOD) samples. Several ideas have been proposed to push the empirical result further through complicated techniques. We believe that such complication is indeed not necessary. To this end, we have shown that Maximum Softmax Probability (MSP), as the simplest baseline for OSR, applied on Vision Transformers (ViTs) as the base classifier that is trained with non-OOD augmentations can surprisingly outperform many recent methods. Non-OOD augmentations are the ones that do not alter the data distribution by much. Our results outperform state-of-the-art in CIFAR-10 datasets, and is also better than most of the current methods in SVHN and MNIST. We show that training augmentation has a significant effect on the performance of ViTs in the OSR tasks, and while they should produce significant diversity in the augmented samples, the generated sample OOD-ness must remain limited.

preprint2022arXiv

PIAT: Physics Informed Adversarial Training for Solving Partial Differential Equations

In this paper, we propose the physics informed adversarial training (PIAT) of neural networks for solving nonlinear differential equations (NDE). It is well-known that the standard training of neural networks results in non-smooth functions. Adversarial training (AT) is an established defense mechanism against adversarial attacks, which could also help in making the solution smooth. AT include augmenting the training mini-batch with a perturbation that makes the network output mismatch the desired output adversarially. Unlike formal AT, which relies only on the training data, here we encode the governing physical laws in the form of nonlinear differential equations using automatic differentiation in the adversarial network architecture. We compare PIAT with PINN to indicate the effectiveness of our method in solving NDEs for up to 10 dimensions. Moreover, we propose weight decay and Gaussian smoothing to demonstrate the PIAT advantages. The code repository is available at https://github.com/rohban-lab/PIAT.

preprint2022arXiv

Puzzle-AE: Novelty Detection in Images through Solving Puzzles

Autoencoder, as an essential part of many anomaly detection methods, is lacking flexibility on normal data in complex datasets. U-Net is proved to be effective for this purpose but overfits on the training data if trained by just using reconstruction error similar to other AE-based frameworks. Puzzle-solving, as a pretext task of self-supervised learning (SSL) methods, has earlier proved its ability in learning semantically meaningful features. We show that training U-Nets based on this task is an effective remedy that prevents overfitting and facilitates learning beyond pixel-level features. Shortcut solutions, however, are a big challenge in SSL tasks, including jigsaw puzzles. We propose adversarial robust training as an effective automatic shortcut removal. We achieve competitive or superior results compared to the State of the Art (SOTA) anomaly detection methods on various toy and real-world datasets. Unlike many competitors, the proposed framework is stable, fast, data-efficient, and does not require unprincipled early stopping.

preprint2022arXiv

SwinCheX: Multi-label classification on chest X-ray images with transformers

According to the considerable growth in the avail of chest X-ray images in diagnosing various diseases, as well as gathering extensive datasets, having an automated diagnosis procedure using deep neural networks has occupied the minds of experts. Most of the available methods in computer vision use a CNN backbone to acquire high accuracy on the classification problems. Nevertheless, recent researches show that transformers, established as the de facto method in NLP, can also outperform many CNN-based models in vision. This paper proposes a multi-label classification deep model based on the Swin Transformer as the backbone to achieve state-of-the-art diagnosis classification. It leverages Multi-Layer Perceptron, also known as MLP, for the head architecture. We evaluate our model on one of the most widely-used and largest x-ray datasets called "Chest X-ray14," which comprises more than 100,000 frontal/back-view images from over 30,000 patients with 14 famous chest diseases. Our model has been tested with several number of MLP layers for the head setting, each achieves a competitive AUC score on all classes. Comprehensive experiments on Chest X-ray14 have shown that a 3-layer head attains state-of-the-art performance with an average AUC score of 0.810, compared to the former SOTA average AUC of 0.799. We propose an experimental setup for the fair benchmarking of existing methods, which could be used as a basis for the future studies. Finally, we followed up our results by confirming that the proposed method attends to the pathologically relevant areas of the chest.

preprint2020arXiv

Towards Deep Learning Models Resistant to Large Perturbations

Adversarial robustness has proven to be a required property of machine learning algorithms. A key and often overlooked aspect of this problem is to try to make the adversarial noise magnitude as large as possible to enhance the benefits of the model robustness. We show that the well-established algorithm called "adversarial training" fails to train a deep neural network given a large, but reasonable, perturbation magnitude. In this paper, we propose a simple yet effective initialization of the network weights that makes learning on higher levels of noise possible. We next evaluate this idea rigorously on MNIST ($ε$ up to $\approx 0.40$) and CIFAR10 ($ε$ up to $\approx 32/255$) datasets assuming the $\ell_{\infty}$ attack model. Additionally, in order to establish the limits of $ε$ in which the learning is feasible, we study the optimal robust classifier assuming full access to the joint data and label distribution. Then, we provide some theoretical results on the adversarial accuracy for a simple multi-dimensional Bernoulli distribution, which yields some insights on the range of feasible perturbations for the MNIST dataset.

preprint2013arXiv

An Impossibility Result for High Dimensional Supervised Learning

We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.

preprint2013arXiv

Incorporating Betweenness Centrality in Compressive Sensing for Congestion Detection

This paper presents a new Compressive Sensing (CS) scheme for detecting network congested links. We focus on decreasing the required number of measurements to detect all congested links in the context of network tomography. We have expanded the LASSO objective function by adding a new term corresponding to the prior knowledge based on the relationship between the congested links and the corresponding link Betweenness Centrality (BC). The accuracy of the proposed model is verified by simulations on two real datasets. The results demonstrate that our model outperformed the state-of-the-art CS based method with significant improvements in terms of F-Score.

Mohammad Hossein Rohban

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

OOD Augmentation May Be at Odds with Open-Set Recognition

PIAT: Physics Informed Adversarial Training for Solving Partial Differential Equations

Puzzle-AE: Novelty Detection in Images through Solving Puzzles

SwinCheX: Multi-label classification on chest X-ray images with transformers

Towards Deep Learning Models Resistant to Large Perturbations

An Impossibility Result for High Dimensional Supervised Learning

Incorporating Betweenness Centrality in Compressive Sensing for Congestion Detection