Researcher profile

Hojjat Salehinejad

Hojjat Salehinejad contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
11works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

EDropout: Energy-Based Dropout and Pruning of Deep Neural Networks

Dropout is a well-known regularization method by sampling a sub-network from a larger deep neural network and training different sub-networks on different subsets of the data. Inspired by the dropout concept, we propose EDropout as an energy-based framework for pruning neural networks in classification tasks. In this approach, a set of binary pruning state vectors (population) represents a set of corresponding sub-networks from an arbitrary provided original neural network. An energy loss function assigns a scalar energy loss value to each pruning state. The energy-based model stochastically evolves the population to find states with lower energy loss. The best pruning state is then selected and applied to the original network. Similar to dropout, the kept weights are updated using backpropagation in a probabilistic model. The energy-based model again searches for better pruning states and the cycle continuous. Indeed, this procedure is in fact switching between the energy model, which manages the pruning states, and the probabilistic model, which updates the temporarily unpruned weights, in each iteration. The population can dynamically converge to a pruning state. This can be interpreted as dropout leading to pruning the network. From an implementation perspective, EDropout can prune typical neural networks without modification of the network architecture. We evaluated the proposed method on different flavours of ResNets, AlexNet, and SqueezeNet on the Kuzushiji, Fashion, CIFAR-10, CIFAR-100, and Flowers datasets, and compared the pruning rate and classification performance of the models. On average the networks trained with EDropout achieved a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2021arXiv

A Framework For Pruning Deep Neural Networks Using Energy-Based Models

A typical deep neural network (DNN) has a large number of trainable parameters. Choosing a network with proper capacity is challenging and generally a larger network with excessive capacity is trained. Pruning is an established approach to reducing the number of parameters in a DNN. In this paper, we propose a framework for pruning DNNs based on a population-based global optimization method. This framework can use any pruning objective function. As a case study, we propose a simple but efficient objective function based on the concept of energy-based models. Our experiments on ResNets, AlexNet, and SqueezeNet for the CIFAR-10 and CIFAR-100 datasets show a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2021arXiv

Deep Sequential Learning for Cervical Spine Fracture Detection on Computed Tomography Imaging

Fractures of the cervical spine are a medical emergency and may lead to permanent paralysis and even death. Accurate diagnosis in patients with suspected fractures by computed tomography (CT) is critical to patient management. In this paper, we propose a deep convolutional neural network (DCNN) with a bidirectional long-short term memory (BLSTM) layer for the automated detection of cervical spine fractures in CT axial images. We used an annotated dataset of 3,666 CT scans (729 positive and 2,937 negative cases) to train and validate the model. The validation results show a classification accuracy of 70.92% and 79.18% on the balanced (104 positive and 104 negative cases) and imbalanced (104 positive and 419 negative cases) test datasets, respectively.

preprint2021arXiv

Pruning of Convolutional Neural Networks Using Ising Energy Model

Pruning is one of the major methods to compress deep neural networks. In this paper, we propose an Ising energy model within an optimization framework for pruning convolutional kernels and hidden units. This model is designed to reduce redundancy between weight kernels and detect inactive kernels/hidden units. Our experiments using ResNets, AlexNet, and SqueezeNet on CIFAR-10 and CIFAR-100 datasets show that the proposed method on average can achieve a pruning rate of more than $50\%$ of the trainable parameters with approximately $<10\%$ and $<5\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2019arXiv

Survey of Dropout Methods for Deep Neural Networks

Dropout methods are a family of stochastic techniques used in neural network training or inference that have generated significant research interest and are widely used in practice. They have been successfully applied in neural network regularization, model compression, and in measuring the uncertainty of neural network outputs. While original formulated for dense neural network layers, recent advances have made dropout methods also applicable to convolutional and recurrent neural network layers. This paper summarizes the history of dropout methods, their various applications, and current areas of research interest. Important proposed methods are described in additional detail.

preprint2016arXiv

Deep Recurrent Neural Networks for Sequential Phenotype Prediction in Genomics

In analyzing of modern biological data, we are often dealing with ill-posed problems and missing data, mostly due to high dimensionality and multicollinearity of the dataset. In this paper, we have proposed a system based on matrix factorization (MF) and deep recurrent neural networks (DRNNs) for genotype imputation and phenotype sequences prediction. In order to model the long-term dependencies of phenotype data, the new Recurrent Linear Units (ReLU) learning strategy is utilized for the first time. The proposed model is implemented for parallel processing on central processing units (CPUs) and graphic processing units (GPUs). Performance of the proposed model is compared with other training algorithms for learning long-term dependencies as well as the sparse partial least square (SPLS) method on a set of genotype and phenotype data with 604 samples, 1980 single-nucleotide polymorphisms (SNPs), and two traits. The results demonstrate performance of the ReLU training algorithm in learning long-term dependencies in RNNs.

preprint2016arXiv

Diversity Enhancement for Micro-Differential Evolution

The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random mutation factor (MDEVM) is proposed, which utilizes the small size population benefit while empowers the exploration ability of mutation factor through randomizing it in the decision variable level. The idea is supported by analyzing mutation factor using Monte-Carlo based simulations. To facilitate the usage of MDE algorithms with very-small population sizes, new mutation schemes for population sizes less than four are also proposed. Furthermore, comprehensive comparative simulations and analysis on performance of the MDE algorithms over various mutation schemes, population sizes, problem types (i.e. uni-modal, multi-modal, and composite), problem dimensionalities, and mutation factor ranges are conducted by considering population diversity analysis for stagnation and trapping in local optimum situations. The studies are conducted on 28 benchmark functions provided for the IEEE CEC-2013 competition. Experimental results demonstrate high performance and convergence speed of the proposed MDEVM algorithm.

preprint2016arXiv

Learning Over Long Time Lags

The advantage of recurrent neural networks (RNNs) in learning dependencies between time-series data has distinguished RNNs from other deep learning models. Recently, many advances are proposed in this emerging field. However, there is a lack of comprehensive review on memory models in RNNs in the literature. This paper provides a fundamental review on RNNs and long short term memory (LSTM) model. Then, provides a surveys of recent advances in different memory enhancements and learning techniques for capturing long term dependencies in RNNs.

preprint2015arXiv

Combined A*-Ants Algorithm: A New Multi-Parameter Vehicle Navigation Scheme

In this paper a multi-parameter A*(A- star)-ants based algorithm is proposed in order to find the best optimized multi-parameter path between two desired points in regions. This algorithm recognizes paths, according to user desired parameters using electronic maps. The proposed algorithm is a combination of A* and ants algorithm in which the proposed A* algorithm is the prologue to the suggested ant based algorithm .In fact, this A* algorithm invigorates some paths pheromones in ants algorithm. As one of implementations of this method, this algorithm was applied on a part of Kerman city, Iran as a multi-parameter vehicle navigator. It finds the best optimized multi-parameter direction between two desired junctions based on city traveler parameters. Comparison results between the proposed method and ants algorithm demonstrates efficiency and lower cost function results of the proposed method versus ants algorithm.

preprint2015arXiv

Internet of Things for Residential Areas: Toward Personalized Energy Management Using Big Data

Intelligent management of machines, particularly in a residence area, has been of interest for many years. However, such system design has always been limited to simple control of machines from a local area or remotely from the Internet. In this report, for the first time, an intelligent system is proposed, where not only provides intelligent control ability of machines to user, but also utilizes big data and optimization techniques to provide promotional offers to the user to optimize energy consumption of machines. Since a high traffic communication is involved among the machines and the optimization-big data core of system, the communication core of the proposed system is designed based on cloud, where many challenging issues such as spectrum assignment and resource management are involved. To deal with that, the communication network in the home area network (HAN) is designed based on the cognitive radio system, where a new spectrum assignment method based on the ant colony optimization (ACO) algorithm is proposed to perform spectrum assignment to the machines in the HAN. Performance evaluation of the proposed spectrum assignment method shows its performance in fair spectrum assignment among machines.

preprint2015arXiv

Toward Smart Power Grids: Communication Network Design for Power Grids Synchronization

In smart power grids, keeping the synchronicity of generators and the corresponding controls is of great importance. To do so, a simple model is employed in terms of swing equation to represent the interactions among dynamics of generators and feedback control. In case of having a communication network available, the control can be done based on the transmitted measurements by the communication network. The stability of system is denoted by the largest eigenvalue of the weighted sum of the Laplacian matrices of the communication infrastructure and power network. In this work, we use graph theory to model the communication network as a graph problem. Then, Ant Colony System (ACS) is employed for optimum design of above graph for synchronization of power grids. Performance evaluation of the proposed method for the 39-bus New England power system versus methods such as exhaustive search and Rayleigh quotient approximation indicates feasibility and effectiveness of our method for even large scale smart power grids.