Researcher profile

Shahrokh Valaee

Shahrokh Valaee contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
10works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Extending Kernel Trick to Influence Functions

In this paper, we present a dual representation of the influence functions, whose computational complexity scales with dataset size rather than model size. Both analytically and experimentally, we show that this representation can be an efficient alternative to the original influence functions for estimating changes in parameters, model outputs and loss due to data point removal, when model size is large relative to dataset size, or when evaluating the original influence functions in parameter space is infeasible. The dual representation, however, is limited to linearizable models, which are models whose behavior can be approximated by their linearizations throughout training, and requires materializing a matrix, whose size grows with the product of model output dimension and dataset size.

preprint2022arXiv

EDropout: Energy-Based Dropout and Pruning of Deep Neural Networks

Dropout is a well-known regularization method by sampling a sub-network from a larger deep neural network and training different sub-networks on different subsets of the data. Inspired by the dropout concept, we propose EDropout as an energy-based framework for pruning neural networks in classification tasks. In this approach, a set of binary pruning state vectors (population) represents a set of corresponding sub-networks from an arbitrary provided original neural network. An energy loss function assigns a scalar energy loss value to each pruning state. The energy-based model stochastically evolves the population to find states with lower energy loss. The best pruning state is then selected and applied to the original network. Similar to dropout, the kept weights are updated using backpropagation in a probabilistic model. The energy-based model again searches for better pruning states and the cycle continuous. Indeed, this procedure is in fact switching between the energy model, which manages the pruning states, and the probabilistic model, which updates the temporarily unpruned weights, in each iteration. The population can dynamically converge to a pruning state. This can be interpreted as dropout leading to pruning the network. From an implementation perspective, EDropout can prune typical neural networks without modification of the network architecture. We evaluated the proposed method on different flavours of ResNets, AlexNet, and SqueezeNet on the Kuzushiji, Fashion, CIFAR-10, CIFAR-100, and Flowers datasets, and compared the pruning rate and classification performance of the models. On average the networks trained with EDropout achieved a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2021arXiv

A Framework For Pruning Deep Neural Networks Using Energy-Based Models

A typical deep neural network (DNN) has a large number of trainable parameters. Choosing a network with proper capacity is challenging and generally a larger network with excessive capacity is trained. Pruning is an established approach to reducing the number of parameters in a DNN. In this paper, we propose a framework for pruning DNNs based on a population-based global optimization method. This framework can use any pruning objective function. As a case study, we propose a simple but efficient objective function based on the concept of energy-based models. Our experiments on ResNets, AlexNet, and SqueezeNet for the CIFAR-10 and CIFAR-100 datasets show a pruning rate of more than $50\%$ of the trainable parameters with approximately $<5\%$ and $<1\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2021arXiv

Pruning of Convolutional Neural Networks Using Ising Energy Model

Pruning is one of the major methods to compress deep neural networks. In this paper, we propose an Ising energy model within an optimization framework for pruning convolutional kernels and hidden units. This model is designed to reduce redundancy between weight kernels and detect inactive kernels/hidden units. Our experiments using ResNets, AlexNet, and SqueezeNet on CIFAR-10 and CIFAR-100 datasets show that the proposed method on average can achieve a pruning rate of more than $50\%$ of the trainable parameters with approximately $<10\%$ and $<5\%$ drop of Top-1 and Top-5 classification accuracy, respectively.

preprint2019arXiv

Survey of Dropout Methods for Deep Neural Networks

Dropout methods are a family of stochastic techniques used in neural network training or inference that have generated significant research interest and are widely used in practice. They have been successfully applied in neural network regularization, model compression, and in measuring the uncertainty of neural network outputs. While original formulated for dense neural network layers, recent advances have made dropout methods also applicable to convolutional and recurrent neural network layers. This paper summarizes the history of dropout methods, their various applications, and current areas of research interest. Important proposed methods are described in additional detail.

preprint2013arXiv

Coding Opportunity Densification Strategies for Instantly Decodable Network Coding

In this paper, we aim to identify the strategies that can maximize and monotonically increase the density of the coding opportunities in instantly decodable network coding (IDNC).Using the well-known graph representation of IDNC, first derive an expression for the exact evolution of the edge set size after the transmission of any arbitrary coded packet. From the derived expressions, we show that sending commonly wanted packets for all the receivers can maximize the number of coding opportunities. Since guaranteeing such property in IDNC is usually impossible, this strategy does not guarantee the achievement of our target. Consequently, we further investigate the problem by deriving the expectation of the edge set size evolution after ignoring the identities of the packets requested by the different receivers and considering only their numbers. We then employ this expected expression to show that serving the maximum number of receivers having the largest numbers of missing packets and erasure probabilities tends to both maximize and monotonically increase the expected density of coding opportunities. Simulation results justify our theoretical findings. Finally, we validate the importance of our work through two case studies showing that our identified strategy outperforms the step-by-step service maximization solution in optimizing both the IDNC completion delay and receiver goodput.

preprint2013arXiv

Joint Indoor Localization and Radio Map Construction with Limited Deployment Load

One major bottleneck in the practical implementation of received signal strength (RSS) based indoor localization systems is the extensive deployment efforts required to construct the radio maps through fingerprinting. In this paper, we aim to design an indoor localization scheme that can be directly employed without building a full fingerprinted radio map of the indoor environment. By accumulating the information of localized RSSs, this scheme can also simultaneously construct the radio map with limited calibration. To design this scheme, we employ a source data set that possesses the same spatial correlation of the RSSs in the indoor environment under study. The knowledge of this data set is then transferred to a limited number of calibration fingerprints and one or several RSS observations with unknown locations, in order to perform direct localization of these observations using manifold alignment. We test two different source data sets, namely a simulated radio propagation map and the environments plan coordinates. For moving users, we exploit the correlation of their observations to improve the localization accuracy. The online testing in two indoor environments shows that the plan coordinates achieve better results than the simulated radio maps, and a negligible degradation with 70-85% reduction in calibration load.

preprint2013arXiv

Mobility Diversity in Mobile Wireless Networks

We introduce the novel concept of mobility diversity for mobile sensor or communication networks as the diversity introduced by transmitting data over different topologies of the network. We show how node mobility can provide diversity by changing the topology of the network. More specifically, we consider a mobile network of a sensor node and a number of sink nodes which are all moving randomly according to different Wiener process mobility models. Assuming that the network topology evolves with time and assuming that the connectivity of the sensor node to at least one sink node is needed for successful communication, we calculate three performance measures for this network, i) the expected number of time instants, where the sensor node is connected to at least one sink node, ii) the probability of outage, being the probability that no sink node is in the vicinity of the sensor node during the observation interval, and finally, iii) the maximum number of consequent failures in the communication. Our theoretical and numerical analysis show that increasing the mobility parameter of the sensor node increases the average number of successful transmissions, decreases the probability of outage, and reduces the maximum delay in the senor-sink communication.

preprint2013arXiv

Partially Blind Instantly Decodable Network Codes for Lossy Feedback Environment

In this paper, we study the multicast completion and decoding delay minimization problems of instantly decodable network coding (IDNC) in the case of lossy feedback. In such environments, the sender falls into uncertainties about packet reception at the different receivers, which forces it to perform partially blind selections of packet combinations in subsequent transmissions. To determine efficient partially blind policies that handle the completion and decoding delays of IDNC in such environment, we first extend the perfect feedback formulation in [2], [3] to the lossy feedback environment, by incorporating the uncertainties resulting from unheard feedback events in these formulations. For the completion delay problem, we use this formulation to identify the maximum likelihood state of the network in events of unheard feedback, and employ it to design a partially blind graph update extension to the multicast IDNC algorithm in [3]. For the decoding delay problem, we derive an expression for the expected decoding delay increment for any arbitrary transmission. This expression is then used to derive the optimal policy to reduce the decoding delay in such lossy feedback environment. Results show that our proposed solution both outperforms other approaches and achieves a tolerable degradation even at relatively high feedback loss rates.

preprint2012arXiv

Completion Delay Minimization for Instantly Decodable Network Codes

In this paper, we consider the problem of minimizing the completion delay for instantly decodable network coding (IDNC), in wireless multicast and broadcast scenarios. We are interested in this class of network coding due to its numerous benefits, such as low decoding delay, low coding and decoding complexities and simple receiver requirements. We first extend the IDNC graph, which represents all feasible IDNC coding opportunities, to efficiently operate in both multicast and broadcast scenarios. We then formulate the minimum completion delay problem for IDNC as a stochastic shortest path (SSP) problem. Although finding the optimal policy using SSP is intractable, we use this formulation to draw the theoretical guidelines for the policies that can efficiently reduce the completion delay in IDNC. Based on these guidelines, we design a maximum weight clique selection algorithm, which can efficiently reduce the IDNC completion delay in polynomial time. We also design a quadratic time heuristic clique selection algorithm, which can operate in real-time applications. Simulation results show that our proposed algorithms efficiently reduce the IDNC completion delay compared to the random and maximum-rate algorithms, and almost achieve the global optimal completion delay performance over all network codes in broadcast scenarios.