Researcher profile

Shohreh Kasaei

Shohreh Kasaei contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2024arXiv

Rethinking RAFT for Efficient Optical Flow

Despite significant progress in deep learning-based optical flow methods, accurately estimating large displacements and repetitive patterns remains a challenge. The limitations of local features and similarity search patterns used in these algorithms contribute to this issue. Additionally, some existing methods suffer from slow runtime and excessive graphic memory consumption. To address these problems, this paper proposes a novel approach based on the RAFT framework. The proposed Attention-based Feature Localization (AFL) approach incorporates the attention mechanism to handle global feature extraction and address repetitive patterns. It introduces an operator for matching pixels with corresponding counterparts in the second frame and assigning accurate flow values. Furthermore, an Amorphous Lookup Operator (ALO) is proposed to enhance convergence speed and improve RAFTs ability to handle large displacements by reducing data redundancy in its search operator and expanding the search space for similarity extraction. The proposed method, Efficient RAFT (Ef-RAFT),achieves significant improvements of 10% on the Sintel dataset and 5% on the KITTI dataset over RAFT. Remarkably, these enhancements are attained with a modest 33% reduction in speed and a mere 13% increase in memory usage. The code is available at: https://github.com/n3slami/Ef-RAFT

preprint2022arXiv

LPF-Defense: 3D Adversarial Defense based on Frequency Analysis

Although 3D point cloud classification has recently been widely deployed in different application scenarios, it is still very vulnerable to adversarial attacks. This increases the importance of robust training of 3D models in the face of adversarial attacks. Based on our analysis on the performance of existing adversarial attacks, more adversarial perturbations are found in the mid and high-frequency components of input data. Therefore, by suppressing the high-frequency content in the training phase, the models robustness against adversarial examples is improved. Experiments showed that the proposed defense method decreases the success rate of six attacks on PointNet, PointNet++ ,, and DGCNN models. In particular, improvements are achieved with an average increase of classification accuracy by 3.8 % on drop100 attack and 4.26 % on drop200 attack compared to the state-of-the-art methods. The method also improves models accuracy on the original dataset compared to other available methods.

preprint2021arXiv

Deep Learning for Visual Tracking: A Comprehensive Survey

Visual target tracking is one of the most sought-after yet challenging research topics in computer vision. Given the ill-posed nature of the problem and its popularity in a broad range of real-world scenarios, a number of large-scale benchmark datasets have been established, on which considerable methods have been developed and demonstrated with significant progress in recent years -- predominantly by recent deep learning (DL)-based methods. This survey aims to systematically investigate the current DL-based visual tracking methods, benchmark datasets, and evaluation metrics. It also extensively evaluates and analyzes the leading visual tracking methods. First, the fundamental characteristics, primary motivations, and contributions of DL-based methods are summarized from nine key aspects of: network architecture, network exploitation, network training for visual tracking, network objective, network output, exploitation of correlation filter advantages, aerial-view tracking, long-term tracking, and online tracking. Second, popular visual tracking benchmarks and their respective properties are compared, and their evaluation metrics are summarized. Third, the state-of-the-art DL-based methods are comprehensively examined on a set of well-established benchmarks of OTB2013, OTB2015, VOT2018, LaSOT, UAV123, UAVDT, and VisDrone2019. Finally, by conducting critical analyses of these state-of-the-art trackers quantitatively and qualitatively, their pros and cons under various common scenarios are investigated. It may serve as a gentle use guide for practitioners to weigh when and under what conditions to choose which method(s). It also facilitates a discussion on ongoing issues and sheds light on promising research directions.

preprint2020arXiv

COMET: Context-Aware IoU-Guided Network for Small Object Tracking

We consider the problem of tracking an unknown small target from aerial videos of medium to high altitudes. This is a challenging problem, which is even more pronounced in unavoidable scenarios of drastic camera motion and high density. To address this problem, we introduce a context-aware IoU-guided tracker (COMET) that exploits a multitask two-stream network and an offline reference proposal generation strategy. The proposed network fully exploits target-related information by multi-scale feature learning and attention modules. The proposed strategy introduces an efficient sampling strategy to generalize the network on the target and its parts without imposing extra computational complexity during online tracking. These strategies contribute considerably in handling significant occlusions and viewpoint changes. Empirically, COMET outperforms the state-of-the-arts in a range of aerial view datasets that focusing on tracking small objects. Specifically, COMET outperforms the celebrated ATOM tracker by an average margin of 6.2% (and 7%) in precision (and success) score on challenging benchmarks of UAVDT, VisDrone-2019, and Small-90.

preprint2020arXiv

Do Compressed Representations Generalize Better?

One of the most studied problems in machine learning is finding reasonable constraints that guarantee the generalization of a learning algorithm. These constraints are usually expressed as some simplicity assumptions on the target. For instance, in the Vapnik-Chervonenkis (VC) theory the space of possible hypotheses is considered to have a limited VC dimension. In this paper, the constraint on the entropy $H(X)$ of the input variable $X$ is studied as a simplicity assumption. It is proven that the sample complexity to achieve an $ε$-$δ$ Probably Approximately Correct (PAC) hypothesis is bounded by $\frac{2^{ \left.6H(X)\middle/ε\right.}+\log{\frac{1}δ}}{ε^2}$ which is sharp up to the $\frac{1}{ε^2}$ factor. Morever, it is shown that if a feature learning process is employed to learn the compressed representation from the dataset, this bound no longer exists. These findings have important implications on the Information Bottleneck (IB) theory which had been utilized to explain the generalization power of Deep Neural Networks (DNNs), but its applicability for this purpose is currently under debate by researchers. In particular, this is a rigorous proof for the previous heuristic that compressed representations are exponentially easier to be learned. However, our analysis pinpoints two factors preventing the IB, in its current form, to be applicable in studying neural networks. Firstly, the exponential dependence of sample complexity on $\frac{1}ε$, which can lead to a dramatic effect on the bounds in practical applications when $ε$ is small. Secondly, our analysis reveals that arguments based on input compression are inherently insufficient to explain generalization of methods like DNNs in which the features are also learned using available data.

preprint2020arXiv

Lightweight Residual Densely Connected Convolutional Neural Network

Extremely efficient convolutional neural network architectures are one of the most important requirements for limited-resource devices (such as embedded and mobile devices). The computing power and memory size are two important constraints of these devices. Recently, some architectures have been proposed to overcome these limitations by considering specific hardware-software equipment. In this paper, the lightweight residual densely connected blocks are proposed to guaranty the deep supervision, efficient gradient flow, and feature reuse abilities of convolutional neural network. The proposed method decreases the cost of training and inference processes without using any special hardware-software equipment by just reducing the number of parameters and computational operations while achieving a feasible accuracy. Extensive experimental results demonstrate that the proposed architecture is more efficient than the AlexNet and VGGNet in terms of model size, required parameters, and even accuracy. The proposed model has been evaluated on the ImageNet, MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. It achieves state-of-the-art results on Fashion MNIST dataset and reasonable results on the others. The obtained results show the superiority of the proposed method to efficient models such as the SqueezNet. It is also comparable with state-of-the-art efficient models such as CondenseNet and ShuffleNet.