Source author record

Hassam Ullah Sheikh

Hassam Ullah Sheikh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Multiagent Systems Cryptography and Security Software Engineering

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity

The classic DQN algorithm is limited by the overestimation bias of the learned Q-function. Subsequent algorithms have proposed techniques to reduce this problem, without fully eliminating it. Recently, the Maxmin and Ensemble Q-learning algorithms have used different estimates provided by the ensembles of learners to reduce the overestimation bias. Unfortunately, these learners can converge to the same point in the parametric or representation space, falling back to the classic single neural network DQN. In this paper, we describe a regularization technique to maximize ensemble diversity in these algorithms. We propose and compare five regularization functions inspired from economics theory and consensus optimization. We show that the regularized approach significantly outperforms the Maxmin and Ensemble Q-learning algorithms as well as non-ensemble baselines.

preprint2020arXiv

Automatic Feature Extraction, Categorization and Detection of Malicious Code in Android Applications

Since Android has become a popular software platform for mobile devices recently; they offer almost the same functionality as personal computers. Malwares have also become a big concern. As the number of new Android applications tends to be rapidly increased in the near future, there is a need for automatic malware detection quickly and efficiently. In this paper, we define a simple static analysis approach to first extract the features of the android application based on intents and categories the application into a known major category and later on mapping it with the permissions requested by the application and also comparing it with the most obvious intents of category. As a result, getting to know which apps are using features which they are not supposed to use or they do not need.

preprint2020arXiv

Emergence of Scenario-Appropriate Collaborative Behaviors for Teams of Robotic Bodyguards

We are considering the problem of controlling a team of robotic bodyguards protecting a VIP from physical assault in the presence of neutral and/or adversarial bystanders. This task is part of a much larger class of problems involving coordinated robot behavior in the presence of humans. This problem is challenging due to the large number of active entities with different agendas, the need of cooperation between the robots as well as the requirement to take into consideration criteria such as social norms and unobtrusiveness in addition to the main goal of VIP safety. Furthermore, different settings such as street, public space or red carpet require very different behavior from the robot. We describe how a multi-agent reinforcement learning approach can evolve behavior policies for teams of robot bodyguards that compare well with hand-engineered approaches. Furthermore, we show that an algorithm inspired by universal value function approximators can learn policies that exhibit appropriate, distinct behavior in environments with different requirements.

preprint2020arXiv

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are designed to either maximize the global reward of the team or the individual local rewards. The problem is exacerbated when either of the rewards is sparse leading to unstable learning. To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to maximize the global and local rewards. We evaluate our solution on the challenging defensive escort team problem and show that our solution achieves a significantly better and more stable performance than the direct adaptation of the MADDPG algorithm.

Hassam Ullah Sheikh

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity

Automatic Feature Extraction, Categorization and Detection of Malicious Code in Android Applications

Emergence of Scenario-Appropriate Collaborative Behaviors for Teams of Robotic Bodyguards

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward