Researcher profile

Hassam Ullah Sheikh

Hassam Ullah Sheikh contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity

The classic DQN algorithm is limited by the overestimation bias of the learned Q-function. Subsequent algorithms have proposed techniques to reduce this problem, without fully eliminating it. Recently, the Maxmin and Ensemble Q-learning algorithms have used different estimates provided by the ensembles of learners to reduce the overestimation bias. Unfortunately, these learners can converge to the same point in the parametric or representation space, falling back to the classic single neural network DQN. In this paper, we describe a regularization technique to maximize ensemble diversity in these algorithms. We propose and compare five regularization functions inspired from economics theory and consensus optimization. We show that the regularized approach significantly outperforms the Maxmin and Ensemble Q-learning algorithms as well as non-ensemble baselines.

preprint2020arXiv

Automatic Feature Extraction, Categorization and Detection of Malicious Code in Android Applications

Since Android has become a popular software platform for mobile devices recently; they offer almost the same functionality as personal computers. Malwares have also become a big concern. As the number of new Android applications tends to be rapidly increased in the near future, there is a need for automatic malware detection quickly and efficiently. In this paper, we define a simple static analysis approach to first extract the features of the android application based on intents and categories the application into a known major category and later on mapping it with the permissions requested by the application and also comparing it with the most obvious intents of category. As a result, getting to know which apps are using features which they are not supposed to use or they do not need.

preprint2020arXiv

Emergence of Scenario-Appropriate Collaborative Behaviors for Teams of Robotic Bodyguards

We are considering the problem of controlling a team of robotic bodyguards protecting a VIP from physical assault in the presence of neutral and/or adversarial bystanders. This task is part of a much larger class of problems involving coordinated robot behavior in the presence of humans. This problem is challenging due to the large number of active entities with different agendas, the need of cooperation between the robots as well as the requirement to take into consideration criteria such as social norms and unobtrusiveness in addition to the main goal of VIP safety. Furthermore, different settings such as street, public space or red carpet require very different behavior from the robot. We describe how a multi-agent reinforcement learning approach can evolve behavior policies for teams of robot bodyguards that compare well with hand-engineered approaches. Furthermore, we show that an algorithm inspired by universal value function approximators can learn policies that exhibit appropriate, distinct behavior in environments with different requirements.

preprint2020arXiv

Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward

Many cooperative multi-agent problems require agents to learn individual tasks while contributing to the collective success of the group. This is a challenging task for current state-of-the-art multi-agent reinforcement algorithms that are designed to either maximize the global reward of the team or the individual local rewards. The problem is exacerbated when either of the rewards is sparse leading to unstable learning. To address this problem, we present Decomposed Multi-Agent Deep Deterministic Policy Gradient (DE-MADDPG): a novel cooperative multi-agent reinforcement learning framework that simultaneously learns to maximize the global and local rewards. We evaluate our solution on the challenging defensive escort team problem and show that our solution achieves a significantly better and more stable performance than the direct adaptation of the MADDPG algorithm.