Researcher profile

Man-on Pun

Man-on Pun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

Fairness-Oriented User Scheduling for Bursty Downlink Transmission Using Multi-Agent Reinforcement Learning

In this work, we develop practical user scheduling algorithms for downlink bursty traffic with emphasis on user fairness. In contrast to the conventional scheduling algorithms that either equally divides the transmission time slots among users or maximizing some ratios without physcial meanings, we propose to use the 5%-tile user data rate (5TUDR) as the metric to evaluate user fairness. Since it is difficult to directly optimize 5TUDR, we first cast the problem into the stochastic game framework and subsequently propose a Multi-Agent Reinforcement Learning (MARL)-based algorithm to perform distributed optimization on the resource block group (RBG) allocation. Furthermore, each MARL agent is designed to take information measured by network counters from multiple network layers (e.g. Channel Quality Indicator, Buffer size) as the input states while the RBG allocation as action with a proposed reward function designed to maximize 5TUDR. Extensive simulation is performed to show that the proposed MARL-based scheduler can achieve fair scheduling while maintaining good average network throughput as compared to conventional schedulers.

preprint2022arXiv

Rényi State Entropy for Exploration Acceleration in Reinforcement Learning

One of the most critical challenges in deep reinforcement learning is to maintain the long-term exploration capability of the agent. To tackle this problem, it has been recently proposed to provide intrinsic rewards for the agent to encourage exploration. However, most existing intrinsic reward-based methods proposed in the literature fail to provide sustainable exploration incentives, a problem known as vanishing rewards. In addition, these conventional methods incur complex models and additional memory in their learning procedures, resulting in high computational complexity and low robustness. In this work, a novel intrinsic reward module based on the Rényi entropy is proposed to provide high-quality intrinsic rewards. It is shown that the proposed method actually generalizes the existing state entropy maximization methods. In particular, a $k$-nearest neighbor estimator is introduced for entropy estimation while a $k$-value search method is designed to guarantee the estimation accuracy. Extensive simulation results demonstrate that the proposed Rényi entropy-based method can achieve higher performance as compared to existing schemes.