Researcher profile

Anders Jonsson

Anders Jonsson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2022arXiv

Computing Programs for Generalized Planning as Heuristic Search

Although heuristic search is one of the most successful approaches to classical planning, this planning paradigm does not apply straightforwardly to Generalized Planning (GP). This paper adapts the planning as heuristic search paradigm to the particularities of GP, and presents the first native heuristic search approach to GP. First, the paper defines a program-based solution space for GP that is independent of the number of planning instances in a GP problem, and the size of these instances. Second, the paper defines the BFGP algorithm for GP, that implements a best-first search in our program-based solution space, and that is guided by different evaluation and heuristic functions.

preprint2022arXiv

Performance and Coexistence Evaluation of IEEE 802.11be Multi-link Operation

Wi-Fi 7 is already in the making, and Multi-Link Operation (MLO) is one of the main features proposed in its correspondent IEEE 802.11be amendment. MLO will allow devices to coordinate multiple radio interfaces to access separate channels through a single association, aiming for improved throughput, network delay, and overall spectrum reuse efficiency. In this work, we study three reference scenarios to evaluate the performance of the two main MLO implementations -- Multi-Link Multi-Radio (MLMR) and Multi-Link Single-Radio (MLSR) -- , the interplay between multiple nodes employing them, and their coexistence with legacy Single-Link devices. Importantly, our results reveal that the potential of MLMR is mainly unleashed in isolated deployments or under unloaded network conditions. Instead, in medium- to high-load scenarios, MLSR may prove more effective in reducing the latency while guaranteeing fairness with contending Single-Link nodes.

preprint2022arXiv

Scaling-up Generalized Planning as Heuristic Search with Landmarks

Landmarks are one of the most effective search heuristics for classical planning, but largely ignored in generalized planning. Generalized planning (GP) is usually addressed as a combinatorial search in a given space of algorithmic solutions, where candidate solutions are evaluated w.r.t.~the instances they solve. This type of solution evaluation ignores any sub-goal information that is not explicit in the representation of the planning instances, causing plateaus in the space of candidate generalized plans. Furthermore, node expansion in GP is a run-time bottleneck since it requires evaluating every child node over the entire batch of classical planning instances in a GP problem. In this paper we define a landmark counting heuristic for GP (that considers sub-goal information that is not explicitly represented in the planning instances), and a novel heuristic search algorithm for GP (that we call PGP) and that progressively processes subsets of the planning instances of a GP problem. Our two orthogonal contributions are analyzed in an ablation study, showing that both improve the state-of-the-art in GP as heuristic search, and that both benefit from each other when used in combination.

preprint2022arXiv

State Representation Learning for Goal-Conditioned Reinforcement Learning

This paper presents a novel state representation for reward-free Markov decision processes. The idea is to learn, in a self-supervised manner, an embedding space where distances between pairs of embedded states correspond to the minimum number of actions needed to transition between them. Compared to previous methods, our approach does not require any domain knowledge, learning from offline and unlabeled data. We show how this representation can be leveraged to learn goal-conditioned policies, providing a notion of similarity between states and goals and a useful heuristic distance to guide planning and reinforcement learning algorithms. Finally, we empirically validate our method in classic control domains and multi-goal environments, demonstrating that our method can successfully learn representations in large and/or continuous domains.

preprint2021arXiv

Hierarchical Width-Based Planning and Learning

Width-based search methods have demonstrated state-of-the-art performance in a wide range of testbeds, from classical planning problems to image-based simulators such as Atari games. These methods scale independently of the size of the state-space, but exponentially in the problem width. In practice, running the algorithm with a width larger than 1 is computationally intractable, prohibiting IW from solving higher width problems. In this paper, we present a hierarchical algorithm that plans at two levels of abstraction. A high-level planner uses abstract features that are incrementally discovered from low-level pruning decisions. We illustrate this algorithm in classical planning PDDL domains as well as in pixel-based simulator domains. In classical planning, we show how IW(1) at two levels of abstraction can solve problems of width 2. For pixel-based domains, we show how in combination with a learned policy and a learned value function, the proposed hierarchical IW can outperform current flat IW-based planners in Atari games with sparse rewards.

preprint2021arXiv

Usage of Network Simulators in Machine-Learning-Assisted 5G/6G Networks

Without any doubt, Machine Learning (ML) will be an important driver of future communications due to its foreseen performance when applied to complex problems. However, the application of ML to networking systems raises concerns among network operators and other stakeholders, especially regarding trustworthiness and reliability. In this paper, we devise the role of network simulators for bridging the gap between ML and communications systems. In particular, we present an architectural integration of simulators in ML-aware networks for training, testing, and validating ML models before being applied to the operative network. Moreover, we provide insights on the main challenges resulting from this integration, and then give hints discussing how they can be overcome. Finally, we illustrate the integration of network simulators into ML-assisted communications through a proof-of-concept testbed implementation of a residential Wi-Fi network.

preprint2020arXiv

A Flexible Machine Learning-Aware Architecture for Future WLANs

Lots of hopes have been placed on Machine Learning (ML) as a key enabler of future wireless networks. By taking advantage of large volumes of data, ML is expected to deal with the ever-increasing complexity of networking problems. Unfortunately, current networks are not yet prepared to support the ensuing requirements of ML-based applications in terms of data collection, processing, and output distribution. This article points out the architectural requirements that are needed to pervasively include ML as part of future wireless networks operation. Specifically, we look into Wireless Local Area Networks (WLANs), which, due to their nature can be found in multiple forms, ranging from cloud-based to edge-computing-like deployments. In particular, we propose to adopt the International Telecommunications Union (ITU) unified architecture for 5G and beyond. Based on ITU's architecture, we provide insights on the main requirements and the major challenges of introducing ML to the multiple modalities of WLANs. Finally, we showcase the superiority of the architecture through an ML-enabled use case for future networks.

preprint2020arXiv

Lifelong Control of Off-grid Microgrid with Model Based Reinforcement Learning

The lifelong control problem of an off-grid microgrid is composed of two tasks, namely estimation of the condition of the microgrid devices and operational planning accounting for the uncertainties by forecasting the future consumption and the renewable production. The main challenge for the effective control arises from the various changes that take place over time. In this paper, we present an open-source reinforcement framework for the modeling of an off-grid microgrid for rural electrification. The lifelong control problem of an isolated microgrid is formulated as a Markov Decision Process (MDP). We categorize the set of changes that can occur in progressive and abrupt changes. We propose a novel model based reinforcement learning algorithm that is able to address both types of changes. In particular the proposed algorithm demonstrates generalisation properties, transfer capabilities and better robustness in case of fast-changing system dynamics. The proposed algorithm is compared against a rule-based policy and a model predictive controller with look-ahead. The results show that the trained agent is able to outperform both benchmarks in the lifelong setting where the system dynamics are changing over time.

preprint2020arXiv

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of calls to the generative models needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent sample complexity result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.