Researcher profile

Alvaro Valcarce

Alvaro Valcarce contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

LLM-Based Emulation of the Radio Resource Control Layer: Towards AI-Native RAN Protocols

Integrating Large AI Models (LAMs) into 6G mobile networks is a key enabler of the AI-Native Air Interface (AI-AI), where protocol intelligence must scale beyond handcrafted logic. This paper presents, to our knowledge, the first standards-compliant emulation of the Radio Resource Control (RRC) layer using a decoder-only LAM (LLAMA-class) fine-tuned with Low-Rank Adaptation (LoRA) on a multi-vendor corpus of real-world traces spanning both 5G and 4G systems. We treat RRC as a domain-specific language and construct a segmentation-safe, question-answer (Question-and-Answer (QA)) dataset that preserves Abstract Syntax Notation (ASN.1) structure through linearization prior to Byte Pair Encoding (BPE) tokenization. The proposed approach combines parameter-efficient adaptation with schema-bounded prompting to ensure syntactic and procedural fidelity. Evaluation introduces a standards-aware triad -- ASN.1 conformance, field-level coverage analysis, and uplink-to-downlink state-machine checks -- alongside semantic similarity and latency profiling across 120 configurations. On 30k 5G request-response pairs plus an additional 4.8k QA turns from 4G sessions, our 8B model achieves a median cosine similarity of 0.97, a 61% relative gain over a zero-shot baseline, while sustaining high conformance rates. These results demonstrate that LAMs, when augmented with protocol-aware reasoning, can directly orchestrate control-plane procedures, laying the foundation for the future Artificial Intelligence (AI)-native Radio Access Network (RAN).

preprint2022arXiv

Fairness Based Energy-Efficient 3D Path Planning of a Portable Access Point: A Deep Reinforcement Learning Approach

In this work, we optimize the 3D trajectory of an unmanned aerial vehicle (UAV)-based portable access point (PAP) that provides wireless services to a set of ground nodes (GNs). Moreover, as per the Peukert effect, we consider pragmatic non-linear battery discharge for the battery of the UAV. Thus, we formulate the problem in a novel manner that represents the maximization of a fairness-based energy efficiency metric and is named fair energy efficiency (FEE). The FEE metric defines a system that lays importance on both the per-user service fairness and the energy efficiency of the PAP. The formulated problem takes the form of a non-convex problem with non-tractable constraints. To obtain a solution, we represent the problem as a Markov Decision Process (MDP) with continuous state and action spaces. Considering the complexity of the solution space, we use the twin delayed deep deterministic policy gradient (TD3) actor-critic deep reinforcement learning (DRL) framework to learn a policy that maximizes the FEE of the system. We perform two types of RL training to exhibit the effectiveness of our approach: the first (offline) approach keeps the positions of the GNs the same throughout the training phase; the second approach generalizes the learned policy to any arrangement of GNs by changing the positions of GNs after each training episode. Numerical evaluations show that neglecting the Peukert effect overestimates the air-time of the PAP and can be addressed by optimally selecting the PAP's flying speed. Moreover, the user fairness, energy efficiency, and hence the FEE value of the system can be improved by efficiently moving the PAP above the GNs. As such, we notice massive FEE improvements over baseline scenarios of up to 88.31%, 272.34%, and 318.13% for suburban, urban, and dense urban environments, respectively.

preprint2022arXiv

Scalable Joint Learning of Wireless Multiple-Access Policies and their Signaling

In this paper, we apply an multi-agent reinforcement learning (MARL) framework allowing the base station (BS) and the user equipments (UEs) to jointly learn a channel access policy and its signaling in a wireless multiple access scenario. In this framework, the BS and UEs are reinforcement learning (RL) agents that need to cooperate in order to deliver data. The comparison with a contention-free and a contention-based baselines shows that our framework achieves a superior performance in terms of goodput even in high traffic situations while maintaining a low collision rate. The scalability of the proposed method is studied, since it is a major problem in MARL and this paper provides the first results in order to address it.

preprint2021arXiv

The Emergence of Wireless MAC Protocols with Multi-Agent Reinforcement Learning

In this paper, we propose a new framework, exploiting the multi-agent deep deterministic policy gradient (MADDPG) algorithm, to enable a base station (BS) and user equipment (UE) to come up with a medium access control (MAC) protocol in a multiple access scenario. In this framework, the BS and UEs are reinforcement learning (RL) agents that need to learn to cooperate in order to deliver data. The network nodes can exchange control messages to collaborate and deliver data across the network, but without any prior agreement on the meaning of the control messages. In such a framework, the agents have to learn not only the channel access policy, but also the signaling policy. The collaboration between agents is shown to be important, by comparing the proposed algorithm to ablated versions where either the communication between agents or the central critic is removed. The comparison with a contention-free baseline shows that our framework achieves a superior performance in terms of goodput and can effectively be used to learn a new protocol.