Researcher profile

Javier García

Javier García contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2021arXiv

A Taxonomy of Similarity Metrics for Markov Decision Processes

Although the notion of task similarity is potentially interesting in a wide range of areas such as curriculum learning or automated planning, it has mostly been tied to transfer learning. Transfer is based on the idea of reusing the knowledge acquired in the learning of a set of source tasks to a new learning process in a target task, assuming that the target and source tasks are close enough. In recent years, transfer learning has succeeded in making Reinforcement Learning (RL) algorithms more efficient (e.g., by reducing the number of samples needed to achieve the (near-)optimal performance). Transfer in RL is based on the core concept of similarity: whenever the tasks are similar, the transferred knowledge can be reused to solve the target task and significantly improve the learning performance. Therefore, the selection of good metrics to measure these similarities is a critical aspect when building transfer RL algorithms, especially when this knowledge is transferred from simulation to the real world. In the literature, there are many metrics to measure the similarity between MDPs, hence, many definitions of similarity or its complement distance have been considered. In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far, taking into account such categorization. We also follow this taxonomy to survey the existing literature, as well as suggesting future directions for the construction of new metrics.

preprint2021arXiv

Channel Estimation and Data Equalization in Frequency-Selective MIMO Systems with One-Bit Quantization

This paper addresses channel estimation and data equalization on frequency-selective 1-bit quantized Multiple Input-Multiple Output (MIMO) systems. No joint processing or Channel State Information is assumed at the transmitter, and therefore our findings are also applicable to the uplink of Multi-User MIMO systems. System models for both Orthogonal Division Frequency Multiplexing (OFDM) and single-carrier schemes are developed. A Cramér-Rao Lower Bound for the estimation problems is derived. The two nonlinear algorithms Expectation Maximization (EM) and Generalized Approximate Message Passing (GAMP) are adapted to the problems, and a linear method based on the Bussgang theorem is proposed. In the OFDM case, the linear method enables subcarrier-wise estimation, greatly reducing computational complexity. Simulations are carried out to compare the algorithms with different settings. The results turn out to be close to the Cramér-Rao bound in the low Signal to Noise Ratio (SNR) region. The OFDM setting is more suitable for the nonlinear algorithms, and that the linear methods incur a performance loss with respect to the nonlinear approaches. In the relevant low and medium SNR regions, the loss amounts to 2-3 dB and might well be justified in exchange for the reduced computational effort, especially in Massive MIMO settings.

preprint2021arXiv

DevOps Team Structures: Characterization and Implications

Context: DevOps can be defined as a cultural movement to improve and accelerate the delivery of business value by making the collaboration between development and operations effective. Objective: This paper aims to help practitioners and researchers to better understand the organizational structure and characteristics of teams adopting DevOps. Method: We conducted an exploratory study by leveraging in depth, semi-structured interviews to relevant stakeholders of 31 multinational software-intensive companies, together with industrial workshops and observations at organizations' facilities that supported triangulation. We used Grounded Theory as qualitative research method to explore the structure and characteristics of teams, and statistical analysis to discover their implications in software delivery performance. Results: We describe a taxonomy of team structure patterns that shows emerging, stable and consolidated product teams that are classified according to six variables, such as collaboration frequency, product ownership sharing, autonomy, among others, as well as their implications on software delivery performance. These teams are often supported by horizontal teams (DevOps platform teams, Centers of Excellence, and chapters) that provide them with platform technical capability, mentoring and evangelization, and even temporarily facilitate human resources. Conclusion: This study aims to strengthen evidence and support practitioners in making better informed about organizational team structures by analyzing their main characteristics and implications in software delivery performance.

preprint2021arXiv

Disturbing Reinforcement Learning Agents with Corrupted Rewards

Reinforcement Learning (RL) algorithms have led to recent successes in solving complex games, such as Atari or Starcraft, and to a huge impact in real-world applications, such as cybersecurity or autonomous driving. In the side of the drawbacks, recent works have shown how the performance of RL algorithms decreases under the influence of soft changes in the reward function. However, little work has been done about how sensitive these disturbances are depending on the aggressiveness of the attack and the learning exploration strategy. In this paper, we propose to fill this gap in the literature analyzing the effects of different attack strategies based on reward perturbations, and studying the effect in the learner depending on its exploration strategy. In order to explain all the behaviors, we choose a sub-class of MDPs: episodic, stochastic goal-only-rewards MDPs, and in particular, an intelligible grid domain as a benchmark. In this domain, we demonstrate that smoothly crafting adversarial rewards are able to mislead the learner, and that using low exploration probability values, the policy learned is more robust to corrupt rewards. Finally, in the proposed learning scenario, a counterintuitive result arises: attacking at each learning episode is the lowest cost attack strategy.

preprint2020arXiv

Relativistic reflection and reverberation in GX 339-4 with NICER and NuSTAR

We analyze seven NICER and NuSTAR epochs of the black hole X-ray binary GX 339-4 in the hard state during its two most recent hard-only outbursts in 2017 and 2019. These observations cover the 1-100 keV unabsorbed luminosities between 0.3% and 2.1% of the Eddington limit. With NICER's negligible pile-up, high count rate and unprecedented time resolution, we perform a spectral-timing analysis and spectral modeling using relativistic and distant reflection models. Our spectral fitting shows that as the inner disk radius moves inwards, the thermal disk emission increases in flux and temperature, the disk becomes more highly ionized and the reflection fraction increases. This coincides with the inner disk increasing its radiative efficiency around ~1% Eddington. We see a hint of hysteresis effect at ~0.3% of Eddington: the inner radius is significantly truncated during the rise ($>49R_{g}$), while only a mild truncation ($\sim5R_g$) is found during the decay. At higher frequencies ($2-7$~Hz) in the highest luminosity epoch, a soft lag is present, whose energy dependence reveals a thermal reverberation lag, with an amplitude similar to previous findings for this source. We also discuss the plausibility of the hysteresis effect and the debate of the disk truncation problem in the hard state.