Source author record

Julián Urbano

Julián Urbano appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Applications Computation and Language Methodology

Catalog footprint

What is connected

6works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Stop Using the Wilcoxon Test: Myth, Misconception and Misuse in IR Research

In benchmarking of Information Retrieval systems, the Wilcoxon signed-rank test is often treated as a safer alternative to the t-test. This belief is fueled by textbooks and recommendations that portray Wilcoxon as the proper non-parametric alternative because metric scores are not normally distributed. We argue that this narrative is misleading and harmful. A careful review of Statistics textbooks reveals inconsistencies and omissions in how the assumptions underlying these tests are presented, fostering confusion that has propagated into IR research. As a result, Wilcoxon has been routinely misapplied for decades, creating a false sense of safety against a threat that was never there to begin with, while introducing another one so severe that it virtually guarantees the test will break down and mislead researchers. Through a combination of systematic literature review, analysis and empirical demonstrations with TREC data, we show how and why the Wilcoxon test easily loses control of its Type I error rate in IR settings. We conclude that the continued use of Wilcoxon in IR evaluation is unjustified and that abandoning it would improve the methodological soundness of our field.

preprint2021arXiv

Leave No User Behind: Towards Improving the Utility of Recommender Systems for Non-mainstream Users

In a collaborative-filtering recommendation scenario, biases in the data will likely propagate in the learned recommendations. In this paper we focus on the so-called mainstream bias: the tendency of a recommender system to provide better recommendations to users who have a mainstream taste, as opposed to non-mainstream users. We propose NAECF, a conceptually simple but effective idea to address this bias. The idea consists of adding an autoencoder (AE) layer when learning user and item representations with text-based Convolutional Neural Networks. The AEs, one for the users and one for the items, serve as adversaries to the process of minimizing the rating prediction error when learning how to recommend. They enforce that the specific unique properties of all users and items are sufficiently well incorporated and preserved in the learned representations. These representations, extracted as the bottlenecks of the corresponding AEs, are expected to be less biased towards mainstream users, and to provide more balanced recommendation utility across all users. Our experimental results confirm these expectations, significantly improving the recommendations for non-mainstream users while maintaining the recommendation quality for mainstream users. Our results emphasize the importance of deploying extensive content-based features, such as online reviews, in order to better represent users and items to maximize the de-biasing effect.

preprint2013arXiv

Overview of EIREX 2012: Social Media

The third Information Retrieval Education through EXperimentation track (EIREX 2012) was run at the University Carlos III of Madrid, during the 2012 spring semester. EIREX 2012 is the third in a series of experiments designed to foster new Information Retrieval (IR) education methodologies and resources, with the specific goal of teaching undergraduate IR courses from an experimental perspective. For an introduction to the motivation behind the EIREX experiments, see the first sections of [Urbano et al., 2011a]. For information on other editions of EIREX and related data, see the website at http://ir.kr.inf.uc3m.es/eirex/. The EIREX series have the following goals: a) to help students get a view of the Information Retrieval process as they would find it in a real-world scenario, either industrial or academic; b) to make students realize the importance of laboratory experiments in Computer Science and have them initiated in their execution and analysis; c) to create a public repository of resources to teach Information Retrieval courses; d) to seek the collaboration and active participation of other Universities in this endeavor. This overview paper summarizes the results of the EIREX 2012 track, focusing on the creation of the test collection and the analysis to assess its reliability.

preprint2012arXiv

Information Retrieval Systems Adapted to the Biomedical Domain

The terminology used in Biomedicine shows lexical peculiarities that have required the elaboration of terminological resources and information retrieval systems with specific functionalities. The main characteristics are the high rates of synonymy and homonymy, due to phenomena such as the proliferation of polysemic acronyms and their interaction with common language. Information retrieval systems in the biomedical domain use techniques oriented to the treatment of these lexical peculiarities. In this paper we review some of the techniques used in this domain, such as the application of Natural Language Processing (BioNLP), the incorporation of lexical-semantic resources, and the application of Named Entity Recognition (BioNER). Finally, we present the evaluation methods adopted to assess the suitability of these techniques for retrieving biomedical resources.

preprint2012arXiv

Overview of EIREX 2011: Crowdsourcing

The second Information Retrieval Education through EXperimentation track (EIREX 2011) was run at the University Carlos III of Madrid, during the 2011 spring semester. EIREX 2011 is the second in a series of experiments designed to foster new Information Retrieval (IR) education methodologies and resources, with the specific goal of teaching undergraduate IR courses from an experimental perspective. For an introduction to the motivation behind the EIREX experiments, see the first sections of [Urbano et al., 2011a]. For information on other editions of EIREX and related data, see the website at http://ir.kr.inf.uc3m.es/eirex/. The EIREX series have the following goals: a) to help students get a view of the Information Retrieval process as they would find it in a real-world scenario, either industrial or academic; b) to make students realize the importance of laboratory experiments in Computer Science and have them initiated in their execution and analysis; c) to create a public repository of resources to teach Information Retrieval courses; d) to seek the collaboration and active participation of other Universities in this endeavor. This overview paper summarizes the results of the EIREX 2011 track, focusing on the creation of the test collection and the analysis to assess its reliability.

preprint2011arXiv

Overview of EIREX 2010: Computing

The first Information Retrieval Education through Experimentation track (EIREX 2010) was run at the University Carlos III of Madrid, during the 2010 spring semester. EIREX 2010 is the first in a series of experiments designed to foster new Information Retrieval (IR) education methodologies and resources, with the specific goal of teaching undergraduate IR courses from an experimental perspective. For an introduction to the motivation behind the EIREX experiments, see the first sections of [Urbano et al., 2011]. For information on other editions of EIREX and related data, see the website at http://ir.kr.inf.uc3m.es/eirex/. The EIREX series have the following goals: a) to help students get a view of the Information Retrieval process as they would find it in a real-world scenario, either industrial or academic; b) to make students realize the importance of laboratory experiments in Computer Science and have them initiated in their execution and analysis; c) to create a public repository of resources to teach Information Retrieval courses; d) to seek the collaboration and active participation of other Universities in this endeavor. This overview paper summarizes the results of the EIREX 2010 track, focusing on the creation of the test collection and the analysis to assess its reliability.