Source author record

Philipp Schaer

Philipp Schaer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Digital Libraries Artificial Intelligence Computation and Language cs.CY Human-Computer Interaction physics.soc-ph Social and Information Networks

Catalog footprint

What is connected

27works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Auditing Search Query Suggestion Bias Through Recursive Algorithm Interrogation

Despite their important role in online information search, search query suggestions have not been researched as much as most other aspects of search engines. Although reasons for this are multi-faceted, the sparseness of context and the limited data basis of up to ten suggestions per search query pose the most significant problem in identifying bias in search query suggestions. The most proven method to reduce sparseness and improve the validity of bias identification of search query suggestions so far is to consider suggestions from subsequent searches over time for the same query. This work presents a new, alternative approach to search query bias identification that includes less high-level suggestions to deepen the data basis of bias analyses. We employ recursive algorithm interrogation techniques and create suggestion trees that enable access to more subliminal search query suggestions. Based on these suggestions, we investigate topical group bias in person-related searches in the political domain.

preprint2026arXiv

Dynamics in Search Engine Query Suggestions for European Politicians

Search engines are commonly used for online political information seeking. Yet, it remains unclear how search query suggestions for political searches that reflect the latent interest of internet users vary across countries and over time. We provide a systematic analysis of Google search engine query suggestions for European and national politicians. Using an original dataset of search query suggestions for European politicians collected in ten countries, we find that query suggestions are less stable over time in politicians' countries of origin, when the politicians hold a supranational role, and for female politicians. Moreover, query suggestions for political leaders and male politicians are more similar across countries. We conclude by discussing possible future directions for studying information search about European politicians in online search.

preprint2026arXiv

LISP -- A Rich Interaction Dataset and Loggable Interactive Search Platform

We present a reusable dataset and accompanying infrastructure for studying human search behavior in Interactive Information Retrieval (IIR). The dataset combines detailed interaction logs from 61 participants (122 sessions) with user characteristics, including perceptual speed, topic-specific interest, search expertise, and demographic information. To facilitate reproducibility and reuse, we provide a fully documented study setup, a web-based perceptual speed test, and a framework for conducting similar user studies. Our work allows researchers to investigate individual and contextual factors affecting search behavior, and to develop or validate user simulators that account for such variability. We illustrate the datasets potential through an illustrative analysis and release all resources as open-access, supporting reproducible research and resource sharing in the IIR community.

preprint2026arXiv

Perception-Aware Bias Detection for Query Suggestions

Bias in web search has been in the spotlight of bias detection research for quite a while. At the same time, little attention has been paid to query suggestions in this regard. Awareness of the problem of biased query suggestions has been raised. Likewise, there is a rising need for automatic bias detection approaches. This paper adds on the bias detection pipeline for bias detection in query suggestions of person-related search developed by Bonart et al. \cite{Bonart_2019a}. The sparseness and lack of contextual metadata of query suggestions make them a difficult subject for bias detection. Furthermore, query suggestions are perceived very briefly and subliminally. To overcome these issues, perception-aware metrics are introduced. Consequently, the enhanced pipeline is able to better detect systematic topical bias in search engine query suggestions for person-related searches. The results of an analysis performed with the developed pipeline confirm this assumption. Due to the perception-aware bias detection metrics, findings produced by the pipeline can be assumed to reflect bias that users would discern.

preprint2026arXiv

Sim4IA-Bench: A User Simulation Benchmark Suite for Next Query and Utterance Prediction

Validating user simulation is a difficult task due to the lack of established measures and benchmarks, which makes it challenging to assess whether a simulator accurately reflects real user behavior. As part of the Sim4IA Micro-Shared Task at the Sim4IA Workshop, SIGIR 2025, we present Sim4IA-Bench, a simulation benchmark suit for the prediction of the next queries and utterances, the first of its kind in the IR community. Our dataset as part of the suite comprises 160 real-world search sessions from the CORE search engine. For 70 of these sessions, up to 62 simulator runs are available, divided into Task A and Task B, in which different approaches predicted users next search queries or utterances. Sim4IA-Bench provides a basis for evaluating and comparing user simulation approaches and for developing new measures of simulator validity. Although modest in size, the suite represents the first publicly available benchmark that links real search sessions with simulated next-query predictions. In addition to serving as a testbed for next query prediction, it also enables exploratory studies on query reformulation behavior, intent drift, and interaction-aware retrieval evaluation. We also introduce a new measure for evaluating next-query predictions in this task. By making the suite publicly available, we aim to promote reproducible research and stimulate further work on realistic and explainable user simulation for information access: https://github.com/irgroup/Sim4IA-Bench.

preprint2026arXiv

Validating Search Query Simulations: A Taxonomy of Measures

Assessing the validity of user simulators when used for the evaluation of information retrieval systems remains an open question, constraining their effective use and the reliability of simulation-based results. To address this issue, we conduct a comprehensive literature review with a particular focus on methods for the validation of simulated user queries with regard to real queries. Based on the review, we develop a taxonomy that structures the current landscape of available measures. We empirically corroborate the taxonomy by analyzing the relationships between the different measures applied to four different datasets representing diverse search scenarios. Finally, we provide concrete recommendations on which measures or combinations of measures should be considered when validating user simulation in different contexts. Furthermore, we release a dedicated library with the most commonly used measures to facilitate future research.

preprint2022arXiv

Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval

In this work, we analyze a pseudo-relevance retrieval method based on the results of web search engines. By enriching topics with text data from web search engine result pages and linked contents, we train topic-specific and cost-efficient classifiers that can be used to search test collections for relevant documents. Building upon attempts initially made at TREC Common Core 2018 by Grossman and Cormack, we address questions of system performance over time considering different search engines, queries, and test collections. Our experimental results show how and to which extent the considered components affect the retrieval performance. Overall, the analyzed method is robust in terms of average retrieval performance and a promising way to use web content for the data enrichment of relevance feedback methods.

preprint2022arXiv

ir_metadata: An Extensible Metadata Schema for IR Experiments

The information retrieval (IR) community has a strong tradition of making the computational artifacts and resources available for future reuse, allowing the validation of experimental results. Besides the actual test collections, the underlying run files are often hosted in data archives as part of conferences like TREC, CLEF, or NTCIR. Unfortunately, the run data itself does not provide much information about the underlying experiment. For instance, the single run file is not of much use without the context of the shared task's website or the run data archive. In other domains, like the social sciences, it is good practice to annotate research data with metadata. In this work, we introduce ir_metadata - an extensible metadata schema for TREC run files based on the PRIMAD model. We propose to align the metadata annotations to PRIMAD, which considers components of computational experiments that can affect reproducibility. Furthermore, we outline important components and information that should be reported in the metadata and give evidence from the literature. To demonstrate the usefulness of these metadata annotations, we implement new features in repro_eval that support the outlined metadata schema for the use case of reproducibility studies. Additionally, we curate a dataset with run files derived from experiments with different instantiations of PRIMAD components and annotate these with the corresponding metadata. In the experiments, we cover reproducibility experiments that are identified by the metadata and classified by PRIMAD. With this work, we enable IR researchers to annotate TREC run files and improve the reuse value of experimental artifacts even further.

preprint2022arXiv

Overview of LiLAS 2021 -- Living Labs for Academic Search

The Living Labs for Academic Search (LiLAS) lab aims to strengthen the concept of user-centric living labs for academic search. The methodological gap between real-world and lab-based evaluation should be bridged by allowing lab participants to evaluate their retrieval approaches in two real-world academic search systems from life sciences and social sciences. This overview paper outlines the two academic search systems LIVIVO and GESIS Search, and their corresponding tasks within LiLAS, which are ad-hoc retrieval and dataset recommendation. The lab is based on a new evaluation infrastructure named STELLA that allows participants to submit results corresponding to their experimental systems in the form of pre-computed runs and Docker containers that can be integrated into production systems and generate experimental results in real-time. Both submission types are interleaved with the results provided by the productive systems allowing for a seamless presentation and evaluation. The evaluation of results and a meta-analysis of the different tasks and submission types complement this overview.

preprint2022arXiv

repro_eval: A Python Interface to Reproducibility Measures of System-oriented IR Experiments

In this work we introduce repro_eval - a tool for reactive reproducibility studies of system-oriented information retrieval (IR) experiments. The corresponding Python package provides IR researchers with measures for different levels of reproduction when evaluating their systems' outputs. By offering an easily extensible interface, we hope to stimulate common practices when conducting a reproducibility study of system-oriented IR experiments.

preprint2022arXiv

Validating Simulations of User Query Variants

System-oriented IR evaluations are limited to rather abstract understandings of real user behavior. As a solution, simulating user interactions provides a cost-efficient way to support system-oriented experiments with more realistic directives when no interaction logs are available. While there are several user models for simulated clicks or result list interactions, very few attempts have been made towards query simulations, and it has not been investigated if these can reproduce properties of real queries. In this work, we validate simulated user query variants with the help of TREC test collections in reference to real user queries that were made for the corresponding topics. Besides, we introduce a simple yet effective method that gives better reproductions of real queries than the established methods. Our evaluation framework validates the simulations regarding the retrieval performance, reproducibility of topic score distributions, shared task utility, effort and effect, and query term similarity when compared with real user query variants. While the retrieval effectiveness and statistical properties of the topic score distributions as well as economic aspects are close to that of real queries, it is still challenging to simulate exact term matches and later query reformulations.

preprint2020arXiv

Computational Methods in Professional Communication

The digitization of the world has also led to a digitization of communication processes. Traditional research methods fall short in understanding communication in digital worlds as the scope has become too large in volume, variety, and velocity to be studied using traditional approaches. In this paper, we present computational methods and their use in public and mass communication research and how those could be adapted to professional communication research. The paper is a proposal for a panel in which the panelists, each an expert in their field, will present their current work using computational methods and will discuss transferability of these methods to professional communication.

preprint2016arXiv

A System for Probabilistic Linking of Thesauri and Classification Systems

This paper presents a system which creates and visualizes probabilistic semantic links between concepts in a thesaurus and classes in a classification system. For creating the links, we build on the Polylingual Labeled Topic Model (PLL-TM). PLL-TM identifies probable thesaurus descriptors for each class in the classification system by using information from the natural language text of documents, their assigned thesaurus descriptors and their designated classes. The links are then presented to users of the system in an interactive visualization, providing them with an automatically generated overview of the relations between the thesaurus and the classification system.

preprint2014arXiv

Editorial for the Bibliometric-enhanced Information Retrieval Workshop at ECIR 2014

This first "Bibliometric-enhanced Information Retrieval" (BIR 2014) workshop aims to engage with the IR community about possible links to bibliometrics and scholarly communication. Bibliometric techniques are not yet widely used to enhance retrieval processes in digital libraries, although they offer value-added effects for users. In this workshop we will explore how statistical modelling of scholarship, such as Bradfordizing or network analysis of co-authorship network, can improve retrieval services for specific communities, as well as for large, cross-domain collections. This workshop aims to raise awareness of the missing link between information retrieval (IR) and bibliometrics / scientometrics and to create a common ground for the incorporation of bibliometric-enhanced services into retrieval at the digital library interface. Our interests include information retrieval, information seeking, science modelling, network analysis, and digital libraries. The goal is to apply insights from bibliometrics, scientometrics, and informetrics to concrete practical problems of information retrieval and browsing.

preprint2013arXiv

An OAI-PMH-based Web Service for the Generation of Co-Author Networks

We will present a new component of our technical framework that was built to provide a brought range of reusable web services for the enhancement of typical scientific retrieval processes. The proposed component computes betweenness of authors in co-authorship networks extracted from publicly available metadata that was harvested using OAI-PMH.

preprint2013arXiv

Bibliometric-enhanced Information Retrieval

Bibliometric techniques are not yet widely used to enhance retrieval processes in digital libraries, although they offer value-added effects for users. In this workshop we will explore how statistical modelling of scholarship, such as Bradfordizing or network analysis of coauthorship network, can improve retrieval services for specific communities, as well as for large, cross-domain collections. This workshop aims to raise awareness of the missing link between information retrieval (IR) and bibliometrics/scientometrics and to create a common ground for the incorporation of bibliometric-enhanced services into retrieval at the digital library interface.

preprint2013arXiv

Performing Informetric Analysis on Information Retrieval Test Collections: Preliminary Experiments in the Physics Domain

The combination of informetric analysis and information retrieval allows a twofold application. (1) While in-formetrics analysis is primarily used to gain insights into a scientific domain, it can be used to build recommen-dation or alternative ranking services. They are usually based on methods like co-occurrence or citation analyses. (2) Information retrieval and its decades-long tradition of rigorous evaluation using standard document corpora, predefined topics and relevance judgements can be used as a test bed for informetric analyses. We show a preliminary experiment on how both domains can be connected using the iSearch test collection, a standard information retrieval test collection derived from the open access arXiv.org preprint server. In this paper the aim is to draw a conclusion about the appropriateness of iSearch as a test bed for the evaluation of a retrieval or recommendation system that applies informetric methods to improve retrieval results for the user. Based on an interview study with physicists, bibliographic coupling and author-co-citation analysis, important authors for ten different research questions are identified. The results show that the analysed corpus includes these authors and their corresponding documents. This study is a first step towards a combination of retrieval evaluations and the evaluation of informetric analyses methods.

preprint2012arXiv

Building Custom Term Suggestion Web Services with OAI-Harvested Open Data

The problem that the same information need can be expressed in a variety of ways is especially true for scientific literature. Each scientific discipline has its own domain-specific language and vocabulary. This language is coded into documentary tools like thesauri or classifications that are used to document and describe scientific documents. When we think of information retrieval as "fundamentally a linguistic process" (Blair, 2003) users have to be aware of the most relevant search terms - which are the controlled thesauri terms the documents are described with. This can be achieved with so-called search-term-recommenders (STR) that map free search terms of a lay user to controlled vocabulary terms which can then be used as a term suggestion or to do an automatic query expansion (Hienert, Schaer, Schaible, & Mayr, 2011). State-of-the-art repository software systems like DSpace or EPrints already offer some kind of term suggestion features in search or input forms but these implementations only work as simple auto completion mechanisms that don't incorporate any kind of semantic mapping. Such software systems would gain a lot in terms of usability and data consistency if tools like the proposed domain-specific STRs would be freely available. We aim to implement a rich toolbox of web services (like the mentioned domain-specific STRs) to support users and providers of online Digital Library (DL) or repository systems.

preprint2012arXiv

Dealing with Sparse Document and Topic Representations: Lab Report for CHiC 2012

We will report on the participation of GESIS at the first CHiC workshop (Cultural Heritage in CLEF). Being held for the first time, no prior experience with the new data set, a document dump of Europeana with ca. 23 million documents, exists. The most prominent issues that arose from pretests with this test collection were the very unspecific topics and sparse document representations. Only half of the topics (26/50) contained a description and the titles were usually short with just around two words. Therefore we focused on three different term suggestion and query expansion mechanisms to surpass the sparse topical description. We used two methods that build on concept extraction from Wikipedia and on a method that applied co-occurrence statistics on the available Europeana corpus. In the following paper we will present the approaches and preliminary results from their assessments.

preprint2012arXiv

Extending Term Suggestion with Author Names

Term suggestion or recommendation modules can help users to formulate their queries by mapping their personal vocabularies onto the specialized vocabulary of a digital library. While we examined actual user queries of the social sciences digital library Sowiport we could see that nearly one third of the users were explicitly looking for author names rather than terms. Common term recommenders neglect this fact. By picking up the idea of polyrepresentation we could show that in a standardized IR evaluation setting we can significantly increase the retrieval performances by adding topical-related author names to the query. This positive effect only appears when the query is additionally expanded with thesaurus terms. By just adding the author names to a query we often observe a query drift which results in worse results.

preprint2012arXiv

Improving Retrieval Results with discipline-specific Query Expansion

Choosing the right terms to describe an information need is becoming more difficult as the amount of available information increases. Search-Term-Recommendation (STR) systems can help to overcome these problems. This paper evaluates the benefits that may be gained from the use of STRs in Query Expansion (QE). We create 17 STRs, 16 based on specific disciplines and one giving general recommendations, and compare the retrieval performance of these STRs. The main findings are: (1) QE with specific STRs leads to significantly better results than QE with a general STR, (2) QE with specific STRs selected by a heuristic mechanism of topic classification leads to better results than the general STR, however (3) selecting the best matching specific STR in an automatic way is a major challenge of this process.

preprint2012arXiv

Integrating Interactive Visualizations in the Search Process of Digital Libraries and IR Systems

Interactive visualizations for exploring and retrieval have not yet become an integral part of digital libraries and information retrieval systems. We have integrated a set of interactive graphics in a real world social science digital library. These visualizations support the exploration of search queries, results and authors, can filter search results, show trends in the database and can support the creation of new search queries. The use of weighted brushing supports the identification of related metadata for search facets. We discuss some use cases of the combination of IR systems and interactive graphics. In a user study we verify that users can gain insights from statistical graphics intuitively and can adopt interaction techniques.

preprint2011arXiv

A Science Model Driven Retrieval Prototype

This paper is about a better understanding on the structure and dynamics of science and the usage of these insights for compensating the typical problems that arises in metadata-driven Digital Libraries. Three science model driven retrieval services are presented: co-word analysis based query expansion, re-ranking via Bradfordizing and author centrality. The services are evaluated with relevance assessments from which two important implications emerge: (1) precision values of the retrieval service are the same or better than the tf-idf retrieval baseline and (2) each service retrieved a disjoint set of documents. The different services each favor quite other - but still relevant - documents than pure term-frequency based rankings. The proposed models and derived retrieval services therefore open up new viewpoints on the scientific knowledge space and provide an alternative framework to structure scholarly information systems.

preprint2011arXiv

Applying Science Models for Search

The paper proposes three different kinds of science models as value-added services that are integrated in the retrieval process to enhance retrieval quality. The paper discusses the approaches Search Term Recommendation, Bradfordizing and Author Centrality on a general level and addresses implementation issues of the models within a real-life retrieval environment.

preprint2011arXiv

Using Lotkaian Informetrics for Ranking in Digital Libraries

The purpose of this paper is to propose the use of models, theories and laws in bibliometrics and scientometrics to enhance information retrieval processes, especially ranking. A common pattern in many man-made data sets is Lotka's Law which follows the well-known power-law distributions. These informetric distributions can be used to give an alternative order to large and scattered result sets and can be applied as a new ranking mechanism. The polyrepresentation of information in Digital Library systems is used to enhance the retrieval quality, to overcome the drawbacks of the typical term-based ranking approaches and to enable users to explore retrieved document sets from a different perspective.

preprint2011arXiv

Web-Based Multi-View Visualizations for Aggregated Statistics

With the rise of the open data movement a lot of statistical data has been made publicly available by governments, statistical offices and other organizations. First efforts to visualize are made by the data providers themselves. Data aggregators go a step beyond: they collect data from different open data repositories and make them comparable by providing data sets from different providers and showing different statistics in the same chart. Another approach is to visualize two different indicators in a scatter plot or on a map. The integration of several data sets in one graph can have several drawbacks: different scales and units are mixed, the graph gets visually cluttered and one cannot easily distinguish between different indicators. Our approach marks a combination of (1) the integration of live data from different data sources, (2) presenting different indicators in coordinated visualizations and (3) allows adding user visualizations to enrich official statistics with personal data. Each indicator gets its own visualization, which fits best for the individual indicator in case of visualization type, scale, unit etc. The different visualizations are linked, so that related items can easily be identified by using mouse over effects on data items.

preprint2010arXiv

Implications of Inter-Rater Agreement on a Student Information Retrieval Evaluation

This paper is about an information retrieval evaluation on three different retrieval-supporting services. All three services were designed to compensate typical problems that arise in metadata-driven Digital Libraries, which are not adequately handled by a simple tf-idf based retrieval. The services are: (1) a co-word analysis based query expansion mechanism and re-ranking via (2) Bradfordizing and (3) author centrality. The services are evaluated with relevance assessments conducted by 73 information science students. Since the students are neither information professionals nor domain experts the question of inter-rater agreement is taken into consideration. Two important implications emerge: (1) the inter-rater agreement rates were mainly fair to moderate and (2) after a data-cleaning step which erased the assessments with poor agreement rates the evaluation data shows that the three retrieval services returned disjoint but still relevant result sets.

Philipp Schaer

What is connected

Connect this record

See the researcher in context

Building this map preview

27 published item(s)

Auditing Search Query Suggestion Bias Through Recursive Algorithm Interrogation

Dynamics in Search Engine Query Suggestions for European Politicians

LISP -- A Rich Interaction Dataset and Loggable Interactive Search Platform

Perception-Aware Bias Detection for Query Suggestions

Sim4IA-Bench: A User Simulation Benchmark Suite for Next Query and Utterance Prediction

Validating Search Query Simulations: A Taxonomy of Measures

Evaluating Elements of Web-based Data Enrichment for Pseudo-Relevance Feedback Retrieval

ir_metadata: An Extensible Metadata Schema for IR Experiments

Overview of LiLAS 2021 -- Living Labs for Academic Search

repro_eval: A Python Interface to Reproducibility Measures of System-oriented IR Experiments

Validating Simulations of User Query Variants

Computational Methods in Professional Communication

A System for Probabilistic Linking of Thesauri and Classification Systems

Editorial for the Bibliometric-enhanced Information Retrieval Workshop at ECIR 2014

An OAI-PMH-based Web Service for the Generation of Co-Author Networks

Bibliometric-enhanced Information Retrieval

Performing Informetric Analysis on Information Retrieval Test Collections: Preliminary Experiments in the Physics Domain

Building Custom Term Suggestion Web Services with OAI-Harvested Open Data

Dealing with Sparse Document and Topic Representations: Lab Report for CHiC 2012

Extending Term Suggestion with Author Names

Improving Retrieval Results with discipline-specific Query Expansion

Integrating Interactive Visualizations in the Search Process of Digital Libraries and IR Systems

A Science Model Driven Retrieval Prototype

Applying Science Models for Search

Using Lotkaian Informetrics for Ranking in Digital Libraries

Web-Based Multi-View Visualizations for Aggregated Statistics

Implications of Inter-Rater Agreement on a Student Information Retrieval Evaluation