Source author record

Vasilis Efthymiou

Vasilis Efthymiou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Databases Information Retrieval

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) improves reliability of large language models by incorporating external knowledge, but the retrieval process can introduce bias that propagates to generated outputs. This issue is particularly challenging in top-k settings, where multiple documents jointly influence generation. We propose a fairness-aware retrieval framework that models and controls this bias. Our approach combines controlled bias injection via reranking, a position-aware model of bias propagation, and an optimization formulation that balances relevance and fairness. We further introduce a scalable solution based on Quadratic Fairness via Dual Hyperplane Approximation (FARO), which enables efficient optimization through problem decomposition. Experimental results show that our method effectively mitigates generation bias while preserving relevance. This work provides a principled approach for fairness-aware retrieval in RAG systems.

preprint2022arXiv

Bipartite Graph Matching Algorithms for Clean-Clean Entity Resolution: An Empirical Evaluation

Entity Resolution (ER) is the task of finding records that refer to the same real-world entities. A common scenario is when entities across two clean sources need to be resolved, which we refer to as Clean-Clean ER. In this paper, we perform an extensive empirical evaluation of 8 bipartite graph matching algorithms that take in as input a bipartite similarity graph and provide as output a set of matched entities. We consider a wide range of matching algorithms, including algorithms that have not previously been applied to ER, or have been evaluated only in other ER settings. We assess the relative performance of the algorithms with respect to accuracy and time efficiency over 10 established, real datasets, from which we extract >700 different similarity graphs. Our results provide insights into the relative performance of these algorithms and guidelines for choosing the best one, depending on the data at hand.

preprint2020arXiv

Benchmarking Blocking Algorithms for Web Entities

An increasing number of entities are described by interlinked data rather than documents on the Web. Entity Resolution (ER) aims to identify descriptions of the same real-world entity within one or across knowledge bases in the Web of data. To reduce the required number of pairwise comparisons among descriptions, ER methods typically perform a pre-processing step, called \emph{blocking}, which places similar entity descriptions into blocks and thus only compare descriptions within the same block. We experimentally evaluate several blocking methods proposed for the Web of data using real datasets, whose characteristics significantly impact their effectiveness and efficiency. The proposed experimental evaluation framework allows us to better understand the characteristics of the missed matching entity descriptions and contrast them with ground truth obtained from different kinds of relatedness links.

preprint2020arXiv

End-to-End Entity Resolution for Big Data: A Survey

One of the most important tasks for improving data quality and the reliability of data analytics results is Entity Resolution (ER). ER aims to identify different descriptions that refer to the same real-world entity, and remains a challenging problem. While previous works have studied specific aspects of ER (and mostly in traditional settings), in this survey, we provide for the first time an end-to-end view of modern ER workflows, and of the novel aspects of entity indexing and matching methods in order to cope with more than one of the Big Data characteristics simultaneously. We present the basic concepts, processing steps and execution strategies that have been proposed by different communities, i.e., database, semantic Web and machine learning, in order to cope with the loose structuredness, extreme diversity, high speed and large scale of entity descriptions used by real-world applications. Finally, we provide a synthetic discussion of the existing approaches, and conclude with a detailed presentation of open research directions.

Vasilis Efthymiou

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation

Bipartite Graph Matching Algorithms for Clean-Clean Entity Resolution: An Empirical Evaluation

Benchmarking Blocking Algorithms for Web Entities

End-to-End Entity Resolution for Big Data: A Survey