Researcher profile

Erhard Rahm

Erhard Rahm contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2026arXiv

Combining Time-Series and Graph Data: A Survey of Existing Systems and Approaches

We provide a comprehensive overview of current approaches and systems for combining graphs and time series data. We categorize existing systems into four architectural categories and analyze how these systems meet different requirements and exhibit distinct implementation characteristics to support both data types in a unified manner. Our overview aims to help readers understand and evaluate current options and trade-offs, such as the degree of cross-model integration, maturity, and openness.

preprint2021arXiv

EAGER: Embedding-Assisted Entity Resolution for Knowledge Graphs

Entity Resolution (ER) is a constitutional part for integrating different knowledge graphs in order to identify entities referring to the same real-world object. A promising approach is the use of graph embeddings for ER in order to determine the similarity of entities based on the similarity of their graph neighborhood. The similarity computations for such embeddings translates to calculating the distance between them in the embedding space which is comparatively simple. However, previous work has shown that the use of graph embeddings alone is not sufficient to achieve high ER quality. We therefore propose a more comprehensive ER approach for knowledge graphs called EAGER (Embedding-Assisted Knowledge Graph Entity Resolution) to flexibly utilize both the similarity of graph embeddings and attribute values within a supervised machine learning approach. We evaluate our approach on 23 benchmark datasets with differently sized and structured knowledge graphs and use hypothesis tests to ensure statistical significance of our results. Furthermore we compare our approach with state-of-the-art ER solutions, where our approach yields competitive results for table-oriented ER problems and shallow knowledge graphs but much better results for deeper knowledge graphs.

preprint2012arXiv

How do Ontology Mappings Change in the Life Sciences?

Mappings between related ontologies are increasingly used to support data integration and analysis tasks. Changes in the ontologies also require the adaptation of ontology mappings. So far the evolution of ontology mappings has received little attention albeit ontologies change continuously especially in the life sciences. We therefore analyze how mappings between popular life science ontologies evolve for different match algorithms. We also evaluate which semantic ontology changes primarily affect the mappings. We further investigate alternatives to predict or estimate the degree of future mapping changes based on previous ontology and mapping transitions.

preprint2011arXiv

Rule-based Construction of Matching Processes

Mapping complex metadata structures is crucial in a number of domains such as data integration, ontology alignment or model management. To speed up that process automatic matching systems were developed to compute mapping suggestions that can be corrected by a user. However, constructing and tuning match strategies still requires a high manual effort by matching experts as well as correct mappings to evaluate generated mappings. We therefore propose a self-configuring schema matching system that is able to automatically adapt to the given mapping problem at hand. Our approach is based on analyzing the input schemas as well as intermediate matching results. A variety of matching rules use the analysis results to automatically construct and adapt an underlying matching process for a given match task. We comprehensively evaluate our approach on different mapping problems from the schema, ontology and model management domains. The evaluation shows that our system is able to robustly return good quality mappings across different mapping problems and domains.

preprint2010arXiv

Data Partitioning for Parallel Entity Matching

Entity matching is an important and difficult step for integrating web data. To reduce the typically high execution time for matching we investigate how we can perform entity matching in parallel on a distributed infrastructure. We propose different strategies to partition the input data and generate multiple match tasks that can be independently executed. One of our strategies supports both, blocking to reduce the search space for matching and parallel matching to improve efficiency. Special attention is given to the number and size of data partitions as they impact the overall communication overhead and memory requirements of individual match tasks. We have developed a service-based distributed infrastructure for the parallel execution of match workflows. We evaluate our approach in detail for different match strategies for matching real-world product data of different web shops. We also consider caching of in-put entities and affinity-based scheduling of match tasks.

preprint2010arXiv

Evaluation of Query Generators for Entity Search Engines

Dynamic web applications such as mashups need efficient access to web data that is only accessible via entity search engines (e.g. product or publication search engines). However, most current mashup systems and applications only support simple keyword searches for retrieving data from search engines. We propose the use of more powerful search strategies building on so-called query generators. For a given set of entities query generators are able to automatically determine a set of search queries to retrieve these entities from an entity search engine. We demonstrate the usefulness of query generators for on-demand web data integration and evaluate the effectiveness and efficiency of query generators for a challenging real-world integration scenario.

preprint2010arXiv

Parallel Sorted Neighborhood Blocking with MapReduce

Cloud infrastructures enable the efficient parallel execution of data-intensive tasks such as entity resolution on large datasets. We investigate challenges and possible solutions of using the MapReduce programming model for parallel entity resolution. In particular, we propose and evaluate two MapReduce-based implementations for Sorted Neighborhood blocking that either use multiple MapReduce jobs or apply a tailored data replication.

preprint2010arXiv

Rule-based Generation of Diff Evolution Mappings between Ontology Versions

Ontologies such as taxonomies, product catalogs or web directories are heavily used and hence evolve frequently to meet new requirements or to better reflect the current instance data of a domain. To effectively manage the evolution of ontologies it is essential to identify the difference (Diff) between two ontology versions. We propose a novel approach to determine an expressive and invertible diff evolution mapping between given versions of an ontology. Our approach utilizes the result of a match operation to determine an evolution mapping consisting of a set of basic change operations (insert/update/delete). To semantically enrich the evolution mapping we adopt a rule-based approach to transform the basic change operations into a smaller set of more complex change operations, such as merge, split, or changes of entire subgraphs. The proposed algorithm is customizable in different ways to meet the requirements of diverse ontologies and application scenarios. We evaluate the proposed approach by determining and analyzing evolution mappings for real-world life science ontologies and web directories.

preprint2010arXiv

Target-driven merging of Taxonomies

The proliferation of ontologies and taxonomies in many domains increasingly demands the integration of multiple such ontologies. The goal of ontology integration is to merge two or more given ontologies in order to provide a unified view on the input ontologies while maintaining all information coming from them. We propose a new taxonomy merging algorithm that, given as input two taxonomies and an equivalence matching between them, can generate an integrated taxonomy in a fully automatic manner. The approach is target-driven, i.e. we merge a source taxonomy into the target taxonomy and preserve the structure of the target ontology as much as possible. We also discuss how to extend the merge algorithm providing auxiliary information, like additional relationships between source and target concepts, in order to semantically improve the final result. The algorithm was implemented in a working prototype and evaluated using synthetic and real-world scenarios.