Source author record

Michael Krauthammer

Michael Krauthammer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Digital Libraries Quantitative Methods cs.CY Machine Learning Applications Artificial Intelligence Computation and Language Computational Engineering, Finance, and Science

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Drug prescription clusters in the UK Biobank: An assessment of drug-drug interactions and patient outcomes in a large patient cohort

In recent decades, there has been an increase in polypharmacy, the concurrent administration of multiple drugs per patient. Studies have shown that polypharmacy is linked to adverse patient outcomes and there is interest in elucidating the exact causes behind this observation. In this paper, we are studying the relationship between drug prescriptions, drug-drug interactions (DDIs) and patient mortality. Our focus is not so much on the number of prescribed drugs, the typical metric in polypharmacy research, but rather on the specific combinations of drugs leading to a DDI. To learn the space of real-world drug combinations, we first assessed the drug prescription landscape of the UK Biobank, a large patient data registry. We observed distinct drug constellation patterns driven by the UK Biobank participants' disease status. We show that these drug prescription clusters matter in terms of the number and types of expected DDIs, and may possibly explain observed differences in health outcomes.

preprint2020arXiv

AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural Networks

Patients increasingly turn to search engines and online content before, or in place of, talking with a health professional. Low quality health information, which is common on the internet, presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, the DISCERN criteria (developed at University of Oxford) are used to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. The HEA BERT and BioBERT models achieved average F1-macro scores across all criteria of 0.75 and 0.74, respectively, outperforming the Random Forest model (average F1-macro = 0.69). Overall, the neural network based models achieved 81% and 86% average accuracy at 100% and 80% coverage, respectively, compared to 94% manual rating accuracy. The attention mechanism implemented in the HEA architectures not only provided 'model explainability' by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the same architecture without an attention mechanism. Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.

preprint2020arXiv

Patient Similarity Analysis with Longitudinal Health Data

Healthcare professionals have long envisioned using the enormous processing powers of computers to discover new facts and medical knowledge locked inside electronic health records. These vast medical archives contain time-resolved information about medical visits, tests and procedures, as well as outcomes, which together form individual patient journeys. By assessing the similarities among these journeys, it is possible to uncover clusters of common disease trajectories with shared health outcomes. The assignment of patient journeys to specific clusters may in turn serve as the basis for personalized outcome prediction and treatment selection. This procedure is a non-trivial computational problem, as it requires the comparison of patient data with multi-dimensional and multi-modal features that are captured at different times and resolutions. In this review, we provide a comprehensive overview of the tools and methods that are used in patient similarity analysis with longitudinal data and discuss its potential for improving clinical decision making.

preprint2015arXiv

Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Making available and archiving scientific results is for the most part still considered the task of classical publishing companies, despite the fact that classical forms of publishing centered around printed narrative articles no longer seem well-suited in the digital age. In particular, there exist currently no efficient, reliable, and agreed-upon methods for publishing scientific datasets, which have become increasingly important for science. Here we propose to design scientific data publishing as a Web-based bottom-up process, without top-down control of central authorities such as publishing companies. Based on a novel combination of existing concepts and technologies, we present a server network to decentrally store and archive data in the form of nanopublications, an RDF-based format to represent scientific data. We show how this approach allows researchers to publish, retrieve, verify, and recombine datasets of nanopublications in a reliable and trustworthy manner, and we argue that this architecture could be used for the Semantic Web in general. Evaluation of the current small network shows that this system is efficient and reliable.

preprint2014arXiv

Mining Images in Biomedical Publications: Detection and Analysis of Gel Diagrams

Authors of biomedical publications use gel images to report experimental results such as protein-protein interactions or protein expressions under different conditions. Gel images offer a concise way to communicate such findings, not all of which need to be explicitly discussed in the article text. This fact together with the abundance of gel images and their shared common patterns makes them prime candidates for automated image mining and parsing. We introduce an approach for the detection of gel images, and present a workflow to analyze them. We are able to detect gel segments and panels at high accuracy, and present preliminary results for the identification of gene names in these images. While we cannot provide a complete solution at this point, we present evidence that this kind of image mining is feasible.

preprint2013arXiv

Broadening the Scope of Nanopublications

In this paper, we present an approach for extending the existing concept of nanopublications --- tiny entities of scientific results in RDF representation --- to broaden their application range. The proposed extension uses English sentences to represent informal and underspecified scientific claims. These sentences follow a syntactic and semantic scheme that we call AIDA (Atomic, Independent, Declarative, Absolute), which provides a uniform and succinct representation of scientific assertions. Such AIDA nanopublications are compatible with the existing nanopublication concept and enjoy most of its advantages such as information sharing, interlinking of scientific findings, and detailed attribution, while being more flexible and applicable to a much wider range of scientific results. We show that users are able to create AIDA sentences for given scientific results quickly and at high quality, and that it is feasible to automatically extract and interlink AIDA nanopublications from existing unstructured data sources. To demonstrate our approach, a web-based interface is introduced, which also exemplifies the use of nanopublications for non-scientific content, including meta-nanopublications that describe other nanopublications.

preprint2012arXiv

Image Mining from Gel Diagrams in Biomedical Publications

Authors of biomedical publications often use gel images to report experimental results such as protein-protein interactions or protein expressions under different conditions. Gel images offer a way to concisely communicate such findings, not all of which need to be explicitly discussed in the article text. This fact together with the abundance of gel images and their shared common patterns makes them prime candidates for image mining endeavors. We introduce an approach for the detection of gel images, and present an automatic workflow to analyze them. We are able to detect gel segments and panels at high accuracy, and present first results for the identification of gene names in these images. While we cannot provide a complete solution at this point, we present evidence that this kind of image mining is feasible.

preprint2012arXiv

Underspecified Scientific Claims in Nanopublications

The application range of nanopublications --- small entities of scientific results in RDF representation --- could be greatly extended if complete formal representations are not mandatory. To that aim, we present an approach to represent and interlink scientific claims in an underspecified way, based on independent English sentences.

preprint2010arXiv

Analysis Of Cancer Omics Data In A Semantic Web Framework

Our work concerns the elucidation of the cancer (epi)genome, transcriptome and proteome to better understand the complex interplay between a cancer cell's molecular state and its response to anti-cancer therapy. To study the problem, we have previously focused on data warehousing technologies and statistical data integration. In this paper, we present recent work on extending our analytical capabilities using Semantic Web technology. A key new component presented here is a SPARQL endpoint to our existing data warehouse. This endpoint allows the merging of observed quantitative data with existing data from semantic knowledge sources such as Gene Ontology (GO). We show how such variegated quantitative and functional data can be integrated and accessed in a universal manner using Semantic Web tools. We also demonstrate how Description Logic (DL) reasoning can be used to infer previously unstated conclusions from existing knowledge bases. As proof of concept, we illustrate the ability of our setup to answer complex queries on resistance of cancer cells to Decitabine, a demethylating agent.

Michael Krauthammer

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

Drug prescription clusters in the UK Biobank: An assessment of drug-drug interactions and patient outcomes in a large patient cohort

AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural Networks

Patient Similarity Analysis with Longitudinal Health Data

Publishing without Publishers: a Decentralized Approach to Dissemination, Retrieval, and Archiving of Data

Mining Images in Biomedical Publications: Detection and Analysis of Gel Diagrams

Broadening the Scope of Nanopublications

Image Mining from Gel Diagrams in Biomedical Publications

Underspecified Scientific Claims in Nanopublications

Analysis Of Cancer Omics Data In A Semantic Web Framework