Researcher profile

Lutz Bornmann

Lutz Bornmann contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2025arXiv

Institutional cooperations in Austrian research: An analysis of shared researchers

Multiple organisational affiliations are an increasingly common feature of research systems, yet their implications for organisational performance had received limited systematic attention. We developed a scalable, network-based analytical framework that represents simultaneous researcher affiliations as relational links between organisations and applied it to bibliometric data from Austria. Using harmonised publication and affiliation metadata, we constructed two complementary co-affiliation networks: a complete network capturing all simultaneous affiliations and a temporally filtered network retaining only organisational pairs that recurred over time. Network regression analyses showed that geographical proximity remained an important determinant of co-affiliation formation, with spatial distance consistently reducing shared appointments. Clear sectoral differences emerged beyond geography. Universities formed a dense and persistent core of co-affiliations, whereas ties involving medical institutions, government, non-profit and private-sector organisations were often short-lived and attenuated under temporal filtering. Among crosssector links, co-affiliations between universities and research institutes were notably resilient, indicating a more structurally embedded form of organisational integration. We assessed the effect of concurrent affiliations on organisational citation impact across organisational types using field- and year-normalised indicators. Research institutes and universities consistently exhibited higher citation impact than organisations from other sectors, and persistent co-affiliations were associated with greater and more stable scientific visibility.

preprint2022arXiv

Empirical analysis of recent temporal dynamics of research fields: Annual publications in chemistry and related areas as an example

Changes in the number of publications in a certain field might reflect the dynamic of scientific progress in this field, since an increase in the number of publications can be interpreted as an increase in the field-specific knowledge. In this paper, we present a methodological approach to analyse the dynamics of science on lower aggregation levels, i.e., the level of research fields. Our trend analysis approach is able to uncover very recent trends, and the methods used to study the trends are simple to understand for the possible recipients of the results. In order to demonstrate the trend analysis approach, we focused in this study on the annual number of publications (and patents) in chemistry (and related areas) between 2014 and 2020 identifying those fields in chemistry with the highest dynamics (largest rates of change in publication counts). The study is based on the mono-disciplinary literature database CAplus. Our results reveal that the number of publications in the CAplus database is increasing since many years. Research regarding optical phenomena and electrochemical technologies was found to be among the emerging topics in recent years.

preprint2022arXiv

Reference Publication Year Spectroscopy (RPYS) in practice: A software tutorial

In course of the organization of Workshop III entitled "Cited References Analysis Using CRExplorer" at the International Conference of the International Society for Scientometrics and Informetrics (ISSI2021), we have prepared three reference publication year spectroscopy (RPYS) analyses: (i) papers published in Journal of Informetrics; (ii) papers regarding the topic altmetrics; and (iii) papers published by Ludo Waltman (we selected this researcher since he received the Derek de Solla Price Memorial Medal during the ISSI2021 conference). The first RPYS analysis has been presented live at the workshop and the second and third RPYS analyses have been left to the participants for undertaking after the workshop. Here, we present the results for all three RPYS analyses. The three analyses have shown quite different seminal papers with a few overlaps. Many of the foundational papers in the field of scientometrics (e.g., distributions of publications and citations, citation network and co-citation analyses, and citation analysis with the aim of impact measurement and research evaluation) were retrieved as seminal papers of the papers published in Journal of Informetrics. Mainly papers with discussions of the deficiencies of citation-based impact measurements and comparisons between altmetrics and citations were retrieved as seminal papers of the topic altmetrics. The RPYS analysis of the paper set published by Ludo Waltman mainly retrieved papers about network analyses, citation relations, and citation impact measurement.

preprint2020arXiv

A Decade of In-text Citation Analysis based on Natural Language Processing and Machine Learning Techniques: An overview of empirical studies

Citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation context and content analysis, citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations.

preprint2020arXiv

An Evaluation of Percentile Measures of Citation Impact, and a Proposal for Making Them Better

Percentiles are statistics pointing to the standing of a paper's citation impact relative to other papers in a given citation distribution. Percentile Ranks (PRs) often play an important role in evaluating the impact of scholars, institutions, and lines of study. Because PRs are so important for the assessment of scholarly impact, and because citation practices differ greatly across time and fields, various percentile approaches have been proposed to time- and field-normalize citations. Unfortunately, current popular methods often face significant problems in time- and field-normalization, including when papers are assigned to multiple fields or have been published by more than one unit (e.g., researchers or countries). They also face problems for estimating citation counts (CCs) for pre-defined PRs (e.g., the 90th PR). We offer a series of guidelines and procedures that, we argue, address these problems and others and provide a superior means to make the use of percentile methods more accurate and informative. In particular, we introduce two approaches, CP-IN and CP-EX, that should be preferred in bibliometric studies because they consider the complete citation distribution. Both approaches are based on cumulative frequencies in percentages (CPs). The paper further shows how bar graphs and beamplots can present PRs in a more meaningful and accurate manner.

preprint2020arXiv

Are papers addressing certain diseases perceived where these diseases are prevalent? The proposal to use Twitter data as social-spatial sensors

We propose to use Twitter data as social-spatial sensors. This study deals with the question whether research papers on certain diseases are perceived by people in regions (worldwide) that are especially concerned by the diseases. Since (some) Twitter data contain location information, it is possible to spatially map the activity of Twitter users referring to certain papers (e.g., dealing with tuberculosis). The resulting maps reveal whether heavy activity on Twitter is correlated with large numbers of people having certain diseases. In this study, we focus on tuberculosis, human immunodeficiency virus (HIV), and malaria, since the World Health Organization ranks these diseases as the top three causes of death worldwide by a single infectious agent. The results of the social-spatial Twitter maps (and additionally performed regression models) reveal the usefulness of the proposed sensor approach. One receives an impression of how research papers on the diseases have been perceived by people in regions that are especially concerned by the diseases. Our study demonstrates a promising approach for using Twitter data for research evaluation purposes beyond simple counting of tweets.

preprint2020arXiv

Bibliometrics-based heuristics: What is their definition and how can they be studied?

When scientists study the phenomena they are interested in, they apply sound methods and base their work on theoretical considerations. In contrast, when the fruits of their research is being evaluated, basic scientific standards do not seem to matter. Instead, simplistic bibliometric indicators (i.e., publications and citation counts) are, paradoxically, both widely used and criticized without any methodological and theoretical framework that would serve to ground both use and critique. Yet, Bornmann and Marewski [1] proposed such a framework recently. They developed bibliometrics-based heuristics (BBHs) based on the fast-and-frugal heuristics approach [2] to decision making, in order to conceptually understand and empirically investigate the quantitative evaluation of research as well as to effectively train end-users of bibliometrics (e.g., science managers, scientists). Heuristics are decision strategies that use part of the available information and ignore the rest. By exploiting the statistical structure of task environments, they can aid to make accurate, fast, effortless, and cost-efficient decisions without that trade-offs are incurred. Because of their simplicity, heuristics are easy to understand and communicate, enhancing the transparency of decision processes. In this commentary, we explain several BBHs and discuss how such heuristics can be employed in practice (using the evaluation of applicants for funding programs as one example). Furthermore, we outline why heuristics can perform well, and how they and their fit to task environments can be studied. In pointing to the potential of research on BBHs and to the risks that come with an under-researched, mindless usage of bibliometrics, this commentary contributes to make research evaluation more scientific.

preprint2020arXiv

Convergent validity of several indicators measuring disruptiveness with milestone assignments to physics papers by experts

This study focuses on a recently introduced type of indicator measuring disruptiveness in science. Disruptive research diverges from current lines of research by opening up new lines. In the current study, we included the initially proposed indicator of this new type (Wu, Wang, & Evans, 2019) and several variants with DI1: DI5, DI1n, DI5n, and DEP. Since indicators should measure what they propose to measure, we investigated the convergent validity of the indicators. We used a list of milestone papers, selected and published by editors of Physical Review Letters, and investigated whether this human (experts - based list is related to values of the several disruption indicators variants and - if so - which variants show the highest correlation with expert judgements. We used bivariate statistics, multiple regression models, and (coarsened) exact matching (CEM) to investigate the convergent validity of the indicators. The results show that the indicators correlate differently with the milestone paper assignments by the editors. It is not the initially proposed disruption index that performed best (DI1), but the variant DI5 which has been introduced by Bornmann, Devarakonda, Tekles, and Chacko (2019). In the CEM analysis of this study, the DEP variant - introduced by Bu, Waltman, and Huang (2019) - also showed favorable results.

preprint2020arXiv

Should citations be field-normalized in evaluative bibliometrics? An empirical analysis based on propensity score matching

Field-normalization of citations is bibliometric standard. Despite the observed differences in citation counts between fields, the question remains how strong fields influence citation rates beyond the effect of attributes or factors possibly influencing citations (FICs). We considered several FICs such as number of pages and number of co-authors in this study. We wondered whether there is a separate field-effect besides other effects (e.g., from numbers of pages and co-authors). To find an answer on the question in this study, we applied inverse-probability of treatment weighting (IPW). Using Web of Science data (a sample of 308,231 articles), we investigated whether mean differences among subject categories in citation rates still remain, even if the subject categories are made comparable in the field-related attributes (e.g., comparable of co-authors, comparable number of pages) by IPW. In a diagnostic step of our statistical analyses, we considered propensity scores as covariates in regression analyses to examine whether the differences between the fields in FICs vanish. The results revealed that the differences did not completely vanish but were strongly reduced. We received similar results when we calculated mean value differences of the fields after IPW representing the causal or unconfounded field effects on citations. However, field differences in citation rates remain. The results point out that field-normalization seems to be a prerequisite for citation analysis and cannot be replaced by the consideration of any set of FICs in citation analyses.

preprint2020arXiv

Which papers cited which tweets? An empirical analysis based on Scopus data

Many altmetric studies analyze which papers were mentioned how often in specific altmetrics sources. In order to study the potential policy relevance of tweets from another perspective, we investigate which tweets were cited in papers. If many tweets were cited in publications, this might demonstrate that tweets have substantial and useful content. Overall, a rather low number of tweets (n=5506) were cited by less than 3000 papers. Most tweets do not seem to be cited because of any cognitive influence they might have had on studies; they rather were study objects. Most of the papers citing tweets are from the subject areas Social Sciences, Arts and Humanities, and Computer Sciences. Most of the papers cited only one tweet. Up to 55 tweets cited in a single paper were found. This research-in-progress does not support a high policy-relevance of tweets. However, a content analysis of the tweets and/or papers might lead to a more detailed conclusion.

preprint2019arXiv

Citation concept analysis (CCA) - A new form of citation analysis revealing the usefulness of concepts for other researchers illustrated by two exemplary case studies including classic books by Thomas S. Kuhn and Karl R. Popper

In recent years, the full text of papers are increasingly available electronically which opens up the possibility of quantitatively investigating citation contexts in more detail. In this study, we introduce a new form of citation analysis, which we call citation concept analysis (CCA). CCA is intended to reveal the cognitive impact certain concepts -- published in a document -- have on the citing authors. It counts the number of times the concepts are mentioned (cited) in the citation context of citing publications. We demonstrate the method using three classical examples: (1) The structure of scientific revolutions by Thomas S. Kuhn, (2) The logic of scientific discovery - Logik der Forschung: Zur Erkenntnistheorie der modernen Naturwissenschaft in German -, and (3) Conjectures and refutations: the growth of scientific knowledge by Karl R. Popper. It is not surprising -- as our results show -- that Kuhn's "paradigm" concept has had a significant impact. What is surprising is that it has had such a disproportionately larger impact than Kuhn's other concepts, e.g., "scientific revolution". The paradigm concept accounts for over 80% of the concept-related citations to Kuhn's work, and its impact is resilient across all disciplines and over time. With respect to Popper, "falsification" is the most used concept derived from his books. Falsification, after all, is the cornerstone of Popper's critical rationalism.

preprint2019arXiv

Do disruption index indicators measure what they propose to measure? The comparison of several indicator variants with assessments by peers

Recently, Wu, Wang, and Evans (2019) and Bu, Waltman, and Huang (2019) proposed a new family of indicators, which measure whether a scientific publication is disruptive to a field or tradition of research. Such disruptive influences are characterized by citations to a focal paper, but not its cited references. In this study, we are interested in the question of convergent validity, i.e., whether these indicators of disruption are able to measure what they propose to measure ('disruptiveness'). We used external criteria of newness to examine convergent validity: in the post-publication peer review system of F1000Prime, experts assess papers whether the reported research fulfills these criteria (e.g., reports new findings). This study is based on 120,179 papers from F1000Prime published between 2000 and 2016. In the first part of the study we discuss the indicators. Based on the insights from the discussion, we propose alternate variants of disruption indicators. In the second part, we investigate the convergent validity of the indicators and the (possibly) improved variants. Although the results of a factor analysis show that the different variants measure similar dimensions, the results of regression analyses reveal that one variant (DI5) performs slightly better than the others.

preprint2017arXiv

t factor: A metric for measuring impact on Twitter

Based on the definition of the well-known h index we propose a t factor for measuring the impact of publications (and other entities) on Twitter. The new index combines tweet and retweet data in a balanced way whereby retweets are seen as data reflecting the impact of initial tweets. The t factor is defined as follows: A unit (single publication, journal, researcher, research group etc.) has factor t if t of its Nt tweets have at least t retweets each and the other (Nt-t) tweets have <=t retweets each.