Researcher profile

Serghei Mangul

Serghei Mangul contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2023arXiv

Perceptual and technical barriers in sharing and formatting metadata accompanying omics studies

Metadata, often termed "data about data," is crucial for organizing, understanding, and managing vast omics datasets. It aids in efficient data discovery, integration, and interpretation, enabling users to access, comprehend, and utilize data effectively. Its significance spans the domains of scientific research, facilitating data reproducibility, reusability, and secondary analysis. However, numerous perceptual and technical barriers hinder the sharing of metadata among researchers. These barriers compromise the reliability of research results and hinder integrative meta-analyses of omics studies . This study highlights the key barriers to metadata sharing, including the lack of uniform standards, privacy and legal concerns, limitations in study design, limited incentives, inadequate infrastructure, and the dearth of well-trained personnel for metadata management and reuse. Proposed solutions include emphasizing the promotion of standardization, educational efforts, the role of journals and funding agencies, incentives and rewards, and the improvement of infrastructure. More accurate, reliable, and impactful research outcomes are achievable if the scientific community addresses these barriers, facilitating more accurate, reliable, and impactful research outcomes.

preprint2022arXiv

Packaging, containerization, and virtualization of computational omics methods: Advances, challenges, and opportunities

Omics software tools have reshaped the landscape of modern biology and become an essential component of biomedical research. The increasing dependence of biomedical scientists on these powerful tools creates a need for easier installation and greater usability. Packaging, virtualization, and containerization are different approaches to satisfy this need by wrapping omics tools in additional software that makes the omics tools easier to install and use. Here, we systematically review practices across prominent packaging, virtualization, and containerization platforms. We outline the challenges, advantages, and limitations of each approach and some of the most widely used platforms from the perspectives of users, software developers, and system administrators. We also propose principles to make packaging, virtualization, and containerization of omics software more sustainable and robust to increase the reproducibility of biomedical and life science research.

preprint2021arXiv

Diversity in immunogenomics: the value and the challenge

With the advent of high-throughput sequencing technologies, the fields of immunogenomics and adaptive immune receptor repertoire research are facing both opportunities and challenges. Adaptive immune receptor repertoire sequencing (AIRR-seq) has become an increasingly important tool to characterize T and B cell responses in settings of interest. However, the majority of AIRR-seq studies conducted so far were performed in individuals of European ancestry, restricting the ability to identify variation in human adaptive immune responses across populations and limiting their applications. As AIRR-seq studies depend on the ability to assign VDJ sequence reads to the correct germline gene segments, efforts to characterize the genomic loci that encode adaptive immune receptor genes in different populations are urgently needed. The availability of comprehensive germline gene databases and further applications of AIRR-seq studies to individuals of non-European ancestry will substantially enhance our understanding of human adaptive immune responses, promote the development of effective diagnostics and treatments, and eventually advance precision medicine.

preprint2021arXiv

RNA-seq data science: From raw data to effective interpretation

RNA-sequencing (RNA-seq) has become an exemplar technology in modern biology and clinical applications over the past decade. It has gained immense popularity in the recent years driven by continuous efforts of the bioinformatics community to develop accurate and scalable computational tools. RNA-seq is a method of analyzing the RNA content of a sample using the modern sequencing platforms. It generates enormous amounts of transcriptomic data in the form of nucleotide sequences, known as reads. RNA-seq analysis enables the probing of genes and corresponding transcripts which is essential for answering important biological questions, such as detecting novel exons, transcripts, gene expressions, and studying alternative splicing structure. However, obtaining meaningful biological signals from raw data using computational methods is challenging due to the limitations of modern sequencing technologies. The need to leverage these technological challenges have pushed the rapid development of many novel computational tools which have evolved and diversified in accordance with technological advancements, leading to the current myriad population of RNA-seq tools. Our review provides a systemic overview of RNA-seq technology and 235 available RNA-seq tools across various domains published from 2008 to 2020, discussing the interdisciplinary nature of bioinformatics involved in RNA sequencing, analysis, and software development.

preprint2020arXiv

Metagenomics for clinical diagnostics: technologies and informatics

The human-associated microbiome is closely tied to human health and is of substantial clinical interest. Metagenomics-based tools are emerging for clinical diagnostics, tracking the spread of diseases, and surveillance of potential pathogens. In some cases, these tools are overcoming limitations of traditional clinical approaches. Metagenomics has limitations barring the tools from clinical validation. Once these hurdles are overcome, clinical metagenomics will inform doctors of the best, targeted treatment for their patients and provide early detection of disease. Here we present an overview of metagenomics methods with a discussion of computational challenges and limitations.

preprint2020arXiv

Recommendations to enhance rigor and reproducibility in biomedical research

Computational methods have reshaped the landscape of modern biology. While the biomedical community is increasingly dependent on computational tools, the mechanisms ensuring open data, open software, and reproducibility are variably enforced by academic institutions, funders, and publishers. Publications may present academic software for which essential materials are or become unavailable, such as source code and documentation. Publications that lack such information compromise the role of peer review in evaluating technical strength and scientific contribution. Incomplete ancillary information for an academic software package may bias or limit any subsequent work produced with the tool. We provide eight recommendations across four different domains to improve reproducibility, transparency, and rigor in computational biology - precisely on the main values which should be emphasized in life science curricula. Our recommendations for improving software availability, usability, and archival stability aim to foster a sustainable data science ecosystem in biomedicine and life science research.