Source author record

Sheeba Samuel

Sheeba Samuel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Machine Learning Computational Engineering, Finance, and Science cs.CY

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Learning to be Reproducible: Custom Loss Design for Robust Neural Networks

To enhance the reproducibility and reliability of deep learning models, we address a critical gap in current training methodologies: the lack of mechanisms that ensure consistent and robust performance across runs. Our empirical analysis reveals that even under controlled initialization and training conditions, the accuracy of the model can exhibit significant variability. To address this issue, we propose a Custom Loss Function (CLF) that reduces the sensitivity of training outcomes to stochastic factors such as weight initialization and data shuffling. By fine-tuning its parameters, CLF explicitly balances predictive accuracy with training stability, leading to more consistent and reliable model performance. Extensive experiments across diverse architectures for both image classification and time series forecasting demonstrate that our approach significantly improves training robustness without sacrificing predictive performance. These results establish CLF as an effective and efficient strategy for developing more stable, reliable and trustworthy neural networks.

preprint2022arXiv

Computational reproducibility of Jupyter notebooks from biomedical publications

Jupyter notebooks allow to bundle executable code with its documentation and output in one interactive environment, and they represent a popular mechanism to document and share computational workflows, including for research publications. Here, we analyze the computational reproducibility of 9625 Jupyter notebooks from 1117 GitHub repositories associated with 1419 publications indexed in the biomedical literature repository PubMed Central. 8160 of these were written in Python, including 4169 that had their dependencies declared in standard requirement files and that we attempted to re-run automatically. For 2684 of these, all declared dependencies could be installed successfully, and we re-ran them to assess reproducibility. Of these, 396 notebooks ran through without any errors, including 245 that produced results identical to those reported in the original. Running the other notebooks resulted in exceptions. We zoom in on common problems and practices, highlight trends and discuss potential improvements to Jupyter-related workflows associated with biomedical publications.

preprint2020arXiv

Machine Learning Pipelines: Provenance, Reproducibility and FAIR Data Principles

Machine learning (ML) is an increasingly important scientific tool supporting decision making and knowledge generation in numerous fields. With this, it also becomes more and more important that the results of ML experiments are reproducible. Unfortunately, that often is not the case. Rather, ML, similar to many other disciplines, faces a reproducibility crisis. In this paper, we describe our goals and initial steps in supporting the end-to-end reproducibility of ML pipelines. We investigate which factors beyond the availability of source code and datasets influence reproducibility of ML experiments. We propose ways to apply FAIR data practices to ML workflows. We present our preliminary results on the role of our tool, ProvBook, in capturing and comparing provenance of ML experiments and their reproducibility using Jupyter Notebooks.

preprint2020arXiv

ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks

Computational notebooks have gained widespread adoption among researchers from academia and industry as they support reproducible science. These notebooks allow users to combine code, text, and visualizations for easy sharing of experiments and results. They are widely shared in GitHub, which currently has more than 100 million repositories making it the largest host of source code in the world. Recent reproducibility studies have indicated that there exist good and bad practices in writing these notebooks which can affect their overall reproducibility. We present ReproduceMeGit, a visualization tool for analyzing the reproducibility of Jupyter Notebooks. This will help repository users and owners to reproduce and directly analyze and assess the reproducibility of any GitHub repository containing Jupyter Notebooks. The tool provides information on the number of notebooks that were successfully reproducible, those that resulted in exceptions, those with different results from the original notebooks, etc. Each notebook in the repository along with the provenance information of its execution can also be exported in RDF with the integration of the ProvBook tool.

Sheeba Samuel

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Learning to be Reproducible: Custom Loss Design for Robust Neural Networks

Computational reproducibility of Jupyter notebooks from biomedical publications

Machine Learning Pipelines: Provenance, Reproducibility and FAIR Data Principles

ReproduceMeGit: A Visualization Tool for Analyzing Reproducibility of Jupyter Notebooks