Source author record

Neil A. Ernst

Neil A. Ernst appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering

Catalog footprint

What is connected

4works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Error Identification Strategies for Python Jupyter Notebooks

Computational notebooks -- such as Jupyter or Colab -- combine text and data analysis code. They have become ubiquitous in the world of data science and exploratory data analysis. Since these notebooks present a different programming paradigm than conventional IDE-driven programming, it is plausible that debugging in computational notebooks might also be different. More specifically, since creating notebooks blends domain knowledge, statistical analysis, and programming, the ways in which notebook users find and fix errors in these different forms might be different. In this paper, we present an exploratory, observational study on how Python Jupyter notebook users find and understand potential errors in notebooks. Through a conceptual replication of study design investigating the error identification strategies of R notebook users, we presented users with Python Jupyter notebooks pre-populated with common notebook errors -- errors rooted in either the statistical data analysis, the knowledge of domain concepts, or in the programming. We then analyzed the strategies our study participants used to find these errors and determined how successful each strategy was at identifying errors. Our findings indicate that while the notebook programming environment is different from the environments used for traditional programming, debugging strategies remain quite similar. It is our hope that the insights presented in this paper will help both notebook tool designers and educators make changes to improve how data scientists discover errors more easily in the notebooks they write.

preprint2020arXiv

Cross-Dataset Design Discussion Mining

Being able to identify software discussions that are primarily about design, which we call design mining, can improve documentation and maintenance of software systems. Existing design mining approaches have good classification performance using natural language processing (NLP) techniques, but the conclusion stability of these approaches is generally poor. A classifier trained on a given dataset of software projects has so far not worked well on different artifacts or different datasets. In this study, we replicate and synthesize these earlier results in a meta-analysis. We then apply recent work in transfer learning for NLP to the problem of design mining. However, for our datasets, these deep transfer learning classifiers perform no better than less complex classifiers. We conclude by discussing some reasons behind the transfer learning approach to design mining.

preprint2011arXiv

Mixed-Variable Requirements Roadmaps and their Role in the Requirements Engineering of Adaptive Systems

The requirements roadmap concept is introduced as a solution to the problem of the requirements engineering of adaptive systems. The concept requires a new general definition of the requirements problem which allows for quantitative (numeric) variables, together with qualitative (binary boolean) propositional variables, and distinguishes monitored from controlled variables for use in control loops. We study the consequences of these changes, and argue that the requirements roadmap concept bridges the gap between current general definitions of the requirements problem and its notion of solution, and the research into the relaxation of requirements, the evaluation of their partial satisfaction, and the monitoring and control of requirements, all topics of particular interest in the engineering of requirements for adaptive systems [Cheng et al. 2009]. From the theoretical perspective, we show clearly and formally the fundamental differences between more traditional conception of requirements engineering (e.g., Zave & Jackson [1997]) and the requirements engineering of adaptive systems (from Fickas & Feather [1995], over Letier & van Lamsweerde [2004], and up to Whittle et al. [2010] and the most recent research). From the engineering perspective, we define a proto-framework for early requirements engineering of adaptive systems, which illustrates the features needed in future requirements frameworks for this class of systems.

preprint2010arXiv

Code forking in open-source software: a requirements perspective

To fork a project is to copy the existing code base and move in a direction different than that of the erstwhile project leadership. Forking provides a rapid way to address new requirements by adapting an existing solution. However, it can also create a plethora of similar tools, and fragment the developer community. Hence, it is not always clear whether forking is the right strategy. In this paper, we describe a mixed-methods exploratory case study that investigated the process of forking a project. The study concerned the forking of an open-source tool for managing software projects, Trac. Trac was forked to address differing requirements in an academic setting. The paper makes two contributions to our understanding of code forking. First, our exploratory study generated several theories about code forking in open source projects, for further research. Second, we investigated one of these theories in depth, via a quantitative study. We conjectured that the features of the OSS forking process would allow new requirements to be addressed. We show that the forking process in this case was successful at fulfilling the new projects requirements.