Researcher profile

Neil Ernst

Neil Ernst contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

AI-driven Development Is Here: Should You Worry?

AI-Driven Development Environments (AIDEs) Integrate the power of modern AI into IDEs like Visual Studio Code and JetBrains IntelliJ. By leveraging massive language models and the plethora of openly available source code, AIDEs promise to automate many of the obvious, routine tasks in programming. At the same time, AIDEs come with new challenges to think about, such as bias, legal compliance, security vulnerabilities, and their impact on learning programming.

preprint2021arXiv

Empirical Standards for Software Engineering Research

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.

preprint2021arXiv

VConstruct: Filling Gaps in Chl-a Data Using a Variational Autoencoder

Remote sensing of Chlorophyll-a is vital in monitoring climate change. Chlorphyll-a measurements give us an idea of the algae concentrations in the ocean, which lets us monitor ocean health. However, a common problem is that the satellites used to gather the data are commonly obstructed by clouds and other artifacts. This means that time series data from satellites can suffer from spatial data loss. There are a number of algorithms that are able to reconstruct the missing parts of these images to varying degrees of accuracy, with Data INterpolating Empirical Orthogonal Functions (DINEOF) being the current standard. However, DINEOF is slow, suffers from accuracy loss in temporally homogenous waters, reliant on temporal data, and only able to generate a single potential reconstruction. We propose a machine learning approach to reconstruction of Chlorophyll-a data using a Variational Autoencoder (VAE). Our accuracy results to date are competitive with but slightly less accurate than DINEOF. We show the benefits of our method including vastly decreased computation time and ability to generate multiple potential reconstructions. Lastly, we outline our planned improvements and future work.

preprint2020arXiv

Code Duplication and Reuse in Jupyter Notebooks

Duplicating one's own code makes it faster to write software. This expediency is particularly valuable for users of computational notebooks. Duplication allows notebook users to quickly test hypotheses and iterate over data. In this paper, we explore how much, how and from where code duplication occurs in computational notebooks, and identify potential barriers to code reuse. Previous work in the area of computational notebooks describes developers' motivations for reuse and duplication but does not show how much reuse occurs or which barriers they face when reusing code. To address this gap, we first analyzed GitHub repositories for code duplicates contained in a repository's Jupyter notebooks, and then conducted an observational user study of code reuse, where participants solved specific tasks using notebooks. Our findings reveal that repositories in our sample have a mean self-duplication rate of 7.6%. However, in our user study, few participants duplicated their own code, preferring to reuse code from online sources.

preprint2020arXiv

GDPR Compliance in the Context of Continuous Integration

The enactment of the General Data Protection Regulation (GDPR) in 2018 forced any organization that collects and/or processes EU-based personal data to comply with stringent privacy regulations. Software organizations have struggled to achieve GDPR compliance both before and after the GDPR deadline. While some studies have relied on surveys or interviews to find general implications of the GDPR, there is a lack of in-depth studies that investigate compliance practices and compliance challenges of software organizations. In particular, there is no information on small and medium enterprises (SMEs), which represent the majority of organizations in the EU, nor on organizations that practice continuous integration. Using design science methodology, we conducted an in-depth study over the span of 20 months regarding GDPR compliance practices and challenges in collaboration with a small, startup organization. We first identified our collaborator's business problems and then iteratively developed two artifacts to address those problems: a set of operationalized GDPR principles, and an automated GDPR tool that tests those GDPR-derived privacy requirements. This design science approach resulted in four implications for research and for practice. For example, our research reveals that GDPR regulations can be partially operationalized and tested through automated means, which improves compliance practices, but more research is needed to create more efficient and effective means to disseminate and manage GDPR knowledge among software developers.

preprint2020arXiv

The Lack of Shared Understanding of Non-Functional Requirements in Continuous Software Engineering: Accidental or Essential?

Building shared understanding of requirements is key to ensuring downstream software activities are efficient and effective. However, in continuous software engineering (CSE) some lack of shared understanding is an expected, and essential, part of a rapid feedback learning cycle. At the same time, there is a key trade-off with avoidable costs, such as rework, that come from accidental gaps in shared understanding. This trade-off is even more challenging for non-functional requirements (NFRs), which have significant implications for product success. Comprehending and managing NFRs is especially difficult in small, agile organizations. How such organizations manage shared understanding of NFRs in CSE is understudied. We conducted a case study of three small organizations scaling up CSE to further understand and identify factors that contribute to lack of shared understanding of NFRs, and its relationship to rework. Our in-depth analysis identified 41 NFR-related software tasks as rework due to a lack of shared understanding of NFRs. Of these 41 tasks 78% were due to avoidable (accidental) lack of shared understanding of NFRs. Using a mixed-methods approach we identify factors that contribute to lack of shared understanding of NFRs, such as the lack of domain knowledge, rapid pace of change, and cross-organizational communication problems. We also identify recommended strategies to mitigate lack of shared understanding through more effective management of requirements knowledge in such organizations. We conclude by discussing the complex relationship between shared understanding of requirements, rework and, CSE.