Source author record

Miroslaw Staron

Miroslaw Staron appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering General Literature Machine Learning

Catalog footprint

What is connected

4works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Agentic Pipelines in Embedded Software Engineering: Emerging Practices and Challenges

A new transformation is underway in software engineering, driven by the rapid adoption of generative AI in development workflows. Similar to how version control systems once automated manual coordination, AI tools are now beginning to automate many aspects of programming. For embedded software engineering organizations, however, this marks their first experience integrating AI into safety-critical and resource-constrained environments. The strict demands for determinism, reliability, and traceability pose unique challenges for adopting generative technologies. In this paper, we present findings from a qualitative study with ten senior experts from four companies who are evaluating generative AI-augmented development for embedded software. Through semi-structured focus group interviews and structured brainstorming sessions, we identified eleven emerging practices and fourteen challenges related to the orchestration, responsible governance, and sustainable adoption of generative AI tools. Our results show how embedded software engineering teams are rethinking workflows, roles, and toolchains to enable a sustainable transition toward agentic pipelines and generative AI-augmented development.

preprint2022arXiv

Trusting Machine Learning Results from Medical Procedures in the Operating Room

Machine learning can be used to analyse physiological data for several purposes. Detection of cerebral ischemia is an achievement that would have high impact on patient care. We attempted to study if collection of continous physiological data from non-invasive monitors, and analysis with machine learning could detect cerebral ischemia in tho different setting, during surgery for carotid endarterectomy and during endovascular thrombectomy in acute stroke. We compare the results from the two different group and one patient from each group in details. While results from CEA-patients are consistent, those from thrombectomy patients are not and frequently contain extreme values such as 1.0 in accuracy. We conlcude that this is a result of short duration of the procedure and abundance of data with bad quality resulting in small data sets. These results can therefore not be trusted.

preprint2021arXiv

Empirical Standards for Software Engineering Research

Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair.

preprint2020arXiv

PHANTOM: Curating GitHub for engineered software projects using time-series clustering

Context: Within the field of Mining Software Repositories, there are numerous methods employed to filter datasets in order to avoid analysing low-quality projects. Unfortunately, the existing filtering methods have not kept up with the growth of existing data sources, such as GitHub, and researchers often rely on quick and dirty techniques to curate datasets. Objective: The objective of this study is to develop a method capable of filtering large quantities of software projects in a resource-efficient way. Method: This study follows the Design Science Research (DSR) methodology. The proposed method, PHANTOM, extracts five measures from Git logs. Each measure is transformed into a time-series, which is represented as a feature vector for clustering using the k-means algorithm. Results: Using the ground truth from a previous study, PHANTOM was shown to be able to rediscover the ground truth on the training dataset, and was able to identify "engineered" projects with up to 0.87 Precision and 0.94 Recall on the validation dataset. PHANTOM downloaded and processed the metadata of 1,786,601 GitHub repositories in 21.5 days using a single personal computer, which is over 33% faster than the previous study which used a computer cluster of 200 nodes. The possibility of applying the method outside of the open-source community was investigated by curating 100 repositories owned by two companies. Conclusions: It is possible to use an unsupervised approach to identify engineered projects. PHANTOM was shown to be competitive compared to the existing supervised approaches while reducing the hardware requirements by two orders of magnitude.