Researcher profile

Rebecca Moussa

Rebecca Moussa contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

FairRF: Multi-Objective Search for Single and Intersectional Software Fairness

Background: The wide adoption of AI- and ML-based systems in sensitive domains raises severe concerns about their fairness. Many methods have been proposed in the literature to enhance software fairness. However, the majority behave as a black-box, not allowing stakeholders to prioritise fairness or effectiveness (i.e., prediction correctness) based on their needs. Aims: In this paper, we introduce FairRF, a novel approach based on multi-objective evolutionary search to optimise fairness and effectiveness in classification tasks. FairRF uses a Random Forest (RF) model as a base classifier and searches for the best hyperparameter configurations and data mutation to maximise fairness and effectiveness. Eventually, it returns a set of Pareto optimal solutions, allowing the final stakeholders to choose the best one based on their needs. Method: We conduct an extensive empirical evaluation of FairRF against 26 different baselines in 11 different scenarios using five effectiveness and three fairness metrics. Additionally, we also include two variations of the fairness metrics for intersectional bias for a total of six definitions analysed. Result: Our results show that FairRF can significantly improve the fairness of base classifiers, while maintaining consistent prediction effectiveness. Additionally, FairRF provides a more consistent optimisation under all fairness definitions compared to state-of-the-art bias mitigation methods and overcomes the existing state-of-the-art approach for intersectional bias mitigation. Conclusions: FairRF is an effective approach for bias mitigation also allowing stakeholders to adapt the development of fair software systems based on their specific needs.

preprint2022arXiv

A Versatile Dataset of Agile Open Source Software Projects

Agile software development is nowadays a widely adopted practise in both open-source and industrial software projects. Agile teams typically heavily rely on issue management tools to document new issues and keep track of outstanding ones, in addition to storing their technical details, effort estimates, assignment to developers, and more. Previous work utilised the historical information stored in issue management systems for various purposes; however, when researchers make their empirical data public, it is usually relevant solely to the study's objective. In this paper, we present a more holistic and versatile dataset containing a wealth of information on more than 500,000 issues from 44 open-source Agile software, making it well-suited to several research avenues, and cross-analyses therein, including effort estimation, issue prioritization, issue assignment and many more. We make this data publicly available on GitHub to facilitate ease of use, maintenance, and extensibility.

preprint2022arXiv

Agile Effort Estimation: Have We Solved the Problem Yet? Insights From A Second Replication Study (GPT2SP Replication Report)

Fu and Tantithamthavorn have recently proposed GPT2SP, a Transformer-based deep learning model for SP estimation of user stories. They empirically evaluated the performance of GPT2SP on a dataset shared by Choetkiertikul et al including 16 projects with a total of 23,313 issues. They benchmarked GPT2SP against two baselines (namely the naive Mean and Median estimators) and the method previously proposed by Choetkiertikul et al. (which we will refer to as DL2SP from now on) for both within- and cross-project estimation scenarios, and evaluated the extent to which each components of GPT2SP contribute towards the accuracy of the SP estimates. Their results show that GPT2SP outperforms DL2SP with a 6%-47% improvement over MAE for the within-project scenario and a 3%-46% improvement for the cross-project scenarios. However, when we attempted to use the GPT2SP source code made available by Fu and Tantithamthavorn to reproduce their experiments, we found a bug in the computation of the Mean Absolute Error (MAE), which may have inflated the GPT2SP's accuracy reported in their work. Therefore, we had issued a pull request to fix such a bug, which has been accepted and merged into their repository at https://github.com/awsm-research/gpt2sp/pull/2. In this report, we describe the results we achieved by using the fixed version of GPT2SP to replicate the experiments conducted in the original paper for RQ1 and RQ2. Following the original study, we analyse the results considering the Medan Absolute Error (MAE) of the estimation methods over all issues in each project, but we also report the Median Absolute Error (MdAE) and the Standard accuracy (SA) for completeness.

preprint2022arXiv

Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation

In the past two decades, several Machine Learning (ML) libraries have become freely available. Many studies have used such libraries to carry out empirical investigations on predictive Software Engineering (SE) tasks. However, the differences stemming from using one library over another have been overlooked, implicitly assuming that using any of these libraries would provide the user with the same or very similar results. This paper aims at raising awareness of the differences incurred when using different ML libraries for software development effort estimation (SEE), one of most widely studied SE prediction tasks. To this end, we investigate 4 deterministic machine learners as provided by 3 of the most popular ML open-source libraries written in different languages (namely, Scikit-Learn, Caret and Weka). We carry out a thorough empirical study comparing the performance of the machine learners on 5 SEE datasets in the two most common SEE scenarios (i.e., out-of-the-box-ml and tuned-ml) as well as an in-depth analysis of the documentation and code of their APIs. The results of our study reveal that the predictions provided by the 3 libraries differ in 95% of the cases on average across a total of 105 cases studied. These differences are significantly large in most cases and yield misestimations of up to approx. 3,000 hours per project. Moreover, our API analysis reveals that these libraries provide the user with different levels of control on the parameters one can manipulate, and a lack of clarity and consistency, overall, which might mislead users. Our findings highlight that the ML library is an important design choice for SEE studies, which can lead to a difference in performance. However, such a difference is under-documented. We conclude by highlighting open-challenges with suggestions for the developers of libraries as well as for the researchers and practitioners using them.