Source author record

Hadis Anahideh

Hadis Anahideh appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning cs.CY math.OC

Catalog footprint

What is connected

5works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

An Empirical Review of Model-based Adaptive Sampling for Global Optimization of Expensive Black-box Functions

This paper reviews the state-of-the-art model-based adaptive sampling approaches for single-objective black-box optimization (BBO). While BBO literature includes various promising sampling techniques, there is still a lack of comprehensive investigations of the existing research across the vast scope of BBO problems. We first classify BBO problems into two categories: engineering design and algorithm design optimization and discuss their challenges. We then critically discuss and analyze the adaptive model-based sampling techniques focusing on key acquisition functions. We elaborate on the shortcomings of the variance-based sampling techniques for engineering design problems. Moreover, we provide in-depth insights on the impact of the discretization schemes on the performance of acquisition functions. We emphasize the importance of dynamic discretization for distance-based exploration and introduce EEPA+, an improved variant of a previously proposed Pareto-based sampling technique. Our empirical analyses reveal the effectiveness of variance-based techniques for algorithm design and distance-based methods for engineering design optimization problems.

preprint2022arXiv

Auditing the Imputation Effect on Fairness of Predictive Analytics in Higher Education

Colleges and universities use predictive analytics in a variety of ways to increase student success rates. Despite the potential for predictive analytics, two major barriers exist to their adoption in higher education: (a) the lack of democratization in deployment, and (b) the potential to exacerbate inequalities. Education researchers and policymakers encounter numerous challenges in deploying predictive modeling in practice. These challenges present in different steps of modeling including data preparation, model development, and evaluation. Nevertheless, each of these steps can introduce additional bias to the system if not appropriately performed. Most large-scale and nationally representative education data sets suffer from a significant number of incomplete responses from the research participants. While many education-related studies addressed the challenges of missing data, little is known about the impact of handling missing values on the fairness of predictive outcomes in practice. In this paper, we set out to first assess the disparities in predictive modeling outcomes for college-student success, then investigate the impact of imputation techniques on the model performance and fairness using a commonly used set of metrics. We conduct a prospective evaluation to provide a less biased estimation of future performance and fairness than an evaluation of historical data. Our comprehensive analysis of a real large-scale education dataset reveals key insights on modeling disparities and how imputation techniques impact the fairness of the student-success predictive outcome under different testing scenarios. Our results indicate that imputation introduces bias if the testing set follows the historical distribution. However, if the injustice in society is addressed and consequently the upcoming batch of observations is equalized, the model would be less biased.

preprint2022arXiv

Explainable Predictive Modeling for Limited Spectral Data

Feature selection of high-dimensional labeled data with limited observations is critical for making powerful predictive modeling accessible, scalable, and interpretable for domain experts. Spectroscopy data, which records the interaction between matter and electromagnetic radiation, particularly holds a lot of information in a single sample. Since acquiring such high-dimensional data is a complex task, it is crucial to exploit the best analytical tools to extract necessary information. In this paper, we investigate the most commonly used feature selection techniques and introduce applying recent explainable AI techniques to interpret the prediction outcomes of high-dimensional and limited spectral data. Interpretation of the prediction outcome is beneficial for the domain experts as it ensures the transparency and faithfulness of the ML models to the domain knowledge. Due to the instrument resolution limitations, pinpointing important regions of the spectroscopy data creates a pathway to optimize the data collection process through the miniaturization of the spectrometer device. Reducing the device size and power and therefore cost is a requirement for the real-world deployment of such a sensor-to-prediction system as a whole. We specifically design three different scenarios to ensure that the evaluation of ML models is robust for the real-time practice of the developed methodologies and to uncover the hidden effect of noise sources on the final outcome.

preprint2022arXiv

Finding Representative Group Fairness Metrics Using Correlation Estimations

It is of critical importance to be aware of the historical discrimination embedded in the data and to consider a fairness measure to reduce bias throughout the predictive modeling pipeline. Given various notions of fairness defined in the literature, investigating the correlation and interaction among metrics is vital for addressing unfairness. Practitioners and data scientists should be able to comprehend each metric and examine their impact on one another given the context, use case, and regulations. Exploring the combinatorial space of different metrics for such examination is burdensome. To alleviate the burden of selecting fairness notions for consideration, we propose a framework that estimates the correlation among fairness notions. Our framework consequently identifies a set of diverse and semantically distinct metrics as representative for a given context. We propose a Monte-Carlo sampling technique for computing the correlations between fairness metrics by indirect and efficient perturbation in the model space. Using the estimated correlations, we then find a subset of representative metrics. The paper proposes a generic method that can be generalized to any arbitrary set of fairness metrics. We showcase the validity of the proposal using comprehensive experiments on real-world benchmark datasets.

preprint2020arXiv

Fair Active Learning

Machine learning (ML) is increasingly being used in high-stakes applications impacting society. Therefore, it is of critical importance that ML models do not propagate discrimination. Collecting accurate labeled data in societal applications is challenging and costly. Active learning is a promising approach to build an accurate classifier by interactively querying an oracle within a labeling budget. We design algorithms for fair active learning that carefully selects data points to be labeled so as to balance model accuracy and fairness. Specifically, we focus on demographic parity - a widely used measure of fairness. Extensive experiments over benchmark datasets demonstrate the effectiveness of our proposed approach.

Hadis Anahideh

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

An Empirical Review of Model-based Adaptive Sampling for Global Optimization of Expensive Black-box Functions

Auditing the Imputation Effect on Fairness of Predictive Analytics in Higher Education

Explainable Predictive Modeling for Limited Spectral Data

Finding Representative Group Fairness Metrics Using Correlation Estimations

Fair Active Learning