Source author record

Christoph Heitz

Christoph Heitz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

cs.CY Machine Learning Artificial Intelligence

Catalog footprint

What is connected

6works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems

Designing fair algorithmic decision systems requires balancing model performance with fairness toward affected individuals: More fairness might require sacrificing some performance and vice versa, yet the space of possible trade-offs is still poorly understood. We investigate fairness in binary prediction-based decision problems by conceptualizing decision making as a multi-objective optimization problem that simultaneously considers decision-maker utility and group fairness. We investigate the set of Pareto-optimal decision rules for arbitrary utility functions for decision maker, arbitrary population distributions, and a wide range of group fairness metrics. We find that the Pareto frontier consists of deterministic, group-specific threshold rules applied to individuals' success probability. This complements existing optimality theorems from literature which, for specific fairness constraints, posit lower-bound threshold rules only. However we also show that, depending on the used fairness metric, the Pareto frontier may include upper-bound threshold rules, thus preferring individuals with lower success probabilities. We show that the location of the Pareto frontier depends only on population characteristics, utility functions and fairness score, but not on the technical design of the algorithm - our findings hold for pre-, in-, and post-processing approaches alike. Our results generalize existing optimality theorems for fairness-constrained classification and extend them to generalized fairness metrics and fairness principles, and to partial fairness regimes. This paper connects formal fairness research with legal and ethical requirements to search for less discriminatory alternatives, offering a principled foundation for evaluating and comparing algorithmic decision systems.

preprint2022arXiv

Enforcing Group Fairness in Algorithmic Decision Making: Utility Maximization Under Sufficiency

Binary decision making classifiers are not fair by default. Fairness requirements are an additional element to the decision making rationale, which is typically driven by maximizing some utility function. In that sense, algorithmic fairness can be formulated as a constrained optimization problem. This paper contributes to the discussion on how to implement fairness, focusing on the fairness concepts of positive predictive value (PPV) parity, false omission rate (FOR) parity, and sufficiency (which combines the former two). We show that group-specific threshold rules are optimal for PPV parity and FOR parity, similar to well-known results for other group fairness criteria. However, depending on the underlying population distributions and the utility function, we find that sometimes an upper-bound threshold rule for one group is optimal: utility maximization under PPV parity (or FOR parity) might thus lead to selecting the individuals with the smallest utility for one group, instead of selecting the most promising individuals. This result is counter-intuitive and in contrast to the analogous solutions for statistical parity and equality of opportunity. We also provide a solution for the optimal decision rules satisfying the fairness constraint sufficiency. We show that more complex decision rules are required and that this leads to within-group unfairness for all but one of the groups. We illustrate our findings based on simulated and real data.

preprint2022arXiv

Is calibration a fairness requirement? An argument from the point of view of moral philosophy and decision theory

In this paper, we provide a moral analysis of two criteria of statistical fairness debated in the machine learning literature: 1) calibration between groups and 2) equality of false positive and false negative rates between groups. In our paper, we focus on moral arguments in support of either measure. The conflict between group calibration vs. false positive and false negative rate equality is one of the core issues in the debate about group fairness definitions among practitioners. For any thorough moral analysis, the meaning of the term fairness has to be made explicit and defined properly. For our paper, we equate fairness with (non-)discrimination, which is a legitimate understanding in the discussion about group fairness. More specifically, we equate it with prima facie wrongful discrimination in the sense this is used in Prof. Lippert-Rasmussen's treatment of this definition. In this paper, we argue that a violation of group calibration may be unfair in some cases, but not unfair in others. This is in line with claims already advanced in the literature, that algorithmic fairness should be defined in a way that is sensitive to context. The most important practical implication is that arguments based on examples in which fairness requires between-group calibration, or equality in the false-positive/false-negative rates, do no generalize. For it may be that group calibration is a fairness requirement in one case, but not in another.

preprint2022arXiv

People are not coins. Morally distinct types of predictions necessitate different fairness constraints

A recent paper (Hedden 2021) has argued that most of the group fairness constraints discussed in the machine learning literature are not necessary conditions for the fairness of predictions, and hence that there are no genuine fairness metrics. This is proven by discussing a special case of a fair prediction. In our paper, we show that Hedden 's argument does not hold for the most common kind of predictions used in data science, which are about people and based on data from similar people; we call these human-group-based practices. We argue that there is a morally salient distinction between human-group-based practices and those that are based on data of only one person, which we call human-individual-based practices. Thus, what may be a necessary condition for the fairness of human-group-based practices may not be a necessary condition for the fairness of human-individual-based practices, on which Hedden 's argument is based. Accordingly, the group fairness metrics discussed in the machine learning literature may still be relevant for most applications of prediction-based decision making.

preprint2021arXiv

A Systematic Approach to Group Fairness in Automated Decision Making

While the field of algorithmic fairness has brought forth many ways to measure and improve the fairness of machine learning models, these findings are still not widely used in practice. We suspect that one reason for this is that the field of algorithmic fairness came up with a lot of definitions of fairness, which are difficult to navigate. The goal of this paper is to provide data scientists with an accessible introduction to group fairness metrics and to give some insight into the philosophical reasoning for caring about these metrics. We will do this by considering in which sense socio-demographic groups are compared for making a statement on fairness.

preprint2021arXiv

On the Moral Justification of Statistical Parity

A crucial but often neglected aspect of algorithmic fairness is the question of how we justify enforcing a certain fairness metric from a moral perspective. When fairness metrics are proposed, they are typically argued for by highlighting their mathematical properties. Rarely are the moral assumptions beneath the metric explained. Our aim in this paper is to consider the moral aspects associated with the statistical fairness criterion of independence (statistical parity). To this end, we consider previous work, which discusses the two worldviews "What You See Is What You Get" (WYSIWYG) and "We're All Equal" (WAE) and by doing so provides some guidance for clarifying the possible assumptions in the design of algorithms. We present an extension of this work, which centers on morality. The most natural moral extension is that independence needs to be fulfilled if and only if differences in predictive features (e.g. high school grades and standardized test scores are predictive of performance at university) between socio-demographic groups are caused by unjust social disparities or measurement errors. Through two counterexamples, we demonstrate that this extension is not universally true. This means that the question of whether independence should be used or not cannot be satisfactorily answered by only considering the justness of differences in the predictive features.

Christoph Heitz

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

Fairness vs Performance: Characterizing the Pareto Frontier of Algorithmic Decision Systems

Enforcing Group Fairness in Algorithmic Decision Making: Utility Maximization Under Sufficiency

Is calibration a fairness requirement? An argument from the point of view of moral philosophy and decision theory

People are not coins. Morally distinct types of predictions necessitate different fairness constraints

A Systematic Approach to Group Fairness in Automated Decision Making

On the Moral Justification of Statistical Parity