Researcher profile

David García-Soriano

David García-Soriano contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
2topics
3close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2020arXiv

Fair-by-design matching

Matching algorithms are used routinely to match donors to recipients for solid organs transplantation, for the assignment of medical residents to hospitals, record linkage in databases, scheduling jobs on machines, network switching, online advertising, and image recognition, among others. Although many optimal solutions may exist to a given matching problem, when the elements that shall or not be included in a solution correspond to individuals, it becomes of paramount importance that the solution be selected fairly. In this paper we study individual fairness in matching problems. Given that many maximum matchings may exist, each one satisfying a different set of individuals, the only way to guarantee fairness is through randomization. Hence we introduce the distributional maxmin fairness framework which provides, for any given input instance, the strongest guarantee possible simultaneously for all individuals in terms of satisfaction probability (the probability of being matched in the solution). Specifically, a probability distribution over feasible solutions is maxmin-fair if it is not possible to improve the satisfaction probability of any individual without decreasing it for some other individual which is no better off. In the special case of matchings in bipartite graphs, our framework is equivalent to the egalitarian mechanism of Bogomolnaia and Mouline. Our main contribution is a polynomial-time algorithm for fair matching building on techniques from minimum cuts, and edge-coloring algorithms for regular bipartite graphs, and transversal theory. For bipartite graphs, our algorithm runs in $O((|V|^2 + |E||V|^{2/3}) \cdot (\log |V|)^2)$ expected time and scales to graphs with tens of millions of vertices and hundreds of millions of edges. To the best of our knowledge, this provides the first large-scale implementation of the egalitarian mechanism.

preprint2020arXiv

Query-Efficient Correlation Clustering

Correlation clustering is arguably the most natural formulation of clustering. Given n objects and a pairwise similarity measure, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. A main drawback of correlation clustering is that it requires as input the $Θ(n^2)$ pairwise similarities. This is often infeasible to compute or even just to store. In this paper we study \emph{query-efficient} algorithms for correlation clustering. Specifically, we devise a correlation clustering algorithm that, given a budget of $Q$ queries, attains a solution whose expected number of disagreements is at most $3\cdot OPT + O(\frac{n^3}{Q})$, where $OPT$ is the optimal cost for the instance. Its running time is $O(Q)$, and can be easily made non-adaptive (meaning it can specify all its queries at the outset and make them in parallel) with the same guarantees. Up to constant factors, our algorithm yields a provably optimal trade-off between the number of queries $Q$ and the worst-case error attained, even for adaptive algorithms. Finally, we perform an experimental study of our proposed method on both synthetic and real data, showing the scalability and the accuracy of our algorithm.