Source author record

Martine De Cock

Martine De Cock appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Cryptography and Security Artificial Intelligence Logic in Computer Science Programming Languages Computer Vision eess.AS Social and Information Networks Sound

Catalog footprint

What is connected

14works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

PrivFair: a Library for Privacy-Preserving Fairness Auditing

Machine learning (ML) has become prominent in applications that directly affect people's quality of life, including in healthcare, justice, and finance. ML models have been found to exhibit discrimination based on sensitive attributes such as gender, race, or disability. Assessing if an ML model is free of bias remains challenging to date, and by definition has to be done with sensitive user characteristics that are subject of anti-discrimination and data protection law. Existing libraries for fairness auditing of ML models offer no mechanism to protect the privacy of the audit data. We present PrivFair, a library for privacy-preserving fairness audits of ML models. Through the use of Secure Multiparty Computation (MPC), PrivFair protects the confidentiality of the model under audit and the sensitive data used for the audit, hence it supports scenarios in which a proprietary classifier owned by a company is audited using sensitive audit data from an external investigator. We demonstrate the use of PrivFair for group fairness auditing with tabular data or image data, without requiring the investigator to disclose their data to anyone in an unencrypted manner, or the model owner to reveal their model parameters to anyone in plaintext.

preprint2022arXiv

PrivFairFL: Privacy-Preserving Group Fairness in Federated Learning

Group fairness ensures that the outcome of machine learning (ML) based decision making systems are not biased towards a certain group of people defined by a sensitive attribute such as gender or ethnicity. Achieving group fairness in Federated Learning (FL) is challenging because mitigating bias inherently requires using the sensitive attribute values of all clients, while FL is aimed precisely at protecting privacy by not giving access to the clients' data. As we show in this paper, this conflict between fairness and privacy in FL can be resolved by combining FL with Secure Multiparty Computation (MPC) and Differential Privacy (DP). In doing so, we propose a method for training group-fair ML models in cross-device FL under complete and formal privacy guarantees, without requiring the clients to disclose their sensitive attribute values.

preprint2021arXiv

Privacy-Preserving Feature Selection with Secure Multiparty Computation

Existing work on privacy-preserving machine learning with Secure Multiparty Computation (MPC) is almost exclusively focused on model training and on inference with trained models, thereby overlooking the important data pre-processing stage. In this work, we propose the first MPC based protocol for private feature selection based on the filter method, which is independent of model training, and can be used in combination with any MPC protocol to rank features. We propose an efficient feature scoring protocol based on Gini impurity to this end. To demonstrate the feasibility of our approach for practical data science, we perform experiments with the proposed MPC protocols for feature selection in a commonly used machine-learning-as-a-service configuration where computations are outsourced to multiple servers, with semi-honest and with malicious adversaries. Regarding effectiveness, we show that secure feature selection with the proposed protocols improves the accuracy of classifiers on a variety of real-world data sets, without leaking information about the feature values or even which features were selected. Regarding efficiency, we document runtimes ranging from several seconds to an hour for our protocols to finish, depending on the size of the data set and the security settings.

preprint2021arXiv

Privacy-Preserving Video Classification with Convolutional Neural Networks

Many video classification applications require access to personal data, thereby posing an invasive security risk to the users' privacy. We propose a privacy-preserving implementation of single-frame method based video classification with convolutional neural networks that allows a party to infer a label from a video without necessitating the video owner to disclose their video to other entities in an unencrypted manner. Similarly, our approach removes the requirement of the classifier owner from revealing their model parameters to outside entities in plaintext. To this end, we combine existing Secure Multi-Party Computation (MPC) protocols for private image classification with our novel MPC protocols for oblivious single-frame selection and secure label aggregation across frames. The result is an end-to-end privacy-preserving video classification pipeline. We evaluate our proposed solution in an application for private human emotion recognition. Our results across a variety of security settings, spanning honest and dishonest majority configurations of the computing parties, and for both passive and active adversaries, demonstrate that videos can be classified with state-of-the-art accuracy, and without leaking sensitive user information.

preprint2021arXiv

Private Speech Classification with Secure Multiparty Computation

Deep learning in audio signal processing, such as human voice audio signal classification, is a rich application area of machine learning. Legitimate use cases include voice authentication, gunfire detection, and emotion recognition. While there are clear advantages to automated human speech classification, application developers can gain knowledge beyond the professed scope from unprotected audio signal processing. In this paper we propose the first privacy-preserving solution for deep learning-based audio classification that is provably secure. Our approach, which is based on Secure Multiparty Computation, allows to classify a speech signal of one party (Alice) with a deep neural network of another party (Bob) without Bob ever seeing Alice's speech signal in an unencrypted manner. As threat models, we consider both passive security, i.e. with semi-honest parties who follow the instructions of the cryptographic protocols, as well as active security, i.e. with malicious parties who deviate from the protocols. We evaluate the efficiency-security-accuracy trade-off of the proposed solution in a use case for privacy-preserving emotion detection from speech with a convolutional neural network. In the semi-honest case we can classify a speech signal in under 0.3 sec; in the malicious case it takes $\sim$1.6 sec. In both cases there is no leakage of information, and we achieve classification accuracies that are the same as when computations are done on unencrypted data.

preprint2020arXiv

High Performance Logistic Regression for Privacy-Preserving Genome Analysis

In this paper, we present a secure logistic regression training protocol and its implementation, with a new subprotocol to securely compute the activation function. To the best of our knowledge, we present the fastest existing secure Multi-Party Computation implementation for training logistic regression models on high dimensional genome data distributed across a local area network.

preprint2020arXiv

Inline Detection of DGA Domains Using Side Information

Malware applications typically use a command and control (C&C) server to manage bots to perform malicious activities. Domain Generation Algorithms (DGAs) are popular methods for generating pseudo-random domain names that can be used to establish a communication between an infected bot and the C&C server. In recent years, machine learning based systems have been widely used to detect DGAs. There are several well known state-of-the-art classifiers in the literature that can detect DGA domain names in real-time applications with high predictive performance. However, these DGA classifiers are highly vulnerable to adversarial attacks in which adversaries purposely craft domain names to evade DGA detection classifiers. In our work, we focus on hardening DGA classifiers against adversarial attacks. To this end, we train and evaluate state-of-the-art deep learning and random forest (RF) classifiers for DGA detection using side information that is harder for adversaries to manipulate than the domain name itself. Additionally, the side information features are selected such that they are easily obtainable in practice to perform inline DGA detection. The performance and robustness of these models is assessed by exposing them to one day of real-traffic data as well as domains generated by adversarial attack algorithms. We found that the DGA classifiers that rely on both the domain name and side information have high performance and are more robust against adversaries.

preprint2020arXiv

User Profiling Using Hinge-loss Markov Random Fields

A variety of approaches have been proposed to automatically infer the profiles of users from their digital footprint in social media. Most of the proposed approaches focus on mining a single type of information, while ignoring other sources of available user-generated content (UGC). In this paper, we propose a mechanism to infer a variety of user characteristics, such as, age, gender and personality traits, which can then be compiled into a user profile. To this end, we model social media users by incorporating and reasoning over multiple sources of UGC as well as social relations. Our model is based on a statistical relational learning framework using Hinge-loss Markov Random Fields (HL-MRFs), a class of probabilistic graphical models that can be defined using a set of first-order logical rules. We validate our approach on data from Facebook with more than 5k users and almost 725k relations. We show how HL-MRFs can be used to develop a generic and extensible user profiling framework by leveraging textual, visual, and relational content in the form of status updates, profile pictures and Facebook page likes. Our experimental results demonstrate that our proposed model successfully incorporates multiple sources of information and outperforms competing methods that use only one source of information or an ensemble method across the different sources for modeling of users in social media.

preprint2015arXiv

Solving stable matching problems using answer set programming

Since the introduction of the stable marriage problem (SMP) by Gale and Shapley (1962), several variants and extensions have been investigated. While this variety is useful to widen the application potential, each variant requires a new algorithm for finding the stable matchings. To address this issue, we propose an encoding of the SMP using answer set programming (ASP), which can straightforwardly be adapted and extended to suit the needs of specific applications. The use of ASP also means that we can take advantage of highly efficient off-the-shelf solvers. To illustrate the flexibility of our approach, we show how our ASP encoding naturally allows us to select optimal stable matchings, i.e. matchings that are optimal according to some user-specified criterion. To the best of our knowledge, our encoding offers the first exact implementation to find sex-equal, minimum regret, egalitarian or maximum cardinality stable matchings for SMP instances in which individuals may designate unacceptable partners and ties between preferences are allowed. This paper is under consideration in Theory and Practice of Logic Programming (TPLP).

preprint2013arXiv

Characterizing and Extending Answer Set Semantics using Possibility Theory

Answer Set Programming (ASP) is a popular framework for modeling combinatorial problems. However, ASP cannot easily be used for reasoning about uncertain information. Possibilistic ASP (PASP) is an extension of ASP that combines possibilistic logic and ASP. In PASP a weight is associated with each rule, where this weight is interpreted as the certainty with which the conclusion can be established when the body is known to hold. As such, it allows us to model and reason about uncertain information in an intuitive way. In this paper we present new semantics for PASP, in which rules are interpreted as constraints on possibility distributions. Special models of these constraints are then identified as possibilistic answer sets. In addition, since ASP is a special case of PASP in which all the rules are entirely certain, we obtain a new characterization of ASP in terms of constraints on possibility distributions. This allows us to uncover a new form of disjunction, called weak disjunction, that has not been previously considered in the literature. In addition to introducing and motivating the semantics of weak disjunction, we also pinpoint its computational complexity. In particular, while the complexity of most reasoning tasks coincides with standard disjunctive ASP, we find that brave reasoning for programs with weak disjunctions is easier.

preprint2013arXiv

Modeling Stable Matching Problems with Answer Set Programming

The Stable Marriage Problem (SMP) is a well-known matching problem first introduced and solved by Gale and Shapley (1962). Several variants and extensions to this problem have since been investigated to cover a wider set of applications. Each time a new variant is considered, however, a new algorithm needs to be developed and implemented. As an alternative, in this paper we propose an encoding of the SMP using Answer Set Programming (ASP). Our encoding can easily be extended and adapted to the needs of specific applications. As an illustration we show how stable matchings can be found when individuals may designate unacceptable partners and ties between preferences are allowed. Subsequently, we show how our ASP based encoding naturally allows us to select specific stable matchings which are optimal according to a given criterion. Each time, we can rely on generic and efficient off-the-shelf answer set solvers to find (optimal) stable matchings.

preprint2012arXiv

Possibilistic Answer Set Programming Revisited

Possibilistic answer set programming (PASP) extends answer set programming (ASP) by attaching to each rule a degree of certainty. While such an extension is important from an application point of view, existing semantics are not well-motivated, and do not always yield intuitive results. To develop a more suitable semantics, we first introduce a characterization of answer sets of classical ASP programs in terms of possibilistic logic where an ASP program specifies a set of constraints on possibility distributions. This characterization is then naturally generalized to define answer sets of PASP programs. We furthermore provide a syntactic counterpart, leading to a possibilistic generalization of the well-known Gelfond-Lifschitz reduct, and we show how our framework can readily be implemented using standard ASP solvers.

preprint2011arXiv

Expressiveness of Communication in Answer Set Programming

Answer set programming (ASP) is a form of declarative programming that allows to succinctly formulate and efficiently solve complex problems. An intuitive extension of this formalism is communicating ASP, in which multiple ASP programs collaborate to solve the problem at hand. However, the expressiveness of communicating ASP has not been thoroughly studied. In this paper, we present a systematic study of the additional expressiveness offered by allowing ASP programs to communicate. First, we consider a simple form of communication where programs are only allowed to ask questions to each other. For the most part, we deliberately only consider simple programs, i.e. programs for which computing the answer sets is in P. We find that the problem of deciding whether a literal is in some answer set of a communicating ASP program using simple communication is NP-hard. In other words: we move up a step in the polynomial hierarchy due to the ability of these simple ASP programs to communicate and collaborate. Second, we modify the communication mechanism to also allow us to focus on a sequence of communicating programs, where each program in the sequence may successively remove some of the remaining models. This mimics a network of leaders, where the first leader has the first say and may remove models that he or she finds unsatisfactory. Using this particular communication mechanism allows us to capture the entire polynomial hierarchy. This means, in particular, that communicating ASP could be used to solve problems that are above the second level of the polynomial hierarchy, such as some forms of abductive reasoning as well as PSPACE-complete problems such as STRIPS planning.

preprint2011arXiv

Reducing Fuzzy Answer Set Programming to Model Finding in Fuzzy Logics

In recent years answer set programming has been extended to deal with multi-valued predicates. The resulting formalisms allows for the modeling of continuous problems as elegantly as ASP allows for the modeling of discrete problems, by combining the stable model semantics underlying ASP with fuzzy logics. However, contrary to the case of classical ASP where many efficient solvers have been constructed, to date there is no efficient fuzzy answer set programming solver. A well-known technique for classical ASP consists of translating an ASP program $P$ to a propositional theory whose models exactly correspond to the answer sets of $P$. In this paper, we show how this idea can be extended to fuzzy ASP, paving the way to implement efficient fuzzy ASP solvers that can take advantage of existing fuzzy logic reasoners. To appear in Theory and Practice of Logic Programming (TPLP).

Martine De Cock

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

PrivFair: a Library for Privacy-Preserving Fairness Auditing

PrivFairFL: Privacy-Preserving Group Fairness in Federated Learning

Privacy-Preserving Feature Selection with Secure Multiparty Computation

Privacy-Preserving Video Classification with Convolutional Neural Networks

Private Speech Classification with Secure Multiparty Computation

High Performance Logistic Regression for Privacy-Preserving Genome Analysis

Inline Detection of DGA Domains Using Side Information

User Profiling Using Hinge-loss Markov Random Fields

Solving stable matching problems using answer set programming

Characterizing and Extending Answer Set Semantics using Possibility Theory

Modeling Stable Matching Problems with Answer Set Programming

Possibilistic Answer Set Programming Revisited

Expressiveness of Communication in Answer Set Programming

Reducing Fuzzy Answer Set Programming to Model Finding in Fuzzy Logics