Source author record

Gillian Dobbie

Gillian Dobbie appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Cryptography and Security Computation and Language Methodology Software Engineering

Catalog footprint

What is connected

5works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

A Survey of Methods, Challenges and Perspectives in Causality

Deep Learning models have shown success in a large variety of tasks by extracting correlation patterns from high-dimensional data but still struggle when generalizing out of their initial distribution. As causal engines aim to learn mechanisms independent from a data distribution, combining Deep Learning with Causality can have a great impact on the two fields. In this paper, we further motivate this assumption. We perform an extensive overview of the theories and methods for Causality from different perspectives, with an emphasis on Deep Learning and the challenges met by the two domains. We show early attempts to bring the fields together and the possible perspectives for the future. We finish by providing a large variety of applications for techniques from Causality.

preprint2024arXiv

Large Language Models Are Not Strong Abstract Reasoners

Large Language Models have shown tremendous performance on a large variety of natural language processing tasks, ranging from text comprehension to common sense reasoning. However, the mechanisms responsible for this success remain opaque, and it is unclear whether LLMs can achieve human-like cognitive capabilities or whether these models are still fundamentally circumscribed. Abstract reasoning is a fundamental task for cognition, consisting of finding and applying a general pattern from few data. Evaluating deep neural architectures on this task could give insight into their potential limitations regarding reasoning and their broad generalisation abilities, yet this is currently an under-explored area. In this paper, we introduce a new benchmark for evaluating language models beyond memorization on abstract reasoning tasks. We perform extensive evaluations of state-of-the-art LLMs, showing that they currently achieve very limited performance in contrast with other natural language tasks, even when applying techniques that have been shown to improve performance on other NLP tasks. We argue that guiding LLM generation to follow causal paths could help improve the generalisation and reasoning abilities of LLMs.

preprint2022arXiv

Membership Inference Attacks on Machine Learning: A Survey

Machine learning (ML) models have been widely applied to various applications, including image classification, text generation, audio recognition, and graph data analysis. However, recent studies have shown that ML models are vulnerable to membership inference attacks (MIAs), which aim to infer whether a data record was used to train a target model or not. MIAs on ML models can directly lead to a privacy breach. For example, via identifying the fact that a clinical record that has been used to train a model associated with a certain disease, an attacker can infer that the owner of the clinical record has the disease with a high chance. In recent years, MIAs have been shown to be effective on various ML models, e.g., classification models and generative models. Meanwhile, many defense methods have been proposed to mitigate MIAs. Although MIAs on ML models form a newly emerging and rapidly growing research area, there has been no systematic survey on this topic yet. In this paper, we conduct the first comprehensive survey on membership inference attacks and defenses. We provide the taxonomies for both attacks and defenses, based on their characterizations, and discuss their pros and cons. Based on the limitations and gaps identified in this survey, we point out several promising future research directions to inspire the researchers who wish to follow this area. This survey not only serves as a reference for the research community but also provides a clear description for researchers outside this research domain. To further help the researchers, we have created an online resource repository, which we will keep updated with future relevant work. Interested readers can find the repository at https://github.com/HongshengHu/membership-inference-machine-learning-literature.

preprint2022arXiv

Membership Inference via Backdooring

Recently issued data privacy regulations like GDPR (General Data Protection Regulation) grant individuals the right to be forgotten. In the context of machine learning, this requires a model to forget about a training data sample if requested by the data owner (i.e., machine unlearning). As an essential step prior to machine unlearning, it is still a challenge for a data owner to tell whether or not her data have been used by an unauthorized party to train a machine learning model. Membership inference is a recently emerging technique to identify whether a data sample was used to train a target model, and seems to be a promising solution to this challenge. However, straightforward adoption of existing membership inference approaches fails to address the challenge effectively due to being originally designed for attacking membership privacy and suffering from several severe limitations such as low inference accuracy on well-generalized models. In this paper, we propose a novel membership inference approach inspired by the backdoor technology to address the said challenge. Specifically, our approach of Membership Inference via Backdooring (MIB) leverages the key observation that a backdoored model behaves very differently from a clean model when predicting on deliberately marked samples created by a data owner. Appealingly, MIB requires data owners' marking a small number of samples for membership inference and only black-box access to the target model, with theoretical guarantees for inference results. We perform extensive experiments on various datasets and deep neural network architectures, and the results validate the efficacy of our approach, e.g., marking only 0.1% of the training dataset is practically sufficient for effective membership inference.

preprint2020arXiv

Measuring the Quality of B Abstract Machines with ISO/IEC 25010

The B method has facilitated the development of software by specifying the design of software as abstract machines and formally verifying the correctness of the abstract machines. The quality of B abstract machines can significantly impact the quality of final software products. In this paper, we propose a set of criteria for measuring the quality of B abstract machines based on ISO/IEC 25010, which is one of the latest international standards for evaluating software quality in software engineering. These criteria evaluate abstract machines using a number of general-purpose and domain-independent equations and model checking techniques, so that the quality of abstract machines can be quantified as vectors. The proposed criteria are implemented as a B model quality evaluator, and they are explained and justified using a number of examples.