Source author record

Alun Preece

Alun Preece appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Human-Computer Interaction Computation and Language Cryptography and Security eess.IV Machine Learning Multiagent Systems Software Engineering

Catalog footprint

What is connected

10works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity. State-of-the-art generative language models have been shown to provide significant gains across different NLP tasks. However, their applicability to data augmentation for text classification tasks in few-shot settings have not been fully explored, especially for specialised domains. In this paper, we leverage GPT-2 (Radford A et al, 2019) for generating artificial training instances in order to improve classification performance. Our aim is to analyse the impact the selection process of seed training examples have over the quality of GPT-generated samples and consequently the classifier performance. We perform experiments with several seed selection strategies that, among others, exploit class hierarchical structures and domain expert selection. Our results show that fine-tuning GPT-2 in a handful of label instances leads to consistent classification improvements and outperform competitive baselines. Finally, we show that guiding this process through domain expert selection can lead to further improvements, which opens up interesting research avenues for combining generative models and active learning.

preprint2021arXiv

A framework for fostering transparency in shared artificial intelligence models by increasing visibility of contributions

Increased adoption of artificial intelligence (AI) systems into scientific workflows will result in an increasing technical debt as the distance between the data scientists and engineers who develop AI system components and scientists, researchers and other users grows. This could quickly become problematic, particularly where guidance or regulations change and once-acceptable best practice becomes outdated, or where data sources are later discredited as biased or inaccurate. This paper presents a novel method for deriving a quantifiable metric capable of ranking the overall transparency of the process pipelines used to generate AI systems, such that users, auditors and other stakeholders can gain confidence that they will be able to validate and trust the data sources and contributors in the AI systems that they rely on. The methodology for calculating the metric, and the type of criteria that could be used to make judgements on the visibility of contributions to systems are evaluated through models published at ModelHub and PyTorch Hub, popular archives for sharing science resources, and is found to be helpful in driving consideration of the contributions made to generating AI systems and approaches towards effective documentation and improving transparency in machine learning assets shared within scientific communities.

preprint2020arXiv

Certifying Provenance of Scientific Datasets with Self-sovereign Identity and Verifiable Credentials

In order to increase the value of scientific datasets and improve research outcomes, it is important that only trustworthy data is used. This paper presents mechanisms by which scientists and the organisations they represent can certify the authenticity of characteristics and provenance of any datasets they publish so that secondary users can inspect and gain confidence in the qualities of data they source. By drawing on data models and protocols used to provide self-sovereign ownership of identity and personal data to individuals, we conclude that providing self-sovereignty to data assets offers a viable approach for institutions to certify qualities of their datasets in a cryptography secure manner, and enables secondary data users to efficiently perform verification of the authenticity of such certifications. By building upon emerging standards for decentralized identification and cryptographically verifiable credentials, we envisage an infrastructure of tools being developed to foster adoption of metadata certification schemes, and improving the quality of information provided in support of shared data assets.

preprint2020arXiv

Explaining Motion Relevance for Activity Recognition in Video Deep Learning Models

A small subset of explainability techniques developed initially for image recognition models has recently been applied for interpretability of 3D Convolutional Neural Network models in activity recognition tasks. Much like the models themselves, the techniques require little or no modification to be compatible with 3D inputs. However, these explanation techniques regard spatial and temporal information jointly. Therefore, using such explanation techniques, a user cannot explicitly distinguish the role of motion in a 3D model's decision. In fact, it has been shown that these models do not appropriately factor motion information into their decision. We propose a selective relevance method for adapting the 2D explanation techniques to provide motion-specific explanations, better aligning them with the human understanding of motion as conceptually separate from static spatial features. We demonstrate the utility of our method in conjunction with several widely-used 2D explanation methods, and show that it improves explanation selectivity for motion. Our results show that the selective relevance method can not only provide insight on the role played by motion in the model's decision -- in effect, revealing and quantifying the model's spatial bias -- but the method also simplifies the resulting explanations for human consumption.

preprint2020arXiv

Increasing negotiation performance at the edge of the network

Automated negotiation has been used in a variety of distributed settings, such as privacy in the Internet of Things (IoT) devices and power distribution in Smart Grids. The most common protocol under which these agents negotiate is the Alternating Offers Protocol (AOP). Under this protocol, agents cannot express any additional information to each other besides a counter offer. This can lead to unnecessarily long negotiations when, for example, negotiations are impossible, risking to waste bandwidth that is a precious resource at the edge of the network. While alternative protocols exist which alleviate this problem, these solutions are too complex for low power devices, such as IoT sensors operating at the edge of the network. To improve this bottleneck, we introduce an extension to AOP called Alternating Constrained Offers Protocol (ACOP), in which agents can also express constraints to each other. This allows agents to both search the possibility space more efficiently and recognise impossible situations sooner. We empirically show that agents using ACOP can significantly reduce the number of messages a negotiation takes, independently of the strategy agents choose. In particular, we show our method significantly reduces the number of messages when an agreement is not possible. Furthermore, when an agreement is possible it reaches this agreement sooner with no negative effect on the utility.

preprint2020arXiv

The current state of automated negotiation theory: a literature review

Automated negotiation can be an efficient method for resolving conflict and redistributing resources in a coalition setting. Automated negotiation has already seen increased usage in fields such as e-commerce and power distribution in smart girds, and recent advancements in opponent modelling have proven to deliver better outcomes. However, significant barriers to more widespread adoption remain, such as lack of predictable outcome over time and user trust. Additionally, there have been many recent advancements in the field of reasoning about uncertainty, which could help alleviate both those problems. As there is no recent survey on these two fields, and specifically not on their possible intersection we aim to provide such a survey here.

preprint2020arXiv

Towards a Modelling Framework for Self-Sovereign Identity Systems

Self-sovereign Identity promises to give users control of their own data, and has the potential to foster advancements in terms of personal data privacy. Self-sovereign concepts can also be applied to other entities, such as datasets and devices. Systems adopting this paradigm will be decentralised, with messages passing between multiple actors, both human and representing other entities, in order to issue and request credentials necessary to meet individual and collective goals. Such systems are complex, and build upon social and technical interactions and behaviours. Modelling self-sovereign identity systems seeks to provide stakeholders and software architects with tools to enable them to communicate effectively, and lead to effective and well-regarded system designs and implementations. This paper draws upon research from Actor-based Modelling to guide a way forward in modelling self-sovereign systems, and reports early success in utilising the iStar 2.0 framework to provide a representation of a birth registration case study.

preprint2014arXiv

Conversational Sensing

Recent developments in sensing technologies, mobile devices and context-aware user interfaces have made it possible to represent information fusion and situational awareness as a conversational process among actors - human and machine agents - at or near the tactical edges of a network. Motivated by use cases in the domain of security, policing and emergency response, this paper presents an approach to information collection, fusion and sense-making based on the use of natural language (NL) and controlled natural language (CNL) to support richer forms of human-machine interaction. The approach uses a conversational protocol to facilitate a flow of collaborative messages from NL to CNL and back again in support of interactions such as: turning eyewitness reports from human observers into actionable information (from both trained and untrained sources); fusing information from humans and physical sensors (with associated quality metadata); and assisting human analysts to make the best use of available sensing assets in an area of interest (governed by management and security policies). CNL is used as a common formal knowledge representation for both machine and human agents to support reasoning, semantic information fusion and generation of rationale for inferences, in ways that remain transparent to human users. Examples are provided of various alternative styles for user feedback, including NL, CNL and graphical feedback. A pilot experiment with human subjects shows that a prototype conversational agent is able to gather usable CNL information from untrained human subjects.

preprint2011arXiv

Proceedings of the Doctoral Consortium and Poster Session of the 5th International Symposium on Rules (RuleML 2011@IJCAI)

This volume contains the papers presented at the first edition of the Doctoral Consortium of the 5th International Symposium on Rules (RuleML 2011@IJCAI) held on July 19th, 2011 in Barcelona, as well as the poster session papers of the RuleML 2011@IJCAI main conference.

preprint2011arXiv

Rule-Based Semantic Sensing

Rule-Based Systems have been in use for decades to solve a variety of problems but not in the sensor informatics domain. Rules aid the aggregation of low-level sensor readings to form a more complete picture of the real world and help to address 10 identified challenges for sensor network middleware. This paper presents the reader with an overview of a system architecture and a pilot application to demonstrate the usefulness of a system integrating rules with sensor middleware.

Alun Preece

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

A framework for fostering transparency in shared artificial intelligence models by increasing visibility of contributions

Certifying Provenance of Scientific Datasets with Self-sovereign Identity and Verifiable Credentials

Explaining Motion Relevance for Activity Recognition in Video Deep Learning Models

Increasing negotiation performance at the edge of the network

The current state of automated negotiation theory: a literature review

Towards a Modelling Framework for Self-Sovereign Identity Systems

Conversational Sensing

Proceedings of the Doctoral Consortium and Poster Session of the 5th International Symposium on Rules (RuleML 2011@IJCAI)

Rule-Based Semantic Sensing