Source author record

Jim Smith

Jim Smith appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Cryptography and Security cs.CY econ.GN Information Retrieval math.ST Methodology q-fin.EC Software Engineering Statistics Theory

Catalog footprint

What is connected

4works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Machine Learning Models Disclosure from Trusted Research Environments (TRE), Challenges and Opportunities

Artificial intelligence (AI) applications in healthcare and medicine have increased in recent years. To enable access to personal data, Trusted Research environments (TREs) provide safe and secure environments in which researchers can access sensitive personal data and develop Artificial Intelligence (AI) and Machine Learning models. However currently few TREs support the use of automated AI-based modelling using Machine Learning. Early attempts have been made in the literature to present and introduce privacy preserving machine learning from the design point of view [1]. However, there exists a gap in the practical decision-making guidance for TREs in handling models disclosure. Specifically, the use of machine learning creates a need to disclose new types of outputs from TREs, such as trained machine learning models. Although TREs have clear policies for the disclosure of statistical outputs, the extent to which trained models can leak personal training data once released is not well understood and guidelines do not exist within TREs for the safe disclosure of these models. In this paper we introduce the challenge of disclosing trained machine learning models from TREs. We first give an overview of machine learning models in general and describe some of their applications in healthcare and medicine. We define the main vulnerabilities of trained machine learning models in general. We also describe the main factors affecting the vulnerabilities of disclosing machine learning models. This paper also provides insights and analyses methods that could be introduced within TREs to mitigate the risk of privacy breaches when disclosing trained models.

preprint2019arXiv

Confidentiality and linked data

Data providers such as government statistical agencies perform a balancing act: maximising information published to inform decision-making and research, while simultaneously protecting privacy. The emergence of identified administrative datasets with the potential for sharing (and thus linking) offers huge potential benefits but significant additional risks. This article introduces the principles and methods of linking data across different sources and points in time, focusing on potential areas of risk. We then consider confidentiality risk, focusing in particular on the "intruder" problem central to the area, and looking at both risks from data producer outputs and from the release of micro-data for further analysis. Finally, we briefly consider potential solutions to micro-data release, both the statistical solutions considered in other contributed articles and non-statistical solutions.

preprint2015arXiv

Searching Multiregression Dynamic Models of Resting-State fMRI Networks Using Integer Programming

A Multiregression Dynamic Model (MDM) is a class of multivariate time series that represents various dynamic causal processes in a graphical way. One of the advantages of this class is that, in contrast to many other Dynamic Bayesian Networks, the hypothesised relationships accommodate conditional conjugate inference. We demonstrate for the first time how straightforward it is to search over all possible connectivity networks with dynamically changing intensity of transmission to find the Maximum a Posteriori Probability (MAP) model within this class. This search method is made feasible by using a novel application of an Integer Programming algorithm. The efficacy of applying this particular class of dynamic models to this domain is shown and more specifically the computational efficiency of a corresponding search of 11-node Directed Acyclic Graph (DAG) model space. We proceed to show how diagnostic methods, analogous to those defined for static Bayesian Networks, can be used to suggest embellishment of the model class to extend the process of model selection. All methods are illustrated using simulated and real resting-state functional Magnetic Resonance Imaging (fMRI) data.

preprint2014arXiv

Interactive Ant Colony Optimisation (iACO) for Early Lifecycle Software Design

Software design is crucial to successful software development, yet is a demanding multi-objective problem for software engineers. In an attempt to assist the software designer, interactive (i.e. human in-the-loop) meta-heuristic search techniques such as evolutionary computing have been applied and show promising results. Recent investigations have also shown that Ant Colony Optimization (ACO) can outperform evolutionary computing as a potential search engine for interactive software design. With a limited computational budget, ACO produces superior candidate design solutions in a smaller number of iterations. Building on these findings, we propose a novel interactive ACO (iACO) approach to assist the designer in early lifecycle software design, in which the search is steered jointly by subjective designer evaluation as well as machine fitness functions relating the structural integrity and surrogate elegance of software designs. Results show that iACO is speedy, responsive and highly effective in enabling interactive, dynamic multi-objective search in early lifecycle software design. Study participants rate the iACO search experience as compelling. Results of machine learning of fitness measure weightings indicate that software design elegance does indeed play a significant role in designer evaluation of candidate software design. We conclude that the evenness of the number of attributes and methods among classes (NAC) is a significant surrogate elegance measure, which in turn suggests that this evenness of distribution, when combined with structural integrity, is an implicit but crucial component of effective early lifecycle software design.