Researcher profile

Thomas Davies

Thomas Davies contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2022arXiv

A Review of Topological Data Analysis for Cybersecurity

In cybersecurity it is often the case that malicious or anomalous activity can only be detected by combining many weak indicators of compromise, any one of which may not raise suspicion when taken alone. The path that such indicators take can also be critical. This makes the problem of analysing cybersecurity data particularly well suited to Topological Data Analysis (TDA), a field that studies the high level structure of data using techniques from algebraic topology, both for exploratory analysis and as part of a machine learning workflow. By introducing TDA and reviewing the work done on its application to cybersecurity, we hope to highlight to researchers a promising new area with strong potential to improve cybersecurity data science.

preprint2022arXiv

Topological Data Analysis for Anomaly Detection in Host-Based Logs

Topological Data Analysis (TDA) gives practioners the ability to analyse the global structure of cybersecurity data. We use TDA for anomaly detection in host-based logs collected with the open-source Logging Made Easy (LME) project. We present an approach that builds a filtration of simplicial complexes directly from Windows logs, enabling analysis of their intrinsic structure using topological tools. We compare the efficacy of persistent homology and the spectrum of graph and hypergraph Laplacians as feature vectors against a standard log embedding that counts events, and find that topological and spectral embeddings of computer logs contain discriminative information for classifying anomalous logs that is complementary to standard embeddings. We end by discussing the potential for our methods to be used as part of an explainable framework for anomaly detection.

preprint2021arXiv

Fuzzy c-Means Clustering for Persistence Diagrams

Persistence diagrams concisely represent the topology of a point cloud whilst having strong theoretical guarantees, but the question of how to best integrate this information into machine learning workflows remains open. In this paper we extend the ubiquitous Fuzzy c-Means (FCM) clustering algorithm to the space of persistence diagrams, enabling unsupervised learning that automatically captures the topological structure of data without the topological prior knowledge or additional processing of persistence diagrams that many other techniques require. We give theoretical convergence guarantees that correspond to the Euclidean case, and empirically demonstrate the capability of our algorithm to capture topological information via the fuzzy RAND index. We end with experiments on two datasets that utilise both the topological and fuzzy nature of our algorithm: pre-trained model selection in machine learning and lattices structures from materials science. As pre-trained models can perform well on multiple tasks, selecting the best model is a naturally fuzzy problem; we show that fuzzy clustering persistence diagrams allows for model selection using the topology of decision boundaries. In materials science, we classify transformed lattice structure datasets for the first time, whilst the probabilistic membership values let us rank candidate lattices in a scenario where further investigation requires expensive laboratory time and expertise.

preprint2021arXiv

On the Effectiveness of Weight-Encoded Neural Implicit 3D Shapes

A neural implicit outputs a number indicating whether the given query point in space is inside, outside, or on a surface. Many prior works have focused on _latent-encoded_ neural implicits, where a latent vector encoding of a specific shape is also fed as input. While affording latent-space interpolation, this comes at the cost of reconstruction accuracy for any _single_ shape. Training a specific network for each 3D shape, a _weight-encoded_ neural implicit may forgo the latent vector and focus reconstruction accuracy on the details of a single shape. While previously considered as an intermediary representation for 3D scanning tasks or as a toy-problem leading up to latent-encoding tasks, weight-encoded neural implicits have not yet been taken seriously as a 3D shape representation. In this paper, we establish that weight-encoded neural implicits meet the criteria of a first-class 3D shape representation. We introduce a suite of technical contributions to improve reconstruction accuracy, convergence, and robustness when learning the signed distance field induced by a polygonal mesh -- the _de facto_ standard representation. Viewed as a lossy compression, our conversion outperforms standard techniques from geometry processing. Compared to previous latent- and weight-encoded neural implicits we demonstrate superior robustness, scalability, and performance.