Source author record

Toby Davies

Toby Davies appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language cs.CY Cryptography and Security Human-Computer Interaction Machine Learning

Catalog footprint

What is connected

4works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Bridging Information Security and Environmental Criminology Research to Better Mitigate Cybercrime

Cybercrime is a complex phenomenon that spans both technical and human aspects. As such, two disjoint areas have been studying the problem from separate angles: the information security community and the environmental criminology one. Despite the large body of work produced by these communities in the past years, the two research efforts have largely remained disjoint, with researchers on one side not benefitting from the advancements proposed by the other. In this paper, we argue that it would be beneficial for the information security community to look at the theories and systematic frameworks developed in environmental criminology to develop better mitigations against cybercrime. To this end, we provide an overview of the research from environmental criminology and how it has been applied to cybercrime. We then survey some of the research proposed in the information security domain, drawing explicit parallels between the proposed mitigations and environmental criminology theories, and presenting some examples of new mitigations against cybercrime. Finally, we discuss the concept of cyberplaces and propose a framework in order to define them. We discuss this as a potential research direction, taking into account both fields of research, in the hope of broadening interdisciplinary efforts in cybercrime research.

preprint2022arXiv

Self-Supervised Losses for One-Class Textual Anomaly Detection

Current deep learning methods for anomaly detection in text rely on supervisory signals in inliers that may be unobtainable or bespoke architectures that are difficult to tune. We study a simpler alternative: fine-tuning Transformers on the inlier data with self-supervised objectives and using the losses as an anomaly score. Overall, the self-supervision approach outperforms other methods under various anomaly detection scenarios, improving the AUROC score on semantic anomalies by 11.6% and on syntactic anomalies by 22.8% on average. Additionally, the optimal objective and resultant learnt representation depend on the type of downstream anomaly. The separability of anomalies and inliers signals that a representation is more effective for detecting semantic anomalies, whilst the presence of narrow feature directions signals a representation that is effective for detecting syntactic anomalies.

preprint2022arXiv

Textwash -- automated open-source text anonymisation

The increased use of text data in social science research has benefited from easy-to-access data (e.g., Twitter). That trend comes at the cost of research requiring sensitive but hard-to-share data (e.g., interview data, police reports, electronic health records). We introduce a solution to that stalemate with the open-source text anonymisation software_Textwash_. This paper presents the empirical evaluation of the tool using the TILD criteria: a technical evaluation (how accurate is the tool?), an information loss evaluation (how much information is lost in the anonymisation process?) and a de-anonymisation test (can humans identify individuals from anonymised text data?). The findings suggest that Textwash performs similar to state-of-the-art entity recognition models and introduces a negligible information loss of 0.84%. For the de-anonymisation test, we tasked humans to identify individuals by name from a dataset of crowdsourced person descriptions of very famous, semi-famous and non-existing individuals. The de-anonymisation rate ranged from 1.01-2.01% for the realistic use cases of the tool. We replicated the findings in a second study and concluded that Textwash succeeds in removing potentially sensitive information that renders detailed person descriptions practically anonymous.

preprint2016arXiv

Probabilistic map-matching using particle filters

Increasing availability of vehicle GPS data has created potentially transformative opportunities for traffic management, route planning and other location-based services. Critical to the utility of the data is their accuracy. Map-matching is the process of improving the accuracy by aligning GPS data with the road network. In this paper, we propose a purely probabilistic approach to map-matching based on a sequential Monte Carlo algorithm known as particle filters. The approach performs map-matching by producing a range of candidate solutions, each with an associated probability score. We outline implementation details and thoroughly validate the technique on GPS data of varied quality.

Toby Davies

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Bridging Information Security and Environmental Criminology Research to Better Mitigate Cybercrime

Self-Supervised Losses for One-Class Textual Anomaly Detection

Textwash -- automated open-source text anonymisation

Probabilistic map-matching using particle filters