Researcher profile

Claudio Angione

Claudio Angione contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 13 - UnverifiedVerification L1Unclaimed author
2works
0followers
4topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

2 published item(s)

preprint2022arXiv

A pipeline and comparative study of 12 machine learning models for text classification

Text-based communication is highly favoured as a communication method, especially in business environments. As a result, it is often abused by sending malicious messages, e.g., spam emails, to deceive users into relaying personal information, including online accounts credentials or banking details. For this reason, many machine learning methods for text classification have been proposed and incorporated into the services of most email providers. However, optimising text classification algorithms and finding the right tradeoff on their aggressiveness is still a major research problem. We present an updated survey of 12 machine learning text classifiers applied to a public spam corpus. A new pipeline is proposed to optimise hyperparameter selection and improve the models' performance by applying specific methods (based on natural language processing) in the preprocessing stage. Our study aims to provide a new methodology to investigate and optimise the effect of different feature sizes and hyperparameters in machine learning classifiers that are widely used in text classification problems. The classifiers are tested and evaluated on different metrics including F-score (accuracy), precision, recall, and run time. By analysing all these aspects, we show how the proposed pipeline can be used to achieve a good accuracy towards spam filtering on the Enron dataset, a widely used public email corpus. Statistical tests and explainability techniques are applied to provide a robust analysis of the proposed pipeline and interpret the classification outcomes of the 12 machine learning models, also identifying words that drive the classification results. Our analysis shows that it is possible to identify an effective machine learning model to classify the Enron dataset with an F-score of 94%.

preprint2020arXiv

Situating Agent-Based Modelling in Population Health Research

Today's most troublesome population health challenges are often driven by social and environmental determinants, which are difficult to model using traditional epidemiological methods. We agree with those who have argued for the wider adoption of agent-based modelling (ABM) in taking on these challenges. However, while ABM has been used occasionally in population health, we argue that for ABM to be most effective in the field it should be used as a means for answering questions normally inaccessible to the traditional epidemiological toolkit. In an effort to clearly illustrate the utility of ABM for population health research, and to clear up persistent misunderstandings regarding the method's conceptual underpinnings, we offer a detailed presentation of the core concepts of complex systems theory, and summarise why simulations are essential to the study of complex systems. We then examine the current state of the art in ABM for population health, and propose they are well-suited for the study of the `wicked' problems in population health, and could make significant contributions to theory and intervention development in these areas.