Source author record

Massimo Stella

Massimo Stella appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Social and Information Networks Computation and Language cs.CY Artificial Intelligence Human-Computer Interaction Machine Learning cond-mat.stat-mech physics.ed-ph Biological Physics Neurons and Cognition Populations and Evolution Quantitative Methods

Catalog footprint

What is connected

14works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

How to predict creativity ratings from written narratives: A comparison of co-occurrence and textual forma mentis networks

This tutorial paper provides a step-by-step workflow for building and analysing semantic networks from short creative texts. We introduce and compare two widely used text-to-network approaches: word co-occurrence networks and textual forma mentis networks (TFMNs). We also demonstrate how they can be used in machine learning to predict human creativity ratings. Using a corpus of 1029 short stories, we guide readers through text preprocessing, network construction, feature extraction (structural measures, spreading-activation indices, and emotion scores), and application of regression models. We evaluate how network-construction choices influence both network topology and predictive performance. Across all modelling settings, TFMNs consistently outperformed co-occurrence networks through lower prediction errors (best MAE = 0.581 for TFMN, vs 0.592 for co-occurrence with window size 3). Network-structural features dominated predictive performance (MAE = 0.591 for TFMN), whereas emotion features performed worse (MAE = 0.711 for TFMN) and spreading-activation measures contributed little (MAE = 0.788 for TFMN). This paper offers practical guidance for researchers interested in applying network-based methods for cognitive fields like creativity research. we show when syntactic networks are preferable to surface co-occurrence models, and provide an open, reproducible workflow accessible to newcomers in the field, while also offering deeper methodological insight for experienced researchers.

preprint2026arXiv

Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Large Language Models (LLMs) can strongly shape social discourse, yet datasets investigating how LLM outputs vary across controlled social and contextual prompting remain sparse. Cognitive Digital Shadows (CDS) is a 190,000-record synthetic corpus supporting analyses of LLM-generated discourse. Each CDS record is generated by one of 19 LLMs, prompted to shadow either a human persona or an AI-assistant role. CDS contains LLM responses on 4 controversial societal topics: vaccines/healthcare, social media disinformation, the gender gap in science, and STEM stereotypes. Persona-conditioned records encode 17 sociodemographic and psychological attributes, providing data linking LLMs' prompts, language, stances and reasoning. Texts are validated for topic anchoring and can support emotional analyses via interpretable NLP (e.g. textual forma mentis networks). CDS is enriched by a pooling platform with user-friendly dashboards, enabling easy, interactive group-level comparisons of emotional and semantic framing across personas, topics and models. The CDS prompting framework supports future audits of LLMs' bias, social sensitivity and alignment.

preprint2026arXiv

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

To enhance LLMs' impact on math education, we need data on their mathematical prowess and biases across prompts. To fill this gap, we introduce MEDS (Math Education Digital Shadows) as a dataset mapping how large language models reason about and report mathematics across human- and AI-like conditions. MEDS involves 28,000 personas from 14 LLMs (from families like Mistral, Qwen, DeepSeek, Granite, Phi and Grok) shadowing either humans or AI assistants. Each record/shadow includes a set of prompts along with psychological/sociodemographic persona metadata and four types of math tasks: (i) open math interview, (ii) three psychometric tests about math perceptions with explanations, (iii) cognitive networks capturing math attitudes, and (iv) 18 high-school math test questions together with their reasoning and confidence scores. MEDS differs from traditional score-only math benchmarks because it integrates concepts of self-efficacy, math anxiety, and cognitive network science besides math proficiency scores. Data validation shows that the sampled LLMs exhibit schema integrity and consistent personas, together with family-specific peculiarities like human-like negative math attitudes, logical fallacies, and math overconfidence. MEDS will benefit learning analytics experts, cognitive scientists, and developers of safer AI tutors in mathematics.

preprint2026arXiv

The TEA Nets framework combines AI and cognitive network science to model targets, events and actors in text

We introduce Target-Event-Agent Networks (TEA Nets) as a computational framework to extract subjects (``Agents"), verbs (``Events"), and objects (``Targets") from texts. Grounded in cognitive network science and artificial intelligence, TEA Nets are implemented as an open-source Python library. We test TEA Nets in three case studies, demonstrating the framework's ability to perform interpretable emotion detection, semantic frame analyses, and linguistic inquiries across conspiracy texts and textual responses generated by LLMs. In the LOCO conspiracy corpus, TEA Nets revealed that highly conspiratorial narratives (4,227 texts) linked personal pronouns (``I", ``you", ``we") with the same actions twice as frequently as low-similarity conspiracy narratives. High-conspiracy narratives connected person-focused elements (``you", ``people") through actions eliciting anger above the random baseline ($z = 2.63, p < .05$), a trend absent in low-similarity conspiracy narratives, which emphasized scientific actors (``researcher", ``scientist"). In the HOPE and CounseLLMe datasets of 212 (human) and 200 (LLM-based) psychotherapy transcripts, respectively, TEA Nets highlighted emotional differences. When expressing feelings, Claude 3 Haiku, GPT-3.5, and humans used sad words with higher frequency than random expectations but Haiku expressed sadness with lower emotional intensity than humans ($U = 1243.5, p = .036$). We discuss these differences in the context of psychotherapy training on LLM-simulated patients. Our results show that Target-Event-Agent Networks can extract relevant emotional, syntactic, and semantic insights from narratives, opening new avenues for text analysis with cognitive network science.

preprint2022arXiv

Feature-rich multiplex lexical networks reveal mental strategies of early language learning

Knowledge in the human mind exhibits a dualistic vector/network nature. Modelling words as vectors is key to natural language processing, whereas networks of word associations can map the nature of semantic memory. We reconcile these paradigms - fragmented across linguistics, psychology and computer science - by introducing FEature-Rich MUltiplex LEXical (FERMULEX) networks. This novel framework merges structural similarities in networks and vector features of words, which can be combined or explored independently. Similarities model heterogenous word associations across semantic/syntactic/phonological aspects of knowledge. Words are enriched with multi-dimensional feature embeddings including frequency, age of acquisition, length and polysemy. These aspects enable unprecedented explorations of cognitive knowledge. Through CHILDES data, we use FERMULEX networks to model normative language acquisition by 1000 toddlers between 18 and 30 months. Similarities and embeddings capture word homophily via conformity, which measures assortative mixing via distance and features. Conformity unearths a language kernel of frequent/polysemous/short nouns and verbs key for basic sentence production, supporting recent evidence of children's syntactic constructs emerging at 30 months. This kernel is invisible to network core-detection and feature-only clustering: It emerges from the dual vector/network nature of words. Our quantitative analysis reveals two key strategies in early word learning. Modelling word acquisition as random walks on FERMULEX topology, we highlight non-uniform filling of communicative developmental inventories (CDIs). Conformity-based walkers lead to accurate (75%), precise (55%) and partially well-recalled (34%) predictions of early word learning in CDIs, providing quantitative support to previous empirical findings and developmental theories.

preprint2021arXiv

Cognitive network science for understanding online social cognitions: A brief review

Social media are digitalising massive amounts of users' cognitions in terms of timelines and emotional content. Such Big Data opens unprecedented opportunities for investigating cognitive phenomena like perception, personality and information diffusion but requires suitable interpretable frameworks. Since social media data come from users' minds, worthy candidates for this challenge are cognitive networks, models of cognition giving structure to mental conceptual associations. This work outlines how cognitive network science can open new, quantitative ways for understanding cognition through online media, like: (i) reconstructing how users semantically and emotionally frame events with contextual knowledge unavailable to machine learning, (ii) investigating conceptual salience/prominence through knowledge structure in social discourse; (iii) studying users' personality traits like openness-to-experience, curiosity, and creativity through language in posts; (iv) bridging cognitive/emotional content and social dynamics via multilayer networks comparing the mindsets of influencers and followers. These advancements combine cognitive-, network- and computer science to understand cognitive mechanisms in both digital and real-world settings but come with limitations concerning representativeness, individual variability and data integration. Such aspects are discussed along the ethical implications of manipulating socio-cognitive data. In the future, reading cognitions through networks and social media can expose cognitive biases amplified by online platforms and relevantly inform policy making, education and markets about massive, complex cognitive trends.

preprint2020arXiv

#lockdown: network-enhanced emotional profiling at the times of COVID-19

The COVID-19 pandemic forced countries all over the world to take unprecedented measures like nationwide lockdowns. To adequately understand the emotional and social repercussions, a large-scale reconstruction of how people perceived these unexpected events is necessary but currently missing. We address this gap through social media by introducing MERCURIAL (Multi-layer Co-occurrence Networks for Emotional Profiling), a framework which exploits linguistic networks of words and hashtags to reconstruct social discourse describing real-world events. We use MERCURIAL to analyse 101,767 tweets from Italy, the first country to react to the COVID-19 threat with a nationwide lockdown. The data were collected between 11th and 17th March, immediately after the announcement of the Italian lockdown and the WHO declaring COVID-19 a pandemic. Our analysis provides unique insights into the psychological burden of this crisis, focussing on: (i) the Italian official campaign for self-quarantine (#iorestoacasa}), (ii) national lockdown (#italylockdown), and (iii) social denounce (#sciacalli). Our exploration unveils evidence for the emergence of complex emotional profiles, where anger and fear (towards political debates and socio-economic repercussions) coexisted with trust, solidarity, and hope (related to the institutions and local communities). We discuss our findings in relation to mental well-being issues and coping mechanisms, like instigation to violence, grieving, and solidarity. We argue that our framework represents an innovative thermometer of emotional status, a powerful tool for policy makers to quickly gauge feelings in massive audiences and devise appropriate responses based on cognitive data.

preprint2020arXiv

Forma mentis networks reconstruct how Italian high schoolers and international STEM experts perceive teachers, students, scientists, and school

This study investigates how students and researchers shape their knowledge and perception of educational topics. The mindset or forma mentis of 159 Italian high school students and of 59 international researchers in STEM are reconstructed through forma mentis networks, i.e., cognitive networks of concepts connected by free associations and enriched with sentiment labels. The layout of conceptual associations between positively/negatively/neutrally perceived concepts is informative on how people build their own mental constructs or beliefs about specific topics. Researchers displayed mixed positive/neutral mental representations of ``teacher'', ``student'' and, ``scientist''. Students' conceptual associations of ``scientist'' were highly positive and largely non-stereotypical, although links about the ``mad scientist'' stereotype persisted. Students perceived ``teacher'' as a complex figure, associated with positive aspects like mentoring/knowledge transmission but also to negative sides revolving around testing and grading. ``School'' elicited stronger differences between the two groups. In the students' mindset, ``school'' was surrounded by a negative emotional aura or set of associations, indicating an anxious perception of the school setting, mixing scholastic concepts, anxiety-eliciting words, STEM disciplines like maths and physics, and exam-related notions. Researchers' positive stance of ``school'' included concepts of fun, friendship, and personal growth instead. Along the perspective of Education Research, the above results are discussed as quantitative evidence for test- and STEM anxiety co-occurring in the way Italian students perceive education places and their actors. Detecting these patterns in student populations through forma mentis networks offers new, simple to gather yet detailed knowledge for future data-informed intervention policies and action research.

preprint2020arXiv

Mapping computational thinking mindsets between educational levels with cognitive network science

Computational thinking is a way of reasoning about the world in terms of data. This mindset channels number crunching toward an ambition to discover knowledge through logic, models and simulations. Here we show how computational cognitive science can be used to reconstruct and analyse the structure of computational thinking mindsets (forma mentis in Latin) through complex networks. As a case study, we investigate cognitive networks tied to key concepts of computational thinking provided by: (i) 159 high school students enrolled in a science curriculum and (ii) 59 researchers in complex systems and simulations. Researchers' reconstructed forma mentis highlighted a positive mindset about scientific modelling, semantically framing data and simulations as ways of discovering nature. Students correctly identified different aspects of logic reasoning but perceived "computation" as a distressing, anxiety-eliciting task, framed with math jargon and lacking links to real-world discovery. Students' mindsets around "data", "model" and "simulations" critically revealed no awareness of numerical modelling as a way for understanding the world. Our findings provide evidence of a crippled computational thinking mindset in students, who acquire mathematical skills that are not channelled toward real-world discovery through coding. This unlinked knowledge ends up being perceived as distressing number-crunching expertise with no relevant outcome. The virtuous mindset of researchers reported here indicates that computational thinking can be restored by training students specifically in coding, modelling and simulations in relation to discovering nature. Our approach opens innovative ways for quantifying computational thinking and enhancing its development through mindset reconstruction.

preprint2020arXiv

Multiplex networks quantify robustness of the mental lexicon to catastrophic concept failures, aphasic degradation and ageing

Concepts and their mental associations influence how language is processed and used. Networks represent powerful models for exploring such cognitive system, known as mental lexicon. This study investigates lexicon robustness to progressive word failure with multiplex network attacks. The average lexicon of an adult English speaker is built by considering 16000 words connected through semantic free associations and phonological sound similarities. Progressive structural degradation is modelled as random and targeted attacks. Words with higher psycholinguistic features (e.g. frequency, length, age of acquisition, polysemy) or network centrality (e.g. closeness, PageRank, betweenness and degree) are targeted first. Aphasia-inspired attacks are introduced here and target first words named correctly, more or less frequently, by patients with anomic aphasia, a pathology disrupting word finding. Robustness is measured as connectedness, fundamental for activation spreading and lexical retrieval, and viability, a multi-layer connectivity identifying language kernels. The lexicon is resilient to random, aphasia-inspired and psycholinguistic attacks. Catastrophic phase transitions happen when phonological and semantic degrees are combined, making the lexicon fragile to multidegree attacks. The viable kernel is fragile to multi-PageRank and to aphasia-inspired attacks. Consequently, connectedness in the lexicon is mediated by hubs, whereas viability enables a lexical semantic/phonological interplay and corresponds to a facilitative naming effect in aphasia. These effects persist also through ageing, in different network representations of younger and older lexicons. This study indicates the need to prevent failure of high multidegree and viable words in the mental lexicon when pursuing the design of effective language restoration strategies against cognitive impairing.

preprint2016arXiv

Mental Lexicon Growth Modelling Reveals the Multiplexity of the English Language

In this work we extend previous analyses of linguistic networks by adopting a multi-layer network framework for modelling the human mental lexicon, i.e. an abstract mental repository where words and concepts are stored together with their linguistic patterns. Across a three-layer linguistic multiplex, we model English words as nodes and connect them according to (i) phonological similarities, (ii) synonym relationships and (iii) free word associations. Our main aim is to exploit this multi-layered structure to explore the influence of phonological and semantic relationships on lexicon assembly over time. We propose a model of lexicon growth which is driven by the phonological layer: words are suggested according to different orderings of insertion (e.g. shorter word length, highest frequency, semantic multiplex features) and accepted or rejected subject to constraints. We then measure times of network assembly and compare these to empirical data about the age of acquisition of words. In agreement with empirical studies in psycholinguistics, our results provide quantitative evidence for the hypothesis that word acquisition is driven by features at multiple levels of organisation within language.

preprint2016arXiv

Parasite Spreading in Spatial Ecological Multiplex Networks

Network ecology is a rising field of quantitative biology representing ecosystems as complex networks. A suitable example is parasite spreading: several parasites may be transmitted among their hosts through different mechanisms, each one giving rise to a network of interactions. Modelling these networked, ecological interactions at the same time is still an open challenge. We present a novel spatially-embedded multiplex network framework for modelling multi-host infection spreading through multiple routes of transmission. Our model is inspired by T. cruzi, a parasite transmitted by trophic and vectorial mechanisms. Our ecological network model is represented by a multiplex in which nodes represent species populations interacting through a food web and a parasite contaminative layer at the same time. We modelled an SI dynamics in two different scenarios: a simple theoretical food web and an empirical one. Our simulations in both scenarios show that the infection is more widespread when both the trophic and the contaminative interactions are considered with equal rates. This indicates that trophic and contaminative transmission may have additive effects in real ecosystems. We also find that the ratio of vectors-to-host in the community (i) crucially influences the infection spread, (ii) regulates a percolating phase transition in the rate of parasite transmission and (iii) increases the infection rate in hosts. By immunising the same fractions of predator and prey populations, we show that the multiplex topology is fundamental in outlining the role that each host species plays in parasite transmission in a given ecosystem. We also show that the multiplex models provide a richer phenomenology in terms of parasite spreading dynamics compared to mono-layer models. Our work opens new challenges and provides new quantitative tools for modelling multi-channel spreading in networked systems.

preprint2015arXiv

Patterns in the English Language: Phonological Networks, Percolation and Assembly Models

In this paper we provide a quantitative framework for the study of phonological networks (PNs) for the English language by carrying out principled comparisons to null models, either based on site percolation, randomization techniques, or network growth models. In contrast to previous work, we mainly focus on null models that reproduce lower order characteristics of the empirical data. We find that artificial networks matching connectivity properties of the English PN are exceedingly rare: this leads to the hypothesis that the word repertoire might have been assembled over time by preferentially introducing new words which are small modifications of old words. Our null models are able to explain the "power-law-like" part of the degree distributions and generally retrieve qualitative features of the PN such as high clustering, high assortativity coefficient, and small-world characteristics. However, the detailed comparison to expectations from null models also points out significant differences, suggesting the presence of additional constraints in word assembly. Key constraints we identify are the avoidance of large degrees, the avoidance of triadic closure, and the avoidance of large non-percolating clusters.

preprint2014arXiv

A k-deformed Model of Growing Complex Networks with Fitness

The Barabási-Bianconi (BB) fitness model can be solved by a mapping between the original network growth model to an idealized bosonic gas. The well-known transition to Bose-Einstein condensation in the latter then corresponds to the emergence of "super-hubs" in the network model. Motivated by the preservation of the scale-free property, thermodynamic stability and self-duality, we generalize the original extensive mapping of the BB fitness model by using the nonextensive Kaniadakis k-distribution. Through numerical simulation and mean-field calculations we show that deviations from extensivity do not compromise qualitative features of the phase transition. Analysis of the critical temperature yields a monotonically decreasing dependence on the nonextensive parameter k.

Massimo Stella

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

How to predict creativity ratings from written narratives: A comparison of co-occurrence and textual forma mentis networks

Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior

Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs

The TEA Nets framework combines AI and cognitive network science to model targets, events and actors in text

Feature-rich multiplex lexical networks reveal mental strategies of early language learning

Cognitive network science for understanding online social cognitions: A brief review

#lockdown: network-enhanced emotional profiling at the times of COVID-19

Forma mentis networks reconstruct how Italian high schoolers and international STEM experts perceive teachers, students, scientists, and school

Mapping computational thinking mindsets between educational levels with cognitive network science

Multiplex networks quantify robustness of the mental lexicon to catastrophic concept failures, aphasic degradation and ageing

Mental Lexicon Growth Modelling Reveals the Multiplexity of the English Language

Parasite Spreading in Spatial Ecological Multiplex Networks

Patterns in the English Language: Phonological Networks, Percolation and Assembly Models

A k-deformed Model of Growing Complex Networks with Fitness