Researcher profile

Francesca Chiaromonte

Francesca Chiaromonte contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
10topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

GravityGraphSAGE: Link Prediction in Directed Attributed Graphs

Link prediction (inferring missing or future connections between nodes in a graph) is a fundamental problem in network science with widespread applications in, e.g., biological systems, recommender systems, finance and cybersecurity. The ability to accurately predict links has significant real-world applications, such as detecting fraudulent financial transactions or identifying drug-target interactions in biomedicine. Despite a rich literature, link prediction is still challenging, especially for graphs enriched with information on edges (direction) and nodes (attributes). In fact, research on link prediction, especially the one based on Graph Deep Learning (GDL), has mostly focused on undirected graphs, without fully leveraging node attributes. Here, we fill this gap by proposing Gravity-GraphSAGE (GG-SAGE), a modified version of GraphSAGE, a GDL model for node embeddings, composed of a gravity-inspired decoder. This implementation is the first example in the literature of a GraphSAGE backbone adopted for directed link prediction. Using the benchmark datasets Cora, Citeseer, PubMed and 16 real-world graphs from the online Netzschleuder repository, we show that our proposed model outperforms state-of-the-art GDL link prediction techniques. Using further experimental evidence, we relate the quality of the output of our model with various characteristics of the graph, suggesting that our framework scales well when applied to data of increasing complexity.

preprint2026arXiv

RAwR: Role-Aware Rewiring via Approximate Equitable Partition

While Graph Neural Networks (GNNs) have demonstrated significant efficacy in node classification tasks, where predictions rely on local neighborhood information, the performance of GNNs often drops when prediction tasks depend on long-range interactions. These limitations are attributed to phenomena such as oversquashing, where structural bottlenecks restrict signal propagation across the network topology. To address this challenge, we introduce RAwR, a computationally efficient rewiring framework that augments the input graph with a quotient graph derived from equitable partitions. This approach facilitates accelerated communication between nodes that share identical structural roles, as identified by the Weisfeiler-Leman graph coloring, and thereby reduces the total effective resistance of the system. Furthermore, by employing an approximate definition of the equitable partition, RAwR enables a controllable reduction of the quotient graph, which, in its most condensed state, recovers the conventional Master Node rewiring technique. Empirical evaluations across a diverse suite of benchmarks -- including homophilic, heterophilic, and synthetic long-range datasets -- demonstrate that RAwR achieves state-of-the-art results. Our contribution is further supported by an analytical investigation using a teacher-student model of linear GNNs, which elucidates the theoretical foundations of role-based rewiring. This analysis leads to the formulation of Spectral Role Lift (SRL), a metric designed to identify the optimal approximate equitable partition for maximizing predictive performance.

preprint2023arXiv

Contrasting pre-vaccine COVID-19 waves in Italy through Functional Data Analysis

We use data from 107 Italian provinces to characterize and compare mortality patterns in the first two COVID-19 epidemic waves, which occurred prior to the introduction of vaccines. We also associate these patterns with mobility, timing of government restrictions, and socio-demographic, infrastructural, and environmental covariates. Notwithstanding limitations in the accuracy and reliability of publicly available data, we are able to exploit information in curves and shapes through Functional Data Analysis techniques. Specifically, we document differences in magnitude and variability between the two waves; while both were characterized by a co-occurrence of 'exponential' and 'mild' mortality patterns, the second spread much more broadly and asynchronously through the country. Moreover, we find evidence of a significant positive association between local mobility and mortality in both epidemic waves and corroborate the effectiveness of timely restrictions in curbing mortality. The techniques we describe could capture additional signals of interest if applied, for instance, to data on cases and positivity rates. However, we show that the quality of such data, at least in the case of Italian provinces, was too poor to support meaningful analyses.

preprint2022arXiv

Venture Capital investments through the lens of Network and Functional Data Analysis

In this paper we characterize the performance of venture capital-backed firms based on their ability to attract investment. The aim of the study is to identify relevant predictors of success built from the network structure of firms' and investors' relations. Focusing on deal-level data for the health sector, we first create a bipartite network among firms and investors, and then apply functional data analysis (FDA) to derive progressively more refined indicators of success captured by a binary, a scalar and a functional outcome. More specifically, we use different network centrality measures to capture the role of early investments for the success of the firm. Our results, which are robust to different specifications, suggest that success has a strong positive association with centrality measures of the firm and of its large investors, and a weaker but still detectable association with centrality measures of small investors and features describing firms as knowledge bridges. Finally, based on our analyses, success is not associated with firms' and investors' spreading power (harmonic centrality), nor with the tightness of investors' community (clustering coefficient) and spreading ability (VoteRank).

preprint2021arXiv

Can you always reap what you sow? Network and functional data analysis of VC investments in health-tech companies

"Success" of firms in venture capital markets is hard to define, and its determinants are still poorly understood. We build a bipartite network of investors and firms in the healthcare sector, describing its structure and its communities. Then, we characterize "success" introducing progressively more refined definitions, and we find a positive association between such definitions and the centrality of a company. In particular, we are able to cluster funding trajectories of firms into two groups capturing different "success" regimes and to link the probability of belonging to one or the other to their network features (in particular their centrality and the one of their investors). We further investigate this positive association by introducing scalar as well as functional "success" outcomes, confirming our findings and their robustness.

preprint2020arXiv

An Efficient Semi-smooth Newton Augmented Lagrangian Method for Elastic Net

Feature selection is an important and active research area in statistics and machine learning. The Elastic Net is often used to perform selection when the features present non-negligible collinearity or practitioners wish to incorporate additional known structure. In this article, we propose a new Semi-smooth Newton Augmented Lagrangian Method to efficiently solve the Elastic Net in ultra-high dimensional settings. Our new algorithm exploits both the sparsity induced by the Elastic Net penalty and the sparsity due to the second order information of the augmented Lagrangian. This greatly reduces the computational cost of the problem. Using simulations on both synthetic and real datasets, we demonstrate that our approach outperforms its best competitors by at least an order of magnitude in terms of CPU time. We also apply our approach to a Genome Wide Association Study on childhood obesity.

preprint2020arXiv

Give more data, awareness and control to individual citizens, and they will help COVID-19 containment

The rapid dynamics of COVID-19 calls for quick and effective tracking of virus transmission chains and early detection of outbreaks, especially in the phase 2 of the pandemic, when lockdown and other restriction measures are progressively withdrawn, in order to avoid or minimize contagion resurgence. For this purpose, contact-tracing apps are being proposed for large scale adoption by many countries. A centralized approach, where data sensed by the app are all sent to a nation-wide server, raises concerns about citizens' privacy and needlessly strong digital surveillance, thus alerting us to the need to minimize personal data collection and avoiding location tracking. We advocate the conceptual advantage of a decentralized approach, where both contact and location data are collected exclusively in individual citizens' "personal data stores", to be shared separately and selectively, voluntarily, only when the citizen has tested positive for COVID-19, and with a privacy preserving level of granularity. This approach better protects the personal sphere of citizens and affords multiple benefits: it allows for detailed information gathering for infected people in a privacy-preserving fashion; and, in turn this enables both contact tracing, and, the early detection of outbreak hotspots on more finely-granulated geographic scale. Our recommendation is two-fold. First to extend existing decentralized architectures with a light touch, in order to manage the collection of location data locally on the device, and allow the user to share spatio-temporal aggregates - if and when they want, for specific aims - with health authorities, for instance. Second, we favour a longer-term pursuit of realizing a Personal Data Store vision, giving users the opportunity to contribute to collective good in the measure they want, enhancing self-awareness, and cultivating collective efforts for rebuilding society.

preprint2019arXiv

On the bias of H-scores for comparing biclusters, and how to correct it

In the last two decades several biclustering methods have been developed as new unsupervised learning techniques to simultaneously cluster rows and columns of a data matrix. These algorithms play a central role in contemporary machine learning and in many applications, e.g. to computational biology and bioinformatics. The H-score is the evaluation score underlying the seminal biclustering algorithm by Cheng and Church, as well as many other subsequent biclustering methods. In this paper, we characterize a potentially troublesome bias in this score, that can distort biclustering results. We prove, both analytically and by simulation, that the average H-score increases with the number of rows/columns in a bicluster. This makes the H-score, and hence all algorithms based on it, biased towards small clusters. Based on our analytical proof, we are able to provide a straightforward way to correct this bias, allowing users to accurately compare biclusters.