Researcher profile

Pedro Silva

Pedro Silva contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2025arXiv

Deep Learning for School Dropout Detection: A Comparison of Tabular and Graph-Based Models for Predicting At-Risk Students

Student dropout is a significant challenge in educational systems worldwide, leading to substantial social and economic costs. Predicting students at risk of dropout allows for timely interventions. While traditional Machine Learning (ML) models operating on tabular data have shown promise, Graph Neural Networks (GNNs) offer a potential advantage by capturing complex relationships inherent in student data if structured as graphs. This paper investigates whether transforming tabular student data into graph structures, primarily using clustering techniques, enhances dropout prediction accuracy. We compare the performance of GNNs (a custom Graph Convolutional Network (GCN) and GraphSAGE) on these generated graphs against established tabular models (Random Forest (RF), XGBoost, and TabNet) using a real-world student dataset. Our experiments explore various graph construction strategies based on different clustering algorithms (K-Means, HDBSCAN) and dimensionality reduction techniques (Principal Component Analysis (PCA), Uniform Manifold Approximation and Projection (UMAP)). Our findings demonstrate that a specific GNN configuration, GraphSAGE on a graph derived from PCA-KMeans clustering, achieved superior performance, notably improving the macro F1-score by approximately 7 percentage points and accuracy by nearly 2 percentage points over the strongest tabular baseline (XGBoost). However, other GNN configurations and graph construction methods did not consistently surpass tabular models, emphasizing the critical role of the graph generation strategy and GNN architecture selection. This highlights both the potential of GNNs and the challenges in optimally transforming tabular data for graph-based learning in this domain.

preprint2022arXiv

A Decidability-Based Loss Function

Nowadays, deep learning is the standard approach for a wide range of problems, including biometrics, such as face recognition and speech recognition, etc. Biometric problems often use deep learning models to extract features from images, also known as embeddings. Moreover, the loss function used during training strongly influences the quality of the generated embeddings. In this work, a loss function based on the decidability index is proposed to improve the quality of embeddings for the verification routine. Our proposal, the D-loss, avoids some Triplet-based loss disadvantages such as the use of hard samples and tricky parameter tuning, which can lead to slow convergence. The proposed approach is compared against the Softmax (cross-entropy), Triplets Soft-Hard, and the Multi Similarity losses in four different benchmarks: MNIST, Fashion-MNIST, CIFAR10 and CASIA-IrisV4. The achieved results show the efficacy of the proposal when compared to other popular metrics in the literature. The D-loss computation, besides being simple, non-parametric and easy to implement, favors both the inter-class and intra-class scenarios.

preprint2020arXiv

ARCHI: pipeline for light curve extraction of CHEOPS background star

High precision time series photometry from space is being used for a number of scientific cases. In this context, the recently launched CHEOPS (ESA) mission promises to bring 20 ppm precision over an exposure time of 6 hours, when targeting nearby bright stars, having in mind the detailed characterization of exoplanetary systems through transit measurements. However, the official CHEOPS (ESA) mission pipeline only provides photometry for the main target (the central star in the field). In order to explore the potential of CHEOPS photometry for all stars in the field, in this paper we present archi, an additional open-source pipeline module†to analyse the background stars present in the image. As archi uses the official Data Reduction Pipeline data as input, it is not meant to be used as independent tool to process raw CHEOPS data but, instead, to be used as an add-on to the official pipeline. We test archi using CHEOPS simulated images, and show that photometry of background stars in CHEOPS images is only slightly degraded (by a factor of 2 to 3) with respect to the main target. This opens a potential for the use of CHEOPS to produce photometric time series of several close-by targets at once, as well as to use different stars in the image to calibrate systematic errors. We also show one clear scientific application where the study of the companion light curve can be important for the understanding of the contamination on the main target.

preprint2017arXiv

SpArcFiRe: morphological selection effects due to reduced visibility of tightly winding arms in distant spiral galaxies

The Galaxy Zoo has provided morphological data on many galaxies. Several biases have been identified in the Galaxy Zoo data. Here we report on a newly discovered selection effect: astronomers interested in studying spiral galaxies may select a set of spiral galaxies based upon a threshold in spirality (the fraction of Galaxy Zoo humans who report seeing spiral structure). SpArcFiRe is an automated tool that decomposes a spiral galaxy into its constituent spiral arms, providing objective, quantitative data on their structure. SpArcFiRe measures the pitch angle of spiral arms. We have observed that when selecting a set of spiral galaxies based on a threshold on spirality, the pitch angle of spiral arms appear increase with redshift. We hypothesize that this is a selection effect: tightly-wound spiral arms become less visible as images degrade with increasing redshift, leading to fewer such galaxies being included in the sample at higher redshifts. We corroborate this hypothesis by artificially degrading images of nearby galaxies, then using a machine learning algorithm trained on Galaxy Zoo data to provide a spirality for each artificially degraded image. It correctly predicts that spirality decreases as image quality degrades. Thus, the mean pitch angle of those galaxies remaining above the spirality threshold is higher than those eliminated by the selection effect. This demonstrates that users who select samples of galaxies using a threshold of Galaxy Zoo votes must carefully consider the possibility of selection effects on morphological measures, even if the measure itself is believed to be objective and unbiased. Finally, we also perform an empirical sensitivity analysis to demonstrate that SpArcFiRe's output changes in a smooth and predictable fashion to changes in its internal algorithmic parameters.