Unsupervised learning for variability detection with Gaia DR3 photometry. The main sequence-white dwarf valley
The unprecedented volume and quality of data from space- and ground-based telescopes present an opportunity for machine learning to identify new classes of variable stars and peculiar systems that may have been overlooked by traditional methods. Extending prior methodological work, this study investigates the potential of an unsupervised learning approach to scale effectively to larger stellar populations, including objects in crowded fields, and without the need for pre-selected catalogues, specifically focusing on 13 405 sources selected from Gaia DR3 and lying in the selected region of the CMD. Our methodology incorporates unsupervised clustering techniques based primarily on statistical features extracted from Gaia DR3 epoch photometry. We used the t-distributed stochastic neighbour embedding (t-SNE) algorithm to identify variability classes, their subtypes, and spurious variability induced by instrumental effects. The clustering results revealed distinct groups, including hot subdwarfs, cataclysmic variables (CVs), eclipsing binaries, and objects in crowded fields, such as those in the Andromeda (M31) field. Several potential stellar subtypes also emerged within these clusters. Notably, objects previously labelled as RR Lyrae were found in an unexpected region of the CMD, potentially due to either unreliable astrometric measurements (e.g., due to binarity) or alternative evolutionary pathways. This study emphasises the robustness of the proposed method in finding variable objects in a large region of the Gaia CMD, including variable hot subdwarfs and CVs, while demonstrating its efficiency in detecting variability in extended stellar populations. The proposed unsupervised learning framework demonstrates scalability to large datasets and yields promising results in identifying stellar subclasses.