Source author record

Luca Biggio

Luca Biggio appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence astro-ph.CO

Catalog footprint

What is connected

3works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Dynaformer: A Deep Learning Model for Ageing-aware Battery Discharge Prediction

Electrochemical batteries are ubiquitous devices in our society. When they are employed in mission-critical applications, the ability to precisely predict the end of discharge under highly variable environmental and operating conditions is of paramount importance in order to support operational decision-making. While there are accurate predictive models of the processes underlying the charge and discharge phases of batteries, the modelling of ageing and its effect on performance remains poorly understood. Such a lack of understanding often leads to inaccurate models or the need for time-consuming calibration procedures whenever the battery ages or its conditions change significantly. This represents a major obstacle to the real-world deployment of efficient and robust battery management systems. In this paper, we propose for the first time an approach that can predict the voltage discharge curve for batteries of any degradation level without the need for calibration. In particular, we introduce Dynaformer, a novel Transformer-based deep learning architecture which is able to simultaneously infer the ageing state from a limited number of voltage/current samples and predict the full voltage discharge curve for real batteries with high precision. Our experiments show that the trained model is effective for input current profiles of different complexities and is robust to a wide range of degradation levels. In addition to evaluating the performance of the proposed framework on simulated data, we demonstrate that a minimal amount of fine-tuning allows the model to bridge the simulation-to-real gap between simulations and real data collected from a set of batteries. The proposed methodology enables the utilization of battery-powered systems until the end of discharge in a controlled and predictable way, thereby significantly prolonging the operating cycles and reducing costs.

preprint2022arXiv

Fast emulation of two-point angular statistics for photometric galaxy surveys

We develop a set of machine-learning based cosmological emulators, to obtain fast model predictions for the $C(\ell)$ angular power spectrum coefficients characterising tomographic observations of galaxy clustering and weak gravitational lensing from multi-band photometric surveys (and their cross-correlation). A set of neural networks are trained to map cosmological parameters into the coefficients, achieving a speed-up $\mathcal{O}(10^3)$ in computing the required statistics for a given set of cosmological parameters, with respect to standard Boltzmann solvers, with an accuracy better than $0.175\%$ ($<0.1\%$ for the weak lensing case). This corresponds to $\sim 2\%$ or less of the statistical error bars expected from a typical Stage IV photometric surveys. Such overall improvement in speed and accuracy is obtained through ($\textit{i}$) a specific pre-processing optimisation, ahead of the training phase, and ($\textit{ii}$) a more effective neural network architecture, compared to previous implementations.

preprint2022arXiv

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Transformers have achieved remarkable success in several domains, ranging from natural language processing to computer vision. Nevertheless, it has been recently shown that stacking self-attention layers - the distinctive architectural component of Transformers - can result in rank collapse of the tokens' representations at initialization. The question of if and how rank collapse affects training is still largely unanswered, and its investigation is necessary for a more comprehensive understanding of this architecture. In this work, we shed new light on the causes and the effects of this phenomenon. First, we show that rank collapse of the tokens' representations hinders training by causing the gradients of the queries and keys to vanish at initialization. Furthermore, we provide a thorough description of the origin of rank collapse and discuss how to prevent it via an appropriate depth-dependent scaling of the residual branches. Finally, our analysis unveils that specific architectural hyperparameters affect the gradients of queries and values differently, leading to disproportionate gradient norms. This suggests an explanation for the widespread use of adaptive methods for Transformers' optimization.

Luca Biggio

What is connected

Connect this record

See the researcher in context

Building this map preview

3 published item(s)

Dynaformer: A Deep Learning Model for Ageing-aware Battery Discharge Prediction

Fast emulation of two-point angular statistics for photometric galaxy surveys

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse