Source author record

Alberto Testolin

Alberto Testolin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Networking and Internet Architecture Biological Physics Information Theory math.IT nlin.AO Robotics

Catalog footprint

What is connected

7works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Sequential Enumeration in Large Language Models

Reliably counting and generating sequences of items remain a significant challenge for neural networks, including Large Language Models (LLMs). Indeed, although this capability is readily handled by rule-based symbolic systems based on serial computation, learning to systematically deploy counting procedures is difficult for neural models, which should acquire these skills through learning. Previous research has demonstrated that recurrent architectures can only approximately track and enumerate sequences of events, and it remains unclear whether modern deep learning systems, including LLMs, can deploy systematic counting procedures over sequences of discrete symbols. This paper aims to fill this gap by investigating the sequential enumeration abilities of five state-of-the-art LLMs, including proprietary, open-source, and reasoning models. We probe LLMs in sequential naming and production tasks involving lists of letters and words, adopting a variety of prompting instructions to explore the role of chain-of-thought in the spontaneous emerging of counting strategies. We also evaluate open-source models with the same architecture but increasing size to see whether the mastering of counting principles follows scaling laws, and we analyze the embedding dynamics during sequential enumeration to investigate the emergent encoding of numerosity. We find that some LLMs are indeed capable of deploying counting procedures when explicitly prompted to do so, but none of them spontaneously engage in counting when simply asked to enumerate the number of items in a sequence. Our results suggest that, despite their impressive emergent abilities, LLMs cannot yet robustly and systematically deploy counting procedures, highlighting a persistent gap between neural and symbolic approaches to compositional generalization.

preprint2023arXiv

Learning to solve arithmetic problems with a virtual abacus

Acquiring mathematical skills is considered a key challenge for modern Artificial Intelligence systems. Inspired by the way humans discover numerical knowledge, here we introduce a deep reinforcement learning framework that allows to simulate how cognitive agents could gradually learn to solve arithmetic problems by interacting with a virtual abacus. The proposed model successfully learn to perform multi-digit additions and subtractions, achieving an error rate below 1% even when operands are much longer than those observed during training. We also compare the performance of learning agents receiving a different amount of explicit supervision, and we analyze the most common error patterns to better understand the limitations and biases resulting from our design choices.

preprint2022arXiv

Transformers discover an elementary calculation system exploiting local attention and grid-like problem representation

Mathematical reasoning is one of the most impressive achievements of human intellect but remains a formidable challenge for artificial intelligence systems. In this work we explore whether modern deep learning architectures can learn to solve a symbolic addition task by discovering effective arithmetic procedures. Although the problem might seem trivial at first glance, generalizing arithmetic knowledge to operations involving a higher number of terms, possibly composed by longer sequences of digits, has proven extremely challenging for neural networks. Here we show that universal transformers equipped with local attention and adaptive halting mechanisms can learn to exploit an external, grid-like memory to carry out multi-digit addition. The proposed model achieves remarkable accuracy even when tested with problems requiring extrapolation outside the training distribution; most notably, it does so by discovering human-like calculation strategies such as place value alignment.

preprint2021arXiv

Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control

Over the past few years, the use of swarms of Unmanned Aerial Vehicles (UAVs) in monitoring and remote area surveillance applications has become widespread thanks to the price reduction and the increased capabilities of drones. The drones in the swarm need to cooperatively explore an unknown area, in order to identify and monitor interesting targets, while minimizing their movements. In this work, we propose a distributed Reinforcement Learning (RL) approach that scales to larger swarms without modifications. The proposed framework relies on the possibility for the UAVs to exchange some information through a communication channel, in order to achieve context-awareness and implicitly coordinate the swarm's actions. Our experiments show that the proposed method can yield effective strategies, which are robust to communication channel impairments, and that can easily deal with non-uniform distributions of targets and obstacles. Moreover, when agents are trained in a specific scenario, they can adapt to a new one with minimal additional training. We also show that our approach achieves better performance compared to a computationally intensive look-ahead heuristic.

preprint2021arXiv

Enabling Simulation-Based Optimization Through Machine Learning: A Case Study on Antenna Design

Complex phenomena are generally modeled with sophisticated simulators that, depending on their accuracy, can be very demanding in terms of computational resources and simulation time. Their time-consuming nature, together with a typically vast parameter space to be explored, make simulation-based optimization often infeasible. In this work, we present a method that enables the optimization of complex systems through Machine Learning (ML) techniques. We show how well-known learning algorithms are able to reliably emulate a complex simulator with a modest dataset obtained from it. The trained emulator is then able to yield values close to the simulated ones in virtually no time. Therefore, it is possible to perform a global numerical optimization over the vast multi-dimensional parameter space, in a fraction of the time that would be required by a simple brute-force search. As a testbed for the proposed methodology, we used a network simulator for next-generation mmWave cellular systems. After simulating several antenna configurations and collecting the resulting network-level statistics, we feed it into our framework. Results show that, even with few data points, extrapolating a continuous model makes it possible to estimate the global optimum configuration almost instantaneously. The very same tool can then be used to achieve any further optimization goal on the same input parameters in negligible time.

preprint2021arXiv

Machine Learning-aided Design of Thinned Antenna Arrays for Optimized Network Level Performance

With the advent of millimeter wave (mmWave) communications, the combination of a detailed 5G network simulator with an accurate antenna radiation model is required to analyze the realistic performance of complex cellular scenarios. However, due to the complexity of both electromagnetic and network models, the design and optimization of antenna arrays is generally infeasible due to the required computational resources and simulation time. In this paper, we propose a Machine Learning framework that enables a simulation-based optimization of the antenna design. We show how learning methods are able to emulate a complex simulator with a modest dataset obtained from it, enabling a global numerical optimization over a vast multi-dimensional parameter space in a reasonable amount of time. Overall, our results show that the proposed methodology can be successfully applied to the optimization of thinned antenna arrays.

preprint2019arXiv

Emergence of Network Motifs in Deep Neural Networks

Network science can offer fundamental insights into the structural and functional properties of complex systems. For example, it is widely known that neuronal circuits tend to organize into basic functional topological modules, called "network motifs". In this article we show that network science tools can be successfully applied also to the study of artificial neural networks operating according to self-organizing (learning) principles. In particular, we study the emergence of network motifs in multi-layer perceptrons, whose initial connectivity is defined as a stack of fully-connected, bipartite graphs. Our simulations show that the final network topology is primarily shaped by learning dynamics, but can be strongly biased by choosing appropriate weight initialization schemes. Overall, our results suggest that non-trivial initialization strategies can make learning more effective by promoting the development of useful network motifs, which are often surprisingly consistent with those observed in general transduction networks.

Alberto Testolin

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Sequential Enumeration in Large Language Models

Learning to solve arithmetic problems with a virtual abacus

Transformers discover an elementary calculation system exploiting local attention and grid-like problem representation

Distributed Reinforcement Learning for Flexible and Efficient UAV Swarm Control

Enabling Simulation-Based Optimization Through Machine Learning: A Case Study on Antenna Design

Machine Learning-aided Design of Thinned Antenna Arrays for Optimized Network Level Performance

Emergence of Network Motifs in Deep Neural Networks