Researcher profile

Michele Magno

Michele Magno contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

Cracks in the Foundation: A Civil Infrastructure Dataset to Challenge Vision Foundation Models

Automated structural health monitoring is essential to prevent catastrophic infrastructure failures. Precise, pixel-level defect segmentation is needed to accurately assess structural integrity, but progress in defect segmentation for civil infrastructures has been held back by an extreme scarcity of data, which requires costly expert annotation. The need for data is accentuated by algorithmic hurdles intrinsic to the problem, including center-bias and the need to rely more on shape when inspecting nearly textureless building materials. To remove the bottleneck, we introduce Cracks in the Foundation (CiF), the largest and most detailed civil infrastructure (instance) segmentation dataset to date, comprising $\approx$150,000 high-resolution images meticulously curated over five years in collaboration with civil engineering experts. With the help of this unprecedented data source, we expose a blind spot of current visual AI: despite the advent of promptable Foundation Models (FMs) and Vision Language Models (VLMs), and despite the impressive abilities of today's specialised segmentation models, it turns out that dense image understanding in the built environment is nowhere near solved. Our evaluations indicate that even the most recent zero-shot FMs face significant challenges when deployed on real-world infrastructure and even the performance of specialised models with domain-specific supervision plateaus at $\approx$25% mAP. CiF establishes inspection of civil infrastructure, an elementary and seemingly easy perceptual task, as an open challenge that reveals fundamental weaknesses of present-day models trained predominantly on internet images, literally and figuratively highlighting cracks in the current foundation model paradigm.

preprint2026arXiv

Eco-WakeLoc: An Energy-Neutral and Cooperative UWB Real-Time Locating System

Indoor localization systems face a fundamental trade-off between efficiency and responsiveness, which is especially important for emerging use cases such as mobile robots operating in GPS-denied environments. Traditional RTLS either require continuously powered infrastructure, limiting their scalability, or are limited by their responsiveness. This work presents Eco-WakeLoc, designed to achieve centimeter-level UWB localization while remaining energy-neutral by combining ultra-low power wake-up radios (WuRs) with solar energy harvesting. By activating anchor nodes only on demand, the proposed system eliminates constant energy consumption while achieving centimeter-level positioning accuracy. To reduce coordination overhead and improve scalability, Eco-WakeLoc employs cooperative localization where active tags initiate ranging exchanges (trilateration), while passive tags opportunistically reuse these messages for TDOA positioning. An additive-increase/multiplicative-decrease (AIMD)-based energy-aware scheduler adapts localization rates according to the harvested energy, thereby maximizing the overall performance of the sensor network while ensuring long-term energy neutrality. The measured energy consumption is only 3.22mJ per localization for active tags, 951uJ for passive tags, and 353uJ for anchors. Real-world deployment on a quadruped robot with nine anchors confirms the practical feasibility, achieving an average accuracy of 43cm in dynamic indoor environments. Year-long simulations show that tags achieve an average of 2031 localizations per day, retaining over 7% battery capacity after one year -- demonstrating that the RTLS achieves sustained energy-neutral operation. Eco-WakeLoc demonstrates that high-accuracy indoor localization can be achieved at scale without continuous infrastructure operation, combining energy neutrality, cooperative positioning, and adaptive scheduling.

preprint2023arXiv

An Accurate EEGNet-based Motor-Imagery Brain-Computer Interface for Low-Power Edge Computing

This paper presents an accurate and robust embedded motor-imagery brain-computer interface (MI-BCI). The proposed novel model, based on EEGNet, matches the requirements of memory footprint and computational resources of low-power microcontroller units (MCUs), such as the ARM Cortex-M family. Furthermore, the paper presents a set of methods, including temporal downsampling, channel selection, and narrowing of the classification window, to further scale down the model to relax memory requirements with negligible accuracy degradation. Experimental results on the Physionet EEG Motor Movement/Imagery Dataset show that standard EEGNet achieves 82.43%, 75.07%, and 65.07% classification accuracy on 2-, 3-, and 4-class MI tasks in global validation, outperforming the state-of-the-art (SoA) convolutional neural network (CNN) by 2.05%, 5.25%, and 5.48%. Our novel method further scales down the standard EEGNet at a negligible accuracy loss of 0.31% with 7.6x memory footprint reduction and a small accuracy loss of 2.51% with 15x reduction. The scaled models are deployed on a commercial Cortex-M4F MCU taking 101ms and consuming 4.28mJ per inference for operating the smallest model, and on a Cortex-M7 with 44ms and 18.1mJ per inference for the medium-sized model, enabling a fully autonomous, wearable, and accurate low-power BCI.

preprint2023arXiv

Exploring Automatic Gym Workouts Recognition Locally On Wearable Resource-Constrained Devices

Automatic gym activity recognition on energy- and resource-constrained wearable devices removes the human-interaction requirement during intense gym sessions - like soft-touch tapping and swiping. This work presents a tiny and highly accurate residual convolutional neural network that runs in milliwatt microcontrollers for automatic workouts classification. We evaluated the inference performance of the deep model with quantization on three resource-constrained devices: two microcontrollers with ARM-Cortex M4 and M7 core from ST Microelectronics, and a GAP8 system on chip, which is an open-sourced, multi-core RISC-V computing platform from GreenWaves Technologies. Experimental results show an accuracy of up to 90.4% for eleven workouts recognition with full precision inference. The paper also presents the trade-off performance of the resource-constrained system. While keeping the recognition accuracy (88.1%) with minimal loss, each inference takes only 3.2 ms on GAP8, benefiting from the 8 RISC-V cluster cores. We measured that it features an execution time that is 18.9x and 6.5x faster than the Cortex-M4 and Cortex-M7 cores, showing the feasibility of real-time on-board workouts recognition based on the described data set with 20 Hz sampling rate. The energy consumed for each inference on GAP8 is 0.41 mJ compared to 5.17 mJ on Cortex-M4 and 8.07 mJ on Cortex-M7 with the maximum clock. It can lead to longer battery life when the system is battery-operated. We also introduced an open data set composed of fifty sessions of eleven gym workouts collected from ten subjects that is publicly available.

preprint2023arXiv

HR-SAR-Net: A Deep Neural Network for Urban Scene Segmentation from High-Resolution SAR Data

Synthetic aperture radar (SAR) data is becoming increasingly available to a wide range of users through commercial service providers with resolutions reaching 0.5m/px. Segmenting SAR data still requires skilled personnel, limiting the potential for large-scale use. We show that it is possible to automatically and reliably perform urban scene segmentation from next-gen resolution SAR data (0.15m/px) using deep neural networks (DNNs), achieving a pixel accuracy of 95.19% and a mean IoU of 74.67% with data collected over a region of merely 2.2km${}^2$. The presented DNN is not only effective, but is very small with only 63k parameters and computationally simple enough to achieve a throughput of around 500Mpx/s using a single GPU. We further identify that additional SAR receive antennas and data from multiple flights massively improve the segmentation accuracy. We describe a procedure for generating a high-quality segmentation ground truth from multiple inaccurate building and road annotations, which has been crucial to achieving these segmentation results.

preprint2023arXiv

InfiniWolf: Energy Efficient Smart Bracelet for Edge Computing with Dual Source Energy Harvesting

This work presents InfiniWolf, a novel multi-sensor smartwatch that can achieve self-sustainability exploiting thermal and solar energy harvesting, performing computationally high demanding tasks. The smartwatch embeds both a System-on-Chip (SoC) with an ARM Cortex-M processor and Bluetooth Low Energy (BLE) and Mr. Wolf, an open-hardware RISC-V based parallel ultra-low-power processor that boosts the processing capabilities on board by more than one order of magnitude, while also increasing energy efficiency. We demonstrate its functionality based on a sample application scenario performing stress detection with multi-layer artificial neural networks on a wearable multi-sensor bracelet. Experimental results show the benefits in terms of energy efficiency and latency of Mr. Wolf over an ARM Cortex-M4F micro-controllers and the possibility, under specific assumptions, to be self-sustainable using thermal and solar energy harvesting while performing up to 24 stress classifications per minute in indoor conditions.

preprint2022arXiv

Aerosense: A Self-Sustainable And Long-Range Bluetooth Wireless Sensor Node for Aerodynamic and Aeroacoustic Monitoring on Wind Turbines

This paper presents a low-power, self-sustainable, and modular wireless sensor node for aerodynamic and acoustic measurements on wind turbines and other industrial structures. It includes 40 high-accuracy barometers, 10 microphones, 5 differential pressure sensors, and implements a lossy and a lossless on-board data compression algorithm to decrease the transmission energy cost. The wireless transmitter is based on Bluetooth Low Energy 5.1 tuned for long-range and high throughput while maintaining adequate per-bit energy efficiency (80 nJ). Moreover, we field-assessed the node capability to collect precise and accurate aerodynamic data. Outdoor experimental tests revealed that the system can acquire and sustain a data rate of 850 kbps over 438 m. The power consumption while collecting and streaming all measured data is 120 mW, enabling self-sustainability and long-term in-situ monitoring with a 111 cm^2 photovoltaic panel.

preprint2022arXiv

FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things

The growing number of low-power smart devices in the Internet of Things is coupled with the concept of "Edge Computing", that is moving some of the intelligence, especially machine learning, towards the edge of the network. Enabling machine learning algorithms to run on resource-constrained hardware, typically on low-power smart devices, is challenging in terms of hardware (optimized and energy-efficient integrated circuits), algorithmic and firmware implementations. This paper presents FANN-on-MCU, an open-source toolkit built upon the Fast Artificial Neural Network (FANN) library to run lightweight and energy-efficient neural networks on microcontrollers based on both the ARM Cortex-M series and the novel RISC-V-based Parallel Ultra-Low-Power (PULP) platform. The toolkit takes multi-layer perceptrons trained with FANN and generates code targeted at execution on low-power microcontrollers either with a floating-point unit (i.e., ARM Cortex-M4F and M7F) or without (i.e., ARM Cortex M0-M3 or PULP-based processors). This paper also provides an architectural performance evaluation of neural networks on the most popular ARM Cortex-M family and the parallel RISC-V processor called Mr. Wolf. The evaluation includes experimental results for three different applications using a self-sustainable wearable multi-sensor bracelet. Experimental results show a measured latency in the order of only a few microseconds and a power consumption of few milliwatts while keeping the memory requirements below the limitations of the targeted microcontrollers. In particular, the parallel implementation on the octa-core RISC-V platform reaches a speedup of 22x and a 69% reduction in energy consumption with respect to a single-core implementation on Cortex-M4 for continuous real-time classification.

preprint2022arXiv

Leveraging Tactile Sensors for Low Latency Embedded Smart Hands for Prosthetic and Robotic Applications

Tactile sensing is a crucial perception mode for robots and human amputees in need of controlling a prosthetic device. Today robotic and prosthetic systems are still missing the important feature of accurate tactile sensing. This lack is mainly due to the fact that the existing tactile technologies have limited spatial and temporal resolution and are either expensive or not scalable. In this paper, we present the design and the implementation of a hardware-software embedded system called SmartHand. It is specifically designed to enable the acquisition and the real-time processing of high-resolution tactile information from a hand-shaped multi-sensor array for prosthetic and robotic applications. During data collection, our system can deliver a high throughput of 100 frames per second, which is 13.7x higher than previous related work. We collected a new tactile dataset while interacting with daily-life objects during five different sessions. We propose a compact yet accurate convolutional neural network that requires one order of magnitude less memory and 15.6x fewer computations compared to related work without degrading classification accuracy. The top-1 and top-3 cross-validation accuracies are respectively 98.86% and 99.83%. We further analyze the inter-session variability and obtain the best top-3 leave-one-out-validation accuracy of 77.84%. We deploy the trained model on a high-performance ARM Cortex-M7 microcontroller achieving an inference time of only 100 ms minimizing the response latency. The overall measured power consumption is 505 mW. Finally, we fabricate a new control sensor and perform additional experiments to provide analyses on sensor degradation and slip detection. This work is a step forward in giving robotic and prosthetic devices a sense of touch and demonstrates the practicality of a smart embedded system empowered by tiny machine learning.

preprint2022arXiv

Sub-mW Keyword Spotting on an MCU: Analog Binary Feature Extraction and Binary Neural Networks

Keyword spotting (KWS) is a crucial function enabling the interaction with the many ubiquitous smart devices in our surroundings, either activating them through wake-word or directly as a human-computer interface. For many applications, KWS is the entry point for our interactions with the device and, thus, an always-on workload. Many smart devices are mobile and their battery lifetime is heavily impacted by continuously running services. KWS and similar always-on services are thus the focus when optimizing the overall power consumption. This work addresses KWS energy-efficiency on low-cost microcontroller units (MCUs). We combine analog binary feature extraction with binary neural networks. By replacing the digital preprocessing with the proposed analog front-end, we show that the energy required for data acquisition and preprocessing can be reduced by 29x, cutting its share from a dominating 85% to a mere 16% of the overall energy consumption for our reference KWS application. Experimental evaluations on the Speech Commands Dataset show that the proposed system outperforms state-of-the-art accuracy and energy efficiency, respectively, by 1% and 4.3x on a 10-class dataset while providing a compelling accuracy-energy trade-off including a 2% accuracy drop for a 71x energy reduction.

preprint2021arXiv

Sound Event Detection with Binary Neural Networks on Tightly Power-Constrained IoT Devices

Sound event detection (SED) is a hot topic in consumer and smart city applications. Existing approaches based on Deep Neural Networks are very effective, but highly demanding in terms of memory, power, and throughput when targeting ultra-low power always-on devices. Latency, availability, cost, and privacy requirements are pushing recent IoT systems to process the data on the node, close to the sensor, with a very limited energy supply, and tight constraints on the memory size and processing capabilities precluding to run state-of-the-art DNNs. In this paper, we explore the combination of extreme quantization to a small-footprint binary neural network (BNN) with the highly energy-efficient, RISC-V-based (8+1)-core GAP8 microcontroller. Starting from an existing CNN for SED whose footprint (815 kB) exceeds the 512 kB of memory available on our platform, we retrain the network using binary filters and activations to match these memory constraints. (Fully) binary neural networks come with a natural drop in accuracy of 12-18% on the challenging ImageNet object recognition challenge compared to their equivalent full-precision baselines. This BNN reaches a 77.9% accuracy, just 7% lower than the full-precision version, with 58 kB (7.2 times less) for the weights and 262 kB (2.4 times less) memory in total. With our BNN implementation, we reach a peak throughput of 4.6 GMAC/s and 1.5 GMAC/s over the full network, including preprocessing with Mel bins, which corresponds to an efficiency of 67.1 GMAC/s/W and 31.3 GMAC/s/W, respectively. Compared to the performance of an ARM Cortex-M4 implementation, our system has a 10.3 times faster execution time and a 51.1 times higher energy-efficiency.