Source author record

Alessandro Betti

Alessandro Betti appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision cond-mat.mes-hall Artificial Intelligence cond-mat.soft Information Theory math.AP math.IT math.OC Neural and Evolutionary Computing Neurons and Cognition physics.soc-ph

Catalog footprint

What is connected

20works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A free boundary singular transport equation as a formal limit of a discrete dynamical system

We study the continuous version of a hyperbolic rescaling of a discrete game, called open mancala. The resulting PDE turns out to be a singular transport equation, with a forcing term taking values in $\{0,1\}$, and discontinuous in the solution itself. We prove existence and uniqueness of a certain formulation of the problem, based on a nonlocal equation satisfied by the free boundary dividing the region where the forcing is one (active region) and the region where there is no forcing (tail region). Several examples, most notably the Riemann problem, are provided, related to singularity formation. Interestingly, the solution can be obtained by a suitable vertical rearrangement of a multi-function. Furthermore, the PDE admits a Lyapunov functional.

preprint2022arXiv

A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery

Despite the breakthrough deep learning performances achieved for automatic object detection, small target detection is still a challenging problem, especially when looking at fast and accurate solutions suitable for mobile or edge applications. In this work we present YOLO-S, a simple, fast and efficient network for small target detection. The architecture exploits a small feature extractor based on Darknet20, as well as skip connection, via both bypass and concatenation, and reshape-passthrough layer to alleviate the vanishing gradient problem, promote feature reuse across network and combine low-level positional information with more meaningful high-level information. To verify the performances of YOLO-S, we build "AIRES", a novel dataset for cAr detectIon fRom hElicopter imageS acquired in Europe, and set up experiments on both AIRES and VEDAI datasets, benchmarking this architecture with four baseline detectors. Furthermore, in order to handle efficiently the issue of data insufficiency and domain gap when dealing with a transfer learning strategy, we introduce a transitional learning task over a combined dataset based on DOTAv2 and VEDAI and demonstrate that can enhance the overall accuracy with respect to more general features transferred from COCO data. YOLO-S is from 25% to 50% faster than YOLOv3 and only 15-25% slower than Tiny-YOLOv3, outperforming also YOLOv3 in terms of accuracy in a wide range of experiments. Further simulations performed on SARD dataset demonstrate also its applicability to different scenarios such as for search and rescue operations. Besides, YOLO-S has an 87% decrease of parameter size and almost one half FLOPs of YOLOv3, making practical the deployment for low-power industrial applications.

preprint2022arXiv

A Multi-Stage model based on YOLOv3 for defect detection in PV panels based on IR and Visible Imaging by Unmanned Aerial Vehicle

As solar capacity installed worldwide continues to grow, there is an increasing awareness that advanced inspection systems are becoming of utmost importance to schedule smart interventions and minimize downtime likelihood. In this work we propose a novel automatic multi-stage model to detect panel defects on aerial images captured by unmanned aerial vehicle by using the YOLOv3 network and Computer Vision techniques. The model combines detections of panels and defects to refine its accuracy and exhibits an average inference time per image of 0.98 s. The main novelties are represented by its versatility to process either thermographic or visible images and detect a large variety of defects, to prescript recommended actions to O&M crew to give a more efficient data-driven maintenance strategy and its portability to both rooftop and ground-mounted PV systems and different panel types. The proposed model has been validated on two big PV plants in the south of Italy with an outstanding AP@0.5 exceeding 98% for panel detection, a remarkable AP@0.4 (AP@0.5) of roughly 88.3% (66.9%) for hotspots by means of infrared thermography and a mAP@0.5 of almost 70% in the visible spectrum for detection of anomalies including panel shading induced by soiling and bird dropping, delamination, presence of puddles and raised rooftop panels. The model predicts also the severity of hotspot areas based on the estimated temperature gradients, as well as it computes the soiling coverage based on visual images. Finally an analysis of the influence of the different YOLOv3's output scales on the detection is discussed.

preprint2022arXiv

Deep Learning to See: Towards New Foundations of Computer Vision

The remarkable progress in computer vision over the last few years is, by and large, attributed to deep learning, fueled by the availability of huge sets of labeled data, and paired with the explosive growth of the GPU paradigm. While subscribing to this view, this book criticizes the supposed scientific progress in the field and proposes the investigation of vision within the framework of information-based laws of nature. Specifically, the present work poses fundamental questions about vision that remain far from understood, leading the reader on a journey populated by novel challenges resonating with the foundations of machine learning. The central thesis is that for a deeper understanding of visual computational processes, it is necessary to look beyond the applications of general purpose machine learning algorithms and focus instead on appropriate learning theories that take into account the spatiotemporal nature of the visual signal.

preprint2022arXiv

Forward Approximate Solution for Linear Quadratic Tracking

In this paper, we discuss an approximation strategy for solving the Linear Quadratic Tracking that is both forward and local in time. We exploit the known form of the value function along with a time reversal transformation that nicely addresses the boundary condition consistency. We provide the results of an experimental investigation with the aim of showing how the proposed solution performs with respect to the optimal solution. Finally, we also show that the proposed solution turns out to be a valid alternative to model predictive control strategies, whose computational burden is dramatically reduced.

preprint2022arXiv

Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

Devising intelligent agents able to live in an environment and learn by observing the surroundings is a longstanding goal of Artificial Intelligence. From a bare Machine Learning perspective, challenges arise when the agent is prevented from leveraging large fully-annotated dataset, but rather the interactions with supervisory signals are sparsely distributed over space and time. This paper proposes a novel neural-network-based approach to progressively and autonomously develop pixel-wise representations in a video stream. The proposed method is based on a human-like attention mechanism that allows the agent to learn by observing what is moving in the attended locations. Spatio-temporal stochastic coherence along the attention trajectory, paired with a contrastive term, leads to an unsupervised learning criterion that naturally copes with the considered setting. Differently from most existing works, the learned representations are used in open-set class-incremental classification of each frame pixel, relying on few supervisions. Our experiments leverage 3D virtual environments and they show that the proposed agents can learn to distinguish objects just by observing the video stream. Inheriting features from state-of-the art models is not as powerful as one might expect.

preprint2021arXiv

An Optimal Control Approach to Learning in SIDARTHE Epidemic model

The COVID-19 outbreak has stimulated the interest in the proposal of novel epidemiological models to predict the course of the epidemic so as to help planning effective control strategies. In particular, in order to properly interpret the available data, it has become clear that one must go beyond most classic epidemiological models and consider models that, like the recently proposed SIDARTHE, offer a richer description of the stages of infection. The problem of learning the parameters of these models is of crucial importance especially when assuming that they are time-variant, which further enriches their effectiveness. In this paper we propose a general approach for learning time-variant parameters of dynamic compartmental models from epidemic data. We formulate the problem in terms of a functional risk that depends on the learning variables through the solutions of a dynamic system. The resulting variational problem is then solved by using a gradient flow on a suitable, regularized functional. We forecast the epidemic evolution in Italy and France. Results indicate that the model provides reliable and challenging predictions over all available data as well as the fundamental role of the chosen strategy on the time-variant parameters.

preprint2020arXiv

A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level

The increasing penetration level of energy generation from renewable sources is demanding for more accurate and reliable forecasting tools to support classic power grid operations (e.g., unit commitment, electricity market clearing or maintenance planning). For this purpose, many physical models have been employed, and more recently many statistical or machine learning algorithms, and data-driven methods in general, are becoming subject of intense research. While generally the power research community focuses on power forecasting at the level of single plants, in a short future horizon of time, in this time we are interested in aggregated macro-area power generation (i.e., in a territory of size greater than 100000 km^2) with a future horizon of interest up to 15 days ahead. Real data are used to validate the proposed forecasting methodology on a test set of several months.

preprint2020arXiv

Backprop Diffusion is Biologically Plausible

The Backpropagation algorithm relies on the abstraction of using a neural model that gets rid of the notion of time, since the input is mapped instantaneously to the output. In this paper, we claim that this abstraction of ignoring time, along with the abrupt input changes that occur when feeding the training set, are in fact the reasons why, in some papers, Backprop biological plausibility is regarded as an arguable issue. We show that as soon as a deep feedforward network operates with neurons with time-delayed response, the backprop weight update turns out to be the basic equation of a biologically plausible diffusion process based on forward-backward waves. We also show that such a process very well approximates the gradient for inputs that are not too fast with respect to the depth of the network. These remarks somewhat disclose the diffusion process behind the backprop equation and leads us to interpret the corresponding algorithm as a degeneration of a more general diffusion process that takes place also in neural networks with cyclic connections.

preprint2020arXiv

Developing Constrained Neural Units Over Time

In this paper we present a foundational study on a constrained method that defines learning problems with Neural Networks in the context of the principle of least cognitive action, which very much resembles the principle of least action in mechanics. Starting from a general approach to enforce constraints into the dynamical laws of learning, this work focuses on an alternative way of defining Neural Networks, that is different from the majority of existing approaches. In particular, the structure of the neural architecture is defined by means of a special class of constraints that are extended also to the interaction with data, leading to "architectural" and "input-related" constraints, respectively. The proposed theory is cast into the time domain, in which data are presented to the network in an ordered manner, that makes this study an important step toward alternative ways of processing continuous streams of data with Neural Networks. The connection with the classic Backpropagation-based update rule of the weights of networks is discussed, showing that there are conditions under which our approach degenerates to Backpropagation. Moreover, the theory is experimentally evaluated on a simple problem that allows us to deeply study several aspects of the theory itself and to show the soundness of the model.

preprint2020arXiv

Focus of Attention Improves Information Transfer in Visual Features

Unsupervised learning from continuous visual streams is a challenging problem that cannot be naturally and efficiently managed in the classic batch-mode setting of computation. The information stream must be carefully processed accordingly to an appropriate spatio-temporal distribution of the visual data, while most approaches of learning commonly assume uniform probability density. In this paper we focus on unsupervised learning for transferring visual information in a truly online setting by using a computational model that is inspired to the principle of least action in physics. The maximization of the mutual information is carried out by a temporal process which yields online estimation of the entropy terms. The model, which is based on second-order differential equations, maximizes the information transfer from the input to a discrete space of symbols related to the visual features of the input, whose computation is supported by hidden neurons. In order to better structure the input probability distribution, we use a human-like focus of attention model that, coherently with the information maximization model, is also based on second-order differential equations. We provide experimental results to support the theory by showing that the spatio-temporal filtering induced by the focus of attention allows the system to globally transfer more information from the input stream over the focused areas and, in some contexts, over the whole frames with respect to the unfiltered case that yields uniform probability distributions.

preprint2020arXiv

Learning Visual Features Under Motion Invariance

Humans are continuously exposed to a stream of visual data with a natural temporal structure. However, most successful computer vision algorithms work at image level, completely discarding the precious information carried by motion. In this paper, we claim that processing visual streams naturally leads to formulate the motion invariance principle, which enables the construction of a new theory of learning that originates from variational principles, just like in physics. Such principled approach is well suited for a discussion on a number of interesting questions that arise in vision, and it offers a well-posed computational scheme for the discovery of convolutional filters over the retina. Differently from traditional convolutional networks, which need massive supervision, the proposed theory offers a truly new scenario for the unsupervised processing of video signals, where features are extracted in a multi-layer architecture with motion invariance. While the theory enables the implementation of novel computer vision systems, it also sheds light on the role of information-based principles to drive possible biological solutions.

preprint2020arXiv

Local Propagation in Constraint-based Neural Network

In this paper we study a constraint-based representation of neural network architectures. We cast the learning problem in the Lagrangian framework and we investigate a simple optimization procedure that is well suited to fulfil the so-called architectural constraints, learning from the available supervisions. The computational structure of the proposed Local Propagation (LP) algorithm is based on the search for saddle points in the adjoint space composed of weights, neural outputs, and Lagrange multipliers. All the updates of the model variables are locally performed, so that LP is fully parallelizable over the neural units, circumventing the classic problem of gradient vanishing in deep networks. The implementation of popular neural models is described in the context of LP, together with those conditions that trace a natural connection with Backpropagation. We also investigate the setting in which we tolerate bounded violations of the architectural constraints, and we provide experimental evidence that LP is a feasible approach to train shallow and deep networks, opening the road to further investigations on more complex architectures, easily describable by constraints.

preprint2020arXiv

Real-Time target detection in maritime scenarios based on YOLOv3 model

In this work a novel ships dataset is proposed consisting of more than 56k images of marine vessels collected by means of web-scraping and including 12 ship categories. A YOLOv3 single-stage detector based on Keras API is built on top of this dataset. Current results on four categories (cargo ship, naval ship, oil ship and tug ship) show Average Precision up to 96% for Intersection over Union (IoU) of 0.5 and satisfactory detection performances up to IoU of 0.8. A Data Analytics GUI service based on QT framework and Darknet-53 engine is also implemented in order to simplify the deployment process and analyse massive amount of images even for people without Data Science expertise.

preprint2011arXiv

Atomistic investigation of low-field mobility in graphene nanoribbons

We have investigated the main scattering mechanisms affecting mobility in graphene nanoribbons using detailed atomistic simulations. We have considered carrier scattering due to acoustic and optical phonons, edge roughness, single defects, and ionized impurities, and we have defined a methodology based on simulations of statistically meaningful ensembles of nanoribbon segments. Edge disorder heavily affects mobility at room temperature in narrower nanoribbons, whereas charged impurities and phonons are hardly the limiting factors. Results are favorably compared to the few experiments available in the literature.

preprint2011arXiv

Drift velocity peak and negative differential mobility in high field transport in graphene nanoribbons explained by numerical simulations

We present numerical simulations of high field transport in both suspended and deposited armchair graphene nanoribbon (A-GNR) on HfO2 substrate. Drift velocity in suspended GNR does not saturate at high electric field (F), but rather decreases, showing a maximum for F=10 kV/cm. Deposition on HfO2 strongly degrades the drift velocity by up to a factor of 10 with respect to suspended GNRs in the low-field regime, whereas at high fields drift velocity approaches the intrinsic value expected in suspended GNRs. Even in the assumption of perfect edges, the obtained mobility is far behind what expected in two-dimensional graphene, and is further reduced by surface optical phonons.

preprint2011arXiv

Strong mobility degradation in ideal graphene nanoribbons due to phonon scattering

We investigate the low-field phonon-limited mobility in armchair graphene nanoribbons (GNRs) using full-band electron and phonon dispersion relations. We show that lateral confinement suppresses the intrinsic mobility of GNRs to values typical of common bulk semiconductors, and very far from the impressive experiments on 2D graphene. Suspended GNRs with a width of 1 nm exhibit a mobility close to 500 cm^2/Vs at room temperature, whereas if the same GNRs are deposited on HfO2 mobility is further reduced to about 60 cm^2/Vs due to surface phonons. We also show the occurrence of polaron formation, leading to band gap renormalization of ~118 meV for 1 nm-wide armchair GNRs.

preprint2010arXiv

Enhanced shot noise in carbon nanotube field-effect transistors

We predict shot noise enhancement in defect-free carbon nanotube field-effect transistors through a numerical investigation based on the self-consistent solution of the Poisson and Schrodinger equations within the non-equilibrium Green functions formalism, and on a Monte Carlo approach to reproduce injection statistics. Noise enhancement is due to the correlation between trapping of holes from the drain into quasi-bound states in the channel and thermionic injection of electrons from the source, and can lead to an appreciable Fano factor of 1.22 at room temperature.

preprint2010arXiv

Shot noise suppression in quasi one-dimensional Field Effect Transistors

We present a novel method for the evaluation of shot noise in quasi one-dimensional field-effect transistors, such as those based on carbon nanotubes and silicon nanowires. The method is derived by using a statistical approach within the second quantization formalism and allows to include both the effects of Pauli exclusion and Coulomb repulsion among charge carriers. In this way it extends Landauer-Buttiker approach by explicitly including the effect of Coulomb repulsion on noise. We implement the method through the self-consistent solution of the 3D Poisson and transport equations within the NEGF framework and a Monte Carlo procedure for populating injected electron states. We show that the combined effect of Pauli and Coulomb interactions reduces shot noise in strong inversion down to 23 % of the full shot noise for a gate overdrive of 0.4 V, and that neglecting the effect of Coulomb repulsion would lead to an overestimation of noise up to 180 %.

preprint2010arXiv

Statistical theory of shot noise in quasi-1D Field Effect Transistors in the presence of electron-electron interaction

We present an expression for the shot noise power spectral density in quasi-one dimensional conductors electrostatically controlled by a gate electrode, that includes the effects of Coulomb interaction and of Pauli exclusion among charge carriers. In this sense, our expression extends the well known Landauer-Buttiker noise formula to include the effect of Coulomb interaction through induced fluctuations in the device potential. Our approach is based on the introduction of statistical properties of the scattering matrix and on a second-quantization many-body description. From a quantitative point of view, statistical properties are obtained by means of Monte Carlo simulations on a ensemble of different configurations of injected states, requiring the solution of the Poisson-Schrodinger equation on a three-dimensional grid, with the non-equilibrium Green functions formalism. In a series of example, we show that failure to consider the effects of Coulomb interaction on noise leads to a gross overestimation of the noise spectrum of quasi-one dimensional devices.

Alessandro Betti

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

A free boundary singular transport equation as a formal limit of a discrete dynamical system

A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery

A Multi-Stage model based on YOLOv3 for defect detection in PV panels based on IR and Visible Imaging by Unmanned Aerial Vehicle

Deep Learning to See: Towards New Foundations of Computer Vision

Forward Approximate Solution for Linear Quadratic Tracking

Stochastic Coherence Over Attention Trajectory For Continuous Learning In Video Streams

An Optimal Control Approach to Learning in SIDARTHE Epidemic model

A Machine Learning Model for Long-Term Power Generation Forecasting at Bidding Zone Level

Backprop Diffusion is Biologically Plausible

Developing Constrained Neural Units Over Time

Focus of Attention Improves Information Transfer in Visual Features

Learning Visual Features Under Motion Invariance

Local Propagation in Constraint-based Neural Network

Real-Time target detection in maritime scenarios based on YOLOv3 model

Atomistic investigation of low-field mobility in graphene nanoribbons

Drift velocity peak and negative differential mobility in high field transport in graphene nanoribbons explained by numerical simulations

Strong mobility degradation in ideal graphene nanoribbons due to phonon scattering

Enhanced shot noise in carbon nanotube field-effect transistors

Shot noise suppression in quasi one-dimensional Field Effect Transistors

Statistical theory of shot noise in quasi-1D Field Effect Transistors in the presence of electron-electron interaction