Source author record

Alireza Sadeghian

Alireza Sadeghian appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.data-an Biomolecules Artificial Intelligence Machine Learning Computer Vision Biological Physics Molecular Networks Neural and Evolutionary Computing Computational Engineering, Finance, and Science cs.CY Distributed, Parallel, and Cluster Computing Multiagent Systems physics.med-ph

Catalog footprint

What is connected

14works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Data-driven detrending of nonstationary fractal time series with echo state networks

In this paper, we propose a novel data-driven approach for removing trends (detrending) from nonstationary, fractal and multifractal time series. We consider real-valued time series relative to measurements of an underlying dynamical system that evolves through time. We assume that such a dynamical process is predictable to a certain degree by means of a class of recurrent networks called Echo State Network (ESN), which are capable to model a generic dynamical process. In order to isolate the superimposed (multi)fractal component of interest, we define a data-driven filter by leveraging on the ESN prediction capability to identify the trend component of a given input time series. Specifically, the (estimated) trend is removed from the original time series and the residual signal is analyzed with the multifractal detrended fluctuation analysis procedure to verify the correctness of the detrending procedure. In order to demonstrate the effectiveness of the proposed technique, we consider several synthetic time series consisting of different types of trends and fractal noise components with known characteristics. We also process a real-world dataset, the sunspot time series, which is well-known for its multifractal features and has recently gained attention in the complex systems field. Results demonstrate the validity and generality of the proposed detrending method based on ESNs.

preprint2015arXiv

A generative model for protein contact networks

In this paper we present a generative model for protein contact networks. The soundness of the proposed model is investigated by focusing primarily on mesoscopic properties elaborated from the spectra of the graph Laplacian. To complement the analysis, we study also classical topological descriptors, such as statistics of the shortest paths and the important feature of modularity. Our experiments show that the proposed model results in a considerable improvement with respect to two suitably chosen generative mechanisms, mimicking with better approximation real protein contact networks in terms of diffusion properties elaborated from the Laplacian spectra. However, as well as the other considered models, it does not reproduce with sufficient accuracy the shortest paths structure. To compensate this drawback, we designed a second step involving a targeted edge reconfiguration process. The ensemble of reconfigured networks denotes improvements that are statistically significant. As a byproduct of our study, we demonstrate that modularity, a well-known property of proteins, does not entirely explain the actual network architecture characterizing protein contact networks. In fact, we conclude that modularity, intended as a quantification of an underlying community structure, should be considered as an emergent property of the structural organization of proteins. Interestingly, such a property is suitably optimized in protein contact networks together with the feature of path efficiency.

preprint2015arXiv

Analysis of heat kernel highlights the strongly modular and heat-preserving structure of proteins

In this paper, we study the structure and dynamical properties of protein contact networks with respect to other biological networks, together with simulated archetypal models acting as probes. We consider both classical topological descriptors, such as the modularity and statistics of the shortest paths, and different interpretations in terms of diffusion provided by the discrete heat kernel, which is elaborated from the normalized graph Laplacians. A principal component analysis shows high discrimination among the network types, either by considering the topological and heat kernel based vector characterizations. Furthermore, a canonical correlation analysis demonstrates the strong agreement among those two characterizations, providing thus an important justification in terms of interpretability for the heat kernel. Finally, and most importantly, the focused analysis of the heat kernel provides a way to yield insights on the fact that proteins have to satisfy specific structural design constraints that the other considered networks do not need to obey. Notably, the heat trace decay of an ensemble of varying-size proteins denotes subdiffusion, a peculiar property of proteins.

preprint2015arXiv

Characterization of graphs for protein structure modeling and recognition of solubility

This paper deals with the relations among structural, topological, and chemical properties of the E.Coli proteome from the vantage point of the solubility/aggregation propensity of proteins. Each E.Coli protein is initially represented according to its known folded 3D shape. This step consists in representing the available E.Coli proteins in terms of graphs. We first analyze those graphs by considering pure topological characterizations, i.e., by analyzing the mass fractal dimension and the distribution underlying both shortest paths and vertex degrees. Results confirm the general architectural principles of proteins. Successively, we focus on the statistical properties of a representation of such graphs in terms of vectors composed of several numerical features, which we extracted from their structural representation. We found that protein size is the main discriminator for the solubility, while however there are other factors that help explaining the solubility degree. We finally analyze such data through a novel one-class classifier, with the aim of discriminating among very and poorly soluble proteins. Results are encouraging and consolidate the potential of pattern recognition techniques when employed to describe complex biological systems.

preprint2015arXiv

Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome

We evaluate a version of the recently-proposed classification system named Optimized Dissimilarity Space Embedding (ODSE) that operates in the input space of sequences of generic objects. The ODSE system has been originally presented as a classification system for patterns represented as labeled graphs. However, since ODSE is founded on the dissimilarity space representation of the input data, the classifier can be easily adapted to any input domain where it is possible to define a meaningful dissimilarity measure. Here we demonstrate the effectiveness of the ODSE classifier for sequences by considering an application dealing with the recognition of the solubility degree of the Escherichia coli proteome. Solubility, or analogously aggregation propensity, is an important property of protein molecules, which is intimately related to the mechanisms underlying the chemico-physical process of folding. Each protein of our dataset is initially associated with a solubility degree and it is represented as a sequence of symbols, denoting the 20 amino acid residues. The herein obtained computational results, which we stress that have been achieved with no context-dependent tuning of the ODSE system, confirm the validity and generality of the ODSE-based approach for structured data classification.

preprint2015arXiv

Data granulation by the principles of uncertainty

Researches in granular modeling produced a variety of mathematical models, such as intervals, (higher-order) fuzzy sets, rough sets, and shadowed sets, which are all suitable to characterize the so-called information granules. Modeling of the input data uncertainty is recognized as a crucial aspect in information granulation. Moreover, the uncertainty is a well-studied concept in many mathematical settings, such as those of probability theory, fuzzy set theory, and possibility theory. This fact suggests that an appropriate quantification of the uncertainty expressed by the information granule model could be used to define an invariant property, to be exploited in practical situations of information granulation. In this perspective, a procedure of information granulation is effective if the uncertainty conveyed by the synthesized information granule is in a monotonically increasing relation with the uncertainty of the input data. In this paper, we present a data granulation framework that elaborates over the principles of uncertainty introduced by Klir. Being the uncertainty a mesoscopic descriptor of systems and data, it is possible to apply such principles regardless of the input data type and the specific mathematical setting adopted for the information granules. The proposed framework is conceived (i) to offer a guideline for the synthesis of information granules and (ii) to build a groundwork to compare and quantitatively judge over different data granulation procedures. To provide a suitable case study, we introduce a new data granulation technique based on the minimum sum of distances, which is designed to generate type-2 fuzzy sets. We analyze the procedure by performing different experiments on two distinct data types: feature vectors and labeled graphs. Results show that the uncertainty of the input data is suitably conveyed by the generated type-2 fuzzy set models.

preprint2015arXiv

Discrimination and characterization of Parkinsonian rest tremors by analyzing long-term correlations and multifractal signatures

In this paper, we analyze 48 signals of rest tremor velocity related to 12 distinct subjects affected by Parkinson's disease. The subjects belong to two different groups, formed by four and eight subjects with, respectively, high- and low-amplitude rest tremors. Each subject is tested in four settings, given by combining the use of deep brain stimulation and L-DOPA medication. We develop two main feature-based representations of such signals, which are obtained by considering (i) the long-term correlations and multifractal properties, and (ii) the power spectra. The feature-based representations are initially utilized for the purpose of characterizing the subjects under different settings. In agreement with previous studies, we show that deep brain stimulation does not significantly characterize neither of the two groups, regardless of the adopted representation. On the other hand, the medication effect yields statistically significant differences in both high- and low-amplitude tremor groups. We successively test several different instances of the two feature-based representations of the signals in the setting of supervised classification and (nonlinear) feature transformation. We consider three different classification problems, involving the recognition of (i) the presence of medication, (ii) the use of deep brain stimulation, and (iii) the membership to the high- and low-amplitude tremor groups. Classification results show that the use of medication can be discriminated with higher accuracy, considering many of the feature-based representations. Notably, we show that the best results are obtained with a parsimonious, two-dimensional representation encoding the long-term correlations and multifractal character of the signals.

preprint2015arXiv

Entropic one-class classifiers

The one-class classification problem is a well-known research endeavor in pattern recognition. The problem is also known under different names, such as outlier and novelty/anomaly detection. The core of the problem consists in modeling and recognizing patterns belonging only to a so-called target class. All other patterns are termed non-target, and therefore they should be recognized as such. In this paper, we propose a novel one-class classification system that is based on an interplay of different techniques. Primarily, we follow a dissimilarity representation based approach; we embed the input data into the dissimilarity space by means of an appropriate parametric dissimilarity measure. This step allows us to process virtually any type of data. The dissimilarity vectors are then represented through a weighted Euclidean graphs, which we use to (i) determine the entropy of the data distribution in the dissimilarity space, and at the same time (ii) derive effective decision regions that are modeled as clusters of vertices. Since the dissimilarity measure for the input data is parametric, we optimize its parameters by means of a global optimization scheme, which considers both mesoscopic and structural characteristics of the data represented through the graphs. The proposed one-class classifier is designed to provide both hard (Boolean) and soft decisions about the recognition of test patterns, allowing an accurate description of the classification process. We evaluate the performance of the system on different benchmarking datasets, containing either feature-based or structured patterns. Experimental results demonstrate the effectiveness of the proposed technique.

preprint2015arXiv

On the impact of topological properties of smart grids in power losses optimization problems

Power losses reduction is one of the main targets for any electrical energy distribution company. In this paper, we face the problem of joint optimization of both topology and network parameters in a real smart grid. We consider a portion of the Italian electric distribution network managed by the ACEA Distribuzione S.p.A. located in Rome. We perform both the power factor correction (PFC) for tuning the generators and the distributed feeder reconfiguration (DFR) to set the state of the breakers. This joint optimization problem is faced considering a suitable objective function and by adopting genetic algorithms as global optimization strategy. We analyze admissible network configurations, showing that some of these violate constraints on current and voltage at branches and nodes. Such violations depend only on pure topological properties of the configurations. We perform tests by feeding the simulation environment with real data concerning hourly samples of dissipated and generated active and reactive power values of the ACEA smart grid. Results show that removing the configurations violating the electrical constraints from the solution space leads to interesting improvements in terms of power loss reduction. To conclude, we provide also an electrical interpretation of the phenomenon using graph-based pattern analysis techniques.

preprint2015arXiv

On the long-term correlations and multifractal properties of electric arc furnace time series

In this paper, we study long-term correlations and multifractal properties elaborated from time series of three-phase current signals coming from an industrial electric arc furnace plant. Implicit sinusoidal trends are suitably detected by considering the scaling of the fluctuation functions. Time series are then filtered via a Fourier-based analysis, removing hence such strong periodicities. In the filtered time series we detected long-term, positive correlations. The presence of positive correlations is in agreement with the typical V--I characteristic (hysteresis) of the electric arc furnace, providing thus a sound physical justification for the memory effects found in the current time series. The multifractal signature is strong enough in the filtered time series to be effectively classified as multifractal.

preprint2015arXiv

Position paper: a general framework for applying machine learning techniques in operating room

In this position paper we describe a general framework for applying machine learning and pattern recognition techniques in healthcare. In particular, we are interested in providing an automated tool for monitoring and incrementing the level of awareness in the operating room and for identifying human errors which occur during the laparoscopy surgical operation. The framework that we present is divided in three different layers: each layer implements algorithms which have an increasing level of complexity and which perform functionality with an higher degree of abstraction. In the first layer, raw data collected from sensors in the operating room during surgical operation, they are pre-processed and aggregated. The results of this initial phase are transferred to a second layer, which implements pattern recognition techniques and extract relevant features from the data. Finally, in the last layer, expert systems are employed to take high level decisions, which represent the final output of the system.

preprint2014arXiv

An Agent-Based Algorithm exploiting Multiple Local Dissimilarities for Clusters Mining and Knowledge Discovery

We propose a multi-agent algorithm able to automatically discover relevant regularities in a given dataset, determining at the same time the set of configurations of the adopted parametric dissimilarity measure yielding compact and separated clusters. Each agent operates independently by performing a Markovian random walk on a suitable weighted graph representation of the input dataset. Such a weighted graph representation is induced by the specific parameter configuration of the dissimilarity measure adopted by the agent, which searches and takes decisions autonomously for one cluster at a time. Results show that the algorithm is able to discover parameter configurations that yield a consistent and interpretable collection of clusters. Moreover, we demonstrate that our algorithm shows comparable performances with other similar state-of-the-art algorithms when facing specific clustering problems.

preprint2014arXiv

Modeling and Recognition of Smart Grid Faults by a Combined Approach of Dissimilarity Learning and One-Class Classification

Detecting faults in electrical power grids is of paramount importance, either from the electricity operator and consumer viewpoints. Modern electric power grids (smart grids) are equipped with smart sensors that allow to gather real-time information regarding the physical status of all the component elements belonging to the whole infrastructure (e.g., cables and related insulation, transformers, breakers and so on). In real-world smart grid systems, usually, additional information that are related to the operational status of the grid itself are collected such as meteorological information. Designing a suitable recognition (discrimination) model of faults in a real-world smart grid system is hence a challenging task. This follows from the heterogeneity of the information that actually determine a typical fault condition. The second point is that, for synthesizing a recognition model, in practice only the conditions of observed faults are usually meaningful. Therefore, a suitable recognition model should be synthesized by making use of the observed fault conditions only. In this paper, we deal with the problem of modeling and recognizing faults in a real-world smart grid system, which supplies the entire city of Rome, Italy. Recognition of faults is addressed by following a combined approach of multiple dissimilarity measures customization and one-class classification techniques. We provide here an in-depth study related to the available data and to the models synthesized by the proposed one-class classifier. We offer also a comprehensive analysis of the fault recognition results by exploiting a fuzzy set based reliability decision rule.

preprint2014arXiv

Multifractal Characterization of Protein Contact Networks

The multifractal detrended fluctuation analysis of time series is able to reveal the presence of long-range correlations and, at the same time, to characterize the self-similarity of the series. The rich information derivable from the characteristic exponents and the multifractal spectrum can be further analyzed to discover important insights about the underlying dynamical process. In this paper, we employ multifractal analysis techniques in the study of protein contact networks. To this end, initially a network is mapped to three different time series, each of which is generated by a stationary unbiased random walk. To capture the peculiarities of the networks at different levels, we accordingly consider three observables at each vertex: the degree, the clustering coefficient, and the closeness centrality. To compare the results with suitable references, we consider also instances of three well-known network models and two typical time series with pure monofractal and multifractal properties. The first result of notable interest is that time series associated to proteins contact networks exhibit long-range correlations (strong persistence), which are consistent with signals in-between the typical monofractal and multifractal behavior. Successively, a suitable embedding of the multifractal spectra allows to focus on ensemble properties, which in turn gives us the possibility to make further observations regarding the considered networks. In particular, we highlight the different role that small and large fluctuations of the considered observables play in the characterization of the network topology.

Alireza Sadeghian

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Data-driven detrending of nonstationary fractal time series with echo state networks

A generative model for protein contact networks

Analysis of heat kernel highlights the strongly modular and heat-preserving structure of proteins

Characterization of graphs for protein structure modeling and recognition of solubility

Classifying sequences by the optimized dissimilarity space embedding approach: a case study on the solubility analysis of the E. coli proteome

Data granulation by the principles of uncertainty

Discrimination and characterization of Parkinsonian rest tremors by analyzing long-term correlations and multifractal signatures

Entropic one-class classifiers

On the impact of topological properties of smart grids in power losses optimization problems

On the long-term correlations and multifractal properties of electric arc furnace time series

Position paper: a general framework for applying machine learning techniques in operating room

An Agent-Based Algorithm exploiting Multiple Local Dissimilarities for Clusters Mining and Knowledge Discovery

Modeling and Recognition of Smart Grid Faults by a Combined Approach of Dissimilarity Learning and One-Class Classification

Multifractal Characterization of Protein Contact Networks