Researcher profile

Marcos Faundez-Zanuy

Marcos Faundez-Zanuy contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
74works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

74 published item(s)

preprint2023arXiv

Exploration of Various Fractional Order Derivatives in Parkinson's Disease Dysgraphia Analysis

Parkinson's disease (PD) is a common neurodegenerative disorder with a prevalence rate estimated to 2.0% for people aged over 65 years. Cardinal motor symptoms of PD such as rigidity and bradykinesia affect the muscles involved in the handwriting process resulting in handwriting abnormalities called PD dysgraphia. Nowadays, online handwritten signal (signal with temporal information) acquired by the digitizing tablets is the most advanced approach of graphomotor difficulties analysis. Although the basic kinematic features were proved to effectively quantify the symptoms of PD dysgraphia, a recent research identified that the theory of fractional calculus can be used to improve the graphomotor difficulties analysis. Therefore, in this study, we follow up on our previous research, and we aim to explore the utilization of various approaches of fractional order derivative (FD) in the analysis of PD dysgraphia. For this purpose, we used the repetitive loops task from the Parkinson's disease handwriting database (PaHaW). Handwritten signals were parametrized by the kinematic features employing three FD approximations: Grünwald-Letnikov's, Riemann-Liouville's, and Caputo's. Results of the correlation analysis revealed a significant relationship between the clinical state and the handwriting features based on the velocity. The extracted features by Caputo's FD approximation outperformed the rest of the analyzed FD approaches. This was also confirmed by the results of the classification analysis, where the best model trained by Caputo's handwriting features resulted in a balanced accuracy of 79.73% with a sensitivity of 83.78% and a specificity of 75.68%.

preprint2023arXiv

Prodromal Diagnosis of Lewy Body Diseases Based on the Assessment of Graphomotor and Handwriting Difficulties

To this date, studies focusing on the prodromal diagnosis of Lewy body diseases (LBDs) based on quantitative analysis of graphomotor and handwriting difficulties are missing. In this work, we enrolled 18 subjects diagnosed with possible or probable mild cognitive impairment with Lewy bodies (MCI-LB), 7 subjects having more than 50% probability of developing Parkinson's disease (PD), 21 subjects with both possible/probable MCI-LB and probability of PD > 50%, and 37 age- and gender-matched healthy controls (HC). Each participant performed three tasks: Archimedean spiral drawing (to quantify graphomotor difficulties), sentence writing task (to quantify handwriting difficulties), and pentagon copying test (to quantify cognitive decline). Next, we parameterized the acquired data by various temporal, kinematic, dynamic, spatial, and task-specific features. And finally, we trained classification models for each task separately as well as a model for their combination to estimate the predictive power of the features for the identification of LBDs. Using this approach we were able to identify prodromal LBDs with 74% accuracy and showed the promising potential of computerized objective and non-invasive diagnosis of LBDs based on the assessment of graphomotor and handwriting difficulties.

preprint2022arXiv

A combination between VQ and covariance matrices for speaker recognition

This paper presents a new algorithm for speaker recognition based on the combination between the classical Vector Quantization (VQ) and Covariance Matrix (CM) methods. The combined VQ-CM method improves the identification rates of each method alone, with comparable computational burden. It offers a straightforward procedure to obtain a model similar to GMM with full covariance matrices. Experimental results also show that it is more robust against noise than VQ or CM alone.

preprint2022arXiv

A comparative study between linear and nonlinear speech prediction

This paper is focused on nonlinear prediction coding, which consists on the prediction of a speech sample based on a nonlinear combination of previous samples. It is known that in the generation of the glottal pulse, the wave equation does not behave linearly [2], [10], and we model these effects by means of a nonlinear prediction of speech based on a parametric neural network model. This work is centred on the neural net weight's quantization and on the compression gain.

preprint2022arXiv

A comparative study of in-air trajectories at short and long distances in online handwriting

Introduction Existing literature about online handwriting analysis to support pathology diagnosis has taken advantage of in-air trajectories. A similar situation occurred in biometric security applications where the goal is to identify or verify an individual using his signature or handwriting. These studies do not consider the distance of the pen tip to the writing surface. This is due to the fact that current acquisition devices do not provide height formation. However, it is quite straightforward to differentiate movements at two different heights: a) short distance: height lower or equal to 1 cm above a surface of digitizer, the digitizer provides x and y coordinates. b) long distance: height exceeding 1 cm, the only information available is a time stamp that indicates the time that a specific stroke has spent at long distance. Although short distance has been used in several papers, long distances have been ignored and will be investigated in this paper. Methods In this paper, we will analyze a large set of databases (BIOSECURID, EMOTHAW, PaHaW, Oxygen-Therapy and SALT), which contain a total amount of 663 users and 17951 files. We have specifically studied: a) the percentage of time spent on-surface, in-air at short distance, and in-air at long distance for different user profiles (pathological and healthy users) and different tasks; b) The potential use of these signals to improve classification rates. Results and conclusions Our experimental results reveal that long-distance movements represent a very small portion of the total execution time (0.5 % in the case of signatures and 10.4% for uppercase words of BIOSECUR-ID, which is the largest database). In addition, significant differences have been found in the comparison of pathological versus control group for letter l in PaHaW database (p=0.0157) and crossed pentagons in SALT database (p=0.0122)

preprint2022arXiv

A comparative study of several parameterizations for speaker recognition

This paper presents an exhaustive study about the robustness of several parameterizations, in speaker verification and identification tasks. We have studied several mismatch conditions: different recording sessions, microphones, and different languages (it has been obtained from a bilingual set of speakers). This study reveals that the combination of several parameterizations can improve the robustness in all the scenarios for both tasks, identification and verification. In addition, two different methods have been evaluated: vector quantization, and covariance matrices with an arithmetic-harmonic sphericity measure.

preprint2022arXiv

A modification of the conjugate direction method for motion estimation

A comparative study of different block matching alternatives for motion estimation is presented. The study is focused on computational burden and objective measures on the accuracy of prediction. Together with existing algorithms several new variations have been tested. An interesting modification of the conjugate direction method previously related in literature is reported. This new algorithm shows a good trade-off between computational complexity and accuracy of motion vector estimation. Computational complexity is evaluated using a sequence of artificial images designed to incorporate a great variety of motion vectors. The performance of block matching methods has been measured in terms of the entropy in the error signal between the motion compensated and the original frames.

preprint2022arXiv

A multimodal approach for Parkinson disease analysis

Parkinson's disease (PD) is the second most frequent neurodegenerative disease with prevalence among general population reaching 0.1-1 %, and an annual incidence between 1.3-2.0/10000 inhabitants. The mean age at diagnosis of PD is 55 and most patients are between 50 and 80 years old. The most obvious symptoms are movement-related; these include tremor, rigidity, slowness of movement and walking difficulties. Frequently these are the symptoms that lead to the PD diagnoses. Later, thinking and behavioral problems may arise, and other symptoms include cognitive impairment and sensory, sleep and emotional problems. In this paper we will present an ongoing project that will evaluate if voice and handwriting analysis can be reliable predictors/indicators of swallowing and balance impairments in PD. An important advantage of voice and handwritten analysis is its low intrusiveness and easy implementation in clinical practice. Thus, if a significant correlation between these simple analyses and the gold standard video-fluoroscopic analysis will imply simpler and less stressing diagnostic test for the patients as well as the use of cheaper analysis systems.

preprint2022arXiv

A Naturalistic Database of Thermal Emotional Facial Expressions and Effects of Induced Emotions on Memory

This work defines a procedure for collecting naturally induced emotional facial expressions through the vision of movie excerpts with high emotional contents and reports experimental data ascertaining the effects of emotions on memory word recognition tasks. The induced emotional states include the four basic emotions of sadness, disgust, happiness, and surprise, as well as the neutral emotional state. The resulting database contains both thermal and visible emotional facial expressions, portrayed by forty Italian subjects and simultaneously acquired by appropriately synchronizing a thermal and a standard visible camera. Each subject's recording session lasted 45 minutes, allowing for each mode (thermal or visible) to collect a minimum of 2000 facial expressions from which a minimum of 400 were selected as highly expressive of each emotion category. The database is available to the scientific community and can be obtained contacting one of the authors. For this pilot study, it was found that emotions and/or emotion categories do not affect individual performance on memory word recognition tasks and temperature changes in the face or in some regions of it do not discriminate among emotional states.

preprint2022arXiv

A new face database simultaneously acquired in visible, near infrared and thermal spectrum

In this paper we present a new database acquired with three different sensors (visible, near infrared and thermal) under different illumination conditions. This database consists of 41 people acquired in four different acquisition sessions, five images per session and three different illumination conditions. The total amount of pictures is 7.380 pictures. Experimental results are obtained through single sensor experiments as well as the combination of two and three sensors under different illumination conditions (natural, infrared and artificial illumination). We have found that the three spectral bands studied contribute in a nearly equal proportion to a combined system. Experimental results show a significant improvement combining the three spectrums, even when using a simple classifier and feature extractor. In six of the nine scenarios studied we obtained identification rates higher or equal to 98%, when using a trained combination rule, and two cases of nine when using a fixed rule.

preprint2022arXiv

A New Nonlinear speaker parameterization algorithm for speaker identification

In this paper we propose a new parameterization algorithm based on nonlinear prediction, which is an extension of the classical LPC parameters. The parameters performances are estimated by two different methods: the Arithmetic-Harmonic Sphericity (AHS) and the Auto-Regressive Vector Model (ARVM). Two different methods are proposed for the parameterization based on the Neural Predictive Coding (NPC): classical neural networks initialization and linear initialization. We applied these two parameters to speaker identification. The fist parameters obtained smaller rates. We show for the first parameters how they can be combined with the classical parameters (LPCC, MFCC, etc.) in order to improve the results of only one classical parameterization (MFCC provides 97.55% and MFCC+NPC 98.78%). For the linear initialization, we obtain 100% which is great improvement. This study opens a new way towards different parameterization schemes that offer better accuracy on speaker recognition tasks.

preprint2022arXiv

A Preliminary Study on Aging Examining Online Handwriting

In order to develop infocommunications devices so that the capabilities of the human brain may interact with the capabilities of any artificially cognitive system a deeper knowledge of aging is necessary. Especially if society does not want to exclude elder people and wants to develop automatic systems able to help and improve the quality of life of this group of population, healthy individuals as well as those with cognitive decline or other pathologies. This paper tries to establish the variations in handwriting tasks with the goal to obtain a better knowledge about aging. We present the correlation results between several parameters extracted from online handwriting and the age of the writers. It is based on BIOSECURID database, which consists of 400 people that provided several biometric traits, including online handwriting. The main idea is to identify those parameters that are more stable and those more age dependent. One challenging topic for disease diagnose is the differentiation between healthy and pathological aging. For this purpose, it is necessary to be aware of handwriting parameters that are, in general, not affected by aging and those who experiment changes, increase or decrease their values, because of it. This paper contributes to this research line analyzing a selected set of online handwriting parameters provided by a healthy group of population aged from 18 to 70 years. Preliminary results show that these parameters are not affected by aging and therefore, changes in their values can only be attributed to motor or cognitive disorders.

preprint2022arXiv

Adaptive hybrid speech coding with a MLP LPC structure

In the last years there has been a growing interest for nonlinear speech models. Several works have been published revealing the better performance of nonlinear techniques, but little attention has been dedicated to the implementation of the nonlinear model into real applications. This work is focused on the study of the behaviour of a combined linear/nonlinear predictive model based on linear predictive coding (LPC-10) and neural nets, in a speech waveform coder. Our novel scheme obtains an improvement in SEGSNR between 1 and 2.5 dB for an adaptive quantization ranging from 2 to 5 bits.

preprint2022arXiv

ADPCM with nonlinear prediction

Many speech coders are based on linear prediction coding (LPC), nevertheless with LPC is not possible to model the nonlinearities present in the speech signal. Because of this there is a growing interest for nonlinear techniques. In this paper we discuss ADPCM schemes with a nonlinear predictor based on neural nets, which yields an increase of 1-2.5dB in the SEGSNR over classical methods. This paper will discuss the block-adaptive and sample-adaptive predictions.

preprint2022arXiv

Applying multi-angled parallelism to Spanish topographical maps

Multi-Angled Parallelism (MAP) is a method to recognize lines in binary images. It is suitable to be implemented in parallel processing and image processing hardware. The binary image is transformed into directional planes, upon which, directional operators of erosion-dilation are iteratively applyed. From a set of basic operators, more complex ones are created, which let to extract the several types of lines. Each type is extracted with a different set of operations and so the lines are identified when extracted. In this paper, an overview of MAP is made, and it is adapted to line recognition in Spanish topographical maps, with the double purpose of testing the method in a real case and studying the process of adapting it to a custom application.

preprint2022arXiv

Automatic analysis of Categorical Verbal Fluency for Mild Cognitive Impartment detection: a non-linear language independent approach

Alzheimer's disease (AD) is one the main causes of dementia in the world and the patients develop severe disability and sometime full dependence. In previous stages Mild Cognitive Impairment (MCI) produces cognitive loss but not severe enough to interfere with daily life. This work, on selection of biomarkers from speech for the detection of AD, is part of a wide-ranging cross study for the diagnosis of Alzheimer. Specifically in this work a task for detection of MCI has been used. The task analyzes Categorical Verbal Fluency. The automatic classification is carried out by SVM over classical linear features, Castiglioni fractal dimension and Permutation Entropy. Finally the most relevant features are selected by ANOVA test. The promising results are over 50% for MCI

preprint2022arXiv

Biometric security technology

This paper presents an overview of the main topics related to biometric security technology, with the main purpose to provide a primer on this subject. Biometrics can offer greater security and convenience than traditional methods for people recognition. Even if we do not want to replace a classic method (password or handheld token) by a biometric one, for sure, we are potential users of these systems, which will even be mandatory for new passport models. For this reason, to be familiarized with the possibilities of biometric security technology is useful.

preprint2022arXiv

Biometric verification of humans by means of hand geometry

This paper describes a hand geometry biometric identification system. We have acquired a database of 22 people, 10 acquisitions per person, using a conventional document scanner. We propose a feature extraction and classifier. The experimental results reveal a maximum identification rate equal to 93.64%, and a minimum value of the detection cost function equal to 2.92% using a multi layer perceptron classifier.

preprint2022arXiv

Contribution of Different Handwriting Modalities to Differential Diagnosis of Parkinson's Disease

In this paper, we evaluate the contribution of different handwriting modalities to the diagnosis of Parkinson's disease. We analyse on-surface movement, in-air movement and pressure exerted on the tablet surface. Especially in-air movement and pressure-based features have been rarely taken into account in previous studies. We show that pressure and in-air movement also possess information that is relevant for the diagnosis of Parkinson's Disease (PD) from handwriting. In addition to the conventional kinematic and spatio-temporal features, we present a group of the novel features based on entropy and empirical mode decomposition of the handwriting signal. The presented results indicate that handwriting can be used as biomarker for PD providing classification performance around 89% area under the ROC curve (AUC) for PD classification.

preprint2022arXiv

Contributions to interframe coding

Advanced motion models (4 or 6 parameters) are needed for a good representation of the motion experimented by the different objects contained in a sequence of images. If the image is split in very small blocks, then an accurate description of complex movements can be achieved with only 2 parameters. This alternative implies a large set of vectors per image. We propose a new approach to reduce the number of vectors, using different block sizes as a function of the local characteristics of the image, without increasing the error accepted with the smallest blocks. A second algorithm is proposed for an inter/intraframe coder.

preprint2022arXiv

Digital Speech Algorithms for Speaker De-Identification

The present work is based on the COST Action IC1206 for De-identification in multimedia content. It was performed to test four algorithms of voice modifications on a speech gender recognizer to find the degree of modification of pitch when the speech recognizer have the probability of success equal to the probability of failure. The purpose of this analysis is to assess the intensity of the speech tone modification, the quality, the reversibility and not-reversibility of the changes made.

preprint2022arXiv

EMOTHAW: A novel database for emotional state recognition from handwriting

The detection of negative emotions through daily activities such as handwriting is useful for promoting well-being. The spread of human-machine interfaces such as tablets makes the collection of handwriting samples easier. In this context, we present a first publicly available handwriting database which relates emotional states to handwriting, that we call EMOTHAW. This database includes samples of 129 participants whose emotional states, namely anxiety, depression and stress, are assessed by the Depression Anxiety Stress Scales (DASS) questionnaire. Seven tasks are recorded through a digitizing tablet: pentagons and house drawing, words copied in handprint, circles and clock drawing, and one sentence copied in cursive writing. Records consist in pen positions, on-paper and in-air, time stamp, pressure, pen azimuth and altitude. We report our analysis on this database. From collected data, we first compute measurements related to timing and ductus. We compute separate measurements according to the position of the writing device: on paper or in-air. We analyse and classify this set of measurements (referred to as features) using a random forest approach. This latter is a machine learning method [2], based on an ensemble of decision trees, which includes a feature ranking process. We use this ranking process to identify the features which best reveal a targeted emotional state. We then build random forest classifiers associated to each emotional state. Our results, obtained from cross-validation experiments, show that the targeted emotional states can be identified with accuracies ranging from 60% to 71%.

preprint2022arXiv

Face identification by means of a neural net classifier

This paper describes a novel face identification method that combines the eigenfaces theory with the Neural Nets. We use the eigenfaces methodology in order to reduce the dimensionality of the input image, and a neural net classifier that performs the identification process. The method presented recognizes faces in the presence of variations in facial expression, facial details and lighting conditions. A recognition rate of more than 87% has been achieved, while the classical method of Turk and Pentland achieves a 75.5%.

preprint2022arXiv

Face recognition with small and large size databases

This paper presents experimental results using the ORL (40 people) and FERET (994 people) databases. The ORL database can be useful for securing applications where few users attempting to access are expected. This is the case, for instance, of a PDA or PC where the password is the face of the user. On the other hand, the FERET database is useful for studying those situations where the number of authorized users is around a thousand people.

preprint2022arXiv

Face segmentation: A comparison between visible and thermal images

Face segmentation is a first step for face biometric systems. In this paper we present a face segmentation algorithm for thermographic images. This algorithm is compared with the classic Viola and Jones algorithm used for visible images. Experimental results reveal that, when segmenting a multispectral (visible and thermal) face database, the proposed algorithm is more than 10 times faster, while the accuracy of face segmentation in thermal images is higher than in case of Viola-Jones

preprint2022arXiv

Gender classification by means of online uppercase handwriting: A text-dependent allographic approach

This paper presents a gender classification schema based on online handwriting. Using samples acquired with a digital tablet that captures the dynamics of the writing, it classifies the writer as a male or a female. The method proposed is allographic, regarding strokes as the structural units of handwriting. Strokes performed while the writing device is not exerting any pressure on the writing surface, pen-up (in-air) strokes, are also taken into account. The method is also text-dependent meaning that training and testing is done with exactly the same text. Text-dependency allows classification be performed with very small amounts of text. Experimentation, performed with samples from the BiosecurID database, yields results that fall in the range of the classification averages expected from human judges. With only four repetitions of a single uppercase word, the average rate of well classified writers is 68%; with sixteen words, the rate rises to an average 72.6%. Statistical analysis reveals that the aforementioned rates are highly significant. In order to explore the classification potential of the pen-up strokes, these are also considered. Although in this case results are not conclusive, an outstanding average of 74% of well classified writers is obtained when information from pen-up strokes is combined with information from pen-down ones.

preprint2022arXiv

HAIDA: Biometric technological therapy tools for neurorehabilitation of Cognitive Impairment

Dementia, and specially Alzheimer s disease (AD) and Mild Cognitive Impairment (MCI) are one of the most important diseases suffered by elderly population. Music therapy is one of the most widely used non-pharmacological treatment in the field of cognitive impairments, given that music influences their mood, behavior, the decrease of anxiety, as well as facilitating reminiscence, emotional expressions and movement. In this work we present HAIDA, a multi-platform support system for Musical Therapy oriented to cognitive impairment, which includes not only therapy tools but also non-invasive biometric analysis, speech, activity and hand activity. At this moment the system is on use and recording the first sets of data.

preprint2022arXiv

Hand Geometry Based Recognition with a MLP Classifier

This paper presents a biometric recognition system based on hand geometry. We describe a database specially collected for research purposes, which consists of 50 people and 10 different acquisitions of the right hand. This database can be freely downloaded. In addition, we describe a feature extraction procedure and we obtain experimental results using different classification strategies based on Multi Layer Perceptrons (MLP). We have evaluated identification rates and Detection Cost Function (DCF) values for verification applications. Experimental results reveal up to 100% identification and 0% DCF

preprint2022arXiv

Handwriting Biometrics: Applications and Future Trends in e-Security and e-Health

Background- This paper summarizes the state-of-the-art and applications based on online handwritting signals with special emphasis on e-security and e-health fields. Methods- In particular, we focus on the main achievements and challenges that should be addressed by the scientific community, providing a guide document for future research. Conclusions- Among all the points discussed in this article, we remark the importance of considering security, health, and metadata from a joint perspective. This is especially critical due to the double use possibilities of these behavioral signals.

preprint2022arXiv

Identification of Hypokinetic Dysarthria Using Acoustic Analysis of Poem Recitation

Up to 90 % of patients with Parkinson's disease (PD) suffer from hypokinetic dysarthria (HD). In this work, we analysed the power of conventional speech features quantifying imprecise articulation, dysprosody, speech dysfluency and speech quality deterioration extracted from a specialized poem recitation task to discriminate dysarthric and healthy speech. For this purpose, 152 speakers (53 healthy speakers, 99 PD patients) were examined. Only mildly strong correlation between speech features and clinical status of the speakers was observed. In the case of univariate classification analysis, sensitivity of 62.63% (imprecise articulation), 61.62% (dysprosody), 71.72% (speech dysfluency) and 59.60% (speech quality deterioration) was achieved. Multivariate classification analysis improved the classification performance. Sensitivity of 83.42% using only two features describing imprecise articulation and speech quality deterioration in HD was achieved. We showed the promising potential of the selected speech features and especially the use of poem recitation task to quantify and identify HD in PD.

preprint2022arXiv

Multi-class versus One-class classifier in spontaneous speech analysis oriented to Alzheimer Disease diagnosis

Most of medical developments require the ability to identify samples that are anomalous with respect to a target group or control group, in the sense they could belong to a new, previously unseen class or are not class data. In this case when there are not enough data to train two-class One-class classification appear like an available solution. On the other hand non-linear approaches could give very useful information. The aim of our project is to contribute to earlier diagnosis of AD and better estimates of its severity by using automatic analysis performed through new biomarkers extracted from speech signal. The methods selected in this case are speech biomarkers oriented to Spontaneous Speech and Emotional Response Analysis. In this approach One-class classifiers and two-class classifiers are analyzed. The use of information about outlier and Fractal Dimension features improves the system performance.

preprint2022arXiv

Multi-focus thermal image fusion

This paper proposes a novel algorithm for multi-focus thermal image fusion. The algorithm is based on local activity analysis and advanced pre-selection of images into fusion process. The algorithm improves the object temperature measurement error up to 5 Celsius degrees. The proposed algorithm is evaluated by half total error rate, root mean squared error, cross correlation and visual inspection. To the best of our knowledge, this is the first work devoted to multi-focus thermal image fusion. For testing of proposed algorithm we acquire six thermal image set with objects at different focal depth.

preprint2022arXiv

Non-linear predictive vector quantization of speech

In this paper we propose a Non-Linear Predictive Vector quantizer (PVQ) for speech coding, based on Multi-Layer Perceptrons. We also propose a method to evaluate if a quantizer is well designed, and if it exploits the correlation between consecutive outputs. Although the results of the Non-linear PVQ do not improve the results of the non-linear scalar predictor, we check that there is some room for the PVQ improvement.

preprint2022arXiv

Non-Linear Speech coding with MLP, RBF and Elman based prediction

In this paper we propose a nonlinear scalar predictor based on a combination of Multi Layer Perceptron, Radial Basis Functions and Elman networks. This system is applied to speech coding in an ADPCM backward scheme. The combination of this predictors improves the results of one predictor alone. A comparative study of this three neural networks for speech prediction is also presented.

preprint2022arXiv

Nonlinear predictive models computation in ADPCM schemes

Recently several papers have been published on nonlinear prediction applied to speech coding. At ICASSP98 we presented a system based on an ADPCM scheme with a nonlinear predictor based on a neural net. The most critical parameter was the training procedure in order to achieve good generalization capability and robustness against mismatch between training and testing conditions. In this paper, we propose several new approaches that improve the performance of the original system in up to 1.2dB of SEGSNR (using bayesian regularization). The variance of the SEGSNR between frames is also minimized, so the new scheme produces a more stable quality of the output.

preprint2022arXiv

On handwriting pressure normalization for interoperability of different acquisition stylus

In this paper, we present a pressure characterization and normalization procedure for online handwritten acquisition. Normalization process has been tested in biometric recognition experiments (identification and verification) using online signature database MCYT, which consists of the signatures from 330 users. The goal is to analyze the real mismatch scenarios where users are enrolled with one stylus and then, later on, they produce some testing samples using a different stylus model with different pressure response. Experimental results show: 1) a saturation behavior in pressure signal 2) different dynamic ranges in the different stylus studied 3) improved biometric recognition accuracy by means of pressure signal normalization as well as a performance degradation in mismatched conditions 4) interoperability between different stylus can be obtained by means of pressure normalization. Normalization produces an improvement in signature identification rates higher than 7% (absolute value) when compared with mismatched scenarios.

preprint2022arXiv

On the focusing of thermal images

In this paper we present a new thermographic image database suitable for the analysis of automatic focus measures. This database consists of 8 different sets of scenes, where each scene contains one image for 96 different focus positions. Using this database we evaluate the usefulness of six focus measures with the goal to determine the optimal focus position. Experimental results reveal that an accurate automatic detection of optimal focus position is possible, even with a low computational burden. We also present an acquisition tool able to help the acquisition of thermal images. To the best of our knowledge, this is the first study about automatic focus of thermal images.

preprint2022arXiv

On the Handwriting Tasks' Analysis to Detect Fatigue

Practical determination of physical recovery after intense exercise is a challenging topic that must include mechanical aspects as well as cognitive ones because most of physical sport activities, as well as professional activities (including brain computer interface-operated systems), require good shape in both of them. This paper presents a new online handwritten database of 20 healthy subjects. The main goal was to study the influence of several physical exercise stimuli in different handwritten tasks and to evaluate the recovery after strenuous exercise. To this aim, they performed different handwritten tasks before and after physical exercise as well as other measurements such as metabolic and mechanical fatigue assessment. Experimental results showed that although a fast mechanical recovery happens and can be measured by lactate concentrations and mechanical fatigue, this is not the case when cognitive effort is required. Handwriting analysis revealed that statistical differences exist on handwriting performance even after lactate concentration and mechanical assessment recovery. Conclusions: This points out a necessity of more recovering time in sport and professional activities than those measured in classic ways.

preprint2022arXiv

On the relevance of bandwidth extension for speaker identification

In this paper we discuss the relevance of bandwidth extension for speaker identification tasks. Mainly we want to study if it is possible to recognize voices that have been bandwith extended. For this purpose, we created two different databases (microphonic and ISDN) of speech signals that were bandwidth extended from telephone bandwidth ([300, 3400] Hz) to full bandwidth ([100, 8000] Hz). We have evaluated different parameterizations, and we have found that the MELCEPST parameterization can take advantage of the bandwidth extension algorithms in several situations.

preprint2022arXiv

On the Relevance of Bandwidth Extension for Speaker Verification

In this paper, we consider the effect of a bandwidth extension of narrow-band speech signals (0.3-3.4 kHz) to 0.3-8 kHz on speaker verification. Using covariance matrix based verification systems together with detection error trade-off curves, we compare the performance between systems operating on narrow-band, wide-band (0-8 kHz), and bandwidth-extended speech. The experiments were conducted using different short-time spectral parameterizations derived from microphone and ISDN speech databases. The studied bandwidth-extension algorithm did not introduce artifacts that affected the speaker verification task, and we achieved improvements between 1 and 10 percent (depending on the model order) over the verification system designed for narrow-band speech when mel-frequency cepstral coefficients for the short-time spectral parameterization were used.

preprint2022arXiv

On-line signature verification system with failure to enroll managing

In this paper we simulate a real biometric verification system based on on-line signatures. For this purpose we have split the MCYT signature database in three subsets: one for classifier training, another for system adjustment and a third one for system testing simulating enrollment and verification. This context corresponds to a real operation, where a new user tries to enroll an existing system and must be automatically guided by the system in order to detect the failure to enroll situations. The main contribution of this work is the management of failure to enroll situations by means of a new proposal, called intelligent enrollment, which consists of consistency checking in order to automatically reject low quality samples. This strategy lets to enhance the verification errors up to 22% when leaving out 8% of the users. In this situation 8% of the people cannot be enrolled in the system and must be verified by other biometrics or by human abilities. These people are identified with intelligent enrollment and the situation can be thus managed. In addition we also propose a DCT-based feature extractor with threshold coding and discriminability criteria.

preprint2022arXiv

Online handwriting, signature and touch dynamics: tasks and potential applications in the field of security and health

Background: An advantageous property of behavioural signals ,e.g. handwriting, in contrast to morphological ones, such as iris, fingerprint, hand geometry, etc., is the possibility to ask a user for a very rich amount of different tasks. Methods: This article summarises recent findings and applications of different handwriting and drawing tasks in the field of security and health. More specifically, it is focused on on-line handwriting and hand-based interaction, i.e. signals that utilise a digitizing device (specific devoted or general-purpose tablet/smartphone) during the realization of the tasks. Such devices permit the acquisition of on-surface dynamics as well as in-air movements in time, thus providing complex and richer information when compared to the conventional pen and paper method. Conclusions: Although the scientific literature reports a wide range of tasks and applications, in this paper, we summarize only those providing competitive results (e.g. in terms of discrimination power) and having a significant impact in the field.

preprint2022arXiv

Perceptual Features as Markers of Parkinson's Disease: The Issue of Clinical Interpretability

Up to 90% of patients with Parkinson&#39;s disease (PD) suffer from hypokinetic dysathria (HD) which is also manifested in the field of phonation. Clinical signs of HD like monoloudness, monopitch or hoarse voice are usually quantified by conventional clinical interpretable features (jitter, shimmer, harmonic-to-noise ratio, etc.). This paper provides large and robust insight into perceptual analysis of 5 Czech vowels of 84 PD patients and proves that despite the clinical inexplicability the perceptual features outperform the conventional ones, especially in terms of discrimination power (classification accuracy ACC = 92 %, sensitivity SEN = 93 %, specificity SPE = 92 %) and partial correlation with clinical scores like UPDRS (Unified Parkinson&#39;s disease rating scale), MMSE (Mini-mental state examination) or FOG (Freezing of gait questionnaire), where p < 0.0001.

preprint2022arXiv

Preliminary experiments on thermal emissivity adjustment for face images

In this paper we summarize several applications based on thermal imaging. We emphasize the importance of emissivity adjustment for a proper temperature measurement. A new set of face images acquired at different emissivity values with steps of 0.01 is also presented and will be distributed for free for research purposes. Among the utilities, we can mention: a) the possibility to apply corrections once an image is acquired with a wrong emissivity value and it is not possible to acquire a new one; b) privacy protection in thermal images, which can be obtained with a low emissivity factor, which is still suitable for several applications, but hides the identity of a user; c) image processing for improving temperature detection in scenes containing objects of different emissivity.

preprint2022arXiv

Privacy issues on biometric systems

In the XXIth century there is a strong interest on privacy issues. Technology permits obtaining personal information without individuals consent, computers make it feasible to share and process this information, and this can bring about damaging implications. In some sense, biometric information is personal information, so it is important to be conscious about what is true and what is false when some people claim that biometrics is an attempt to individuals privacy. In this paper, key points related to this matter are dealt with.

preprint2022arXiv

Reliability and Validity of the Polar V800 Sports Watch for Estimating Vertical Jump Height

This study aimed to assess the reliability and validity of the Polar V800 to measure vertical jump height. Twenty-two physically active healthy men (age: 22.89 +- 4.23 years; body mass: 70.74 +- 8.04 kg; height: 1.74 +- 0.76 m) were recruited for the study. The reliability was evaluated by comparing measurements acquired by the Polar V800 in two identical testing sessions one week apart. Validity was assessed by comparing measurements simultaneously obtained using a force platform (gold standard), high-speed camera and the Polar V800 during squat jump (SJ) and countermovement jump (CMJ) tests. In the test-retest reliability, high intraclass correlation coefficients (ICCs) were observed (mean: 0.90, SJ and CMJ) in the Polar V800. There was no significant systematic bias +- random errors (p > 0.05) between test-retest. Low coefficients of variation (<5%) were detected in both jumps in the Polar V800. In the validity assessment, similar jump height was detected among devices (p > 0.05). There was almost perfect agreement between the Polar V800 compared to a force platform for the SJ and CMJ tests (Mean ICCs = 0.95; no systematic bias +- random errors in SJ mean: -0.38 +- 2.10 cm, p > 0.05). Mean ICC between the Polar V800 versus high-speed camera was 0.91 for the SJ and CMJ tests, however, a significant systematic bias +- random error (0.97 +- 2.60 cm; p = 0.01) was detected in CMJ test. The Polar V800 offers valid, compared to force platform, and reliable information about vertical jump height performance in physically active healthy young men.

preprint2022arXiv

Robust and Complex Approach of Pathological Speech Signal Analysis

This paper presents a study of the approaches in the state-of-the-art in the field of pathological speech signal analysis with a special focus on parametrization techniques. It provides a description of 92 speech features where some of them are already widely used in this field of science and some of them have not been tried yet (they come from different areas of speech signal processing like speech recognition or coding). As an original contribution, this work introduces 36 completely new pathological voice measures based on modulation spectra, inferior colliculus coefficients, bicepstrum, sample and approximate entropy and empirical mode decomposition. The significance of these features was tested on 3 (English, Spanish and Czech) pathological voice databases with respect to classification accuracy, sensitivity and specificity.

preprint2022arXiv

Selection of entropy based features for the analysis of the Archimedes&#39; spiral applied to essential tremor

Biomedical systems are regulated by interacting mechanisms that operate across multiple spatial and temporal scales and produce biosignals with linear and non-linear information inside. In this sense entropy could provide a useful measure about disorder in the system, lack of information in time-series and/or irregularity of the signals. Essential tremor (ET) is the most common movement disorder, being 20 times more common than Parkinson&#39;s disease, and 50-70% of this disease cases are estimated to be genetic in origin. Archimedes spiral drawing is one of the most used standard tests for clinical diagnosis. This work, on selection of nonlinear biomarkers from drawings and handwriting, is part of a wide-ranging cross study for the diagnosis of essential tremor in BioDonostia Health Institute. Several entropy algorithms are used to generate nonlinear feayures. The automatic analysis system consists of several Machine Learning paradigms.

preprint2022arXiv

Speaker Identification Experiments Under Gender De-Identification

The present work is based on the COST Action IC1206 for De-identification in multimedia content. It was performed to test four algorithms of voice modifications on a speech gender recognizer to find the degree of modification of pitch when the speech recognizer have the probability of success equal to the probability of failure. The purpose of this analysis is to assess the intensity of the speech tone modification, the quality, the reversibility and not-reversibility of the changes made. Keywords DeIdentification; Speech Algorithms

preprint2022arXiv

Speaker recognition by means of a combination of linear and nonlinear predictive models

This paper deals the combination of nonlinear predictive models with classical LPCC parameterization for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over predictive analysis residual signal gives rise to an improvement over the classical method that considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%. An efficient algorithm for reducing the computational burden is also proposed.

preprint2022arXiv

Speaker recognition improvement using blind inversion of distortions

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of data fusion with and without nonlinear distortion compensation can improve the recognition rates with saturated test sentences from 80% to 88.57%, while the results with clean speech (without saturation) is 87.76% for one microphone.

preprint2022arXiv

Speaker recognition using residual signal of linear and nonlinear prediction models

This Paper discusses the usefulness of the residual signal for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure defined over the energy of the residual signal gives rise to an improvement over the classical method which considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analysis, the improvement is 2.63% (error rate drops from 6.31% to 3.68%) and if it is computed through a nonlinear predictive neural nets based model, the improvement is 3.68%.

preprint2022arXiv

Speaker verification in mismatch training and testing conditions

This paper presents an exhaustive study about the robustness of several parameterizations, with a new database specially acquired for the purpose of a speaker recognition application. This database includes the following variations: different recording sessions (including telephonic and microphonic recordings), recording rooms, and languages (it has been obtained from a bilingual set of speakers). This study has been performed with covariance matrices in a text independent speaker verification application. It reveals that the combination of several parameterizations can improve the robustness in all the scenarios.

preprint2022arXiv

Speech segmentation using multilevel hybrid filters

A novel approach for speech segmentation is proposed, based on Multilevel Hybrid (mean/min) Filters (MHF) with the following features: An accurate transition location. Good performance in noisy environments (gaussian and impulsive noise). The proposed method is based on spectral changes, with the goal of segmenting the voice into homogeneous acoustic segments. This algorithm is being used for phoneticallysegmented speech coder, with successful results.

preprint2022arXiv

Speech watermarking: an approach for the forensic analysis of digital telephonic recordings

In this article, the authors discuss the problem of forensic authentication of digital audio recordings. Although forensic audio has been addressed in several articles, the existing approaches are focused on analog magnetic recordings, which are less prevalent because of the large amount of digital recorders available on the market (optical, solid state, hard disks, etc.). An approach based on digital signal processing that consists of spread spectrum techniques for speech watermarking is presented. This approach presents the advantage that the authentication is based on the signal itself rather than the recording format. Thus, it is valid for usual recording devices in police-controlled telephone intercepts. In addition, our proposal allows for the introduction of relevant information such as the recording date and time and all the relevant data (this is not always possible with classical systems). Our experimental results reveal that the speech watermarking procedure does not interfere in a significant way with the posterior forensic speaker identification.

preprint2022arXiv

State-of-the-art in speaker recognition

Recent advances in speech technologies have produced new tools that can be used to improve the performance and flexibility of speaker recognition While there are few degrees of freedom or alternative methods when using fingerprint or iris identification techniques, speech offers much more flexibility and different levels for performing recognition: the system can force the user to speak in a particular manner, different for each attempt to enter. Also with voice input the system has other degrees of freedom, such as the use of knowledge/codes that only the user knows, or dialectical/semantical traits that are difficult to forge. This paper offers and overview of the state of the art in speaker recognition, with special emphasis on the pros and contras, and the current research lines. The current research lines include improved classification systems, and the use of high level information by means of probabilistic grammars. In conclusion, speaker recognition is far away from being a technology where all the possibilities have already been explored.

preprint2022arXiv

Study of a committee of neural networks for biometric hand-geometry recognition

This Paper studies different committees of neural networks for biometric pattern recognition. We use the neural nets as classifiers for identification and verification purposes. We show that a committee of nets can improve the recognition rates when compared with a multi-start initialization algo-rithm that just picks up the neural net which offers the best performance. On the other hand, we found that there is no strong correlation between identifi-cation and verification applications using the same classifier.

preprint2022arXiv

Technological evaluation of two AFIS systems

This paper provides a technological evaluation of two Automatic Fingerprint Identification Systems (AFIS) used in forensic applications. Both of them are installed and working in Spanish police premises. The first one is a Printrak AFIS 2000 system with a database of more than 450,000 fingerprints, while the second one is a NEC AFIS 21 SAID NT-LEXS Release 2.4.4 with a database of more than 15 million fingerprints. Our experiments reveal that although both systems can manage inkless fingerprints, the latest one offers better experimental results

preprint2022arXiv

Testing report of a fingerprint-based door-opening system

This paper describes the operational evaluation of a door-opening system based on a low-cost inkless fingerprint sensor. This system has been developed and installed for access control to one of our laboratories. Experimental results reveal that the system is working fine and no special cleaning requirements neither components replacement is needed. It can support more than 50 users, and an average of 74,5 access attempts per day in a 14-hour 5-day-per-week working. Emphasize is also given on some important facts to be taken into consideration when comparing and evaluating different products from different vendors.

preprint2022arXiv

The effect of fatigue on the performance of online writer recognition

Background: The performance of biometric modalities based on things done by the subject, like signature and text-based recognition, may be affected by the subject state. Fatigue is one of the conditions that can significantly affect the outcome of handwriting tasks. Recent research has already shown that physical fatigue produces measurable differences in some features extracted from common writing and drawing tasks. It is important to establish to which extent physical fatigue contributes to the intra-person variability observed in these biometric modalities and also to know whether the performance of recognition methods is affected by fatigue. Goal: In this paper we assess the impact of fatigue on intra-user variability and on the performance of signature-based and text-based writer recognition approaches encompassing both identification and verification. Methods: Several signature and text recognition methods are considered and applied to samples gathered after different levels of induced fatigue, measured by metabolic and mechanical assessment and, also by subjective perception. The recognition methods are Dynamic Time Warping and Multi Section Vector Quantization, for signatures, and Allographic Text-Dependent Recognition for text in capital letters. For each fatigue level, the identification and verification performance of these methods is measured. Results: Signature shows no statistically significant intra-user impact, but text does. On the other hand, performance of signature-based recognition approaches is negatively impacted by fatigue whereas the impact is not noticeable in text-based recognition, provided long enough sequences are considered.

preprint2022arXiv

Thermal hand image segmentation for biometric recognition

In this paper we present a method to identify people by means of thermal (TH) and visible (VIS) hand images acquired simultaneously with a TESTO 882-3 camera. In addition, we also present a new database specially acquired for this work. The real challenge when dealing with TH images is the cold finger areas, which can be confused with the acquisition surface. This problem is solved by taking advantage of the VIS information. We have performed different tests to show how TH and VIS images work in identification problems. Experimental results reveal that TH hand image is as suitable for biometric recognition systems as VIS hand images, and better results are obtained when combining this information. A Biometric Dispersion Matcher has been used as a feature vector dimensionality reduction technique as well as a classification task. Its selection criteria helps to reduce the length of the vectors used to perform identification up to a hundred measurements. Identification rates reach a maximum value of 98.3% under these conditions, when using a database of 104 people.

preprint2022arXiv

Wide band sub-band speech coding using nonlinear prediction

We compare a wide band sub-band speech coder using ADPCM schemes with linear prediction against the same scheme with nonlinear prediction based on multi-layer perceptrons. Exhaustive results are presented in each band, and the full signal. Our proposed scheme with non-linear neural net prediction outperforms the linear scheme up to 2 dB in SEGSNR. In addition, we propose a simple method based on a non-linearity in order to obtain a synthetic wide band signal from a narrow band signal.