Source author record

Archi Banerjee

Archi Banerjee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Sound eess.AS nlin.CD Computation and Language Machine Learning Multimedia Neurons and Cognition physics.class-ph physics.data-an

Catalog footprint

What is connected

10works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2021arXiv

A Fractal Approach to Characterize Emotions in Audio and Visual Domain: A Study on Cross-Modal Interaction

It is already known that both auditory and visual stimulus is able to convey emotions in human mind to different extent. The strength or intensity of the emotional arousal vary depending on the type of stimulus chosen. In this study, we try to investigate the emotional arousal in a cross-modal scenario involving both auditory and visual stimulus while studying their source characteristics. A robust fractal analytic technique called Detrended Fluctuation Analysis (DFA) and its 2D analogue has been used to characterize three (3) standardized audio and video signals quantifying their scaling exponent corresponding to positive and negative valence. It was found that there is significant difference in scaling exponents corresponding to the two different modalities. Detrended Cross Correlation Analysis (DCCA) has also been applied to decipher degree of cross-correlation among the individual audio and visual stimulus. This is the first of its kind study which proposes a novel algorithm with which emotional arousal can be classified in cross-modal scenario using only the source audio and visual signals while also attempting a correlation between them.

preprint2021arXiv

Language Independent Emotion Quantification using Non linear Modelling of Speech

At present emotion extraction from speech is a very important issue due to its diverse applications. Hence, it becomes absolutely necessary to obtain models that take into consideration the speaking styles of a person, vocal tract information, timbral qualities and other congenital information regarding his voice. Our speech production system is a nonlinear system like most other real world systems. Hence the need arises for modelling our speech information using nonlinear techniques. In this work we have modelled our articulation system using nonlinear multifractal analysis. The multifractal spectral width and scaling exponents reveals essentially the complexity associated with the speech signals taken. The multifractal spectrums are well distinguishable the in low fluctuation region in case of different emotions. The source characteristics have been quantified with the help of different non-linear models like Multi-Fractal Detrended Fluctuation Analysis, Wavelet Transform Modulus Maxima. The Results obtained from this study gives a very good result in emotion clustering.

preprint2021arXiv

Neural Network architectures to classify emotions in Indian Classical Music

Music is often considered as the language of emotions. It has long been known to elicit emotions in human being and thus categorizing music based on the type of emotions they induce in human being is a very intriguing topic of research. When the task comes to classify emotions elicited by Indian Classical Music (ICM), it becomes much more challenging because of the inherent ambiguity associated with ICM. The fact that a single musical performance can evoke a variety of emotional response in the audience is implicit to the nature of ICM renditions. With the rapid advancements in the field of Deep Learning, this Music Emotion Recognition (MER) task is becoming more and more relevant and robust, hence can be applied to one of the most challenging test case i.e. classifying emotions elicited from ICM. In this paper we present a new dataset called JUMusEmoDB which presently has 400 audio clips (30 seconds each) where 200 clips correspond to happy emotions and the remaining 200 clips correspond to sad emotion. For supervised classification purposes, we have used 4 existing deep Convolutional Neural Network (CNN) based architectures (resnet18, mobilenet v2.0, squeezenet v1.0 and vgg16) on corresponding music spectrograms of the 2000 sub-clips (where every clip was segmented into 5 sub-clips of about 5 seconds each) which contain both time as well as frequency domain information. The initial results are quite inspiring, and we look forward to setting the baseline values for the dataset using this architecture. This type of CNN based classification algorithm using a rich corpus of Indian Classical Music is unique even in the global perspective and can be replicated in other modalities of music also. This dataset is still under development and we plan to include more data containing other emotional features as well. We plan to make the dataset publicly available soon.

preprint2020arXiv

Acoustical classification of different speech acts using nonlinear methods

A recitation is a way of combining the words together so that they have a sense of rhythm and thus an emotional content is imbibed within. In this study we envisaged to answer these questions in a scientific manner taking into consideration 5 (five) well known Bengali recitations of different poets conveying a variety of moods ranging from joy to sorrow. The clips were recited as well as read (in the form of flat speech without any rhythm) by the same person to avoid any perceptual difference arising out of timbre variation. Next, the emotional content from the 5 recitations were standardized with the help of listening test conducted on a pool of 50 participants. The recitations as well as the speech were analyzed with the help of a latest non linear technique called Detrended Fluctuation Analysis (DFA) that gives a scaling exponent α, which is essentially the measure of long range correlations present in the signal. Similar pieces (the parts which have the exact lyrical content in speech as well as in the recital) were extracted from the complete signal and analyzed with the help of DFA technique. Our analysis shows that the scaling exponent for all parts of recitation were much higher in general as compared to their counterparts in speech. We have also established a critical value from our analysis, above which a mere speech may become a recitation. The case may be similar to the conventional phase transition, wherein the measurement of external condition at which the transformation occurs (generally temperature) is called phase transition. Further, we have also categorized the 5 recitations on the basis of their emotional content with the help of the same DFA technique. Analysis with a greater variety of recitations is being carried out to yield more interesting results.

preprint2020arXiv

Speaker Recognition in Bengali Language from Nonlinear Features

At present Automatic Speaker Recognition system is a very important issue due to its diverse applications. Hence, it becomes absolutely necessary to obtain models that take into consideration the speaking style of a person, vocal tract information, timbral qualities of his voice and other congenital information regarding his voice. The study of Bengali speech recognition and speaker identification is scarce in the literature. Hence the need arises for involving Bengali subjects in modelling our speaker identification engine. In this work, we have extracted some acoustic features of speech using non linear multifractal analysis. The Multifractal Detrended Fluctuation Analysis reveals essentially the complexity associated with the speech signals taken. The source characteristics have been quantified with the help of different techniques like Correlation Matrix, skewness of MFDFA spectrum etc. The Results obtained from this study gives a good recognition rate for Bengali Speakers.

preprint2016arXiv

A Non Linear Approach towards Automated Emotion Analysis in Hindustani Music

In North Indian Classical Music, raga forms the basic structure over which individual improvisations is performed by an artist based on his/her creativity. The Alap is the opening section of a typical Hindustani Music (HM) performance, where the raga is introduced and the paths of its development are revealed using all the notes used in that particular raga and allowed transitions between them with proper distribution over time. In India, corresponding to each raga, several emotional flavors are listed, namely erotic love, pathetic, devotional, comic, horrific, repugnant, heroic, fantastic, furious, peaceful. The detection of emotional cues from Hindustani Classical music is a demanding task due to the inherent ambiguity present in the different ragas, which makes it difficult to identify any particular emotion from a certain raga. In this study we took the help of a high resolution mathematical microscope (MFDFA or Multifractal Detrended Fluctuation Analysis) to procure information about the inherent complexities and time series fluctuations that constitute an acoustic signal. With the help of this technique, 3 min alap portion of six conventional ragas of Hindustani classical music namely, Darbari Kanada, Yaman, Mian ki Malhar, Durga, Jay Jayanti and Hamswadhani played in three different musical instruments were analyzed. The results are discussed in detail.

preprint2016arXiv

A Non Linear Multifractal Study to Illustrate the Evolution of Tagore Songs Over a Century

The works of Rabindranath Tagore have been sung by various artistes over generations spanning over almost 100 years. there are few songs which were popular in the early years and have been able to retain their popularity over the years while some others have faded away. In this study we look to find cues for the singing style of these songs which have kept them alive for all these years. For this we took 3 min clip of four Tagore songs which have been sung by five generation of artistes over 100 years and analyze them with the help of latest nonlinear techniques Multifractal Detrended Fluctuation Analysis (MFDFA). The multifractal spectral width is a manifestation of the inherent complexity of the signal and may prove to be an important parameter to identify the singing style of particular generation of singers and how this style varies over different generations. The results are discussed in detail.

preprint2016arXiv

Categorization of Stringed Instruments with Multifractal Detrended Fluctuation Analysis

Categorization is crucial for content description in archiving of music signals. On many occasions, human brain fails to classify the instruments properly just by listening to their sounds which is evident from the human response data collected during our experiment. Some previous attempts to categorize several musical instruments using various linear analysis methods required a number of parameters to be determined. In this work, we attempted to categorize a number of string instruments according to their mode of playing using latest-state-of-the-art robust non-linear methods. For this, 30 second sound signals of 26 different string instruments from all over the world were analyzed with the help of non linear multifractal analysis (MFDFA) technique. The spectral width obtained from the MFDFA method gives an estimate of the complexity of the signal. From the variation of spectral width, we observed distinct clustering among the string instruments according to their mode of playing. Also there is an indication that similarity in the structural configuration of the instruments is playing a major role in the clustering of their spectral width. The observations and implications are discussed in detail.

preprint2016arXiv

Categorization of Tablas by Wavelet Analysis

Tabla, a percussion instrument, mainly used to accompany vocalists, instrumentalists and dancers in every style of music from classical to light in India, mainly used for keeping rhythm. This percussion instrument consists of two drums played by two hands, structurally different and produces different harmonic sounds. Earlier work has done labeling tabla strokes from real time performances by testing neural networks and tree based classification methods. The current work extends previous work by C. V. Raman and S. Kumar in 1920 on spectrum modeling of tabla strokes. In this paper we have studied spectral characteristics (by wavelet analysis by sub band coding method and using torrence wavelet tool) of nine strokes from each of five tablas using Wavelet transform. Wavelet analysis is now a common tool for analyzing localized variations of power within a time series and to find the frequency distribution in time frequency space. Statistically, we will look into the patterns depicted by harmonics of different sub bands and the tablas. Distribution of dominant frequencies at different sub-band of stroke signals, distribution of power and behavior of harmonics are the important features, leads to categorization of tabla.

preprint2015arXiv

Harmonic and Timbre Analysis of Tabla Strokes

Indian twin drums mainly bayan and dayan (tabla) are the most important percussion instruments in India popularly used for keeping rhythm. It is a twin percussion/drum instrument of which the right hand drum is called dayan and the left hand drum is called bayan. Tabla strokes are commonly called as `bol', constitutes a series of syllables. In this study we have studied the timbre characteristics of nine strokes from each of five different tablas. Timbre parameters were calculated from LTAS of each stroke signals. Study of timbre characteristics is one of the most important deterministic approach for analyzing tabla and its stroke characteristics. Statistical correlations among timbre parameters were measured and also through factor analysis we get to know about the parameters of timbre analysis which are closely related. Tabla strokes have unique harmonic and timbral characteristics at mid frequency range and have no uniqueness at low frequency ranges.

Archi Banerjee

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

A Fractal Approach to Characterize Emotions in Audio and Visual Domain: A Study on Cross-Modal Interaction

Language Independent Emotion Quantification using Non linear Modelling of Speech

Neural Network architectures to classify emotions in Indian Classical Music

Acoustical classification of different speech acts using nonlinear methods

Speaker Recognition in Bengali Language from Nonlinear Features

A Non Linear Approach towards Automated Emotion Analysis in Hindustani Music

A Non Linear Multifractal Study to Illustrate the Evolution of Tagore Songs Over a Century

Categorization of Stringed Instruments with Multifractal Detrended Fluctuation Analysis

Categorization of Tablas by Wavelet Analysis

Harmonic and Timbre Analysis of Tabla Strokes