Source author record

Kia Dashtipour

Kia Dashtipour appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

eess.AS Sound cs.CY Machine Learning Computation and Language Cryptography and Security eess.IV eess.SP physics.med-ph

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A Novel Chaos-based Light-weight Image Encryption Scheme for Multi-modal Hearing Aids

Multimodal hearing aids (HAs) aim to deliver more intelligible audio in noisy environments by contextually sensing and processing data in the form of not only audio but also visual information (e.g. lip reading). Machine learning techniques can play a pivotal role for the contextually processing of multimodal data. However, since the computational power of HA devices is low, therefore this data must be processed either on the edge or cloud which, in turn, poses privacy concerns for sensitive user data. Existing literature proposes several techniques for data encryption but their computational complexity is a major bottleneck to meet strict latency requirements for development of future multi-modal hearing aids. To overcome this problem, this paper proposes a novel real-time audio/visual data encryption scheme based on chaos-based encryption using the Tangent-Delay Ellipse Reflecting Cavity-Map System (TD-ERCS) map and Non-linear Chaotic (NCA) Algorithm. The results achieved against different security parameters, including Correlation Coefficient, Unified Averaged Changed Intensity (UACI), Key Sensitivity Analysis, Number of Changing Pixel Rate (NPCR), Mean-Square Error (MSE), Peak Signal to Noise Ratio (PSNR), Entropy test, and Chi-test, indicate that the newly proposed scheme is more lightweight due to its lower execution time as compared to existing schemes and more secure due to increased key-space against modern brute-force attacks.

preprint2022arXiv

A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals. Despite improving the speech quality, such approaches do not deliver required levels of speech intelligibility in everyday noisy environments . Intelligibility-oriented (I-O) loss functions have recently been developed to train DL approaches for robust speech enhancement. Here, we formulate, for the first time, a novel canonical correlation based I-O loss function to more effectively train DL algorithms. Specifically, we present a canonical-correlation based short-time objective intelligibility (CC-STOI) cost function to train a fully convolutional neural network (FCN) model. We carry out comparative simulation experiments to show that our CC-STOI based speech enhancement framework outperforms state-of-the-art DL models trained with conventional distance-based and STOI-based loss functions, using objective and subjective evaluation measures for case of both unseen speakers and noises. Ongoing future work is evaluating the proposed approach for design of robust hearing-assistive technology.

preprint2022arXiv

A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement

In acoustic signal processing, the target signals usually carry semantic information, which is encoded in a hierarchal structure of short and long-term contexts. However, the background noise distorts these structures in a nonuniform way. The existing deep acoustic signal enhancement (ASE) architectures ignore this kind of local and global effect. To address this problem, we propose to integrate a novel temporal attentive-pooling (TAP) mechanism into a conventional convolutional recurrent neural network, termed as TAP-CRNN. The proposed approach considers both global and local attention for ASE tasks. Specifically, we first utilize a convolutional layer to extract local information of the acoustic signals and then a recurrent neural network (RNN) architecture is used to characterize temporal contextual information. Second, we exploit a novelattention mechanism to contextually process salient regions of the noisy signals. The proposed ASE system is evaluated using a benchmark infant cry dataset and compared with several well-known methods. It is shown that the TAPCRNN can more effectively reduce noise components from infant cry signals in unseen background noises at challenging signal-to-noise levels.

preprint2022arXiv

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are generally trained to minimise the distance between clean and enhanced speech features. These often result in improved speech quality however they suffer from a lack of generalisation and may not deliver the required speech intelligibility in everyday noisy situations. In an attempt to address these challenges, researchers have explored intelligibility-oriented (I-O) loss functions to train DL approaches for robust speech enhancement (SE). In this paper, we formulate a novel canonical correlation-based I-O loss function to more effectively train DL algorithms. Specifically, we present a fully convolutional SE model that uses a modified canonical-correlation based short-time objective intelligibility (CC-STOI) metric as a training cost function. To the best of our knowledge, this is the first work that exploits the integration of canonical correlation in an I-O based loss function for SE. Comparative experimental results demonstrate that our proposed CC-STOI based SE framework outperforms DL models trained with conventional STOI and distance-based loss functions, in terms of both standard objective and subjective evaluation measures when dealing with unseen speakers and noises.

preprint2021arXiv

A Novel Context-Aware Multimodal Framework for Persian Sentiment Analysis

Most recent works on sentiment analysis have exploited the text modality. However, millions of hours of video recordings posted on social media platforms everyday hold vital unstructured information that can be exploited to more effectively gauge public perception. Multimodal sentiment analysis offers an innovative solution to computationally understand and harvest sentiments from videos by contextually exploiting audio, visual and textual cues. In this paper, we, firstly, present a first of its kind Persian multimodal dataset comprising more than 800 utterances, as a benchmark resource for researchers to evaluate multimodal sentiment analysis approaches in Persian language. Secondly, we present a novel context-aware multimodal sentiment analysis framework, that simultaneously exploits acoustic, visual and textual cues to more accurately determine the expressed sentiment. We employ both decision-level (late) and feature-level (early) fusion methods to integrate affective cross-modal information. Experimental results demonstrate that the contextual integration of multimodal features such as textual, acoustic and visual features deliver better performance (91.39%) compared to unimodal features (89.24%).

preprint2020arXiv

A Review on the State of the Art in Non Contact Sensing for COVID-19

COVID-19 disease, caused by SARS-CoV-2, has resulted in a global pandemic recently. With no approved vaccination or treatment, governments around the world have issued guidance to their citizens to remain at home in efforts to control the spread of the disease. The goal of controlling the spread of the virus is to prevent strain on hospital. In this paper, we have focus on how non-invasive methods are being used to detect the COVID-19 and assist healthcare workers in caring for COVID-19 patients. Early detection of the COVID-19 virus can allow for early isolation to prevent further spread. This study outlines the advantages and disadvantages and a breakdown of the methods applied in the current state-of-the-art approaches. In addition, the paper highlights some future research directions, which are required to be explored further to come up with innovative technologies to control this pandemic.

preprint2020arXiv

An Intelligent Non-Invasive Real Time Human Activity Recognition System for Next-Generation Healthcare

Human motion detection is getting considerable attention in the field of Artificial Intelligence (AI) driven healthcare systems. Human motion can be used to provide remote healthcare solutions for vulnerable people by identifying particular movements such as falls, gait and breathing disorders. This can allow people to live more independent lifestyles and still have the safety of being monitored if more direct care is needed. At present wearable devices can provide real time monitoring by deploying equipment on a person's body. However, putting devices on a person's body all the time make it uncomfortable and the elderly tends to forget it to wear as well in addition to the insecurity of being tracked all the time. This paper demonstrates how human motions can be detected in quasi-real-time scenario using a non-invasive method. Patterns in the wireless signals presents particular human body motions as each movement induces a unique change in the wireless medium. These changes can be used to identify particular body motions. This work produces a dataset that contains patterns of radio wave signals obtained using software defined radios (SDRs) to establish if a subject is standing up or sitting down as a test case. The dataset was used to create a machine learning model, which was used in a developed application to provide a quasi-real-time classification of standing or sitting state. The machine learning model was able to achieve 96.70 % accuracy using the Random Forest algorithm using 10 fold cross validation. A benchmark dataset of wearable devices was compared to the proposed dataset and results showed the proposed dataset to have similar accuracy of nearly 90 %. The machine learning models developed in this paper are tested for two activities but the developed system is designed and applicable for detecting and differentiating x number of activities.

Kia Dashtipour

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A Novel Chaos-based Light-weight Image Encryption Scheme for Multi-modal Hearing Aids

A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning

A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

A Novel Context-Aware Multimodal Framework for Persian Sentiment Analysis

A Review on the State of the Art in Non Contact Sensing for COVID-19

An Intelligent Non-Invasive Real Time Human Activity Recognition System for Next-Generation Healthcare