Search for: [Keywords = "speech signal"]

Search results

Number of results: 3

items per page: 25 50 75

Sort by:

of 1

Speech and Music – Nonlinear Acoustical Decoding in Neurocognitive Scenario

Susmita Bhaduri Dipak Ghosh

Archives of Acoustics | 2018 | vol. 43 | No 4 | 593–602 | DOI: 10.24425/aoa.2018.125153

Keywords speech signal multifractality Visibility Graph Fractal Darwinism neurocognitive disorders

Download PDF Download RIS Download Bibtex

Abstract

Speech and music signals are multifractal phenomena. The time displacement profile of speech and music signal show strikingly different scaling behaviour. However, a full complexity analysis of their frequency and amplitude has not been made so far. We propose a novel complex network based approach (Visibility Graph) to study the scaling behaviour of frequency wise amplitude variation of speech and music signals over time and then extract their PSVG (Power of Scale freeness of Visibility Graph). From this analysis it emerges that the scaling behaviour of amplitude-profile of music varies a lot from frequency to frequency whereas it’s almost consistent for the speech signal. Our left auditory cortical areas are proposed to be neurocognitively specialised in speech perception and right ones in music. Hence we can conclude that human brain might have adapted to the distinctly different scaling behaviour of speech and music signals and developed different decoding mechanisms, as if following the so called Fractal Darwinism. Using this method, we can capture all non-stationary aspects of the acoustic properties of the source signal to the deepest level, which has huge neurocognitive significance. Further, we propose a novel non-invasive application to detect neurological illness (here autism spectrum disorder, ASD), using the quantitative parameters deduced from the variation of scaling behaviour for speech and music.

Go to article

Authors and Affiliations

Susmita Bhaduri

Dipak Ghosh

Acoustic Methods in Identifying Symptoms of Emotional States

Zuzanna Piątek Maciej Kłaczyński

Archives of Acoustics | 2021 | vol. 46 | No 2 | 259-269 | DOI: 10.24425/aoa.2021.136580

Keywords emotion recognition speech signal processing clustering analysis Sammon mapping

Download PDF Download RIS Download Bibtex

Abstract

The study investigates the use of speech signal to recognise speakers’ emotional states. The introduction includes the definition and categorization of emotions, including facial expressions, speech and physiological signals. For the purpose of this work, a proprietary resource of emotionally-marked speech recordings was created. The collected recordings come from the media, including live journalistic broadcasts, which show spontaneous emotional reactions to real-time stimuli. For the purpose of signal speech analysis, a specific script was written in Python. Its algorithm includes the parameterization of speech recordings and determination of features correlated with emotional content in speech. After the parametrization process, data clustering was performed to allows for the grouping of feature vectors for speakers into greater collections which imitate specific emotional states. Using the t-Student test for dependent samples, some descriptors were distinguished, which identified significant differences in the values of features between emotional states. Some potential applications for this research were proposed, as well as other development directions for future studies of the topic.

Go to article

Authors and Affiliations

Zuzanna Piątek

Maciej Kłaczyński

AGH University of Science and Technology, Faculty of Mechanical Engineering and Robotics, Department of Mechanics and Vibroacoustics, Cracow, Poland

Speech Emotion Recognition Based on Voice Fundamental Frequency

Teodora Dimitrova-Grekow Aneta Klis Magdalena Igras-Cybulska

Archives of Acoustics | 2019 | vol. 44 | No 2 | 277-286 | DOI: 10.24425/aoa.2019.128491

Keywords emotion recognition speech signal analysis voice analysis fundamental frequency speech corpora

Download PDF Download RIS Download Bibtex

Abstract

The human voice is one of the basic means of communication, thanks to which one also can easily convey the emotional state. This paper presents experiments on emotion recognition in human speech based on the fundamental frequency. AGH Emotional Speech Corpus was used. This database consists of audio samples of seven emotions acted by 12 different speakers (6 female and 6 male). We explored phrases of all the emotions – all together and in various combinations. Fast Fourier Transformation and magnitude spectrum analysis were applied to extract the fundamental tone out of the speech audio samples. After extraction of several statistical features of the fundamental frequency, we studied if they carry information on the emotional state of the speaker applying different AI methods. Analysis of the outcome data was conducted with classifiers: K-Nearest Neighbours with local induction, Random Forest, Bagging, JRip, and Random Subspace Method from algorithms collection for data mining WEKA. The results prove that the fundamental frequency is a prospective choice for further experiments.

Go to article

Authors and Affiliations

Teodora Dimitrova-Grekow

Aneta Klis

Magdalena Igras-Cybulska

e-mail:

ORCID:

Search results

Filters

Search results

Speech and Music – Nonlinear Acoustical Decoding in Neurocognitive Scenario

Abstract

Authors and Affiliations

Acoustic Methods in Identifying Symptoms of Emotional States

Abstract

Authors and Affiliations

Speech Emotion Recognition Based on Voice Fundamental Frequency

Abstract

Authors and Affiliations