Effective ensembling classification strategy for voice and emotion recognition

Yasser Alharbi

doi:10.1007/s13198-022-01729-8

Back

Effective ensembling classification strategy for voice and emotion recognition

Journal article

Peer reviewed

Effective ensembling classification strategy for voice and emotion recognition

Yasser Alharbi

International journal of system assurance engineering and management

24/07/2022

DOI: https://doi.org/10.1007/s13198-022-01729-8

Abstract

Engineering

Engineering, Multidisciplinary

Science & Technology

Technology

Nowadays, Machine learning techniques are found to be unique among the most effective approaches for Voice and Emotion Recognition (VER). Moreover, automatic recognition of voice and emotions is essential for smooth psychosocial interactions between humans and machines. There have been huge strides in creating workable pieces of art that combine spectrogram and deep learning characteristics in the VER research. On the other hand, although single Machine Learning (ML) methods deliver acceptable results, it's not quite reaching the standards yet. This necessitates the development of strategies that use various ML techniques, target multiple aspects and elements of voice recognition. This article proposes an ensembling classifier model that incorporates the outcome of base classifiers (CapsNet and RNNs) for VER. The CapsNet model can identify the spatial correlation of vital speech information in spectrograms using a pooling technique. The RNN, on the other hand, is excellent for processing time-series datasets, and both are well known for their performance in classification work. Stacked generalization is used for constructing ensemble classifiers that integrate predictions made by CapsNet and RNN classifiers. As much as 96.05% of overall accuracy is obtained when using this ensemble approach, which is more effective than either CapsNets or RNN when individually compared. One of the significant benefits of the proposed classifier is that it effectively detects the emotional class 'FEAR', with a recognition rate of 96.68% among seven other classes.

Metrics

1 Record Views

Details

Title: Effective ensembling classification strategy for voice and emotion recognition
Creators - without role: Yasser Alharbi - University of Ha'il
Publication Details: International journal of system assurance engineering and management
Publisher: Springer Nature
Number of pages: 12
Identifiers: 9932306308331
Academic Unit: University Ha'il
Language: English
Resource Type: Journal article