Abstract
•A new rhythm metric, Optimized Pairwise Variability Index (O-PVI), is proposed.•The O-PVI provides a generalization of conventional PVI rhythm metrics.•Particle Swarm Optimization (PSO) is used to select the best O-PVI parameters.•The combined PSO/O-PVI approach achieves best classification of Arabic native/non-native speakers.•Experiments compare interval- and PVI-based rhythm metrics.
This paper presents a technique that applies the pairwise variability index (PVI), a rhythm metric that quantifies variability in speech rhythm, to the classification of speech varieties. The technique combines the Particle Swarm Optimization (PSO) algorithm with a generalization of several rhythm metrics that are based on the PVI. The performance of this optimization-oriented classification is compared with classification that uses conventional (both PVI-based and interval-based) rhythm metrics. Application is made to the classification of native and non-native Arabic speech using data are from the West Point Arabic Speech Corpus; experiments are based on segmental durations and use Support Vector Machine (SVM) classification. Results show that the optimization-oriented classification provides a better discrimination between native and non-native speech varieties than classification based of the conventional rhythm metrics. When added to different combinations of these conventional metrics, the optimization-oriented procedure consistently improves classification rates.