Abstract
A small scale Pashtu speakers' database with multiple accents and dialects has been developed to use in Pashtu Speaker Identification Systems (SIS) and accents and dialect identification systems. Pashtu is a major spoken language of Pakistan and Afghanistan. At present, it has become very prominent worldwide due to its regional importance. The regions of Pakistan and Afghanistan where Pashtu is spoken are mostly occupied by the extremists who use Pashtu for their communication. In order to design Pashtu voice-based systems for security and other applications, a database has been designed in which the voice data is collected from 32 native Pashtu speakers of different regions of Pakistan and Afghanistan. Finally, using a subset of the data, Multi-Layer Perceptron (MLP)-based SIS has been designed. The designed system achieved overall 87.5% identification accuracy and outperformed the recently proposed i-vector and GMM-based accent identification systems by showing 3.8% and 12.0% relative improvement respectively.