SPeech ACoustic (SPAC): A novel tool for speech feature extraction and classification     
Yazarlar (2)
Prof. Dr. Turgut ÖZSEVEN Tokat Gaziosmanpaşa Üniversitesi, Türkiye
Muharrem Düğenci
Karabük Üniversitesi, Türkiye
Makale Türü Özgün Makale
Makale Alt Türü SSCI, AHCI, SCI, SCI-Exp dergilerinde yayınlanan tam makale
Dergi Adı Applied Acoustics
Dergi ISSN 0003-682X Wos Dergi Scopus Dergi
Dergi Tarandığı Indeksler SCI-Expanded
Dergi Grubu Q4
Makale Dili İngilizce
Basım Tarihi 02-2018
Cilt No 136
Sayı 1
Sayfalar 1 / 8
DOI Numarası 10.1016/j.apacoust.2018.02.009
Makale Linki http://linkinghub.elsevier.com/retrieve/pii/S0003682X18300070
Özet
Background and objective: The acoustic analysis, an objective evaluation method, is used to determine the descriptive attributes of the voices. Although there are many tools available in the literature for acoustic analysis, these tools are separated by features such as ease of use, visual interface, and acoustic parameter library. In this work, we have developed a new toolbox named SPAC for extracting and simulating attributes from speech files. Methods: SPAC has a modular structure and user-friendly interface, which will make up for the shortcomings of existing vehicles. In addition, modules can be used independently of each other. With SPAC, about 723 attributes can be extracted from each voice file in 9 categories. A validation test was applied to verify the validity of the toolbox-derived attributes. When the validation test was performed, the attributes obtained with Praat and OpenSMILE were grouped as standard, the attributes obtained with SPAC as test data, and the general differences between the attributes were evaluated with mean square error and mean percentage error. In another method used for verification, the classification performance is tested using the SPAC-derived attributes for classification. Results: According to the validation test results, SPAC attributes differ between 0.2% and 9.7% compared to other toolboxes. According to the results of the classification test, the SPAC attribute clusters can identify each class and the classification success varies between 1% and 3% according to the attributes obtained from other toolboxes. As a result, the attributes obtained with SPAC accurately describe the voice data. Conclusions: SPAC's superiority over existing toolboxes is that it has an easy-to-use user-friendly interface, it is modular, allows graphical representation of results, includes classification module and allows to work with SPAC data or data obtained from different toolboxes. In addition, operations performed with other tools can be performed more easily with SPAC.
Anahtar Kelimeler
Speech classification | Speech feature extraction | Speech processing toolbox | Speech toolbox