Perceptual Error Analysis of Human and Synthesized Voices

Perceptual Error Analysis of Human and Synthesized Voices

Author Englert, Marina Autor UNIFESP Google Scholar
Madazio, Glaucya Google Scholar
Gielow, Ingrid Google Scholar
Lucero, Jorge Google Scholar
Behlau, Mara Autor UNIFESP Google Scholar
Abstract Objective/ Hypothesis. To assess the quality of synthesized voices through listeners' skills in discriminating human and synthesized voices. Study Design. Prospective study. Methods. Eighteen human voices with different types and degrees of deviation (roughness, breathiness, and strain, with three degrees of deviation: mild, moderate, and severe) were selected by three voice specialists. Synthesized samples with the same deviations of human voices were produced by the VoiceSim system. The manipulated parameters were vocal frequency perturbation (roughness), additive noise (breathiness), increasing tension, subglottal pressure, and decreasing vocal folds separation (strain). Two hundred sixty-nine listeners were divided in three groups: voice specialist speech language pathologists (V-SLPs), general clinician SLPs (G-SLPs), and naive listeners (NLs). The SLP listeners also indicated the type and degree of deviation. Results. The listeners misclassified 39.3% of the voices, both synthesized (42.3%) and human (36.4%) samples (P = 0.001). V-SLPs presented the lowest error percentage considering the voice nature (34.6%)

G-SLPs and NLs identified almost half of the synthesized samples as human (46.9%, 45.6%). The male voices were more susceptible for misidentification. The synthesized breathy samples generated a greater perceptual confusion. The samples with severe deviation seemed to be more susceptible for errors. The synthesized female deviations were correctly classified. The male breathiness and strain were identified as roughness. Conclusion. VoiceSim produced stimuli very similar to the voices of patients with dysphonia. V-SLPs had a better ability to classify human and synthesized voices. VoiceSim is better to simulate vocal breathiness and female deviations

the male samples need adjustment.
Keywords Voice
Voice disorders
Auditory perception
Speech acoustic
xmlui.dri2xhtml.METS-1.0.item-coverage New York
Language English
Date 2017
Published in Journal Of Voice. New York, v. 31, n. 4, p. -, 2017.
ISSN 0892-1997 (Sherpa/Romeo, impact factor)
Publisher Mosby-Elsevier
Extent -
Type Article
Web of Science ID WOS:000406147000054

Show full item record


File Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)




My Account