Can listeners hear how many singers are singing? the effect of listener's experience, vibrato, onset, and formant frequency on the perception of number of simultaneous singers

Mary Erickson, Christopher S. Gaskill

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Objective/Hypothesis: This study investigated whether listener's experience, presence/absence of vibrato, formant frequency difference, or onset delay affect the ability of experienced and inexperienced listeners to segregate complex vocal stimuli. Study Design: Repeated measures factorial design. Methods: Two sets of stimuli were constructed: one with no vibrato and another with vibrato. For each set, each stimulus was synthesized at four pitches: A3, E4, B4, and F5. Stimuli were synthesized using formant patterns appropriate for the vowel ||. Frequencies for formants one through four were systematically varied from lower to higher in an attempt to simulate the acoustic results of corresponding changes in vocal tract length. Four formant patterns were synthesized (patterns A-D). Three pairs were created at each pitch, pairing the formants AB (mezzo-soprano/mezzo-soprano), CD (soprano/soprano), and AD (mezzo-soprano/soprano). Each of these three pairs was constructed in three separate conditions: simultaneous onset; the first voice in the pair with an onset delay of 100 milliseconds; and the second voice in the pair with an onset delay of 100 milliseconds. Using a scroll bar, listeners rated how difficult it was for them to hear each stimulus pair as two separate voices. Results: The most difficult combinations to segregate were produced with no vibrato and used simultaneous onset. The easiest conditions to segregate were combinations including a "soprano-like" formant pattern (D) in the vibrato condition. Overall, listener's experience did not affect the perceived difficulty of segregation; however, in the presence of vibrato cues, inexperienced listeners did not use delay cues as an aid in segregation in the same manner as did experienced listeners. Once vibrato was removed from the experimental context, inexperienced listeners were able to use delay to aid in segregation in a similar manner to experienced listeners. Conclusion: Presence/absence of vibrato, formant pattern difference, and onset delay interact in a complex manner to affect the perceived difficulty of voice segregation.

Original languageEnglish (US)
Pages (from-to)817.e1-817.e13
JournalJournal of Voice
Volume26
Issue number6
DOIs
StatePublished - Jan 1 2012

Fingerprint

Singing
Cues
Aptitude
Acoustics

All Science Journal Classification (ASJC) codes

  • Otorhinolaryngology
  • Speech and Hearing
  • LPN and LVN

Cite this

@article{145608beeda5493fb18ff1e99af6d43b,
title = "Can listeners hear how many singers are singing? the effect of listener's experience, vibrato, onset, and formant frequency on the perception of number of simultaneous singers",
abstract = "Objective/Hypothesis: This study investigated whether listener's experience, presence/absence of vibrato, formant frequency difference, or onset delay affect the ability of experienced and inexperienced listeners to segregate complex vocal stimuli. Study Design: Repeated measures factorial design. Methods: Two sets of stimuli were constructed: one with no vibrato and another with vibrato. For each set, each stimulus was synthesized at four pitches: A3, E4, B4, and F5. Stimuli were synthesized using formant patterns appropriate for the vowel ||. Frequencies for formants one through four were systematically varied from lower to higher in an attempt to simulate the acoustic results of corresponding changes in vocal tract length. Four formant patterns were synthesized (patterns A-D). Three pairs were created at each pitch, pairing the formants AB (mezzo-soprano/mezzo-soprano), CD (soprano/soprano), and AD (mezzo-soprano/soprano). Each of these three pairs was constructed in three separate conditions: simultaneous onset; the first voice in the pair with an onset delay of 100 milliseconds; and the second voice in the pair with an onset delay of 100 milliseconds. Using a scroll bar, listeners rated how difficult it was for them to hear each stimulus pair as two separate voices. Results: The most difficult combinations to segregate were produced with no vibrato and used simultaneous onset. The easiest conditions to segregate were combinations including a {"}soprano-like{"} formant pattern (D) in the vibrato condition. Overall, listener's experience did not affect the perceived difficulty of segregation; however, in the presence of vibrato cues, inexperienced listeners did not use delay cues as an aid in segregation in the same manner as did experienced listeners. Once vibrato was removed from the experimental context, inexperienced listeners were able to use delay to aid in segregation in a similar manner to experienced listeners. Conclusion: Presence/absence of vibrato, formant pattern difference, and onset delay interact in a complex manner to affect the perceived difficulty of voice segregation.",
author = "Mary Erickson and Gaskill, {Christopher S.}",
year = "2012",
month = "1",
day = "1",
doi = "10.1016/j.jvoice.2012.04.011",
language = "English (US)",
volume = "26",
pages = "817.e1--817.e13",
journal = "Journal of Voice",
issn = "0892-1997",
publisher = "Mosby Inc.",
number = "6",

}

TY - JOUR

T1 - Can listeners hear how many singers are singing? the effect of listener's experience, vibrato, onset, and formant frequency on the perception of number of simultaneous singers

AU - Erickson, Mary

AU - Gaskill, Christopher S.

PY - 2012/1/1

Y1 - 2012/1/1

N2 - Objective/Hypothesis: This study investigated whether listener's experience, presence/absence of vibrato, formant frequency difference, or onset delay affect the ability of experienced and inexperienced listeners to segregate complex vocal stimuli. Study Design: Repeated measures factorial design. Methods: Two sets of stimuli were constructed: one with no vibrato and another with vibrato. For each set, each stimulus was synthesized at four pitches: A3, E4, B4, and F5. Stimuli were synthesized using formant patterns appropriate for the vowel ||. Frequencies for formants one through four were systematically varied from lower to higher in an attempt to simulate the acoustic results of corresponding changes in vocal tract length. Four formant patterns were synthesized (patterns A-D). Three pairs were created at each pitch, pairing the formants AB (mezzo-soprano/mezzo-soprano), CD (soprano/soprano), and AD (mezzo-soprano/soprano). Each of these three pairs was constructed in three separate conditions: simultaneous onset; the first voice in the pair with an onset delay of 100 milliseconds; and the second voice in the pair with an onset delay of 100 milliseconds. Using a scroll bar, listeners rated how difficult it was for them to hear each stimulus pair as two separate voices. Results: The most difficult combinations to segregate were produced with no vibrato and used simultaneous onset. The easiest conditions to segregate were combinations including a "soprano-like" formant pattern (D) in the vibrato condition. Overall, listener's experience did not affect the perceived difficulty of segregation; however, in the presence of vibrato cues, inexperienced listeners did not use delay cues as an aid in segregation in the same manner as did experienced listeners. Once vibrato was removed from the experimental context, inexperienced listeners were able to use delay to aid in segregation in a similar manner to experienced listeners. Conclusion: Presence/absence of vibrato, formant pattern difference, and onset delay interact in a complex manner to affect the perceived difficulty of voice segregation.

AB - Objective/Hypothesis: This study investigated whether listener's experience, presence/absence of vibrato, formant frequency difference, or onset delay affect the ability of experienced and inexperienced listeners to segregate complex vocal stimuli. Study Design: Repeated measures factorial design. Methods: Two sets of stimuli were constructed: one with no vibrato and another with vibrato. For each set, each stimulus was synthesized at four pitches: A3, E4, B4, and F5. Stimuli were synthesized using formant patterns appropriate for the vowel ||. Frequencies for formants one through four were systematically varied from lower to higher in an attempt to simulate the acoustic results of corresponding changes in vocal tract length. Four formant patterns were synthesized (patterns A-D). Three pairs were created at each pitch, pairing the formants AB (mezzo-soprano/mezzo-soprano), CD (soprano/soprano), and AD (mezzo-soprano/soprano). Each of these three pairs was constructed in three separate conditions: simultaneous onset; the first voice in the pair with an onset delay of 100 milliseconds; and the second voice in the pair with an onset delay of 100 milliseconds. Using a scroll bar, listeners rated how difficult it was for them to hear each stimulus pair as two separate voices. Results: The most difficult combinations to segregate were produced with no vibrato and used simultaneous onset. The easiest conditions to segregate were combinations including a "soprano-like" formant pattern (D) in the vibrato condition. Overall, listener's experience did not affect the perceived difficulty of segregation; however, in the presence of vibrato cues, inexperienced listeners did not use delay cues as an aid in segregation in the same manner as did experienced listeners. Once vibrato was removed from the experimental context, inexperienced listeners were able to use delay to aid in segregation in a similar manner to experienced listeners. Conclusion: Presence/absence of vibrato, formant pattern difference, and onset delay interact in a complex manner to affect the perceived difficulty of voice segregation.

UR - http://www.scopus.com/inward/record.url?scp=84870067189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870067189&partnerID=8YFLogxK

U2 - 10.1016/j.jvoice.2012.04.011

DO - 10.1016/j.jvoice.2012.04.011

M3 - Article

VL - 26

SP - 817.e1-817.e13

JO - Journal of Voice

JF - Journal of Voice

SN - 0892-1997

IS - 6

ER -