Reading List & References 2002
S702: Acoustic Research
in Speech, Language and Hearing Sciences
April 3, 2002
1. Introduction and Review Acoustics and Phonetics
Monsen, R.B. and Engebretson, A.M. (1983). The accuracy of formant frequency measurements: A comparison of spectrographic analysis and Linear Prediction. JSHR 26, 89-97.
Peterson, G. (1952). Parameter relationships in the portrayal of signals with sound spectrograh techniques. JSHD, 17, 427-432.
2. Acoustic Properties of Vowels
Ferguson, S. H. and Kewley-Port, D. (2002) Vowel intelligibility in clear and conversational speech for normal-hearing and hearing-impaired listeners. Submitted.
Hillenbrand, J., Getty, L.J., Clark, M.J., & Wheeler, K. (1995). Acoustic characteristics of American English vowels. J. Acoust. Soc. Am., 97, 3099-3111.
Hillenbrand, J., Clark, M.J., & Nearey, T. (2001). Effects of consonant environments on vowel formant patterns. J. Acoust. Soc. Am., 109, 748-763.
Jenkins, J.J., Strange, W., & Miranda, S. (1994). Vowel identification in mixedspeaker silentcenter syllables. J. Acoust. Soc. Am., 95, 10301043.
Jenkins, J.J., Strange, W., &Trent, S. (1999). Context-independent dynamic information for the perception of coarticulated vowels. J. Acoust. Soc. Am., 106, 438-448.
Kewley-Port, D. & Zheng, Y. (1999). Vowel formant discrimination: Towards more ordinary listening conditions. J. Acoust. Soc. Am. 106, 2945-2958.
Miller, J.D.. (1989). Auditoryperceptual interpretation of the vowel. J. Acoust. Soc. Am., 85, 2114-2134.
Nearey, T. and Assmann, P. (1986). "Modeling the role of inherent spectral change in vowel identification," Journal of the Acoustical Society of America, 80, 1297-1308.
Peterson, G.E., & Barney, H.L. (1952). Control methods used in a study of the vowels. J. Acoust. Soc. Am., 24, 175184.
Strange, W. and Bohn, O. (1998). Dynamic specification of coarticulated German vowels: Perception and acoustical studies. J. Acoust. Soc. Am., 104, 488-504.
and Stops Consonants.
Kewley-Port, D. (1982). "Measurement of formant transitions in naturally produced stop consonant-vowel syllables.'' J. Acoust. Soc. Am., 72, 379-389.
Kewley-Port, D. (1983). Time-varying features as correlates of place of articulation in stop consonants. J. Acoust. Soc. Am., 73, 322-335.
Kewley-Port, D., D.B. Pisoni and M. Studdert-Kennedy, (1983). ``Perception of static and dynamic acoustic cues to place of articulation in initial stop consonants.'' J. Acoust. Soc. Am. 73,1779-1793.
Smits, R., ten Bosch, L., and Collier, R. ( 1996a) Evaluation of various sets of acoustic cues for the perception of prevocalic stop consonants. I. Perception experiment. The Journal of the Acoustical Society of America 100, 3852-3864
Smits, R., ten Bosch, L., and Collier, R. ( 1996b) Evaluation of various sets of acoustic cues for the perception of prevocalic stop consonants. II. Modeling and evaluation. The Journal of the Acoustical Society of America 100, 3865-3881.
Sussman, H., Hoemeke, K. and Ahmed, F. (1993). A cross-linguistic investigation of locus equations as a phonetic descriptor for place of articulation. J. Acoust. Soc. Am., 94, 1256-1268.
3. Speech Perception
Frieda, E., Walley, A. Flege, J and Sloane, M. (1999). Adults' perception of native and nonnative vowels; Implications for the perceptual magnet effect. Perception & Psychophysics 61, 561-577.
Kewley-Port, D., C.S. Watson & D.C. Foyle (1988). Auditory temporal acuity in relation to category boundaries; speech and nonspeech stimuli. J. Acoust. Soc. Am., 83, 1122-1145.
Kluender, K. and Lotto, A. (1999). Virtues and perils of an empiricist approach to speech perception. J. Acoust. Soc. Am., 105, 503-511.
Kuhl, P. (1991). Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories, monkeys do not. Perception & Psychophysics 50, 93-107.
Iverson, P. and Kuhl, P. (2000). Perceptual magnet and phoneme boundary effects in speech perceptoin: Do they arise from a common mechanism? Perception & Psychophysics 62, 874-886.
Liberman, A.M. and Mattingly, I.G. (1989). A specialization for speech perception. Science, 243, 489-494.
Liberman, A.M.., & Mattingly, I.G. (1985). The motor theory of speech perception revised. Cognition, 21, 1-36.
Liberman, A.M. (1997). When theories of speech meet the real world. J. Psycholinguistic Res. 27, 111-122.
Lindblom, B.E.F (1996) Role of articulation in speech perception: Clues from production. JASA 99, 1683-1692.
Lotto, A., Kluender, K. and Holt, L. (1998). Depolarizing the perceptual magnet effect. JASA, 103, 3648-3655.
Sussman, J. and Lauckner-Morano, V. (1995). Further tests of the "perceptual magnet effect" in the perception of [i]: Identificaiton and change/no-change discrimination. JASA 97, 539-552.
Thyer, N. and Hickson, L. (2000). The perceptual magnet effect in Australian English vowels. Perception & Psychophysics 62, 1-20.
Utman, J. (1998). Effects of local speaking rate context on the perceptiono of voice-onset time in inital stop consonants. J. Acoust. Soc. Am., 103, 1640-1653.
Volaitis, L.E., & Miller, J.L. (1992). Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories. J. Acoust. Soc. Am., 92, 723-735.
4. Speech Synthesis and Analysis
ATT Natural Voices: Research News
Kawahara, H. (1999). "Restructuring speech representations using a pitch-adaptive time-frequency smoothing ..", Speech Communication 27, 187-207.
DECtalk Software: Text-to-Speech Technology by Hallahan
Klatt, D.H. (1980). Software for cascade/parallel formant synthesizer. J. Acoust. Soc. Am., 67, 971-995.
Klatt, D.H. (1987). Review of text-to-speech conversion for English. J. Acoust. Soc. Am., 82, 737-793.
Markel, J.D. (1972). Digital inverse filtering - A new tool for formant trajectory estimation. IEEE Transactions on Audio and Electroacoustics, June, 129-137.
Smits, R. (1994). Accuracy of quasistationary analysis of highly dynamic speech signals, Journal of the Acoustical Society of America, 96, 3401-3415
5. Infant and Child Production
Assmann, P. and Katz, W. (2000) Time-varying spectral change in the vowels of children and adults. J. Acoust. Soc. Am., 108, 1856-1866
Eilers, R. and Oller, K. (1994). Infant vocalizations and the early diagnosis of severe hearing impairment. J. Pediatrics 124, 199-203.
Howell, P., Au-Yeung, J. and Pilgrim, L. (1999) Utterance rate and linguistic properties as determinants of lexical dysfluencies in children who stutter. J. Acoust. Soc. Am., 105, 481-489.
Kewley-Port, D. and Preston, M.S. (1974). Early apical stop production: A voice onset time analysis. Journal of Phonetics, 2, 195-210.
Lee, S., Potamiansos, A. and Narayanan, S. (1999) Acoustics of children's speech: Developmental changes of temporal and spectral parameters. J. Acoust. Soc. Am., 105, 1455-1468
Oller, D.K. and Eilers, R.E. (1988). The role of audition in infant babbling. Child Development 59, 441-449.
Stark, R. (1986). Prespeech segmental feature development. In Fletcher, P. and Garman, M. (Eds.) Language Acquisition. Cambridge Univ. Press, Cambridge, England (1rst Edition).
Studdert-Kennedy, M. (1998). The particulate origins of language generativity: from syllable to gesture. In Hurford, Studdert-Kennedy and Knight, Approaches to the Evolution of Language, Cambridge Univ. Press: Cambridge UK, 202-221.
6. Infant and Child Speech Perception
Bradlow, A., N., Nicol, T., McGee, T., Cunningham, J., Zecker, S. and Carrell, T. (1999) Effects of lengthened formant transition duration on discrimination and neural representation of synthetic CV syllables by normal and learning -disabled children. J. Acoust. Soc. Am., 106, 2086-2096.
Cacace, A.T., & McFarland, D.J. (1998). Central auditory processing disorder in school-aged children: a critical review. JSHR 41, 355-373.
Kraus, N., McGee, T., Carrell, T., Zecker, S., Nicol, T., Koch, D. (1996) Auditory neurophysiologic responses and discrimination deficits in children with learning problems. Science 273, 971-973.
Kuhl, P.K. (1993).Early linguistic experience and phonetic perception: implications for theories of developmental speech perception. Journal of phonetics, 21, 125-?.
Leonard, L., McGregor, K. and Allen, G. (1992). Grammatical morphology and speech perception in children with specific language impairment. JSHR 35, 1076-1085.
Mody, M., Studdert-Kennedy, M and Brady, S. (1997). Speech perception deficits in poor readers: auditory processing or phonological coding? J. Exp. Child Psycho., 64, 1-33.
Pegg, J. and Werker, J. (1997). Adult and infant perception of two English phones. Journal of the Acoustical Society of America, 102, 3742-3753.
Polka, L. & Bohn, O. (1996) A cross-language comparison of vowel perception in English-learning and German-learning infants. Journal of the Acoustical Society of America, 100, 577-592.
Polka, L. & Werker, J (1994) Developmental changes in perception of nonative vowel contrasts. J. of Exp. Psych: Hum. Percept. & Perform. 20, 421-435.
Rosen, S. and Manganari, E. (2001). Is there a relationship between speech and nonspeech auditory processing in children with dyslexia? JSHR, 44, 720-736.
Studdert-Kennedy, M. & Mody, M. (1995). Auditory temporal perception deficits in the reading-impaired: A critical review of the evidence. Psychonomic Bulletin & Rev. 2, 508-514.
Watson, C.S. and Kidd, G. (2002). On the lack of association between basic auditory abilities, speech processing and other cognitive skills. To appear in Seminars in Hearing, Vol. 23.
Werker, J. (1989) Becoming a native listener. American Scientist 77, 54-59.
Werker, J.F. (1993). The contribution of the relation between vocal production and perception to a developing phonological system. Journal of phonetics, 21, 177-?.
Werker, J.F. & Polka, L.(1993). Developmental changes in speech perception: new challenges and new directions. Journal of phonetics, 21, 83-101.
7. Voice and Prosody
Childers, D. and Lee, C. (1991). Vocal quality factors: Analysis, synthesis, and perception. Journal of the Acoustical Society of America, 90, 2394-2410.
Harris, M. and Umeda, N. (1987). Difference limens for fundamental frequency coutours in sentences. Journal of the Acoustical Society of America, 81, 1139-1145.
Hess, W.(1983). Pitch Determination of Speech Signals. Springer-Verlag: Berlin.
Hermes, D. (1998a) Auditory and visual similarity of pitch contours. JSHR 41, 63-72.
Hermes, D. (1998b) Measuring the perceptual similarity of pitch contours. JSHR 41, 73-82.
Kishon-Rabin, L., Boothroyd, A., and Hannin, L. (1996) Speechreading enhancement: A comparison of spatial-tactile display of voice fundamental frequency (F0) with auditory F0. Journal of the Acoustical Society of America, 100, 593-602..
Klatt, D.H. and Klatt, L.C. (1990) Analysis, synthesis, and perception of voice quality variations among female and male talkers. J. Acoust. Soc. Am., 87, 820-857.
Stevens, K.N., Nickerson, R.S. and Rollins, A.M. (1983). Suprasegmental and postural aspects of speech production and their effect on articulatory skills and intelligibility. In Hochberg, I., Levitt, H. and Osberger, M.J. (Eds.) Speech of the Hearing Impaired - Research, Training, and Personnel Preparation. Univ. Park Press, Baltimore, MD, 35-51.
Sundberg, J. (2000). Level and center frequency of the singer's formant. (pre-print).
Strong, W. and Plitnik, G. Chap. 30, The Singing Voice, "Singers Formant", in Music, Speech and Audio.
Titze, I., Horii, Y. and Scherer, R. (1987). Some technical considerations in voice perturbation measurements, JSHR 30, 252-260.
Titze, I. and Liang, H. (1993). Comaprison of F0 extraction methods for high-precision voice perturbation measurements. JSHR 36, 1120-1133.
8. Hearing Impaired Production
Bakkum, J.J., Plomp, R. and Pols, L.C.W. (1995) Objective analysis versus subjective assessment of vowels pronounced by deaf and normal-hearing children. J. Acoust. Soc. Am., 98, 745-762.
Goldhor, R. (1995) The perceptual and acoustic assessment of the speech of hearing-impaired persons. In Srydal, A., Bennett, R. & Greenspan, S (eds.), Applied Speech Technology, CRC Press, Ann Arbor, 521-546.
Fourakis, M., Geers, A. and Tobey, E. (1993). An acoustic metric for assessing the change in vowel production by profoundly hearing-impaired children. Journal of the Acoustical Society of America, 94, 2544-2552.
Monsen, R.B. (1976). Normal and reduced phonological space: The production of English vowels by deaf adolescents. Journal of Phonetics, 4, 189-198.
Osberger, M.J. (1987). Training effects on vowel production by two profoundly hearing-impaired speakers. J. Sp. Hear. Res., 30, 241-251.
Ryalls, J. Baum, S., and Larouche, A. (1991). "Spectral characteristics for place of articulation in the speech of young normal, moderately and profoundly hearing-impaired French Canadians", Clinical Linguistics & Phontetics 5, 165-179.
Tye-Murray, N. (1992). "Articulatory organizational strategies and roles of audition", The Volta Review 94, 243-259.
9. Hearing Impaired Perception
Coughlin, M., Kewley-Port, D., & Humes, L. (1998). The relation between identification and discrimination of vowels in normal-hearing and elderly hearing-impaired listeners. J. Acoust. Soc. Am. 104, 3597-3607.
Dubno, J. Dirks, D. and Ellison, D. (1989). Stop-consonant recognition for normal-hearing listeners and listeners with high-frequency hearing loss. I: The contribution of selected frequency regions. J. Acoust. Soc. Am. 85, 347-354.
Lindholm, J. Dorman, M. Taylor, and Hannley, M. (1988). Stimulus factors influencing the identification of voiced stop consonants by normal-hearing and hearing-impaired adults. Journal of the Acoustical Society of America, 83, 1608-1614.
Nabelek, A.K., Czyzewski, Z, & Krishnan, L.A. (1992). The influence of talker differences on vowel identification by normal-hearing and hearing-impaired listeners. Journal of the Acoustical Society of America, 92, 1228-1246.
Ochs, M.T., Humes, L.E., Ohde, R.N. and Grantham, D.W. (1989). "Frequency discrimination ability and stop-consonant identification in normally hearing and hearing-impaired subjects", J. Speech Hear. Res. 32, 133-142.
Revoile, S., Pickett, J. Holden-Pitt, L.D., Talkin, D. and Brandt, F. (1987). "Burst and transition cues to voicing perception for spoken initial stops by impaired- and normal-hearing listeners", J. Speech Hear. Res. 30, 3-12.
Stelmachowicz, P., Kopun, J., Mace, A., Lewis, D. and Nittrouer, S. (1995). The perception of amplified speech by listeners with hearing loss: Acoustic correlates. Journal of the Acoustical Society of America 98, 1388-1399.
Stelmachowicz, P., Pittman, A., Hoover, B. and Lewis, D. (2001) Effect of stimulus bandwidth on the perception of /s / in normal- and hearing-impaired children and adults. Journal of the Acoustical Society of America, 110, 2183-2190.
Summers, W.V. & Leek, M.R. (1992). The role of spectral and temporal cues in vowel identification by listeners with impaired hearing, J. Speech Hear. Res., 35, 1189-1199.
10. Acoustics and Cochlear Implants
Dorman, M.F., Loizou, P.C. and Rainey, D. (1997). Speech intelligibility as a function of the number of channels of stimulation for signalprocessors using sine-wave and noise-band outputs. Journal of the Acoustical Society of America 102, 2403-2410.
Dorman, M.F., and Loizou, P.C. (1997) Mechanisms of vowel recognition for Ineraid patients fit with continuous interleaved sampling processors. Journal of the Acoustical Society of America 102, 581-587.
Fishman, K., Shannon, R.V. and Slattery, W. (1997). Speech recognition as a function of the number of electrodes used in the SPEAK cochlear implant speech processor. J. Speech Hearing Res. 40, 1201-1215.
Friesen, l.M., Shannon, R.V., Baskent, D., and Wang, X. (2001) Speech recogntion in noise as a function of the number of spectral channels: Comparison of acoustic hearing and cochlear implants .Journal of the Acoustical Society of America 110, 1150-1163.
Friesen, L.M., Shannon, R.V. and Slattery, W.H. (2000) Effects of electrode location on speech recognition with the nucleus-22 cochlear implant. J. Am. Acod. Audio. 11, 418-428.
Loizou, P.C., Poray, O. and Dorman, M. (2000) The effect of parametric variations of cochlear implant processors on speech understanding. Journal of the Acoustical Society of America 108, 790-802.
Loizou, P.C., Dorman, M.F. and Powell, V. (1997). The recognition of vowels produced by men, women, boys, and girls by cochlear implant patients using a six-channel CIS processor. Journal of the Acoustical Society of America 103, 1141-1149.
Staller, S., Beiter, A and Brimacombe, J. (1994) Use of the Nucleus 22 channel cochlear implant system with children. The Volta Rev. 96, 15-39.
Vick, J.C., Lane, H., Perkell, J.S., Matthies, M.L., Gould, J., Zandipour, M. (2001) Covariation of cochlear implant users' perception and production of vowel contrasts and their identification by listeners with normal hearing. J. Speech Hearing Res. 44,
11. Second Language Acquisition
Best, C.T. , NcRoberts, G.W. and Goodell, E. (2001). Discrimination of non-native consonant contrasts varying in perceptual assimuliation to the listener's native phonological system. J. Acoust. Soc. Am., 109, 775-794.
Bohn, O.-S., and Flege, J. (1997). Perception and production of a new vowel category by adult second language learners. In James, A. and Leather, J. (eds), Second-language speech: structure and process. Berlin: Bouton de Gruyter, p. 53-73.
Bradlow, A., Pisoni, D., Akahane-Yamada, R., and Tohkura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America 101, 2299-2310.
Bradlow, A., and Pisoni, D. (1999). Recognition of spoken words by native and non-native listeners: Talker-, listener-, and item-related factors.. Journal of the Acoustical Society of America 106, 2074-2085.
Felge, J., MacKay, I. and Meador, D. (1999). Native Italian speakers' perception and production of English vowels. J. Acoust. Soc. Am., 106, 2973-2977.
Flege, J.E. and Fletcher, K.L. (1992). "Talker and listener effects on degree of perceived foreign accent", J. Acoust. Soc. Am. 91, 370-389.
Guion, S., Flege, J., Akahane-Yamada, R., and Pruitt, J. (2000). An investigation of current models of second language speech perception: The case of Japanese adults' perception of English consonants. J. Acoust. Soc. Am., 107, 2711-2724.
Piske, T., MacKay, I., and Flege, J. (2001) Facotrs affecting degree of foreign accent in an L2: a review. Journal of Phonetics, 29, 191-215.
Mayo, L., Florentine, M. and Buss, S. (1997). Age of second-language acquisition and perception of speech in noise. J. Speech Hear. Res. 40, 686-693.
Weismer, G. and Martin, R. (1992) Acoustic and perceptual approaches to the study of intelligibility. In Kent, R. (ed.), Intelligibility in Speech Disorders, (John Benjamins: Philidelphia), 67-118.
12. Computer-Based Speech and LanguageTraining
Akahane-Yamada, R., McDermott, Adachi, T., Kawahara, H., and Pruitt, J. (1998). Computer-based second language production training by using spectrographic representations and HMM-based speech recognition scores. ATR HIP Res. Labs, Vol. 2, 1747-1750.
Anderson, S. and Kewley-Port, D. (1995). Evaluation of speech recognizers for speech training applications. IEEE SAP 3, 229-241.
Dalby, J. and Kewley-Port, D. (1999). Explicit pronunciation training using automatic speech recogntion technology. Calico 16, (Special edition of the Computer-Assisted Language Instruction Consortium Journal, Holland, M. (Ed.)) 425-445.
Dagenais, P., Critz-Crosby, P. Flectcher, S. and McCutcheion, M. (1994). Comparing abilities of children with profound hearing impairments to learn consonants using electropalatography or traditional aural-oral techniques. JSHR 37, 687-699.
Kewley-Port, D. and Watson, C. (1995). Computer assisted speech training: Practical considerations. In Syrdal, A., Bennett, R. and Greenspan, S. (Eds.), Applied Speech Technology, Boca Raton: CRC Press, pp. 565-582.
Maki, J. Gustafson, J., Conklin, J. and Humphrey-Whitehead, B. (1981). The speech spectrographic display: Interpretation of visual patterns by hearing-impaired adults. JSHD 46, 379-387.
Neumeyer, L., Franco, H., Digalakis, V. and Weintraub, M. (2000). Automatic scoring of pronunciation quality. Speech Communication 30, 83-93.
Russell, M., Series, R., Wallace, J., Brown, C. and Skilling, A. (2000). The STAR system: an interactive pronunciation tutor for young children. Computer speech and Language 14, 161-175.
Yamada, Y., Javkin, H. and Youdelman, K. (2000). Assistive speech technology for persons with speech impairments. Speech Communication 30, 179-187.