Learning variation in a second language: A cross-language study of rate normalization.

Nagao, K. (Departments of Linguistics and Speech and Hearing Science, Indiana University), de Jong, K.J. (Departments of Linguistics and Cognitive Science, Indiana University), and Lim, B.J. (Department of East Asian Languages and Cultures, University of Wisconsin)

Speech categories are rendered with an enormous range of variability, both within and across speakers. Speaking rate, for example, pervasively affects the acoustic properties of linguistic categories. However, previous literature shows that listeners still systematically identify those categories. One important question facing second language research is how learners deal with such variability when identifying speech categories. This paper examines the relationship between variability induced by speech rate and the perception of English obstruent voicing by Japanese and Korean learners of English. While most previous perception studies used computer-generated stimuli, this study examines naturally produced rate-varied speech. Four native speakers of American English repeated /bi/ and /pi/ at increasing rates in time with a metronome. Three-syllable stimuli were spliced from the repetitive speech and presented to four groups of ESL listeners, and to English controls. Analyses of identification patterns show similar rate normalizing perceptual functions for all listener groups; listeners' category boundaries on a VOT continuum shift to a lower value when syllable duration decreases. However, identification functions for ESL listeners were systematically shifted in the direction of their native language, to longer VOTs for Korean, and shorter VOTs for Japanese. This cross-language difference strongly suggests that rate normalization is not a general auditory mechanism, but is based on the distribution of consonants that the listeners have experienced. Further analyses of individual stimuli were performed to determine where in the distribution of productions the English controls and the non-native ESL listeners diverged. Such analyses find pervasive response differences, both to tokens within the centers of the production distributions as well as in the extreme edges. Preliminary analyses indicate that these divergences are compatible with a model in which listeners abstract rate-normalized functions and modify such functions in the context of a new language. [Work supported by the NIH and NSF.]