Date(s) - 22/11/2012
2:00 pm - 3:00 pm
This talk looks at the implications of different spoken language processing tasks and machine learning strategies for representing prosody in a computational model. We consider local prosodic events related to prominence and segmentation, as well as prosodic patterns that provide cues to high-level phenomena related to topic dynamics and social interaction. At the local level, the case for a symbolic intermediate representation of prosody is presented, with a discussion of the challenge of making prosodic cues truly complementary to lexical cues in language processing. For higher level language analysis, we suggest handling prosodic correlates (e.g. F0, energy contours) as signals with frequency analysis techniques used in feature extraction, analogous to recent work in speech recognition aimed at moving beyond frame-based analysis. A cross-cutting theme is the issue of normalization to factor out the different types of information that prosodic correlates carry.
Mari Ostendorf is Professor of Electrical Engineering at the University of Washington and Adjunct in Computer Science and Linguistics. Her research interests are in dynamic and linguistically motivated statistical models for speech and language processing that consider the interaction of topic, genre and register. She is a Fellow of the IEEE and ISCA. Prof Ostendorf is a leading researcher in spoken language processing and has made significant (and long-lasting) contributions in speech recognition, speech synthesis, prosodic analysis, and computational linguistics. Her current research interests include: Low resource language modelling; Computational modelling of prosody for spoken document processing; Use of parsing in speech recognition; Language technology for education applications; Extracting social roles and relation information from spoken and written discussions.