SICSA DVF: Prof. Mari Ostendorf - Rich speech Transcription for Spoken Document Processing - SICSA - The Scottish Informatics & Computer Science Alliance

Date/Time
Date(s) - 27/11/2012
2:00 pm - 3:00 pm

Location
The Royal College of Surgeons, King Khalid Building Symposium Hall

Abstract:
As storage costs drop and bandwidth increases, there has been rapid growth of spoken information available via the web or in online archives — including radio and TV broadcasts, oral histories, legislative proceedings, call center recordings, etc. — raising problems of document retrieval, information extraction, summarization and translation for spoken language. While there is a long tradition of research in these technologies for text, new challenges arise when moving from written to spoken language. In this talk, we look at differences between speech and text, and how we can leverage the information in the speech signal beyond the words to provide a rich, automatically generated transcript that better serves language processing applications. In particular, we look at how prosodic cues can be used to recognize segmentation, emphasis and intent in spoken language, and how this information can impact tasks such as topic detection, information extraction, translation, and social group analysis.

Biography:
Mari Ostendorf is Professor of Electrical Engineering at the University of Washington and Adjunct in Computer Science and Linguistics. Her research interests are in dynamic and linguistically motivated statistical models for speech and language processing that consider the interaction of topic, genre and register. She is a Fellow of the IEEE and ISCA. Prof Ostendorf is a leading researcher in spoken language processing and has made significant (and long-lasting) contributions in speech recognition, speech synthesis, prosodic analysis, and computational linguistics. Her current research interests include: Low resource language modelling; Computational modelling of prosody for spoken document processing; Use of parsing in speech recognition; Language technology for education applications; Extracting social roles and relation information from spoken and written discussions.