Deriving and utilizing rich representations in spoken language translation
Current statistical speech translation approaches predominantly rely on just text transcripts and do not adequately utilize the rich contextual information such as prosody and discourse function that are conveyed beyond words and syntax. In this talk I will introduce a novel framework for enriching speech translation with prosodic prominence and dialog acts. Our approach of incorporating rich information in speech translation is motivated by the fact that it is important to capture and convey not only what is being communicated (the words) but how something is being communicated (the context). First, I will present various techniques that we have developed for automatically detecting prosody and dialog acts from speech and text, and will survey some of the most important results of our contribution. I will then describe techniques for the integration of these rich representations in spoken language translation.
Speaker: Vivek Kumar Rangarajan Sridhar
Vivek Kumar Rangarajan Sridhar received the B.E. (honors) degree in electrical and electronics engineering from the Birla Institute of Technology and Science, Pilani, India, in 2002, and the M.S. degree in electrical engineering from University of Southern California, Los Angeles, in 2004, where he is currently pursuing the Ph.D. degree in electrical engineering. He is a member of the Speech Analysis and Interpretation Laboratory lead by Prof. Shrikanth Narayanan. His general research interests include speech recognition, spoken language understanding, spontaneous speech processing, disfluency detection, speech-to-speech translation, text-to-speech synthesis and articulatory modeling. Vivek is a recipient of the Best Teaching Assistant award from the USC Electrical Engineering Department (2003--2004). More information on his research and interests can be found at http://sail.usc.edu/~vrangara
Google Tech Talks
July 7, 2008