

![]() 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 Speech Research Speechreading (Lipreading) The comp.speech FAQ Automatic Speech Recognition Automatic Speech Recognition in Air Traffic Control Simulators Using Articulatory Knowledge in Automatic Speech Recognition |
Let's talk about how to wreck a nice beach.
Well, actually, if I were presenting this chapter verbally, you would have little difficulty understanding the preceding sentence as Let's talk about how to recognize speech. Of course, I wouldn't have enunciated the g in recognize, but then we routinely leave out and otherwise slur at least a quarter of the sounds that are "supposed" to be there a phenomenon speech scientists call coarticulation. On the other hand, had this been an article on a rowdy headbangers' beach convention (a topic we assume HAL knew little about), the interpretation at the beginning of the chapter would have been reasonable. On yet another hand, if you were a researcher in speech recognition and heard me read the first sentence of this chapter, you would immediately pick up the beach -- wrecking interpretation, because this sentence is a famous example of acoustic ambiguity and is frequently cited by speech researchers. The point is that we understand speech in context. Spoken language is filled with ambiguities. Only our understanding of the situation, subject matter, and person (or entity) speaking -- as well as our familiarity with the speaker -- lets us infer what words are actually spoken. Perhaps the most basic ambiguity in spoken language is the phenomenon of homonyms, words that sound absolutely identical but are actually different words with different meanings. When Frank asks, "Listen, HAL, there's never been any instance at all of a computer error occurring in a 9000 series, has there?", HAL has little difficulty interpreting the last word as there and not their. Context is the only source of knowledge that can resolve such ambiguities. HAL understands that the word their is an adjective and would have to be followed by the noun it modifies. Because it is the last word in the sentence, there is the only reasonable interpretation. Today's speech -- recognition systems would also have little difficulty with this word and would resolve it the same way HAL does. A more difficult task in interpreting Frank's statement is the word all. Is all a place ; such as IBM headquarters -- where a computer error may take place, as in "there's never been any instance of a computer error at IBM ... " HAL resolves this ambiguity the same way viewers of the movie do. We know that all is not the name of a place or organization where an error may take place. This leaves us with at all as an expression of emphasis reinforcing the meaning of never as the only likely interpretation. In fact, we try to understand what is being said before the words are even spoken, through a process called hypothesis and test. Next time you order coffee in a restaurant and a waiter asks how you want it, try saying "I'd like some dream and sugar please." It would take a rather attentive person to hear that you are talking about sweet dreams and not white coffee. When we listen to other people talking -- and people frequently do not really listen, a fault that HAL does not seem to share with the rest of us -- we constantly anticipate what they are going to say ... next. Consider Dave's reply to HAL's questions about the crew psychology report: Dave: Well, I don't know. That's rather a difficult question to ...
When Dave finally says answer, HAL tests his hypothesis by matching
the word he heard against the word he had hypothesized Dave would
say. In watching the movie, we all do the same thing. Any reasonable
match would tend to confirm our expectation.
|