Is It Time to Talk to our Computers?
One of the major announcements from Google last week was adding voice search to their flagship product.
Apple added a voice-activated “personal assistant” called Siri to the iPhone two years ago at the introduction of the iPhone 4S.
After seeing the introduction of Google Now voice search and having used Siri since its introduction, I am wondering – are we ready to talk with our computers?
I don’t carry an android phone, so I have no personal experience with the Google answer to (and some would say copy of) Apple’s Siri. But I have come to depend heavily on Siri for certain tasks – setting alarms and reminders, receiving and sending text messages (especially to stay safe while driving), and launching apps on my iPhone. As long as I stay within the bounds of the activities that Apple has foreseen (and thus the ones that Siri is programmed to do well), I find Siri to be spectacularly easy to use and very dependable. While I don’t have personal experience, friends who use android phones tell me that they feel the same way about voice control on their android phones.
Migration of voice control to computers, as demonstrated at Google I/O last week, might open the doors to a much richer experience because of the vastly stronger processors available on computers than on smartphones. Think of HAL9000 from 2001: A Space Odyssey or the computer on the Enterprise in the original Star Trek TV series.
I had a client that was in the midst of the cutting-edge work on handwriting recognition back when it was first being attempted seriously (remember the Newton???). That company is still around, and quite successful in using artificial intelligence to create use-specific alternative ways to interact with computers. I called my old friend there to ask about the state of the art in voice recognition.
According to this expert in the field, there are two very different forms of voice recognition – which he called bounded and unbounded. Bounded voice recognition uses a set, limited vocabulary and is similar to selecting items from a menu. The computer merely needs to differentiate between the small number of items on the menu to determine what the user wants to do. Unbounded voice recognition tries to understand general speech. It is much, much harder to do well but is the “holy grail” in the recognition world.
Both Apple and Google use bounded voice recognition in their systems, but both are adding more and more key words to their bounded systems in hopes that they begin to seem more and more like unbounded recognition. According to my expert friend, this approach will never achieve the real goal: users being able to ask computers to do things with natural conversation. That goal requires a completely different technological approach.
We don’t, of course, know what Apple, Google, and other companies are working on behind closed doors. We might see a spectacular unbounded voice recognition system introduced any day. But in the meantime, there is a saying that we all need to remember: Don’t let the perfect be an enemy of the good. In other words, we can still appreciate a good bounded voice recognition system even while recognizing that we would rather than an unbounded one.
So I am looking forward to interacting with my computer via voice – not always, but certainly in some situations. And given my having grown up when Star Trek was on TV, I hope I can configure my system to wake up when I say the word “computer” and respond to me with “working.”