For us, digital assistants are no longer anything new, whether it is Siri, Cortana & Co, we already use them in our smartphones, TVs or in the car. Even though today we still laugh at the answers from a digital assistant, this speech recognition represents a revolution which has radically changed the way we deal with computers. Although we are far beyond the command input and Siri & Co. already answers our questions or carries out actions, even when we speak freely with them, but that doesn’t always work straightaway. Everyone has experienced it; you talk and are simply not understood. We cannot always assume that everyone who listens to us understands us. This may be because you simply do not speak the same language, or it’s just too loud. Possibly it is said too often that one speaks too unclearly… Dictation over the phone or the spelling of the family name often doesn’t work. It is even an unattainable art for some people to correctly write “sie” (she/they) or “Sie” (you polite).
As people we are aware of these problems and ask: What did you say? I didn’t understand because it’s too loud. But a computer only has one automatic response: “I didn’t understand you. Could you please repeat what you said?” Actually at this point the computer still doesn’t understand what it hasn’t understood!
And now we want to speak with computers? It seems so easy in Films: the protagonist speaks and the computer does – what actually? So that the use of speech recognition and virtual assistants becomes meaningful for everyday use, we must be aware of what the computer should do for us, like how it picks up words and announcements and what the limits of a technical device are. Computers work on the principals of input, processing and output. The input is our language that reaches the computer in the form of an audio signal. The computer listens in this moment: an audio signal. If the audio signal comes from a jackhammer, the computer will next try to recognize the language. It listens but doesn’t understand. There are two fundamental ways in which the signal is processed. An audio signal can be burned onto a CD and can be used, for example, as an audio book in the car. But still we want more: The audio signal should be processed further so that after processing, the text appears on a screen, for example. The word “deoxyribonucleic acid” only appears on the screen if it has already been programmed into the computer. Otherwise it will appear as “De oxy rib nucleic acid”. The computer just listens and writes words rather than understanding what has been said.
The key to the use of digital assistants, which don’t just pick up what is said, but instead can correctly detect and turn this into useful actions, is to let the computer participate in human knowledge. This is done by entering words, both written and spoken, into the system so it can learn them. The personal recognition is then saved in a speech profile. Once a networked device has access to this speech profile, only rarely do you hear, “Sorry, I didn’t understand that” from your computer.
In the future if we work together with our devices then speech recognition linked with artificial intelligence changes the processing of data and information, which was not possible until now. Through the use of language the complex control of technical devices in the internet of things (air conditioning, TVs, household appliances) is radically simplified and the technical assistants are even more useful for us. We have focused on exactly this focal point in this year’s communication for our customers Nuance Communications. We have presented the new solutions to editors and shown that the initial difficulties are overcome technically. Now the task is to eliminate the human hurdles and concerns. We offered a look behind the scenes and invited the Journalists to Aachen to the language laboratory and the largest Nuance research centre in Europe. Here the motto was see, hear and test. We are excited about the upcoming developments and innovations in these areas – Who knows, maybe we’ll even be surprised at CES 2016!
Back to blog