What is Google Duplex?
Duplex is a new Google tool that uses complex artificial intelligence(AI) not only to dial a phone number and conduct a voice conversation but also to sound convincingly human in the process. It’s designed to carry out specific tasks such as booking appointments and checking opening hours. But although it undoubtedly marks the next step towards effective human-computer interaction, it is already proving controversial.
What’s the idea behind it?
Google noticed that many businesses have yet to introduce an online booking service. This means we still have to make phone calls to book appointments with dentists, doctors, opticians, hairdressers, restaurants and the like. In an effort to bring these phone-only services more in line with those that let you book online, Google is offering to conduct the necessary conversation on our behalf.
How does it work?
Duplex is to be integrated with Google Assistant, and you simply have to ask it to perform a certain task. You could ask when a local computer shop is open, for example, or make an appointment with your nearest hairdresser. In either case, details of the request are passed to Duplex, which then makes a call, does the talking for you and reports back with an answer or confirmation. Magically, the person on the other end should have no idea what they’ve just been speaking to a computer.
How does it achieve that?
For several years, Google has been training a recurrent neural network’ using anonymized phone-conversation data. Such networks work to some extent as a human brain, using memory to process input sequences while taking into account their own output, context, and history. As you can imagine, this is hugely complex. Duplex needs to work with the output from Google’s automatic speech recognition software while taking into account features of the audio and the conversation’s history. It also needs to operate within the parameters of the conversation it is tasked to carry out, knowing what to ask and the kind of responses it should give.
Can I just chat merrily away to it?
No, not yet. Duplex will not be phoning businesses for general chit-chat and neither could it handle one if it did: flawless open-ended conversation is difficult to achieve because it relies on context and multiple layers. For this reason, Duplex can only respond in limited ways, so discussing the weather and what you’re having for your tea isn’t on the agenda right now.
But does it sound convincingly human?
Yes. Based on the examples that Google demonstrated at its annual I/O developer conference, Duplex can start a conversation with a person who picks up the phone, make the correct request, listen for a response, answer questions and deal with nuances of speech. As a follow-up blog showed, Duplex is also able to handle interruptions and elaborations, with latency is taken into account (“hellos” are responded to immediately, while responses that would ordinarily require a little bit of thought are slightly delayed). It even “ums” and “aahs” like a human, lending extra authenticity to its speech so the person on the other end of the phone thinks they are speaking to a real person.
Will people be told they’re talking to a robot?
In the audio examples given by Google, Duplex does not identify itself as a machine. This has caused controversy because many people worry that it is deceptive and creepy. But Google insists it is experimenting to find the right approach and, since the demonstration, it has said Duplex will indeed identify itself (as Google Assistant) at the start of a conversation.
Can it recognize any speech?
Apparently so, although Google has not yet provided any examples of a conversation that’s struggling to be heard over clattering plates or snipping scissors, nor has it given evidence of Duplex coping with fast talkers or strong accents. However,voice-recognition software, neural networks, and natural language processing are advancing at a rate of knots so we wouldn’t be surprised.
Will it replace infuriating automated phone systems?
Sadly, that’s not the current intention. Instead, Google Duplex is flipping computer-human conversations around, with the business on the receiving end rather than the customer. But Duplex makes the process of talking to a computer much more natural. Google says humans do not have to adjust to the system; the system adjusts to us.
Isn’t this still a slippery slope?
Maybe. Some people believe a robot should always sound like a robot, so there are no misunderstandings. There’s certainly a fear that human-sounding robot voices could be used to manipulate people. We’ve already seen AI that can mimic any voice: a startup called Lyrebird (lyrebird.ai)creates digital voices that sound like you by sampling a minute of your speech. But even something as small as the “umms” and “aahs” of Duplex could be seen as a step too far.
the caller is another matter entirely. It’s even possible that by becoming accustomed to talking to robots, we’ll change the way we converse with each other.
Will Google record the conversations?
Yes, but only “in certain jurisdictions”. We take that to mean it will record conversations where it is legal to do so and that it will seek permission in countries that require people to know a recording is being made. Google says recordings could be shared with users so they have are a cord of how they responded. The information is also likely to aid Duplex’s learning process.
Why can’t we just pick up the phone ourselves?
Well, we can, of course, and we envisage that many people will still prefer to call direct. Going via Duplex does have benefits, though. It can save time and allows Google Assistant to make a note of the appointment in your calendar and set are minders. The system also means businesses don’t have to spend a fortune on online booking systems. It could even get us across language barriers: you could request a booking for a Parisian restaurant in English using Google Assistant and have the request made in perfect French.