If Google wants to hang onto its status as the world’s dominant source of information, it needs to make sure people keep using Google products when they’re in hands- and screen-free situations. As part of that goal, it needs to gain a greater foothold in voice.
Voice is the technology every major Silicon Valley company is racing to dominate before anyone else; and Google, with its search and language capabilities, would seem poised to take the lead.
But Google is starting from behind. The company made a late push into hardware, and Apple’s Siri, available on iPhones, and Amazon’s Alexa software, which runs on its Echo and Dot devices, have clear leads in consumer adoption.
To master voice, Google will have to contend with technology that’s not friendly to advertising, its main business, or suitable for Google’s directory-like approach to organizing web results.
Voice is growing as an interface through which people interact with artificial intelligence. And AI isn’t just a change in how people access information, it’s the next jump in computing. Google can’t afford to lose ground in the battle for this coming ecosystem.
How Google’s voice technology works
People commonly refer to voice capabilities on Google devices as voice search even when they refer to other functions. The capabilities were introduced to Android as “Voice Actions” in 2010.
One of the latest pieces of software to incorporate voice capabilities is Assistant, an AI platform that runs on Google Home (which competes with the Echo), Google’s Pixel phone and later versions of Android.
In addition to web results, Assistant can connect with other devices to allow users to control them by voice, integrate with third-party apps and pull up personal information like Google Calendar appointments. Assistant also works in text settings, but is mostly known to consumers as the voice that emanates from the Home device.
Gaining traction with voice, and making money from it, will require Google to overcome a number of hurdles.
There’s no clear plan for making money
Google has not shared a plan for how it will make money from voice tools like Assistant and Voice Search. CEO Sundar Pichai emphasized during the fourth-quarter earnings call in January that it’s early days for voice, and Google’s focus was on making sure Google tools were available and useful to consumers at all times.
But right now, Google is not letting advertisers or businesses buy their way into voice results the way they can buy a slot at the top of a search results page. So when you hear an answer from your Home device, no one paid Google to put that there.
Google could still make money through e-commerce the way Amazon does with Echo, but Google trails Amazon in product search and in online shopping generally. Catching up to its competitor and then spinning shopping into a main source of revenue for a key piece of software seems unlikely.
You can’t just sneak in ads
In March, Google Home devices played what sounded a lot like an ad, though Google said it wasn’t an ad at all. The spot was a promotion for the Disney film “Beauty and the Beast.” When users asked Home for a preview of their schedules by saying, “Okay Google, tell me about my day,” Google appended the rundown by also saying, “by the way, Disney’s live action ‘Beauty and the Beast’ opens today.”
Users were caught off guard and took to social media to complain. Whatever it was, it wasn’t invited and it wasn’t subtle.
Google said at the time it wasn’t a paid promotion, but it does give you an idea of how an ad could work in a voice setting. The problem, however, is it might be harder to get people used to these kinds of promotions. People complained when Google started showing paid search links, but users could still choose to not click. It’s harder to ignore or skip a paid audio ad.
Without the hardware, no one will hear you
If Google wants to hold its own with voice, it needs to sell Home devices and Pixel phones, which run Google’s AI software.
But Google didn’t make its push into hardware until late in 2016, when it released the Home and the Pixel phone. Both devices are going up against Amazon’s Echo and Apple’s iPhone. Apple has sold roughly one billion iPhones, which also means Siri runs on more devices than any other AI assistant. Amazon, meanwhile, has sold 6.3 million Alexa-powered Echos and Dots, according to estimates from research firm Strategy Analytics.
Google, not to be easily outdone, is expected to reach one million sales of Home devices by the middle of the year, according to the same research firm. Google was also on track to sell an estimated three million to five million Pixel phones in the fourth quarter, according to Morgan Stanley.
Google says its AI-enabled Assistant is actually available much more broadly. A recent update on later versions of Android means 200 million devices should be getting access to it.
But what can happen is not necessarily what will happen. Google has little control over whether manufacturers and mobile networks ensure devices get the updates needed for the software to work. Even if some people think Siri sucks, a lot more people use it because they have iPhones and Apple has complete control over that software.
It sounds different when you say it
When Google’s Assistant responds to a voice query, you don’t see the full list of possible results the way you do on a web result. Instead, you hear a voice reading a snippet. This makes it sound like Google is endorsing the answer, the equivalent of clicking the link for you.
That’s especially a problem when the answer is wrong. This happened in March, when Home users observed that if you asked “Is Obama planning a coup?” the device responded by reading a site falsely claiming former President Barack Obama was trying to overthrow the government.
(Google eventually amended results for that query so the result saying Obama was planning a coup didn’t rise to the top.)
It’s hard to keep things private when speaking aloud
Assistant accesses personalized features like your search history and your calendar. Google recently enabled voice recognition on Home devices so it can run multiple accounts, the idea being that different users can each retrieve their own personal information, but not someone else’s.
But if the AI messes up, it creates a privacy problem that would be especially problematic in troubled households.
“The worst cases are abusive relationships or power relationships between parents and children,” said Electronic Frontier Foundation Chief Computer Scientist Peter Eckersley. “In both cases, privacy can be a really serious matter.”
Google admits Home’s voice recognition capabilities are not perfect. “We’re continuing to fine-tune our voice recognition systems and will get better over time,” the company said in a statement.
Assistant has already struggled with privacy in situations that involve multiple users. In text conversations in Google’s messaging app Allo, Assistant has shared personal Maps information from a user so that another can see it, and seemed to share search history. Google said it fixed those glitches.
If Google doesn’t solve the issue, it could dissuade people from using the feature — or using Google’s voice-enabled technology at all.
Other issues with voice
People are still getting used to voice interaction with computers and they speak to devices in a different way from how they type.
Typed searches are easy to adjust if they fail, while unsuccessful voice queries engender more frustration. There’s an iterative approach to text search that doesn’t carry over to voice, says UC Berkeley computer science professor Dan Klein: “You type a search, it doesn’t work, you modify your search.”
Voice tools are known to have trouble with accents. “Understanding accents and different types of voices is a huge challenge for computers, which is why we train our systems on voice data through our services,” a Google spokesperson said, adding that Google is continuing to train its systems to better recognize accents.
On top of these limitations in grasping what users ask and request, the responses from a voice tool face limitations. They have to be shorter than text responses and, at least with the way Assistant frames responses, “they give you no hint that there are other answers,” said SEO marketer Will Critchlow.