Media

Beyond Comprehension – Where Voice Search Could Be Headed

Voice search has long been a hot topic. In response to this, all of the big players have developed their own product covering voice-enabled options such as mobile search (Siri, Google Voice, Bixby), home assistants (Alexa, Google Assistant), PCs (Cortana), even remote controls (Xfinity Voice Remote).

This saturation has increased awareness, interest and overall use of voice. Google has recently seen about 20 percent of its queries via mobile devices come in via voice. That number is expected to climb in the next few years.

Fast Tracking Mass Adoption

The fact that voice search and devices with the capacity for it are so prevalent makes trial easy enough. The problem is – does trial ensure long-term adoption? And does adoption mean consistent complex interactions, or merely using voice search for navigational or hands-free-related purposes occasionally? Because there’s a big difference between the two.

As it relates to adoption, the biggest barrier has always been comprehension. The eventual goal for comprehension is for language to be discernible in any environment, and to respond with accurate answers.

Currently, most voice-recognition software isn’t polished enough to accomplish this. Experts have stated accuracy levels need to reach at least a 95 percent accuracy threshold for people to feel confident in the technology and in-turn accelerate mass adoption. Hitting that 95 percent level – which Google Voice accomplished last year – was a huge milestone, but continued improvement beyond that level will go a long way towards the public trusting voice.

Moving Beyond the Basics of Comprehension

As voice-accuracy levels continue to grow, there is another obstacle to overcome. Voice will need to go beyond comprehension. It will need to possess the ability to understand complex, conversational interactions – conversations that go beyond a specific string of words or fall within a set of manufacturer-approved questions. Voice will need to adapt in real-time and interact with humans. It will also need to adapt to its user to anticipate questions before they arise.

Google demoed this type of forward-thinking technology at a conference this past June. Their voice functionality successfully made a phone call to a hair salon and made an appointment with a human. Google CEO Sundar Pichai noted the technology – called Google Duplex – was still being worked on behind the scenes. “The amazing thing is that Assistant can actually understand the nuances of conversation,” noted Pichai.

And that’s the key. The true potential of voice. Enabling the technology to go beyond simply responding to prompts. Propelling it to interact in a conversational – and meaningful – manner. The possibilities around technology like that are limitless.

The Current Role of Voice Search

2020 is a big year for voice. This is when ComScore predicts 50 percent of all searches will be voice-related. The point when voice is destined to become the new face of search.

But voice search – even the futuristic utopian version – still has its limitations. It’s unlikely to be the silver bullet that puts an end to the traditional keyword-based model. Its strength isn’t in producing pages of results. With voice, it isn’t about options, it’s about specificity. One prompt, one response.

Additionally, there are certain interactions that require more than auditory cues. Asking a digital assistant for the best pizza place in your area might yield multiple results that inherently require more research to decide. Do they deliver? How much for a large pizza? What toppings can I get? Are there any coupons? What’s the wait time? Sometimes we don’t realize how multi-layered even the tiniest decisions can be. As of now, these are typically interactions best conducted by traditional search.

However there is a scenario where voice can and has excelled – when someone knows exactly what they want and exactly where to get it. That is the interaction for which the current iteration of voice is built. It’s what the biggest names in technology have whittled its purpose down to – serving as a direct conduit to known entities.

Profit Shouldn’t Be The Main Goal

The previously mentioned “I know what I want and where I want it from” scenario is why it makes so much sense for Google (with their Walmart and Target partnerships) and Amazon to dive headfirst into this type of technology. Voice and digital assistants represent a direct line of communication to the place every retailer has always wanted to exist – the home. Amazon tried this same tactic with their Dash buttons, but the idea didn’t fully resonate. No one wants dozens of branded buttons throughout their home that need to be located every time something runs out. They want simplicity. And voice – designed for short, direct questions with short, direct answers – provides that simplicity:

“Alexa, order more Bounty paper towels.”

Done. Alexa knows the quantity you ordered last time, your address, payment info, all of those things that previously smeared the traditional keyboard-based process. It’s such a genius move on Amazon’s part. It’s marketed as a household helper or assistant. It’s there to serve you and provide answers to the mundane things you might not want to look up. But it’s just a well-disguised check-out line Amazon has somehow managed to place directly on your kitchen counter.

And nothing against this current focus. It’s making lives easier, which is the goal of most technological advances, while allowing companies to profit. It’s a win-win based on the current limitations of the technology. But we also need to ensure we’re not dumbing down the technology in the name of profit.

There are endless possibilities within voice, but the best ones are those least associated with a revenue stream.