To Commercialize, Voice Tech Must First Solve Its ‘Cocktail Party Problem’

Men and women say around 15,000 words every day on average. We phone our friends and family, connect into Zoom for meetings with our coworkers, talk about our days with our loved ones, or, if you are like me, and debate with the official over a terrible playoff call.

The hospitality, travel, IoT, and auto industries are all on the verge of mainstreaming voice assistant adoption and monetization. According to Meticulous Research, the worldwide voice and speech recognition market will increase at a CAGR of 17.2 percent from 2019 to 2025, reaching $26.8 billion. Amazon and Apple, for example, will accelerate this expansion by leveraging ambient computing capabilities, which will continue to push voice interfaces ahead as a key interface.

As speech technologies grow more common, businesses are focusing on the value of the data stored in these new channels. Microsoft’s recent acquisition of Nuance is not only about improving NLP or voice assistant technology; it is also about the wealth of healthcare data the conversational AI has amassed.

Our speech technologies not designed to deal with the chaos of real life or the clamor of our daily lives. Google has monetized every click of your mouse, and the same is now occurring with speech. Advertisers have discovered that conversion rates for speak-through conversations are greater than conversion rates for click-through conversations. To reach customers, brands must start building voice strategies – or risk being left behind.

Voice technology use was already on the rise, but now that the globe is on lockdown due to the COVID-19 epidemic, adoption expected to surge. According to Insider Intelligence, about 40% of internet users in the United States will use smart speakers at least once a month by 2020. However, there are a number of basic technological limitations preventing us from realizing the technology’s full potential.

Despite all of the advances achieved in speech technologies and their integration in a myriad of end-user devices, they are still primarily restricted to simple activities by the end of 2020, with global shipments of wearable devices rising 27.2 percent to 153.5 million from a year earlier. As customers want more from these encounters, and speech becomes a more important interface, this is starting to shift.

In 2018, in-car shoppers spent $230 billion on food, coffee, groceries, and other products that they might pick up at a store. The automotive sector was one of the first to adopt speech AI, but in order to realize its full potential, voice technology must become a seamless, fully hands-free experience. The signal still muddled enough by ambient vehicle noise to keep consumers glued to their phones.