It is becoming clear that the next battle in the tech world will be around voice-driven digital assistants, such as the Apple Siri, Amazon Alexa, Microsoft Cortana, Google Assistant/Now and Samsung Bixby/Viv. While the attention has moved from smartphone assistants to home hubs, the real excitement will begin as the underlying AI (artificial intelligence) and machine learning begins to deliver detailed, contextual, and highly personalized responses that will make a consumer’s life easier.
Digital assistants will be at the heart of a user’s daily activities, whether in an increasingly smart home (using home hub devices like the Amazon Echo and Dot, Google Home, Apple HomePod), a connected car, at work or walking down the street. Advancements will increase consumer usage, improving the accuracy of responses and increasing revenue opportunities for businesses. Eventually, digital assistants will process enough insights to transform from being a reactive listening tool to being a ‘digital concierge’, providing proactive recommendations for all parts of a consumer’s life. (See comparison of home hub devices here.)
While today’s digital assistants are far from being able to provide proactive recommendations or integrate all of the elements of a consumer’s day (news, calendar, financial relationships, personal preferences, groceries, ecommerce, etc.) there are many firms that are working to provide this ‘digital concierge’ experience, either as part of a home hub device or as part of individual devices. As a result, brands need to take a proactive approach and create an voice-first skills in order to connect with customers at home.
“Following the introduction of standalone voice devices such as the Amazon Echo and the Google Home, what we’re seeing now is a rush by a wide range of device manufacturers to integrate voice into their end devices, from washing machines to smoke alarms to alarm clocks,” Robert Thompson, i.MX ecosystem manager at NXP, told EETimes.
The Emergence of Voice-First Interactions
The growth of voice-capable hardware, from smartphones and tablets to new smart speaker devices for the home and voice-first capabilities in new cars sets the foundation for voice payments. According to Pew Research, 77% of U.S. consumers now own a smartphone and 51% own a tablet, with 75% of iPhone owners having used Siri and 63% of Android phone owners having used a virtual assistant on their device. Similarly, BI Intelligence projects the number of smart home devices to rise from 39 million this year to 73 million in five years.
There are plenty of reasons consumers are using voice more, with saving time and the simplicity of the process being mentioned most. Not surprisingly, as with most digital technologies, security of data and personal information is a primary concern holding people back from voice-first commerce. In addition, there is a need to get the best hardware and software in the hands of users.
The importance of voice-first interactions can be seen in the competition for voice-first grocery and consumer product ordering. Amazon has a large lead in this category, with voice commands using any of the Amazon Alexa devices. Not to be left behind, Google and Walmart have announced a partnership to combine the product selection of the retail giant with the voice and advanced learning skills of Google. Over time, both Amazon and Google/Walmart hope to be able to use AI to learn consumption habits and provide product recommendations.
“One of the primary use cases for voice shopping will be the ability to build a basket of previously purchased everyday essentials,” stated Walmart. “That’s why we decided to deeply integrate our Easy Reorder feature into Google Express. This will enable us to deliver highly personalized shopping recommendations based on customers’ previous purchases, including those made in Walmart’s 4,700 stores and on Walmart.com. To take advantage of this personalization, customers only need to link their Walmart account to Google Express.”
The Exciting Potential of Voice Payments
In a research report from BI Intelligence, it was found that 18 million US consumers have made a voice payment, with that figure expected to grow at a 31% compound annual growth rate (CAGR), reaching 31% by 2022. “This growth is expected since voice interfaces make transactions faster, easier, and possible when users can’t turn to their hands,” stated the research. “Additionally, our data confirms the voice payments revolution has already begun — users are trying the feature, today, in higher-than-expected numbers.”
There will definitely be a first-mover advantage for both voice-initiated payments and voice banking, with many organizations playing catch-up in voice, AI and Internet of Things (IoT) innovations. The BII report discusses, in depth, how Amazon, Google, PayPal, Bank of America, Capital One, Santander and others are well positioned for growth in both voice payments and banking.
Some of other findings in the report include:
- Voice payments are catching on — 8% of US respondents to a 2017 BI Intelligence survey said they used voice commands to buy something, send money to a friend, or pay a bill.
- Adoption is set to grow — Use of voice to buy something, send money to a friend, or pay a bill will increase to 31% of US adults by 2022. Increased penetration of voice-enabled devices, generational gains in AI, and a strong consumer value proposition for voice payments will feul this growth.
- Large payment players are committed — Amazon, Apple, Google, and PayPal are at the forefront of making these next-generation payments possible.
- Major banks are betting on AI — Bank of America, Capital One, USAA, and others have introduced conversational interfaces to their customers.
- Voice payments will evolve — As with all new technologies, voice-first digital assistant payments will soon be natural and easy to use.
- First-movers will benefit — With aggressive growth expectations, early providers of voice payments and voice banking experiences will be in a position to move market share for an increasingly digital marketplace.
The Voice Payments Ecosystem
In the battle for the voice banking and voice payments customer, AI companies Nuance and Personetics are the leaders for banks looking to launch voice assistants. Nuance powers voice-first assistants that reside in the mobile banking apps at USAA, Santander Bank, and others, while Personetics powers the virtual assistant, Ally Assist, in Ally’s mobile banking app. Because digital assistants are contained within organization’s mobile banking apps (as opposed to being separate apps), they alleviate many of the trust issues with external providers.
Early entrants into Siri P2P payments include PayPal, Venmo, and Square Cash. Some banks are beginning to follow the lead of these firms. including UK challenger bank Monzo, German direct bank N26, and the Royal Bank of Canada (RBC) all offer P2P payments with Siri.
Best Practices to Take Voice Payments Mainstream
Right now, the vast majority of voice-driven payments are driven by relatively small ticket eCommerce transactions, including the use of the Amazon Dash, Amazon Wand, etc. for household consumable items (e.g., laundry detergent, paper towels, etc.) or for digital entertainment purchases such as music and movies.
The key to success with voice-enabled payments will be for consumers to feel comfortable making more involved (and expensive) purchases by voice. This will occur as the machine learning components of voice-first technology improves, with the integration of visual interfaces such as the Amazon Alexa Show, and as security integration comes to the forefront (biometrics).
Some of the recommendations to prepare for increased use of voice banking and voice payments include:
- Reinforce security — Deal with the ‘elephant in the room,’ but proactively testing and integrating alternative forms of authentication including biometric options like facial recognition, fingerprint biometrics, voice identifiers, etc.
- Improve Recognition of Voice Commands — There has been significant improvement in recognition of commands from the initial voice banking tests by USAA. The ability to recognize informal as well as more formal commands is improving. The key will be to test various language or request patterns so that alternative channels aren’t needed.
- Multi-channel support — As mentioned often by Brian Roemmele, the best voice systems will augment visual and tactile interfaces, not replacing them entirely. Amazon’s newest Echo device, the Echo Show, contains a screen to show users visual content, and potentially to provide added security authentication.
The power of voice-first eCommerce, payments and voice banking should not be underestimated. Evolving quickly, voice payments will move from the infancy of Amazon Dash and simple, low value transactions, to AI-driven recommendation engines that will prompt the consumer for bill payments, purchases and P2P transactions based on a consumer’s lifestyle and previous interactions.
Beyond phone-based voice interactions, or even IoT home hubs, an expanded number of appliances, and everyday hardware (including cars) will include the option of voice-first interaction. More importantly, these devices and objects will learn from each other on the consumer’s behalf.
Companies that improve these interactions and experiences will benefit – with some having the potential to become the hub of a consumer’s entire lifestyle. Other firms, including many financial services firms, may be relegated to being a supplier to another firm’s relationship with the consumer.