In the next three years, the voice assistant market in the US is projected to grow from $2.8 billion to $11.2 billion, with sectors like ecommerce, healthcare, manufacturing, and connected home leading market growth. And voice experiences are about so much more than assistants alone: after all, 90% of human communication is nonverbal. Thus, the next generation of intelligent voice assistants like Alexa or Siri will require better listening and paralanguage skills, as well as better speaking skills.
More delightful experiences where voice can positively impact speed, complexity, safety, and accessibility — what we call VoiceCases — will minimize stress and lead to higher customer satisfaction and brand loyalty. Through our multimodal approach, WillowTree empowers users to interact with technology in mixed voice-and-visual ways that offer improved personalization, efficiency, and real-time error correction.
The message is loud and clear: voice is the future of human-centered digital experiences.
Our in-house Voice Innovation Team combines talent and deep expertise from our award-winning design, strategy, research, and engineering teams.
Together, we’re reimagining mission-critical voice experiences like these:
And there are even more ways we’re giving voice to our human-centric values. When a WillowTree teammate’s loved one was temporarily paralyzed and unable to speak, we built Vocable — a free, open-source augmented and alternative communication (AAC) mobile platform that allows users to communicate via eye and/or head movement.
Want to hear more about voice?
Here's the critical design flaw: attempting to create voice-only conversations divorced from screens and other digital experiences. For many brands, voice is still being treated as a standalone, self-contained system.
We call this concept of mixed voice-and-visual interactions multimodal experiences. Multimodal will soon transform how we interact with technology, ushering in a new era.
We speak three times as fast as we type…
…but we read twice as fast as we listen.
The core idea is simple: humans SPEAK 3x faster than we TYPE. But we READ 2x faster than we LISTEN. The future of digital is the voice command married with screen response.
The question is, how do we get there?
Don't take our word for it. Check out this side-by-side comparison for a fictional pizza ordering app (turn on your sound)…
Expert Tip – Surveying the voice landscape and building a solution takes time, effort, and expertise. Want help from our Voice Innovation Team? Let’s talk.
Voice capabilities will soon become integrated into every facet of human-facing technology. Does that mean voice will replace screens? Certainly not.
What we’re seeing instead is the emergence of VoiceCases — moments when the easiest way of getting something done is via voice. VoiceCases are present in our daily lives: hands-free driving, connected home security, telehealth, and online banking, to name a few.
We know what the future looks like. What are the business implications of voice today?
Voice User Interfaces (VUIs) promise easier, more efficient experiences. However, we are often met with affordance challenges that result in frustration and unmet expectations.
We’re invested in this problem space at WillowTree. We think it’s solvable and can be mitigated through thoughtful user experience design, cleverly practiced engineering, and an embrace of a multimodal approach. In order to set up a Voice User Interface for success, we’ve outlined a series of steps in our free, open-source Figma file.
Expert Tip – Consider how multimodal voice applies to your business. (Do you employ lots of field service technicians with aging hardware? Are you a B2C retailer looking to ease customer pain points? Need help answering these questions? Get in touch.)
As with any design or technology initiative, you need to start with the user and the current state of their experience. What technology is available to create voice experiences? What are your customers’ expectations of those experiences, and how can you meet or exceed those expectations with a digital solution?
Voice technology has been evolving rapidly, with new platforms, new devices, new industries, and new integrations cropping up regularly.
WillowTree conducted a nationwide survey of 824 people to get a pulse on which voice use cases were resonating with users. We asked people to rate 15 VoiceCases on two criteria: usefulness and efficiency. Here’s what we found.
Based on this research and on what we’re seeing in the market, voice is poised to become the preferred interaction model for at least three unignorable use cases that currently happen primarily via screens: Specific Search, Composition & Logging, and Coaching & Instruction.
Regardless of the VoiceCases you’re focusing on, or even the platforms you intend to start with, position your company or product to move quickly as the voice space continues to develop.
The form factor, capabilities, and market share of voice-enabled devices will continue to shift rapidly over the coming years. Design a backend for your voice ecosystem that can support voice applications wherever on are: whether that’s on Google Home, in a car, or on a device yet to be invented. You should also have APIs that enable easy access for VoiceCases.
Voice engineering activities are a little more advanced than those required for conventional software applications. In addition to standard UI and back-end technologies, voice requires two additional layers: Automated Speech Recognition (ASR) and Machine Learning (ML).
The content required to “train” your AI can come from any number of places; some of the most common places to look are: in-person user interviews, customer support transcripts, customer-facing knowledge bases, and blog content.
By the same token, you should audit all customer-facing content for useful solutions to user problems. If you’ve surfaced VoiceCases in your discovery process that you don’t yet have a solution for, you’ll need to develop new content to address them.
So you’ve got some VoiceCases in mind and you’ve got a backend that can support the conversations you need to have with your users. It’s time to launch something into the world!
Here a few thought starters:
You need to rethink all of your customer’s interactions with your products and services now that there is an additional tool — voice — at your disposal.
When the steam engine was invented, you rethink your entire approach to transportation — rather than hooking a steam engine to a carriage. Similarly, you can’t just create apps for your company on Google Home and Alexa and think you’ve checked the “voice” box. You have to think about how and where voice will be the most efficient, most delightful way for your users to complete a task — even if what they’re doing via voice is just one part of a flow that happens across multiple devices or channels.
Expert Tip – The voice technology landscape changes daily. Subscribe to our newsletter to stay on top of new developments in voice, mobile, and emerging technology.
While voice may seem like a “UX enhancement” in today’s conventional applications, it will be a mandatory input method in the next generation of software.
In the next hardware generation (let’s call it the next 2-10 years), we will see the emergence of virtual displays that will cannibalize the presence of physical monitors and touchscreens. Without a physical keyboard or touchscreen, additional input systems, including voice, will be required for user/software interaction.
Voice is the future of UX, and multimodal applications can unleash the power of voice today. WillowTree is proud to be an industry leader in the research and development of voice-based technology.