Designing Voice User Interfaces :: UXmatters

Digital Assistants, Personality, and Avatars

While Pearl’s book describes voice interactions, the real core of the voice interface is conversation. A speech-recognition system is simply a user interface that mediates the conversation. Thus, there has been considerable discussion of how an assistive user interface should work with the user.

When designing a voice user interface, you must design—and predict—user’s questions and the answers to them. This is where the art and knowledge of how to craft an interview—something with which UX professionals are familiar—come into play. Thinking of spoken phrasings or words as a user interface draws on our skills for developing well-scoped phrases that prompt the desired action from users.

In a voice user interface—or, more accurately, a conversation-based interface—understanding timing, call, and response is even more critical than in a typical visual user interface. The development of a workflow that really demonstrates an understanding of a user’s task is critical because there is often no visual reference.

When an information system presents itself as an assistant or some other entity that actively responds to our queries, it is natural for humans to assign a personality or even a gender to that system. Voice user interfaces encourage this phenomenon.


One thing that impressed me when the Amazon Echo was introduced was the feedback I had seen from people with diminished motor skills. The Echo’s voice user interface—when paired with a smart-home hub—could give a quadriplegic person the ability to control their home environment simply by speaking. Pearl’s book includes accessibility concepts and best practices and provides relevant examples.

Don’t Forget About the GUI

While voice user interfaces often get the most attention from consumers, most if not all of our modern user interfaces are multimodal user interfaces. Pearl advocates for a holistic approach to designing the experience. It’s important to design and test the graphic user interface (GUI) in concert with the voice user interface. Frequently the graphic user interface delivers the output for a query. For example, if the user asks Siri for directions to a particular landmark, they’ll appear in Apple Maps.

Survey of Voice User Interfaces

Designing Voice User Interfaces provides a great overview of the components of which you should be aware when designing voice user interfaces. One key step in learning about new technologies or systems is simply learning the vocabulary to use in describing the key components of that technology. The book describes the voice technologies that are currently available, starting with an overview of Automated Speech Recognitions (ASR) systems. But technology is just one part of the VUI puzzle.

It is also essential to understand the conventions for different types of interactions. While modern smartphones introduced swipes, pinches/zooms, and other gestures, concepts such as barge-in, end-of-speech, and no-speech timeouts are at play in voice systems. Knowing what they are and how and when to apply them is critical to delivering successful experiences.

Precise understanding of users is always vital, but perhaps even more so for voice interfaces. Pearl provides an example: When designing a voice interface that asks the user to provide an account number, the designer must have the insight and empathy to realize the user might not have that information readily available. Therefore, you must consider timeouts and provide alternative paths to identify the user.

Cognitive Load

It is vital to realize that voice user interfaces—whether in the form of IVRs or modern assistants—can overwhelm people’s working memory. How often have you stopped paying attention to an IVR when it exceeded three or more options? Pearl provides best practices and recommendations on how to design navigation systems so users do not become frustrated by the interface.


I’ve noticed people’s tendency to assume that a new technology means well-established principles no longer apply. I saw this when the iPhone was introduced and again when the iPad launched. One of the things I appreciate most about Pearl’s book is that it clearly demonstrates that, while the underlying technologies for our tools may vary, the actual process for delivering good user experiences does not. The skills and processes that we use as UX professionals are independent of technology or fads.

One fine example of Pearl’s advice is to ensure that you test voice user interfaces with users. As I read her description of how to do this, I was reminded that paper prototyping and think-aloud protocols are still applicable—even in our newest technology products.

Designing Voice User Interfaces provides a good introduction to designing voice interactions. Plus, I definitely learned a lot about the history of VUIs, the technologies that support them, and the design tenets for these interfaces. If you are already familiar with UX design and research skills and competencies, this book should provide some easily transferable knowledge. 

Source link