Paying by Ear: Auditory Judgment and Decision Making in Voice Commerce

Kurt Munz


Chair: Vicki Morwitz
Committee: Tom Meyvis, Adam Alter, Yaacov Trope
Voice commerce involves shopping by voice using a computerized assistant. Assistants such as “Alexa” and “Google Assistant” inhabit smart speakers and smartphones and can facilitate purchases. Some analysts predict voice commerce transactions to rapidly grow to $45 billion by 2022, and point to the large potential influence of recommendations in this channel. Others sow skepticism, reporting that only 2% of smart speaker owners have ever shopped by voice, with few repeat voice shoppers. Understanding auditory judgment and decision making in this context is therefore critical for predicting which of these outlooks may prevail.
Essay 1 – Not-so Easy Listening: Roots and Repercussions of Auditory Choice Difficulty
The first essay of my dissertation explores the impact of recommendations made by voice. When it comes to recommendations, voice can make transactions seem more like human interactions, which may lead to a feeling of social pressure to comply. On the other hand, processing choice options by voice may be more difficult than in writing. Like social pressure, difficulty choosing can make accepting a recommendation more likely, but it can also lead consumers to abandon choosing altogether. Auditory choice may be more difficult due to difficulty making comparisons. Auditory comparisons require holding information in memory while attempting to process new information, which can increase cognitive load. Relatedly, auditory options must also be presented sequentially (versus simultaneously), which can also make comparison more difficult. Voice options are typically presented by speaking all of the information about a single option before moving onto the next. This structure could also interference with simplifying strategies that would be used otherwise (e.g. a lexicographic strategy).
Consistent with the idea that auditory choices are more difficult, six experiments demonstrate that consumers are not able to evaluate auditory options as effectively, and thus are more likely to choose items recommended by a digital assistant. However, they are also more likely to abandon choosing altogether and select none of the available options. These outcomes are related specifically to the difficulty of comparing alternatives when they are presented only by voice. Making options easier to compare can cause these outcomes to effectively disappear for voice presentation. On the other hand, making options more difficult to compare can create similar effects for visual choices, eliminating differences that naturally exist between visual and auditory presentation. Examining evaluations of products presented alone versus in the context of a second option, a “joint versus separate preference reversal” occurs for visual stimuli, but evaluations of auditory stimuli do not vary between these modes, suggesting that auditory consumers can less easily learn from context. Importantly however, difficulty making comparisons can negatively affect the evaluation of a single item presented by voice when a consumer compares it against a salient reference held in memory, a common scenario in the actual marketplace.
Essay 2 – Oral Aura of Truth: Auditory Fluency of Spoken Words Impacts Truth Judgments
In ongoing research, the second essay of my dissertation explores how the fluency of spoken words impacts truth judgments. Past researchers have observed that certain linguistic techniques such as rhyming can make statements feel more fluent, enhancing their believability. We predict that these techniques will be more effective when the statements are actually spoken, compared to presentation in writing. However, there may be some limitations. Past research has found benefits to presenting information by voice, but only when the speaker accurately conveys normal human paralinguistic cues such as intonation, pausing, or correct placement of accents. The authors interpreted their results in terms of the humanizing nature of voice. Alternatively, paralinguistic cues inconsistent with expectations may make speech less fluent, with negative consequences. Understanding the drivers and detractors of auditory fluency can help inform advertising copywriting strategies (e.g. rhyming on the radio) and other related managerial decisions (e.g. to employ a voice actor versus the voice of a digital assistant).
Essay 3 – Amplified Frames: Spoken Words Lead to Larger Framing Effects Than Text
In ongoing research, the third essay of my dissertation explores how voice presentation impacts framing effects, such as describing beef as 75% lean versus 25% fat. We predict framing effects will be more pronounced when claims are presented by voice compared to text. This may be due to consumers engaging in shallower processing of claims made by voice, either because they are harder to process (essay 1) or because they are more fluent (essay 2). With shallower processing, we expect consumers to be less likely to transform the given information.

  2018 Center for Global Economy and Business Grant