THE VOICE CATCHERS
Published with assistance from the foundation established in memory of James Wesley Cooper of the Class of 1865, Yale College.
Copyright 2021 by Joseph Turow.
All rights reserved.
This book may not be reproduced, in whole or in part, including illustrations, in any form (beyond that copying permitted by Sections 107 and 108 of the U.S. Copyright Law and except by reviewers for the public press), without written permission from the publishers.
Yale University Press books may be purchased in quantity for educational, business, or promotional use. For information, please e-mail (U.K. office).
Set in Meridien and Futura types by IDS Infotech Ltd.
Printed in the United States of America.
Library of Congress Control Number: 2020947840
ISBN 978-0-300-24803-6 (hardcover : alk. paper)
A catalogue record for this book is available from the British Library.
This paper meets the requirements of ANSI/NISO Z39.481992 (Permanence of Paper).
10 9 8 7 6 5 4 3 2 1
For Judy
CONTENTS
THE VOICE CATCHERS
INTRODUCTION
Here Comes the Voice Intelligence Industry
Your voice is unique. No one else has it. And because your voice belongs to no one else, its extraordinarily valuable, not only to you, but also to a new sector of society that is designed to exploit it: the voice intelligence industry. Built by marketers to collect information from the ways individuals talk and sound, this industry is deploying immense resources and breakthrough technologies based on the idea that voice is biometrica part of your body that those in the industry believe can be used to identify and evaluate you instantly and permanently. Companies are working to analyze your vocal-cord sounds and speech patterns for information about your emotions, sentiments, and personality characteristics, all so that they can better persuade you, often in real time. Soon they may be able to draw conclusions about your weight, height, age, ethnicity, and moreall characteristics that scientists believe are revealed by your voice. Marketers will be able to score you as more or less valuable, show you different products based on that valuation, give you discounts that are better or worse than the ones they give other people, and treat you better or worse than others when you want help. In other words, marketers are using voice data to model ways to discriminate between you and others in unprecedently powerful ways. And all of this is happening without adequate regulations and safeguards to help American consumers understand the risks. The aim of this book is to describe this developing domain, explain how its already influencing our lives, and show what about it needs to be stopped. Now is the time to promote perspectives and policies to derail the voice-based world of marketing biometricswhile the industry is still being built, and before socially corrosive processes linked to it become too entrenched to change.
The emerging voice intelligence industry involves such tools as smart speakers, car information systems, customer service calls, and connected-home devices like thermostats and alarms. When you talk, their intelligent assistants can draw inferences about you using analytical formulas generated by artificial intelligence. In the United States and European Union, the best-known assistants tasked with performing such activities are Amazons Alexa, Google Assistant, and Apples Siri. In China, Baidu is doing it with its DuerOS voice assistant, and Alibaba with Tmall Genie. Each carries out its work through tens of millions of smart speakers (WiFi linked audio devices), smartphones, and car audio systems.creating voice initiatives propelled by artificial intelligence in customer contact centers.
Public attention to the voice industry has centered primarily on smart speakers. Dubbed voice first devices by marketers, these are cylinders (or more recently other shapes) that sometimes come with screens. Ask a question or make a request, and the devices can access a huge number of information sources including app-like add-ons contributed by thousands of companies, nonprofits, and even individuals. Owners most typically use the devices to check the weather, set timers, learn recipes, listen to music, play games, ask for facts, and buy things. In the United States the explosion of smart-speaker sales began around 2014 with the introduction of Amazons Echo and its assistant, Alexa. The Google Home came out almost exactly two years later, and then smart speakers from other firms came tumbling out. Apple and Samsung used preexisting assistants (Siri for Apple, Bixby for Samsung), and companies like Sonos built speakers that link to Alexa or Google Assistant, or both. Press attention during this period has see-sawed between the latest capabilities built into these devices and the new social dangers they represent. Many stories center on the smart speakers ability to listen and then answer. The gizmo starts recording whenever it hears the wake word (Alexa, Hey Google, Siri), and it tracks sound for up to sixty seconds each time. Ask Alexa, whats the temperature in Chicago? and a (so far) immutable womans voice will provide a direct response. Try Hey Google, how many plays did Shakespeare write? and a female voice (which in this case you can change to male) will give a concise (and correct) answer (thirty-seven), along with two sentences that elaborate.
The real difficulties with the smart speakers and the voice intelligence industry, however, have yet to emerge. The unwanted incidents will come not from bugs, hacks, or glitches, but from features of technology that work properly. Thats because the system is evolving into a blueprint for marketers to use your bodys signals for gain. Consider:
Another patent has Alexa listening through a smart speaker for keywords such as enjoyed or love. When it hears a trigger word, it captures adjacent audio that can be analyzed on the device or remotely, to figure out what the person enjoyed or loves; the individual might say I enjoytraveling to San Francisco or I lovehip-hop or I loveJudy. Tracking the keywords would allow Amazon to add information to peoples profiles so it can sell them items related to what they like and not what they dislike, and sell advertisers the ability to reach people with messages that reflect those sentiments. Amazon and its advertisers may also avoid making offers to people who say they love or enjoy what the advertisers disapprove of, or who for personal or cultural reasons dont use those specific words to express happiness.
Two Amazon representatives who wanted anonymity told me it is company policy not to comment about patents. Both also said patents take a long time to bear fruit. That should not prevent our discussing them. Amazon, Google, and other voice intelligence firms are in business for the long term, and our society will likely continue to be influenced by their innovations for generations to come. In fact, as if to underscore the utility of those patents, Amazon announced during fall 2020 that its just-released Halo health and wellness band is able to analyze the tone of its owners voice for qualities ... like energy and positivity.too, is explicitly not for use by third parties. Yet in the face of all the developments youll see in this book, it is hard not to understand the Halos professed capability as a proof of concept. The entire voice profiling idea demonstrated here can, as the patents suggest, easily be ported to the marketing realm and beyond.
Moreover, the building blocks for the patent scenarios are already in place. Voice discrimination already goes beyond what Amazon and Google do. The customer phone service (or contact center) business was first out of the gate in profiting from individuals unique voices. Contact center firms such as Nuance and Verint already evaluate a callers sounds and linguistic patterns for emotion, sentiment, and personality. Linking those biometrics with the callers name, the firms regularly tell reps to give discounts to tense-sounding customers who are big spenders in order to mollify them. Contact-center software also routinely shunts customers pegged as talkative to reps with a track record of getting along with such people and of getting them to spend extra money (upselling them).
Next page