Amazon is investigating how an Echo recorded a conversation and sent that to another person, after one Portland couple claims the smart speaker eavesdropped on their discussion. The incident, which Amazon has described as “an extremely rare occurrence,” is likely to reignite concerns around privacy, the Internet of Things, and increasingly connected devices.
The smart speaker owner says that she received a call from an employee of her husband in Seattle, telling them that he’d received audio files of recordings from inside the house. “We unplugged all of them and he proceeded to tell us that he had received audio files of recordings from inside our house,” she told KIRO 7. “At first, my husband was, like, ‘no you didn’t!’ And the (recipient of the message) said ‘You sat there talking about hardwood floors.’ And we said, ‘oh gosh, you really did hear us.'”
The person who received the files was one of the existing contacts Alexa can use to place voice calls via an Echo speaker. Those calls are optional, and the communications feature must first be enabled through the Alexa app. Amazon also offers a Drop-In feature, which optionally allows certain approved contacts to make an automatic connection – either solely voice or, if to an Echo with a screen and camera, video too – and begin chatting immediately.
When the Echo owner contacted Amazon, she says, an engineer apparently confirmed the issue from their account logs. However, he did not explain specifically how the situation had come about. “He told us that the device just guessed what we were saying,” the owner claims, presumably referring to Alexa’s ability to send a voice recording message to another user.
The company, meanwhile, confirmed the incident though is yet to give any further details. “Amazon takes privacy very seriously,” a spokesperson said in a statement. “We investigated what happened and determined this was an extremely rare occurrence. We are taking steps to avoid this from happening in the future.”
Echo Voice Messages were added to the speaker a little over a year ago, as a subset of the calling feature. The system is meant to be triggered by saying “Alexa, message Dad,” or whichever contact name is wanted, after which point the speaker prompts the user to first confirm the contact and then say their message. Voice messages can be sent to other Echo speakers, or to the Alexa smartphone app.
The dangers of bring an always-on microphone into the home have been a common concern since smart speakers like Amazon Echo first debuted, and something Amazon and others have pushed back against repeatedly. Although the microphones might constantly be listening out for the wake word – by default “Alexa,” though that is customizable – the retailer insists that the devices aren’t constantly streaming to the cloud.
“Amazon Echo, Echo Plus, and Echo Dot use on-device keyword spotting to detect the wake word,” Amazon explains. “When these devices detect the wake word, they stream audio to the Cloud, including a fraction of a second of audio before the wake word.” That extra snippet of audio is used to further understand the context of the command being issued.
Those commands, and the audio related to them, are logged in the Alexa app. Indeed, Amazon allows users to submit feedback on the accuracy of the assistant’s understanding: if the AI doesn’t recognize what you actually said, you can submit the recording as part of a report, which Amazon says will help better tune the system over time. It’s also possible to delete those recording logs.
Update: In a statement, Amazon has blamed an unusual conflation of errors – including Alexa mistakenly hearing her wake word – for the issue:
“Echo woke up due to a word in background conversation sounding like “Alexa.” Then, the subsequent conversation was heard as a “send message” request. At which point, Alexa said out loud “To whom?” At which point, the background conversation was interpreted as a name in the customer’s contact list. Alexa then asked out loud, “[contact name], right?” Alexa then interpreted background conversation as “right.” As unlike as this string of events is, we are evaluating options to make this case even less likely.”