Smart assistants like Siri have once been the butt of jokes when it comes to mishearing their activation hotwords. Users eventually realized, however, how those mistakes can cost them, especially when it came to potential privacy problems and embarrassing situations. The companies behind these smart assistants are always trying to improve how they properly detect hotwords and Google is now trying to practically crowdsource the data Google Assistant needs to learn from but without going through the potentially messy and controversial route of the cloud.
Google Assistant, just like many of Google’s services, is able to improve its services and algorithms through data it harvests from users. With Assistant, however, Google has taken great care not to send unnecessary voice data to its cloud, especially after the big scandals that rocked that AI assistant market. Technically, Google says it only starts recording voice snippets only after it has been triggered by the “Hey, Google” hotword.
Unfortunately, it’s exactly that hotword that is tripping up Google Assistant and it needs more data on those mistake triggers in order to improve its recognition capabilities. For that purpose, Google is applying a federated learning strategy that utilizes those phrases uttered by Google Assistant users in a more privacy-respecting way.
In a nutshell, raw voice data isn’t sent to Google’s servers and is instead processed locally. Only the anonymized data model is sent to Google and combined with data from other users for the purpose of improving the AI’s algorithms and processes. Voice recordings are encrypted and stored on the device and deleted after about two months when no longer needed.
To be clear, the feature is disabled by default and you’ll have to opt into that program if you want to help improve Google Assistant’s performance. Google is applying similar federated learning strategies across its other apps and services. It has also started crowdsourcing certain data from Android apps in order to improve Google Play Store’s downloading and installation speeds.