HTC’s announcement of its U Ultra and U Play phones, and consequently the revelation of its Sense Companion, may have you cringing at yet another AI-powered, disembodied, virtual assistant. But HTC’s spiel has a nugget of sense. AI agents today are so bright they can tell jokes or even hold conversation, whether with humans or with each other. But when it comes to actually assisting us humans to get things done, they’re not that smart yet and need more than a little nudging to work. If these assistants are to be the way we interact with our devices, our homes, and our digital selves, they need to be able to do a bit more, and soon. Here are some things these virtual secretaries and butlers need to be able to do in order to really make a significant impact in our modern lived. Not all of them even require substantial advancements in artificial intelligence.
The current roster of virtual assistants, different as they may be, all have one thing in common. They all need to be woken up first. That usually means pressing a button or saying a magical phrase. Of course, there are times when you’ll want to explicitly ask something. But if a secretary only did what he or she was told to and only when told to and nothing else, they’d probably be out of a job really quick.
Virtual assistant, likewise, need to be a bit more proactive. Sure, on a technical level, these agents will still be triggered by external factors, like a clock, weather update, event reminder, etc. The point is that you shouldn’t have to ask them only when we remember we might have an appointment in less than 30 minutes. It should be smart enough to volunteer relevant pieces of information before they’re due. GTD author David Allen says our brains have a habit of reminding us of things when we don’t need to be reminded and not reminding us when we do need them. AI assistants should be better than our brains.
Put 2 and 2 together
A corollary to this is that AI-powered assistants should also be smart enough to stitch pieces of information together. It’s not enough for it to know that you have a 9 AM meeting or what the weather will be like today. It should be able to advise you, hours in advance, to leave a lot earlier because traffic might be hell from the upcoming downpour. Oh, and bring an umbrella, of course.
This would require AI agents to not only have access to those pieces of information, which it already has anyway, but also to see them not as disparate pieces but as parts of a whole. Admittedly, this does require a bit more intelligence on their part, but probably not much. If the forecast is heavy rains at 8 AM and the calendar has something scheduled at 9 AM, it won’t take too much calculations to see how those two can be related.
Not be obnoxious
While AIs should be able to take initiative and volunteer information when or, more importantly, before you need it, they also should know when not to volunteer information and stay silent. Meetings and bed times are no-brainers, but some people also have set periods when they do not want to be disturbed, either because they’re busy or because they’re not in the mood. The smart assistant should be able to discern that and delay non-critical information for later. Of course, they should also be able to know if something is urgent and break the rules if needed.
Part of this would require access to calendars and schedules, but part of it might also need some biometric data. Heart rate, stress levels, and others could give an indication that the user is just a wee bit stressed out and would probably appreciate relaxing music more than nagging.
Learn (explicitly) from you
Much of the above relies on the AI assistant knowing the user and her habits, likes, and idiosyncrasies. Indeed, many of the AI functionality these days promise it’ll be able to learn more about you over time. But sometimes, it takes too long. And sometimes, you need to actually step in.
AI might be smart and, in some areas, smarter than us, but we know ourselves better than any AI. Most, though not all, such assistants start with a completely blank slate and then rely on constant usage to build up its image of the user. It would be better, however, if, maybe at the beginning or in sporadic instances, ask you to make some preferences more explicit. It doesn’t to be a first use wizard or questionnaire. Just like a real friend wouldn’t dump all the questions in one sitting (unless you were playing a game), the assistant could ask once in a while the human reasoning behind your decision not to choose that recommended restaurant.
Know your voice
Part of knowing you, the user, is knowing what you sound like. Some assistant already use this kind of “biometric voice” identification but only for securely identifying you before responding. Some, sadly, don’t even have that security. In addition to being able to identify you, however, it should also be able to note your current state through it.
Admittedly, voice recognition, let alone speech recognition, is harder to get right but the idea isn’t new. Mark Zuckerberg is also playing with the idea that his Jarvis AI butler should be able to respond differently to his wife than it does to him. After all, even if married, they don’t exactly share everything in common.
Truth be told, these AI assistants are as flat as their voices. Sure, they can tell jokes and respond with sarcasm, but they deliver those lines exactly in the same way to everyone at all times. While they may be entertaining at times and utilitarian at best, people don’t usually form attachments to them because they’re not exactly relatable. They could do with a bit more personality.
Developers can, perhaps, take some lessons from game designers who have long been injecting some semblance of humanity into their NPCs and player companions. Coincidentally, Cortana is actually a fictional AI character from the hit game franchise Halo. It needn’t be a hugely complex thing where the assistant becomes obstinate and uncooperative. Or at least let the users choose their voice and gender or whether they’d prefer to keep their assistant at robotic and artificial as they truly are.
Almost without fail, the AI assistants require an Internet connection to use, whether that’s because they’re actually fetching data from an online source or because they’re tapping into some natural language processing system that’s only available online, for one reason or another. But while we’re living in an increasingly connected world, we aren’t always connected all the time.
Not every piece of information we need has to be fetched from the Internet. Our calendars, contacts, music, files, and many more are comfortable sitting on our local storage. But sometimes they can’t be reached by our assistant because they themselves can’t connect to the Internet.
A side effect of keeping Internet interactions to a minimum is minimizing the opportunities for unauthorized people to snoop in on the data transfer between the assistant and its mothership. Not to mention it minimizes the data that you yourself send to companies.
Hold silent (typed) conversations
Almost all these AI-assistants are voice-activated. Some, however, can only be talked to by voice. This is great for times when your hands are otherwise too busy or for devices that don’t even have any input method other than voice. Speaking commands out loud, however, isn’t always the best way to communicate with them. Whether it’s because you’re out and about or whether your partner is fast asleep or because you’re planning a surprise gift, AI assistants should be able to handle anything you throw at them through any medium.
Granted, software engineers and computer scientists are probably more excited about voice technology than typed text, but, for a final, consumer product, the written, or typed, message is just as important, and actually less ambiguous, than a spoken one.
Smart personal assistants have been with us for years and yet, even today, they are still mostly viewed as eccentric features not for everyone and not for every day use. But if they are to really become as sundry as the our smartphones and computers, they need to feel more natural to use. They need to be as capable, talented, and proactive as a human personal assistant, but one that just happens to live inside a phone or speaker.