Voicera EVA AI mimics human attention to highlight important points

The popularity of voice-controlled smart assistants has substantially pushed forward the fields of speech recognition and natural language processing (NLP). This, in turn, has improved other products that make use of such technologies, like transcription engines used in meetings and interviews. But it's one thing to simply to accurately transcribe spoken sometimes barely understandable words into text and another to extract and highlight important points in a meeting involving multiple people. Voicera's in-meeting assistant EVA can do both and it's no surprise that its secret sauce is artificial intelligence. Specifically, the Progressive Attention AI that the company is now making available to the public.

Turning spoken word to text is relatively easier these days. Simply feed a transcription engine an audio file and watch it spew out letters. These days, the accuracy of the transcription is being improved by the use of machine learning and AI. But to actually pinpoint parts of the transcription that are more important than others, you will need to actually pay attention to what's being said and how it's being said. And for that, you will need a dual system AI.

The human brain, it turns out, doesn't have a one track mind. Its attention actually operates on two levels. The first level is like a high-speed radar that's always scanning the environment without prejudice. When it picks up something important, it will then switch to the second level, where attention and brain functionality is more focused on the input, disregarding other stimuli. The brain seamlessly and effortlessly switches between these two levels at a moment's notice, something that AI hasn't exactly been able to do. At least not until now.

AI systems often have to choose between those two levels, between fast but lower accuracy and accurate but time-consuming processing. Voicera's solution is to actually use both. The Progressive Attention AI in EVA is actually made up of two systems to mimic how the human mind works. One is always on and always listening, on the lookout for changes in environment and sound. The other is deeper, more focused, and more accurate but only kicks in when attention is needed. Like on important parts of a conversation.

If this dual system Progressive Attention AI isn't enough AI for you, EVA has actually more. When Voicera sets EVA, which is short for the Enterprise Voice AI, by the way, to work on a transcription, it actually uses three training engines on the same audio file. Each engine specializes in a particular scenario, like one that is trained on meetings with lots of background noise, one trained on multi-speakers, etc. This is called Ensemble Learning and it is designed to increase the accuracy of the output. When there's a disagreement between the engines on what word a particular sound translates to, a machine learning layer acts as an arbiter, either by preferring the engine that has more expertise on a given situation or simply picking the output that two out of three engines agree upon.

This multi-faceted AI system not only allows EVA to accurately highlight portions of a transcript that you can share later on, it also acts as a sort of redundancy system. You can record a meeting or conversation, confident that even if the Internet connection drops, you will still end up with a useful transcript. Voicera will reduce the bitrate and quality of the audio that gets streamed to its instant transcription engine while keeping a high-quality version for later processing when you get a better connection.

While EVA naturally relies on machine learning to improve its accuracy, Voicera is also looking to users to give it some aid. Comments, annotations, and edits made by users after the transcription has been made can go a long way in improving the AI's models. That said, that editing ability is something that's still absent from its mobile apps, which Voice is working on to change.

Given today's often fast-paced meetings, the plethora of mobile devices that distract you, and humans' naturally short attention spans, keeping on top of all the conversation that happens in a meeting or interview. With Voicera and EVA, you can rely on good ol' artificial intelligence to keep notes for you while you put down your pen and focus on what's really important: being present and paying attention to the people with you.