Last week at BUILD 2016, Microsoft revealed a Seeing AI project that combines decades of research in computer vision, neural networks, and artificial intelligence to help the blind identify anything using only one single app. Not to be outdone, Facebook is also announcing a similar thrust aimed at people with severe visual impairments. Dubbed Automatic Alternate Text, the feature uses those same computing technologies and theories to identify contents of a photo in a Facebook post, allowing screen readers to describe them as if they were text, and giving visually impaired people a better idea of the picture in their head.
In theory, images in a web page should come with an alternate text, “alt” in the HTML image tag, to describe the image. This information would then be used both by screen readers as well as plain text web browsers (yes, those do exist and people do use them) as placeholders for the images. But with dynamic content like those on Facebook, it is nearly impossible to manually add an alternate text for each and every photo posted there.
Enter automatic alternate text, or Automatic Alt Text. Similar to Microsoft’s project, it also uses years of research in computer vision and artificial intelligence to analyze and identify objects inside a photo. And it goes beyond mere identification too. For example, it tries to identify the emotions shown by people inside a photo, whether they’re smiling, for instance. To some extent, it is almost similar to Microsoft’s newly launched Caption Bot website, though Facebook’s algorithms try to identify every object in a scene instead of just a single focus.
Auto Alt Text won’t be a perfect replacement for another human being more accurately describing a photo, but there won’t always be someone available ready to describe such things for blind people. The AI also simply identifies objects in image, not their arrangement and relationships in the scene. Those might arrive in the not so distant future, when artificial intelligence becomes scarily smarter. For now, people will just have to use old-fashioned imagination to recreate the image in their head.
Facebook is rolling out the feature first to iOS screen readers using the English language. Other platforms and languages are promised to follow someday.