SlashGear

Google Brain algorithm makes “zoom in, enhance” almost real

Anyone who has followed crime procedural TV shows, particularly the likes of CSI, will probably be familiar with the “zoom in, enhance” method of pulling evidence out of thin air, so to speak. It has become so common an occurrence that it has transformed both into a trope but also into a sort of holy grail for computer scientists. The latter bunch at the Google Brain deep learning lab might have come across a process that almost comes close to fiction. Well, almost.

For the uninitiated, the “zoom, enhance” trope, in a nutshell, means taking a grainy, pixelated image, or a cropped section of one, and then, through the magic of science and technology, produce a recognizable face or license plate number that law enforcers can use to save the day. Easier done in fiction, of course, because a basic principle in computer graphics is that you can’t extract more data, that is, a higher resolution image, out of a very low res image. Like, say, an 8×8 image. Not without the help of machine learning, of course.

The human brains at Google Brain have come up with a two-pronged way that almost makes that kind of process possible. Two-pronged because it involves two neural networks working on the same task from different angles. One network, the “conditioning” network, tries to compare the small, low res, 8×8 image to higher res photos by resizing the latter down to 8×8.

The other “prior” network, on the other hand, studies specific sets of pictures, like faces of celebrities or rooms, and tries to discern patterns, like placement of facial features. It then applies those to an upscaled version of the 8×8 image. The two networks are then combined to form a single, higher resolution image, whose accuracy is both impressive and, at times, almost ridiculous.

The results are quite promising. Google Brain researchers say that humans comparing high resolution images with the artificially produced ones were fooled 10% when it came to faces, and an even higher 25% when it come to photos of rooms. It’s not even close to perfect but significantly higher than previous methods. We’re still far, far away from the world of CSI, but it seems we might be frighteningly close.

VIA: Ars Technica