MIT's AI machine removes humans from big-data analysis

The spread of the Internet and prevalence of mobile devices have put data front and center today more than any age in the past. But with volumes of data comes the need to make sense of that data. Called big-data analysis, the process has mostly relied on human intuition. As any artificial intelligence scientist will tell you, intuition is one of the hardest parts of the human thinking process to replicate. Researchers from MIT, however, might be on the verge of a breakthrough, with a Data Science Machine capable of performing just as well or even better than humans.

The Data Science Machine was designed and built specifically for big-data analysis. To test just how far they've come, the researchers signed up the AI into three data science contests. In one competition that had 906 teams it finished ahead of 615. In two out of three competitions, it was 94 and 96 as accurate as human players. And while it was only 87 percent accurate in the third, it only worked on the data for a maximum of 12 hours. Humans took months to finish their analysis.

Max Kanter, whose masters' thesis was the basis of the machine, and his thesis adviser Kalyan Veeramachaneni uses several techniques to give the Data Science Machine the semblance of intuition. For example, it uses the structural relationships in databases as hints. All major databases these days have these relationships, making it easier to implement.

Rather than worrying yet again about the dark cataclysmic future filled with robotic overlords, the Data Science Machine is, at least for now, being employed for somewhat more mundane tasks. In particular, the machine is being used to help determine which students are likely to drop out from MIT's online courses. The machine tries to gain that insight by analyzing how early or late a students starts an assignment and how much time he or she spends online on the course.

Kantar says that there is so much data out there in the world. Just ask Google. However, most of those are just sitting there, underused and underutilized. The Data Science Machine could, in the future, reduce the time needed to sift through that data to produce something more useful. Or maybe even something more dangerous.

SOURCE: EurekaAlert!