Deezer develops an AI to hunt for explicit content in songs

Brittany A. Roston - Apr 29, 2020, 2:50 pm CDT
Deezer develops an AI to hunt for explicit content in songs

Streaming platform Deezer is using artificial intelligence to hunt for explicit content in songs, helping humans tag content that contains potentially bothersome language. The company detailed its effort in a new paper that will be published at the ICASSP 2020 conference, according to a recent post. Deezer points out that what is considered explicit content is a ‘cultural issue,’ which complicates things.

Though record labels may tag explicit content as such when they shuttle the albums off to streaming companies, Deezer points out that many do not — in fact, the company says that ‘a substantially large part’ of its music library doesn’t have a tag noting whether it does or does not contain explicit content, which can include anything from strong language to discriminatory content and more.

Humans have been tasked with reading song lyrics and determining whether an ‘explicit’ tag is appropriate, but it may not remain that way forever. Deezer is looking into whether it is possible to build a system that can analyze large libraries of content and flag problematic content on its own. This, the company notes, is quite complex.

That complexity doesn’t stem from finding certain words or phrases, but rather from the need to understand what each culture expects in terms of explicitness, as well as what is subjectively considered to be inappropriate or adult content. The company detailed the technical aspects of its effort in a long Medium post this week, explaining some of the methods and problems it has encountered.

Deezer notes that no system has been able to reach the same level of accuracy as humans in detecting explicit content, but that its system has returned some ‘encouraging results.’ At this point in time, AI alone cannot determine whether music is explicit or not, but it can potentially serve as a tool to help humans in their effort to catalog large libraries of content.

Must Read Bits & Bytes