Aug 22, 2017 7:00 AM

Defining 'Hate Speech' Online Is an Imperfect Art

Why did YouTube remove a WWII-era video? Because algorithms, and humans, are flawed

Image may contain Modern Art Art and Graphics

Shortly after a rally by white supremacists in Charlottesville, Virginia, led to the death of a counter-protestor, YouTube removed a video of U.S. soldiers blowing up a Nazi swastika in 1945. In place of the video, users saw a message saying it had been “removed for violating YouTube’s policy on hate speech.”

Around the same time, an article from the neo-Nazi website Daily Stormer attacking Heather Heyer, the 32-year-old woman killed during the protest, was shared 65,000 times on Facebook before Facebook started deleting links to the post one day later for violating its community standards on hate speech. After that point, Facebook would only allow links to the post that included a caption denouncing the article or the publication, the company said.

The two incidents underscore a central challenge for technology companies as they reluctantly wade deeper into policing content. To help sort through torrents of material, platform operators increasingly rely on computer algorithms. But these software programs are imperfect tools for assessing the nuances that can distinguish acceptable from unacceptable words or images.

YouTube’s removal of the World War II swastika video provoked sharp criticism online, with many commenters blaming the computers. “If you can fire a sexist human, Google, you can fix a Nazi algorithm,” opined the culture site Boing Boing, referring to the recent firing of James Damore, author of a memo criticizing Google’s diversity programs.

YouTube reinstated the video several hours later and admitted a mistake. "YouTube is a powerful platform for documenting world events, and we have clear policies that outline what content is acceptable to post,” says a YouTube spokesperson. “With the massive volume of videos on our site, sometimes we make the wrong call. When it's brought to our attention that a video or channel has been removed mistakenly, we act quickly to reinstate it."

Arbitrating the bounds of acceptable content on global tech platforms is an enormous task. Roughly 400 hours of content are uploaded to YouTube each minute. Facebook has more than 2 billion users posting updates, comments, and videos. Increasingly, these companies rely on software. Facebook-owned Instagram recently introduced an algorithm to zap comments from trolls. Both YouTube and Facebook have deployed software to filter terrorism-related content. YouTube delivers anti-ISIS content to users searching for ISIS-related videos with a tool known as the Redirect Method. Facebook says it can identify and wipe out clusters of users that might have terrorist ties.

But the software remains imperfect, and so people are almost always involved, too. YouTube says it may use algorithms to determine whether content flagged for review should be given higher priority for a human reviewer. But a human always makes the determination whether to pull something from the platform.

Researchers say the artificial-intelligence programs that analyze content are continually improving. But, they say these programs remain far from understanding the context around words or a picture, which would allow them to make filtering decisions on their own. “Understanding context does suggest, in the most dramatic interpretation, that you understand the world and everything in it,” says Dennis Mortensen, the CEO and founder of x.ai, a startup offering an online personal assistant that schedules meetings. “We are very far away from any machinery reaching that echelon.” Bart Selman, a computer-science professor at Cornell, says people will need to help the machines “for at least a decade or so longer.”

Jana Eggers, CEO of Nara Logics, a startup that incorporates artificial intelligence into its software for companies, uses the World War II Nazi video to explain the challenge of writing such rules into software. “The tech is at a blunt state: anything Nazi take down,” she says. Mistakes such as YouTube’s will prompt a revision: “Anything Nazi take down, unless from a historical perspective.” Then someone will point to pro-Nazi historical videos. “We'll have another iteration: anything Nazi take down unless from a historical perspective and not pro-Nazi. Then, someone will point out that Leni Riefenstahl's work—a historical example of propaganda—has been banned.” Should we take down content that’s being used in a context to rally current neo-Nazis? Probably. But should we also preserve historical examples of propaganda for educational or other purposes? That’s another tough call that AI can’t make just yet. Tech companies will have to decide where they stand on these issues before such decisions become automated.

So why would human moderators recommend that the historic Nazi video be taken down? They may lack context as well. “People get sensitive, and if they don’t have the cultural understanding of what that destruction meant to people, then they don’t understand how important it is to see that destruction,” says Eggers. She compares the state of content reviews to former Supreme Court Justice Potter Stewart’s description of pornography: “I know when I see it.”

Moreover, the sheer volume of content means reviewers must act fast, increasing the chance of mistakes. Mortensen, of x.ai, says reviewers are trained quickly on guidelines for permissible content, then told to scan videos rather than watch them from beginning to end. “In this setting even the same human will label content inconsistently—as in, there will never be a 100% agreement rate,” he says.