How AI can help moderate online toxicity

Toxicity on social networks and the internet is on the rise. However, used well, artificial intelligence (AI) can be part of the solution to protect online communities and brands’ reputation

By Matthieu Boutard

The internet and social networks have become an essential tool for brands and influencers to engage with their community and to forge new ties with prospective customers or followers. It is here that customers and followers ask their questions, report problems, share opinions in real-time, interact with the brand and with each other. This engagement and ability to monitor and respond is an incredible opportunity for brands – provided that the relationships it enables encourages dialogue, rather than dispute.

Unfortunately, the continued rise in online toxicity every year belies this vision. More than 5% of the content generated on brands or influencers’ social networks and platforms is considered toxic (3.3% is hate such as insults, misogyny, racism, homophobia, or body shaming, and 2% is junk/pollution), according to a Bodyguard.AI study. This has become a real issue as 40% of users leave a platform after their first exposure to harmful language. Not to mention that these users are also likely to communicate their poor experience with others, which can lead to brand damage, which can often be irreparable.

AI comes to the rescue of moderators

The solution is smart moderation to defend users against online toxicity. However, manual moderation is time-consuming and reliant on human beings who get tired, desensitized, or overworked. The issues are further compounded by the time-sensitive nature of online hate. With the fleeting nature of social media comments, if moderators don’t react in time or fail to “catch” something, the damage is already done

A trained human moderator needs 10 seconds to analyse and moderate a single comment. When a hundred thousand comments are posted at the same time, it becomes impossible to manage the flow and handle hateful comments in real-time. This is where artificial intelligence can come to the rescue. Working with algorithms has made it possible to review and instantly moderate huge volumes of online comments. Yet even this solution can sometimes present limitations.

The challenge of contextualization and the risk of censorship

Most of the essential social platforms use a system called machine learning to moderate online content. Unfortunately, this system cannot detect all the subtleties of language, sentiment analysis, and the complexities associated with context of meaning. The result speaks for itself: according to the European Commission, only 63.6% of hateful content is removed from social networks.

This substantial margin of error can also be the difference between censoring free speech and protecting communities, if algorithms overreact and erase content that is not actually toxic. For brands and influencers, this represents another conundrum. On the one hand, they have a clear responsibility in protecting their community and the people who venture to their channels, buy their products, etc. On the other hand, they risk infringing on free speech, which can also damage the experience of their community on those channels and eventually hurt their reputation, cause disengagement, and so on.

Artificial intelligence and human supervision: finding the right balance

AI has been deployed by brands to help solve this issue. But can artificial intelligence solutions really deal with moderating the complexities and subtleties of language?

The answer is dependent on the approach brands are taking. AI technology can be agile and refined enough to ensure moderation technology is both contextual and autonomous. However, solutions vastly vary and need to be considered with care as the best practice approach should be a fair and fine balance between artificial intelligence and human supervision. The best-in-class solutions can now reduce the error rate in the detection of toxic content to around 2 to 3% – instead of the 20 to 40% that some of the less sophisticated solutions deliver – and they will automatically identify and block 90% of toxic content, in real-time.

Benefits of the human-AI blended approach

If you deploy AI solutions that have a strong expertise in linguistics, these will enable algorithms to take a more contextual approach to moderation. They will be able to take into consideration colloquialisms, as well as the relationship between terms and/or emoticons and between the users themselves. This will help moderators enormously as they will be able to determine who the content is for, how it's toxic, and its severity in real-time, for accurate, intelligent, and instant moderation. Down the line, quality controllers can then go over the moderated content to double-check the relevance of the moderation that was applied and whether the comment was indeed toxic or not.

These solutions that have a cycle of survey, artificial intelligence and human supervision make it possible to protect individuals, communities and brands from up to 90% of toxic online content without infringing on free speech and damaging brands’ reputation. This approach will never stop the flow of free conversation and shouldn’t automatically intervene to prevent criticism from being expressed. The blended human-AI approach will ensure that only messages that are truly toxic are moderated, where the only intention is not to communicate, but to attack. This will open-up a safer and more inclusive Internet that encourages engagement, dialogue and respect.

Matthieu Boutard is President and co-founder of Bodyguard.ai.

Image by asiandelight on iStock.

pr-strategy ai social-media