What It Is
This is an AI-powered intervention, most often powered by Jigsaw's Perspective API, which rates a comment's toxicity.
Typically any comment receiving a high toxicity score will prompt a message suggesting that they revise their post.
When To Use It
This can be used in conjunction with any comment section or platform that primarily relies on short-text posts.
It should appear to the user after they have pressed the 'Submit' button, and before the post goes live.
What Is Its Intended Impact
This intervention reduces the number of toxic comments posted. It is particularly geared towards "road rage" style comments, in which an otherwise genuine user is immediately in the heat of the moment.
Evidence That It Works
Evidence That It Works
A private study by OpenWeb (2020) on their own platform, which they sell as a service, found that about half of users either revise their comment or decide not to post it when prompted that their comment may be inflammatory.
"The overall positive outcome of this experiment reinforces our belief that quality spaces and civility cues drive a better, healthier conversation experience." write the study's authors. "A gentle nudge can steer the conversation in the right direction and provide an opportunity for users with good intentions to participate. The feature provides more transparency and education throughout the user engagement journey boosting loyalty and overall community health."
In a separate survey of 907 Reddit Users, "although roughly a fifth (18%) of the participants accepted that their post removal was appropriate... over a third (37%) of the participants did not understand why their post was removed, and further, 29% of the participants expressed some level of frustration about the removal." The study suggests that "users receiving explanations for removal are more likely to perceive the removal as fair and post again in the future."
And finally, in a randomized controlled experiment conducted on Twitter, researchers from Cornell and Yale evaluated "a new intervention that aims to encourage participants to reconsider their offensive content [with a prompt to] users, who are about to post harmful content, with an opportunity to pause and reconsider their Tweet."
Their research found that users prompted with this intervention "posted 6% fewer offensive Tweets than non-prompted users in our control. This decrease in the creation of offensive content can be attributed not just to the deletion and revision of prompted Tweets — we also observed a decrease in both the number of offensive Tweets that prompted users create in the future and the number of offensive replies to prompted Tweets." They concluded that interventions allowing users to reconsider their comments can be an effective mechanism for reducing offensive content online.
Why It Matters
The findings could help explain that, while a minority of the edits were either trying to trick the system or redirecting their angrily to the prompt itself; most, but not all, of the edits in response to this prompt are done in good faith.
While most edits in response to this prompt are done in good faith, there can be backlash and attempts to circumvent the intervention.
In one study, roughly 0.4% of cases in which this nudge was used, users edited their posts to add even more slurs, attacks, or profanity compared to what they originally intended to post.
And, as any API that rates the toxicity of comments is human-written, it will naturally carry the perspectives and biases of its creators.
OpenWeb tests the impact of “nudges” in online discussions
"Did You Suspect the Post Would be Removed?": Understanding User Reactions to Content Removals on Reddit
Reconsidering Tweets: Intervening During Tweet Creation Decreases Offensive Content
Is this intervention missing something?
You can help us! Click here to contribute a citation, example, or correction.