Preliminary Flagging Before Posting

Reduce online harrassment



Preliminary Flagging Before Posting

Share This Intervention

Back To Top


What It Is

This is an AI-powered intervention, most often powered by Jigsaw's Perspective API, which rates a comment's toxicity.

Typically any comment receiving a high toxicity score will prompt a message suggesting that they revise their post.

When To Use It

This can be used in conjunction with any comment section or platform that primarily relies on short-text posts.

What Is Its Intended Impact

This intervention reduces the number of toxic comments posted. It is particularly geared towards "road rage" style comments, in which an otherwise genuine user is immediately in the heat of the moment.

Worth noting:

  1. Most, but not all edits in response to this prompt are done in good faith. A minority of the edits were either trying to trick the system or respond directly and angrily to the prompt.
  2. It is worth noting that any API that rates the toxicity of comments is human-written. It will, naturally, carry the perspective and biases of its creators.

How We Know It Works

Why It Might Work

A study by OpenWeb (2020) on their own platform* found that about half of users either revise their comment or decide not to post it when prompted that their comment may be inflammatory.

"The overall positive outcome of this experiment reinforces our belief that quality spaces and civility cues drive a better, healthier conversation experience." write the study's authors. "A gentle nudge can steer the conversation in the right direction and provide an opportunity for users with good intentions to participate. The feature provides more transparency and education throughout the user engagement journey boosting loyalty and overall community health."

*Important Note: this was a study of their own platform, which they sell as a service.

Why It Matters

The findings could help explain that, while a minority of the edits were either trying to trick the system or redirecting their angrily to the prompt itself; most, but not all, of the edits in response to this prompt are done in good faith.

Special Considerations

When it comes to moderation technologies there is no one size fits all. We believe this data analysis has also helped us understand and detect online trolls faster and better. If a user is repeatedly ignoring nudges and trying to trick the system, it warrants stronger tools such as auto suspension.

Ido Goldberg, et al.


No items found.

Our Confidence Rating


Tentative Grades occur when randomized controls or independence from vested interests are absent. It also happens to Convincing interventions if they fail to replicate.

in the wild

This intervention has precedence, and exists, or has at one time or another, existed in the wild.


Intervention Specific Research

OpenWeb tests the impact of “nudges” in online discussions


Ido Goldberg, Guy Simon, Kusuma Thimmaiah

Date of Publication

September 21, 2020

Sample Size(s) (N-values)



Entry Type


Publication Status

Corporate Origin

Study Design


Online Impact


APA Citation

There's more to learn about this intervention. Want to help?

Do you think this intervention could have more benefits, unacknowledged drawbacks, or other inaccuracies that we've neglected to mention here?

We always welcome more evidence and rigorous research to back up, debunk, or augment what we know.

If you want to be a part of that effort, we'd love to have your help!

Email us

Further Reading

Back to Prosocial Design Library