Social media and moderation

I've participated in a lot of online communities, and a lot of types of online communities, over the decades -- mailing lists, Usenet, blogging platforms like Dreamwidth, web-based forums, Q&A communities... and social media. With the exception of blogging platforms, where readers opt in to specific people/blogs/journals and the platform doesn't push other stuff at us, online communities tend to end up with some level of moderation.

We had (some) content moderation even in the early days of mailing lists and Usenet. Mostly1 this was gatekeeping -- reviewing content before it was released, because sometimes people post ill-advised things like personal attacks. Mailing lists and Usenet were inherently slow to begin with -- turnaround times were measured in hours if you were lucky and more typically days -- so adding a step where a human reviewed a post before letting it go out into the wild didn't cost much. Communities were small and moderation was mostly to stop the rare egregiously bad stuff, not to curate everything. So far as I recall, nobody then was vetting content that way, like declaring posts to be misinformation.

On the modern Internet with its speed and scale, moderation is usually after the fact. A human moderator sees (or is alerted to) content that doesn't fit the site's rules and handles it. Walking the moderation line can be tough. On Codidact2 and (previously) Stack Exchange, I and my fellow moderators have sometimes had deep discussions of borderline cases. Is that post offensive to a reasonable person, or is it civilly expressing an unpopular idea? Is that link to the poster's book or blog spam, or is the problem that the affiliation isn't disclosed? How do we handle a case where a very small number of people say something is offensive and most people say it's not -- does it fail the reasonable-person principle, or is it a new trend that a lot of people don't yet know about? We human moderators would examine these issues, sometimes seek outside help, and take the smallest action that corrects an actual problem (often an edit, maybe a word with the user, sometimes a timed suspension).

Three things are really, really important here: (1) human decision-makers, (2) who can explain how they applied the public guidelines, with (3) a way to review and reverse decisions.

Automation isn't always bad. Most of us use automated spam filtering. Some sites have automation that flags content for moderator review. As a user I sometimes want to have automation available to me -- to inform me, but not to make irreversible decisions for me. I want my email system to route spam to a spam folder -- but I don't want it to delete it outright, like Gmail sometimes does. I want my browser to alert me that the certificate for the site I'm trying to visit isn't valid -- but I don't want it to bar me from proceeding anyway. I want a product listing for an electronic product to disclose that it is not UL-certified -- but I don't want a bot to block the sale or quietly remove that product from the seller's catalogue.

These are some of the ways that Twitter has been failing for a while. (Twitter isn't alone, of course, but it's the one everyone's paying attention to right now.) Twitter is pretty bad, Musk's Twitter is likely to be differently bad, and making it good is a hard problem.3

Twitter uses bots to moderate content, and those bots sometimes get it badly wrong. If the bots merely flagged content for human review, that would be ok -- but to do that at scale, Twitter would need to make fundamental changes to its model. No, the bots block the tweets and auto-suspend the users. To get unsuspended, a user has to delete the tweets, admit to wrongdoing, and promise not to do it "again" -- even if there's nothing wrong with the tweet. The people I've seen be hit by this were not able to find an appeal path. Combine this with opaque and arbitrary rules, and it's a nightmare.

Musk might shut down some of the sketchier moderation bots (it's always hard to know what's going on in Musk's head), but he's already promised his advertisers that Twitter won't be a free-for-all, so that means he's keeping some bot-based moderation, probably using different rules than last week's. He's also planning to fire most of the employees, meaning there'll be even fewer people to review issues and adjust the algorithms. And it's still a "shoot first, ask questions later" model. It's not assistive automation.

A bot that annotates content with "contrary to CDC guidelines" or "not UL-certified" or "Google sentiment score: mildly negative" or "Consumer Reports rating: 74" or "failed NPR fact-check" or "Fox News says fake"? Sure, go for it -- we've had metadata like the Good Housekeeping seal of approval and FDA nutrition information and kashrut certifications for a long time. Want to hide violent videos or porn behind a "view sensitive content" control? Also ok, at least if it's mostly not wrong. As a practical matter a platform should limit the number or let users say which assistance they want, but in principle, fine.

But that's not what Twitter does. Its bots don't inform; they judge and punish. Twitter has secret rules about what speech is allowed and what speech is not, uses bots to root out what they don't like today, takes action against the authors, and causes damage when they get it wrong. There are no humans in the loop to check their work, and there's no transparency.

It's not just Twitter, of course. Other platforms, either overwhelmed by scale or just trying to save some money, use bots to prune out content. Even with the best of intentions that can go wrong; when intentions are less pure, it's even worse.

Actual communities, and smaller platforms, can take advantage of human moderators if they want them. For large firehose-style platforms like Twitter, it seems to me, the solutions to the moderation problem lies in metadata and user preferences, not heavy-handed centralized automated deletions and suspensions. Give users information and the tools to filter -- and the responsibility to do so, or not. Take the decision away, and we're stuck with whatever the owner likes.

The alternative would be to use the Dreamwidth model: Dreamwidth performs no moderation that I'm aware of, I'm free to read (or stop reading) any author I want, and the platform won't push other content in front of me. This works for Dreamwidth, which doesn't need to push ads in front of millions of people to make money for its non-existent stockholders, but such slow growth is anathema to the big for-profit social networks.


  1. It was possible to delete posts on Usenet, but it was spotty and delayed. ↩︎

  2. The opinions in this post are mine and I'm not speaking for Codidact, where I am the community lead. ↩︎

  3. I'd say it's more socially hard than technically hard. ↩︎