Language filter on auto imported topics?
-
It is definitely something that I want to look into doing.
There doesn’t appear to be anyway to filter out languages without de-federating. It would easier to do with some servers, but not others.Example: There are accounts on mstdn.ca that have posts in French, but I am aware that they would have some English speakers as well.
This is the downside of using ActivityPub relays. You start following a lot of servers, but you have to do the filtering by hand.
I have de-federated from a good amount of servers (75), but I may have to be stricter with some others.If there are any specific ones that should be de-federated, let me know.

EDIT: OK, some things discovered tonight.
I was clicking “Block User” on a profile, which turns out was only blocking the user for me and not for everyone.
I have to “Delete Account” on a user to actually remove an individual account. (That should allow us to remove German/French accounts without de-federating the entire server)But then I have to do a soft restart of the forum for the changes to be pushed through.
It’s a solution, but one that will have to be actively monitored.
-
And this has been fixed.
Again, I was tired of waiting for a fix, so I used Claude to help with this.It ended up making a plugin for this site which is a lot cleaner. It’s not perfect, but at least it is something.
This applies to posts coming from the Fediverse and posts created on Caint.ie
Whenever a user on Caint.ie posts in a language that is not English nor Irish, they will see this message:“Your post appears to be in a language that is not currently accepted on this forum. Caint only accepts posts in English or Irish (Gaeilge). If you believe this is a mistake, please contact the site administrator.”
Fediverse posts that are not in English nor Irish will simply not come through.
There are some rules:
A post will be ALLOWED if:
- The text is shorter than 10 characters (too short to detect reliably)
- The language cannot be determined (returns
und) - The detected language is English (
eng) - The detected language is Irish (
gle)
A post will be BLOCKED if:
- The text is 10 characters or longer, AND
- The language can be determined, AND
- The detected language is anything other than English or Irish
Additional things worth knowing:
- HTML tags are stripped before detection, so formatting does not affect the result
- Detection is based on the post content for replies, and the content or title for new topics
- The language detection is statistical – very short posts that scrape over the 10 character minimum may occasionally be misidentified
- Mixed language posts will be judged on whichever language dominates the text
- Posts where the language genuinely cannot be determined are always let through rather than risk blocking legitimate English content
Hello! It looks like you're interested in this conversation, but you don't have an account yet.
Getting fed up of having to scroll through the same posts each visit? When you register for an account, you'll always come back to exactly where you were before, and choose to be notified of new replies (either via email, or push notification). You'll also be able to save bookmarks and upvote posts to show your appreciation to other community members.
With your input, this post could be even better 💗
Register Login