• She/her

I'm Noellevanious on any other site too
Avatar Art by my friend krahe!!
https://twitter.com/kurerehe


lokeloski
@lokeloski
This page's posts are visible only to users who are logged in.

dog
@dog

I’ve been hearing a lot of this lately - many people running sites getting hit with huge bills because of AI scrapers. They ignore robots.txt, they switch IP ranges to keep you from blocking them. It’s really rough and it’s not hard to feel like we’re witnessing the end of the public web - running a website that’s open to the public is just asking these people to ruin your day.


You must log in to comment.

in reply to @dog's post:

really makes me wonder what we can even do about it - speaking as someone who enjoys hosting their personal website with writing, imagery and games that i've made. naturally my website won't be their first or biggest target, but i know they'll get to mine eventually. i have a gemini mirror and a bunch of other stuff like that, but is that even close to enough...? i dunno.

a historical parallel that occurs to me is the rise of community-maintained spam blackhole deny-lists that administrators would share among each other back in the day to try to block the worst-offending email spammers. this had problems too of course (malicious contributors intentionally blackholing each other for drama/sabotage reasons, eventual capture and institutionalization by big centralized email providers making it hard-to-impossible to send your own mail from your own mailserver)

At this point I suspect it's necessary to poison the fuckers. Sure you can scrape my site without obeying robots.txt, but uh, if you follow those links that have been hidden you might suddenly find yourself getting the output of a medium-sized markov model.