jkap
@jkap

EDIT: after learning about https://jort.link from @lifning (thanks!) i've made some changes to redirect mastodon traffic through them (as documented on their site) so that we aren't taking the brunt of it. still planning on fixing our own caching this week, but this should be an ok band-aid for now.

we're seeing issues related to the special caching we use for serving to Mastodon, which means we have had to temporarily start blocking requests from Mastodon to prevent performance issues.

the only impact (unless you're an instance admin) is that if you post a cohost link on mastodon, you won't get embed information. if you're an instance admin, you'll (probably) see an increase in failed requests to cohost (i will admit that i have no idea if this information is presented to admins).

we'll be unblocking mastodon server traffic once we've worked with fastly to fix the underlying issue this week. more technical details below the cut.

hopefully this is the last fire drill of the weekend. i'm tired. :eggbug-asleep:


I'm pretty sure i've talked about this before, but an issue with mastodon's federation model is that things like opengraph metadata (what's used to display link embed details) don't get federated with the posts. this means that if someone posts a link to cohost, every single instance that sees that link will all make a request to us at the exact same fucking time.

while this isn't malicious (i consider it to be a consequence of questionable protocol design), it effectively manifests as a mini-DDoS every time someone links to us. we're not the first to mention this as an issue, it's just sort of A Thing.

in the past we've mitigated this by having a Special Caching Layer Just For Mastodon so that we don't have to fully rebuild the page every time, but for some reason this has stopped working consistently! given that more people than average are linking their other social media profiles, we're seeing occasional random spikes of up to 5x standard request load that are enough to temporarily overwhelm our autoscaling and bring the site down.

given that it's sunday and i don't want to deal with this too much right now, we're just doing the Easy Approach of blocking mastodon traffic at the firewall level. this week, i'll be working with our support contacts at fastly to figure out what went wrong with the caching layer and undo the blocking change.

thanks for using cohost! :eggbug:


You must log in to comment.

in reply to @jkap's post:

No no, the slashdot effect was the Popular Site posted a link to something cool someone had hosted on some little server because it was never supposed to be a big deal and thousands of humans saw the link and clicked on it. It wasn't slashdot itself consisting of thousands of servers that all automatically hit the linked site all at the same time without any humans following the link!

um… why a Special Caching Layer Just For Mastodon, why isn't this handled by a Normal Caching Layer For Everyone?

cohost of all things—due to not having Numbers—should be able to get away with very aggressive caching! even proactive caching I'd say! my unsolicited bigbrain ideal of how cohost should work is: for active users, pushing a submit button should regenerate all pages where the change would be reflected and push those directly into cdn caches

(speculation) normal usage might not have the same load, so catching might not be needed/wanted. But for the mastodon thing, there's a lot more requests made so higher load and need for the cache layer. Also, because it's just opengraph embed things, it's not Critical Information compared to actually reading a post

Cohost is rendered dynamically so it can have features like edits to posts and comments propagating in real time on already-loaded pages. This is a silly feature in some ways, but it’s been here since day 1, it’s part of the design bible I suppose, and if they can ever smash the “cohost going down blanks loaded cohost pages” bug with a hammer, I don’t see the harm. Staff has stated before that image hosting takes a much larger toll on resources than stuff like this. (Excepting the current bug of course.)

So the cache is only served to Mastodon servers (and maybe eventually other robot scrapers) and not anything else. This does mean that link previews are unfortunately not going to see dynamic updates but I suppose people don’t expect that.

we have explored more aggressive caching. here's a big list of things that wouldn't work:

  • anything that depends on login state

this is almost everything on the site, including posts! things like "do we show embeds" and "do links open in an external tab" are settings, which means we need to know the value of the setting, which means the output is dependent on login state. this doesn't even get into shit like "blocking" and "private accounts" which come with their own whole new fun can of worms.

we can get away with having a mastodon/scraper-only cache because every single mastodon instance is always logged out, so login state is irrelevant. we could maybe over-engineer some sort of fragment caching system but this is almost certainly not worth doing for a four person team with only two engineers, especially on a site that is nearly 100% dynamic content.

oh, of course I'm talking about logged-out users, I thought that was a given. It just feels weird to me to refer to that as scraper-only when there must be a ton of logged-out human users just anonymously reading posts whenever a post gets linked somewhere, gets into search results, etc.!

we don't know if someone is logged out until they hit our server, at which point a CDN cache is no longer useful. mastodon and other scrapers have fixed (or mostly fixed) user agents so we can hardcode it.

if you don't set any cookies to anonymous users at all (well, not any unique values), just Vary: Cookie on its own (with different Cache-Control returned for anon vs logged in) could just magically solve everything (?)

otherwise, this is precisely why CDNs are so advanced these days. On my previous personal website I've had Lambda@Edge in Cloudfront inject an admin.js script tag into all responses specifically if my admin session cookie was present. Fastly's VCL can definitely look up whether a session cookie is present…

oh, I wasn't expecting to see my own service mentioned on here lol

for what it's worth you can use it basically indefinitely, it's meant for this sort of thing and i didn't even notice an increase in load when you switched it on — I have an explanation of more of this in, of all places, a comment thread on jwz's blog with the tl;dr being it's very efficient and serves as a moderately spicy static file server

the primary caveat is it has no privacy policy because i do not have money to write one lol

sure hope that mastodon eventually fixes this (they won't)

in the mean time i hope you enjoy the finest jort power fedi has to offer 👖👖👖👖👖👖👖👖👖👖👖👖👖👖