• ve/ver

please keep in mind that i have nsfw and normal stuff mixed because i do not care about separating concerns (but i do tag)

bop bop bopbopbopbopbop.

i live in New Zealand.

i'm 36. 🏳️‍⚧️ i feel ancient.



I have a little experience creating/working with a small distributed system, so the musksucking guys in his replies make me a little angry. Five servers - databases, workers, load balancer. It might not seem like much, and the load is not comparable to Twitter, but they have to be monitored, and they have some failure modes. So far nothing that couldn't be fixed quickly, but the problem is that they cause short-lived crashes, of which customer dissatisfaction may follow.

From the recent one, some events in the worker queue caused processing delays, some took 10 minutes at a time, which caused a big backlog of events. There were about 80,000 events stuck in half an hour. The only thing that saved us at that moment was that I was playing games at night and was able to notice it. And it took me almost an hour and a half to straighten it out, and then we had to keep looking for a solution. And it doesn't even have much complex stuff like a zookeeper, other schedulers, etc.

I can't imagine what can happen on a system like Twitter - well, "1200 microservices" is a lot of servers and interconnections, and they have software and hardware failure points.

Some observed erratic behaviour of the website already suggests that something bad is piling up.

...in retrospect, using Ubuntu 18.04 LTS right now for my servers is a risky thing already... and I will need later to update them... or to bring up new servers, deploy 22.04 LTS or 24.04 LTS onto them, then do a switch... which is another bunch of failure points.


You must log in to comment.