Yesterday I spent 2 hours handling an extremely fucked production change that went wrong for a bunch of reasons that had nothing to do with my team
- The oauth system was down on all the lower environment
- Due to an unexpected difference1 the software was not running correctly in prod
- Something was wrong with the service we use to send emails in prod2
I am very tired and did not have a good time. I really wish our change process did not have to start at 8pm local time (this is not a time zone issue, management just decided to be like this). Anyway, good morning on this Normal Tuesday, everyone.
-
prod was missing elasticsearch libraries for no reason, like an entire folder that is supposed to be included as a command line arg???
-
we changed it to point to a cert server instead and it worked, so at least we confirmed that it was an infrastructure problem
