14
Gnonpi
1y

So last week I really fucked up

I had this new implementation that was supposedly to be integrating smoothly into the rest of the service. It depended on a serialized model made by a data scientist. I test it in local, in QA environment: no problem.

So, Friday, 4pm, I decide to deploy to production. I check once from the app: the service throw an error. Panic attack, my chief is at my desk, we triy to understand what went wrong. I make calls with cUrls: no problem. Everything seems fine. I recheck from the app again: no problem.
We dedice to let it in prod, as the feature work. I go get some beers with the guys, to celebrate the deploy.

Fast-forward the next morning, 11am, my phone ring: it's a colleague of my chief. "Please check Slack, a client is trying to use the feature, it's broken"

FUUUUUUUUUUUUCK!!!

Panic attack again. I go to the computer, check the errors: two types of errors. One I can fix, the other from a missing package on the machine that the data guy used.

Needless to say, I had a fairly good weekend.

Lessons learned:
- make sure Dev, QA and Prod are exactly the same (use Ansible or Container)
- never deploy on a Friday afternoon if you don't have a quick way to revert

Comments
  • 11
    I don't care if you're deploying a cure for cancer. 4pm on Friday is UNACCEPTABLE.
Add Comment