10

Business Continuity / DR 101...
How could GitLab go down? A deleted directory? What!
A tired sysadmin should not be able to cause this much damage.
Did they have a TESTED dr plan? An untested plan is no plan. An untested plan does not count. An untested plan is an invitation to what occurred.
That the backups did not work does not cut it - sorry GitLab. Thorough testing is required before a disruptive event.
Did they do a thorough risk assessment?
We call this a 'lesson learned' in my BC/DR profession. Everyone please learn by it.
I hope GitLab is ok.

Comments
  • 1
    I have watched their live stream for hours.
    Testplan? Yes, they tested everything and this was caused by a failing database replication system due to a new problem they are working to resolve at the moment. This new problem was also the cause that the backup system failed. The problem is something with a new kind of spamming attacks they are getting at the moment. If the person who made the mistake wouldn't have done a manual backup six hours before, a lot of data would have been lost now. Not due to the mistake the person made but due to the problem with spammers they are having right now.

    Disclaimer: this is what I understood from the live stream, I might have interpreted stuff wrong or whatsoever but this is what I got from it.
  • 0
    @linuxxx Let me add though that although I'm very defensive of them, the fuckup wasn't smart!
Add Comment