Soooooo, how often does it happen that someone nukes a database and attempts for a restore fail?
Asking for a friend, who happens to be "future me"

Iam very much not responsible for fixing it but I will have a whole lot of work....

  • 3
    Fake your future self's death 😉
  • 2
    occasionally, most often with new employees that get a bit too over confident with their accesses and try to fix everything only to accidentally then fuck it up more.

    But any company and/or project worth its salt will have automatic backups (multiple), strict access control or ideally both.

    You can go even into more extreme measures, like a database mirror that is only read only and jumps into action if the first database goes offline. That not only serves as an additonal offsite backup but, but keeps your production in a functional, although limited mode...

    Eitherway, you need to build and prepare your measures based on the scale of the project and the budget. But it still stands that prevention is the best thing you can do... just don't give people access to do that, look into setting up proper permissions and accesses for everyone who *needs* access...
  • 3
    1 bad query is all it takes.

    But having automated backups (minimal of daily ideally hourly or better depending on budgets) and being able to hit restore should be all it takes.

    Last time I had a db fail was due to the slave db fucking up the auto increments and having 2 databases start being the master / primary which didn't end well trying to do db comparisons and merge unique slave data back onto the master/primary db.
  • 1
    You are expecting a restore to fail. Everyone is talking about backups, which is all true, but also testing restores on a semi regular basis to make sure that works is also a thing. Backing it up for years and never making sure it can come back will haunt you, and yes, that’s where fake your death comes in.
  • 3
    A coworker of mine didn't nuke a database (it did that to itself because the team we inherited it from built it from toothpicks and thread) but he had to spend the next week rebuilding it because the restore failed.

    It was a secrets store. Loads of passwords, api keys, etc., all inaccessible for a week.

    Restoring it was a nightmare and we now have a project going into our upcoming sprints to rebuild it out of something more robust than toothpicks and thread.
  • 1
    Even if the restoration works, the challenge would be the time taken to restore it. You may not have the luxury of keeping your customers waiting

    We did a batch copy ranging from today to today - 21 days and so on until we restored a reasonably good date range of data

    But being in a position where even restore won't work is much worse
  • 0
    Nuke, hehe. Try finding out a dev malformed the whole DB with their service. Nuke is childs play, at least the service is down and not making it worse.
Add Comment