10
ltlian
3y

A couple of years ago I was working on a fairly large system with a complex (by necessity) access control architecture.

As is usually the case with those projects, it's awkward for developers to repro bugs that have to do with a user's accesses in production when we are not allowed to replicate production data in test, let alone locally.

We had a bug where I ended up making myself a new row in the production database for a thing I could have access to without affecting real data to repro it safely. I identified the bug so I could repro it in dev/test and removed the row and ensured everything worked normally, whew scary.

Have you ever walked into the office one day, and everyone is hunched over in a semicircle around one person's workstation, before one turns around to look at you and says - after a pause - "... ltlian?.."

Turns out I had basically "poisoned the well" with my dummy entity in a way where production now threw 500 for everyone BUT me who had transitive access to this post-non-entity. Due to the scope of the system, it had taken about a day for this to gradually propagate in terms of caching and eventual consistencies; new entities coming in was expected, but not that they disappear.

Luckily I had a decent track record for this to be a one-off. I sometimes think about how I would explain testing in prod and making it faceplant before going home for the day, other than "I assumed it would be fine". I would fire me.

Comments
  • 2
    Phew, that escalated fast
  • 1
    Explanations for fuckups like that are always either some oversight, missing something - or malice. So "I assumed it would be fine" is the most positive variant.

    And at least you found that lingering bug about the inability to ever remove something. Was probably an interesting fix...
  • 2
    @Oktokolo It was the epitome example of how you don't handle some error because "this should never occur anyway" due to making assumptions about the code and data in the future.

    Luckily the fix was a simple cache purge and a couple of asserrions to guard for this case. The check looks really unnecessary and it wasn't long before someone asked why it was there. Then we get to pull up a chair and go "so, the thing is -".
Add Comment