Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple APILearn More
Search - "migration on prod"
The riskiest dev choice...
How about "The riskiest thing you've done as a dev"? I have a great entry for that. and I suppose it was my choice to build the feature afterall.
I was working on an instance of a small MMO at a game company I worked for. The MMO boasted multiple servers, each of them a vastly different take on the base game. We could use, extend, or outright replace anything we wanted to, leading to everything from Zelda to pokemon to an RP haven to a top-down futuristic counterstrike. The server in this particular instance was a fantasy RPG, and I was building it a new leveling and experience system with most of the trimmings. (Talents, feats/perks, etc. were in a future update.)
A bit of background, first: the game's dev setup did not have the now-standard dev/staging/prod servers; everything ran on prod, devs worked on prod, players connected and played on prod, etc. Worse yet, there was no backup system implemented -- or not really. The CTO was really the only person with sufficient access. The techy CEO did as well, but he rarely dealt with anything technical except server hardware, occasionally. And usually just to troll/punish us devs (as in "Oops ! I pulled the cat5 ! ;)"). Neither of them were the most reliable of people, either. The CTO would occasionally remote in and make backups of each server -- we assumed whenever he happened to think of it -- and would also occasionally do it when asked, but it could take him a week, sometimes even up to a month to get around to it. So the backups were only really useful for retreiving lost code and assets, not so much for player data.
The lack of reliable backups and the lack of proper testing grounds (among the plethora of other issues at the company) made for an absolutely terrible dev setup, but that's just how it was, and that's what we dealt with. We were game devs, afterall. Terrible or not, we got to make games! What more could you ask for!? It was amazing and terrible and wonderful and the worst thing ever, all at the same time. (and no, I'm not sharing the company name, but it isn't EA or Nexon, surprisingly 😅)
Anyway, back to the story! My new leveling system also needed to migrate players' existing data, so... you can see where this is going.
I did as much testing and inspection of my code as I could, copied it from a personal dev script to the server's xp system, ... and debated if I really wanted to click [Apply]. Every time I considered it, I went back to check another part or do yet more testing. I ended up taking like 40 minutes to finally click it.
And when I did... that was the scariest button press of my life. And the scariest three seconds' wait afterwards. That one click could have ruined every single player's account, permanently lost us players ...
After applying it, I immediately checked my character to see if she was broken, checked the account data for corruption or botched flags, checked for broken interactions with the other systems....
Everything ended up working out perfectly, and the players loved all of the new features. They had no idea what went into building them, and certainly had no idea of what went into applying them, or what could have gone wrong -- which is probably a good thing.
Looking back, that entire environment was so fragile, it's a wonder things didn't go horribly wrong all the time. Really, they almost never did. Apocalypses did happen, but were exceedingly rare, and were ususally fixed quickly. I guess we were all super careful simply because everything was so fragile? or the decent devs were, at least. We never trusted the lessers with access 😅 at least on the main servers where it mattered. Some of the smaller servers... well, we never really cared about those.
But I'm honestly more surprised to realize I've never had nightmares of that button click. It was certainly terrifying enough.
But yay! Complete system overhaul and migration of stored and realtime player data! on prod! With no issues! And lots of happy players! Woooooo!
Thinking back on it makes me happy 😊1
Well, this week was a week from hell. It was a short 3 day week, and all of my internal Customers, who are normally pretty reasonable, just all unloaded on me at the same time. "We need this now!" "Have you done this?" "Why didn't you do that?" "We need you to do this, because our migration takes place in 30 minutes." (first notice of the migration). And then to top everything off, I'm creating a rollback DDL, and I've spent a couple of hours pulling my hair out, because a set of columns that need rolled back aren't in Prod, so I can't roll them back, because my own DDL drops them, and broadcast my natural meltdown to the entire DevOps team, feeling like an utter jackass after I realize my mistake. And even at quitting time, they are still walking up, and texting, and emailing. Holy f**k, I'm only going to be gone four days, two of them weekend, and will be back. All of this while trying to sell my house and pack boxes and move to an apartment. Can I retire now? Looks at retirement account... Nope, I'll be working until I'm 95. Just shoot me already!1
I'm currently between jobs and have a few rants about my previous job (naturally). In retrospect, it's somewhat therapeutic to range about the sheer brainfuckery that has taken place. Enjoy!
First, let me set the scene: legacy B2B web app made with LEMP stack and sencha ext.js 3 + 4 (don't ask) and a lot of madness. Let's call that app "Alpha".
Alpha is a self made CMS build for typical ERP stuff. Yes, a self made CMS: entities are containers, containers have types and fields and values. Like so many legacy PHP apps, it does not have a dedicated FE: the HTML is rendered on the server and then spewed out to the browser.
Easy right? Coding like it's 1999! But there was a twist: Because everything is basically a container, the HTML-templates are saved in the DB. Along with the nessary JS and the CSS. And the translation variables. Why? Because fuck you! That's why. Who needs a git history anyways.
For some reason, Alpha was kinda slow.
There was also an editor, that allowed you to modify templates (web, mail, pdf) on the fly in prod. Because templates contain repeating data (header/footer), one template could contain additional templates. Much confusion. You could change templates via migration (slow, boring) or just ctrl-c/ctrl-v that sucker (fast, much excitement).
Did I mention Alpha was slow?
On with the rant: e-mails! How do they work? Noone knows. How to send mails asynchronous in PHP? Witchcraft is the only possible answer to that riddle. Here is your enterprise™ solution:
1. create mail
2. insert mail into DB
3. WAIT UP TO 59 SECONDS FOR A FUCKING CRON TO SEND MAIL
Why? "Because that way, we can resend mails in case the network is down :)"
Same procedure for the SOAP-API (db-queue + cron). You read that right: all requests to various other systems are processed once a minute.
Alpha was only one of several systems. Imagine a bunch of monolithic php apps, interconnected via SOAP, REST and GraphQL like a godamn intergalactic orgy. Image having to debug that cluster fuck.
Let's say there is a bad request. These things happen. No biggie. Remember the db-queue? Let's try to send the bad request a second time! And a third time! Still no luck? How odd. Let's create a specific file in a specific directory: a LOCK-file. Now, "the db-queue is on hold and no request gets processed :)"
Golly gee thanks Alpha.
Anyhow, did you know that MySQL has a join limit of 61 tables?3