SkillsAWS Infrastructure provisioning, Python, Java, Shell, PowerShell
Joined devRant on 12/13/2018
Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
The bug I never fixed isn't a bug in code I wrote, but rather an OS problem I've given up on fixing.
I dual-boot Windows and Linux on my desktop PC. Every time Windows updates, it switches from grub to the Windows bootloader, making it impossible to boot into Linux. I've fixed it three times (each time requiring a different fix, from disabling fast startup to reinstalling Grub from a live USB), then gave up. My desktop PC is now a Windows machine. I'm upgrading some parts soon (including replacing my boot drive with an NVMe SSD) so I decided when I do that, I'm just going to reinstall Linux on the new drive and see how long I can last without installing Windows at all.5
The last eight years were fun, but I ran out of space while trying to compile a project, and, well, your number came up. I'm sorry...
I need a bigger SSD. I launched Visual Studio (which I rarely use so it only had the default extensions installed) to clone and build the new Windows Terminal to see what it's like. Had to download over 10GB of extensions and features first, and then compiling the project ate up every last byte of remaining space.14
My team handles infrastructure deployment and automation in the cloud for our company, so we don't exactly develop applications ourselves, but we're responsible for building deployment pipelines, provisioning cloud resources, automating their deployments, etc.
I've ranted about this before, but it fits the weekly rant so I'll do it again.
Someone deployed an autoscaling application into our production AWS account, but they set the maximum instance count to 300. The account limit was less than that. So, of course, their application gets stuck and starts scaling out infinitely. Two hundred new servers spun up in an hour before hitting the limit and then throwing errors all over the place. They send me a ticket and I login to AWS to investigate. Not only have they broken their own application, but they've also made it impossible to deploy anything else into prod. Every other autoscaling group is now unable to scale out at all. We had to submit an emergency limit increase request to AWS, spent thousands of dollars on those stupidly-large instances, and yelled at the dev team responsible. Two weeks later, THEY INCREASED THE MAX COUNT TO 500 AND IT HAPPENED AGAIN!
And the whole thing happened because a database filled up the hard drive, so it would spin up a new server, whose hard drive would be full already and thus spin up a new server, and so on into infinity.
Thats probably the only WTF moment that resulted in me actually saying "WTF?!" out loud to the person responsible, but I've had others. One dev team had their code logging to a location they couldn't access, so we got daily requests for two weeks to download and email log files to them. Another dev team refused to believe their server was crashing due to their bad code even after we showed them the logs that demonstrated their application had a massive memory leak. Another team arbitrarily decided that they were going to deploy their code at 4 AM on a Saturday and they wanted a member of my team to be available in case something went wrong. We aren't 24/7 support. We aren't even weekend support. Or any support, technically. Another team told us we had one day to do three weeks' worth of work to deploy their application because they had set a hard deadline and then didn't tell us about it until the day before. We gave them a flat "No" for that request.
I could probably keep going, but you get the gist of it.7
Spent three days banging my head against my desk trying to get an AWS Lambda function to work, only to finally discover that my code was perfectly functional and it was a security group problem. It was supposed to send a POST request to a load balancer's URL but couldn't resolve the hostname because the security group blocked a necessary outbound port for DNS requests.
That's what I get for not troubleshooting at the infrastructure level when experiencing connection issues. I did not spend two years doing tech support just to forget basic troubleshooting steps now that I'm in the DevOps field...1
Meetings, responding to emails, handling urgent tickets, etc. If I could just get four uninterrupted hours of coding in a day, I'd be happy. But I'm basically in meetings all morning and usually have at least 1 more in the middle of the afternoon.2
My very first rant here was about the mess of ticket submission and ticket tracking applications we use, and about how we were moving to a single unified system some day.
Well, that day is today. And, predictably, it went horribly wrong.
So the way it's supposed to work is people login to the portal, search for what they want to request, then fill in details and submit. It creates a request ticket assigned to the appropriate team. (The old way involved a bunch of nonsense that you can see in my first rant).
The thing is, I found out about this today, when I got a company-wide email saying the new system was live as of this morning. None of us knew it would happen today. Not that I could've foreseen any issues just by getting the announcement early, but still, usually people find out about these things beforehand.
So, ecstatic to finally be rid of the old ticket tracking system, I log into the new system and look for our request form, which is, of course, not there. I check the old system and see that they combined every single "general request" into a single request where you pick which team the request goes to.
So I finally find the right request, pick the right department from the drop-down, and see that the request looks much better than it did on the old system. Out of curiosity, I look at the list of people who are part of that department.
I am not on the list.
My ENTIRE TEAM is not on the list.
Because they migrated the team data to the new system a year ago, when the issue tracking/reporting portion of it went live. My current team was hired approximately six months after that and apparently updating the team data in the new system isn't part of our Onboarding process yet.
So... Bright side is I guess I will have a lot of free time soon since nobody can submit new project work to my team?
tl;dr: they took a great software product and implemented it so poorly that our team can't use it.3
The second time I dropped out of college. I wasn't sure why at the time, but I just couldn't manage to keep myself motivated and interested. Later on I was diagnosed with ADD and all my school problems made sense, but at the time, I thought that since I'd tried and failed twice to get a computer science degree, I wasn't qualified to be a developer.
And now I have a degree and a dev job and I know what's actually wrong with my brain and how to deal with it.13
Wow, Corsair pulled no punches today. I think I like their April Fools joke the best so far.
Mostly indie games these days, since I have hundreds of them in my Steam library and haven't played most of them. Last week I tried Factorio and now I'm obsessed with it.13
Finish a personal project.
Any personal project. I'm not picky. The last time I completed a project that wasn't for my job was like five (well, now six) years ago.
On the nth day of Christmas, my true love gave to me:
1 misconfigured autoscaling group
200 unnecessary servers
29 urgent emails
3 support teams that would have fixed the problem in an hour of anybody had bothered going to them first
Ugh. Idiots. Somehow the whole issue was caused by a single full hard disk, which caused database transactions to fail, which caused the group to scale out and spin up new instances, which didn't actually fix the problem so it kept scaling until it hit the limit and then continuously failed to create new instances for several hours straight, generating loads of notification emails and generally causing problems for everyone involved.
Thinking of developing a mod for a game I play, but only thinking about it so far. No coding yet. Probably no coding until I'm back at work on the 26th.1
Our ticket tracking system and our IT service request system are from two different companies that are direct competitors. The source code is full of temporary hacks to just make them play nice until a better solution is worked out. Fast forward a few years and we're abandoning both systems in favor of a single, unified system that handles everything. We currently have maybe 20% of the new, unified system done, which is now hacked together with both of the legacy systems until we finally transition fully to the new system. The current plan is for next year, but the plan six months ago was for this year, and almost no progress has been made since then, so we're probably going to have two ticket trackers and two request systems for a while.
Actually, three ticket trackers and three request systems. The third ticket tracker is used to track work done on tickets that exist in the legacy tracker because the legacy tracker can't do that on its own, while the third request system is the oldest and most cumbersome legacy system of them all.1