Joined devRant on 12/13/2017
Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple APILearn More
I said I wanted Ubuntu, and I got Ubuntu 18.04
Why you might ask? Because the older LTS has features that have had more time for bugs to be fixed.8
It's not a matter of simply "unlocking it". It's a bug that needs to be researched, fixed, tested, and released.
So I wrote these E2E tests to test my credit card expiration notification emails. So I wrote my code, and tested it. Tests failed. I spent the next 6 hours (spanning 2 days) debugging my tests. Come to find out that the tests were fine all along. The issue was my code.
Apparently everything has dates starting at 1 (day starts at 1, year starts at 1). But MONTHS. Months start at 0
When your boss asks what you're working on and you say "Fixing a bug that causes new subscriptions to be prorated to line up with old ones." and he says "Terrible. Needs immediate fix."
"I know! That's what I'm doing!!"1
So I looked at our dashboard and noticed a banner mentioning scheduled maintenance set for 7:00 AM. And I thought to myself, "I never released an update, and even if I had, the maintenance would be performed 15 minutes after the build finished, not at 7:00 AM." So I emailed my coworkers, asking if they had put up the banner, no, no. I started pulling my hair out trying to figure out what caused this banner to be created. Was there some old job that was just now running? I combed through the server logs, thousands of entries later, and I found the banner was installed by some user with the IP 172.18.0.1...which was the local machine. I went through all the users on the system, running atq to see if anyone had jobs scheduled. And there was one job scheduled, under the root user. At that moment, I legit thought to myself, "have we been hacked? How is that possible?" It's wasn't! Then I looked under /var/spool/atjobs to see what the job actually was. And then I saw it. My weekly updater cron job had installed updates and had scheduled a maintenance window to reboot the system. And I smiled, realizing that my code was now sentient.1
And hours of debugging later, I realized that:
"xyz-" + cond ? "one" : "two"
is different than
"xyz-" + (cond ? "one" : "two")
I hate Groovy truth.2
So a user reported they couldn't login to our site, so I reset their password to:
And sent them back an email with the updated password. A few minutes later, they replied and said that password didn't work. They even tried a different web browser, etc. I tried it myself, and sure enough, it didn't work.
I spent the next several hours trying to figure out why the password didn't save properly, or why the logic didn't compare them correctly. Perhaps it was some sort of caching issue? Oh the horror.
As it turns out, the problem was a maxlength of 28 on the login form field:
<input type="password" name="password" value="" maxlength="28"/>
I don't know who wrote that code, but it sure wasn't me.23
So I recently finished a rewrite of a website that processes donations for nonprofits. Once it was complete, I would migrate all the data from the old system to the new system. This involved iterating through every transaction in the database and making a cURL request to the new system's API. A rough calculation yielded 16 hours of migration time.
The first hour or two of the migration (where it was creating users) was fine, no issues. But once it got to the transaction part, the API server would start using more and more RAM. Eventually (30 minutes), it would start doing OOMs and the such. For a while, I just assumed the issue was a lack of RAM so I upgraded the server to 16 GB of RAM.
Running the script again, it would approach the 7 GiB mark and be maxing out all 8 CPUs. At this point, I assumed there was a memory leak somewhere and the garbage collector was doing it's best to free up anything it could find. I scanned my code time and time again, but there was no place I was storing any strong references to anything!
At this point, I just sort of gave up. Every 30 minutes, I would restart the server to fix the RAM and CPU issue. And all was fine. But then there was this one time where I tried to kill it, but I go the error: "fork failed: resource temporarily unavailable". Up until this point, I believed this was simply a lack of memory...but none of my SWAP was in use! And I had 4 GiB of cached stuff!
Now this made me really confused. So I did one search on the Internet and apparently this can be caused by many things: a lack of file descriptors or even too many threads. So I did some digging, and apparently my app was using over 31 thousands threads!!!!! WTF!
I did some more digging, and as it turns out, I never called close() on my network objects. Thus leaving ~30 new "worker" threads per iteration of the migration script. Thanks Java, if only finalize() was utilized properly.1