Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API

From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "troubleshooting"
-
What an absolute fucking disaster of a day. Strap in, folks; it's time for a bumpy ride!
I got a whole hour of work done today. The first hour of my morning because I went to work a bit early. Then people started complaining about Jenkins jobs failing on that one Jenkins server our team has been wanting to decom for two years but management won't let us force people to move to new servers. It's a single server with over four thousand projects, some of which run massive data processing jobs that last DAYS. The server was originally set up by people who have since quit, of course, and left it behind for my team to adopt with zero documentation.
Anyway, the 500GB disk is 100% full. The memory (all 64GB of it) is fully consumed by stuck jobs. We can't track down large old files to delete because du chokes on the workspace folder with thousands of subfolders with no Ram to spare. We decide to basically take a hacksaw to it, deleting the workspace for every job not currently in progress. This of course fucked up some really poorly-designed pipelines that relied on workspaces persisting between jobs, so we had to deal with complaints about that as well.
So we get the Jenkins server up and running again just in time for AWS to have a major incident affecting EC2 instance provisioning in our primary region. People keep bugging me to fix it, I keep telling them that it's Amazon's problem to solve, they wait a few minutes and ask me to fix it again. Emails flying back and forth until that was done.
Lunch time already. But the fun isn't over yet!
I get back to my desk to find out that new hires or people who got new Mac laptops recently can't even install our toolchain, because management has started handing out M1 Macs without telling us and all our tools are compiled solely for x86_64. That took some troubleshooting to even figure out what the problem was because the only error people got from homebrew was that the formula was empty when it clearly wasn't.
After figuring out that problem (but not fully solving it yet), one team starts complaining to us about a Github problem because we manage the github org. Except it's not a github problem and I already knew this because they are a Problem Team that uses some technical authoring software with Git integration but they only have even the barest understanding of what Git actually does. Turns out it's a Git problem. An update for Git was pushed out recently that patches a big bad vulnerability and the way it was patched causes problems because they're using Git wrong (multiple users accessing the same local repo on a samba share). It's a huge vulnerability so my entire conversation with them went sort of like:
"Please don't."
"We have to."
"Fine, here's a workaround, this will allow arbitrary code execution by anyone with physical or virtual access to this computer that you have sitting in an unlocked office somewhere."
"How do I run a Git command I don't use Git."
So that dealt with, I start taking a look at our toolchain, trying to figure out if I can easily just cross-compile it to arm64 for the M1 macbooks or if it will be a more involved fix. And I find all kinds of horrendous shit left behind by the people who wrote the tools that, naturally, they left for us to adopt when they quit over a year ago. I'm talking entire functions in a tool used by hundreds of people that were put in as a joke, poorly documented functions I am still trying to puzzle out, and exactly zero comments in the code and abbreviated function names like "gars", "snh", and "jgajawwawstai".
While I'm looking into that, the person from our team who is responsible for incident communication finally gets the AWS EC2 provisioning issue reported to IT Operations, who sent out an alert to affected users that should have gone out hours earlier.
Meanwhile, according to the health dashboard in AWS, the issue had already been resolved three hours before the communication went out and the ticket remains open at this moment, as far as I know.5 -
Greatest thing just happened.
Get a ticket about orders not being processed in our webshop. Angry customer. Critical!!!!
Starts troubleshooting. Nothing has changed in the code recently, was working just fine yesterday. Works locally and on test server. Hmmm...
Take a chance. Writes back to customer: “there! Try to place an order again” without changing anything.
5 minutes I get back “awesome! Everything works again and all previous orders have appeared. Good work!”.
Happy customer. Happy dev :)
Fin7 -
I need a new 'main' language to do all my projects in as java is kind of grinding away at my psyche.
Golang I liked quite a lot when I used it for my job a year ago, I'll give that a try..
Golang installed and up and working fine.
Oh, I know lets see if there are GLFW bindings for golang. And sure there are lets go!
Oh I need gcc and mingwex + mingw32 which I will acquire through cygwin.
hmm.. mingwex + mingw32 not found and my drive is almost full. I'll reinstall on my D:\ drive before continuing troubleshooting.
> Delete C:\Cygwin Access denied.
> cmd rmdir c:\test /s /q Access denied.
> Change permissions Access denied.
No problem I just don't own this object!
> Change to be the 'object owner' Success!
> Change permissions Success!
> Delete C:\Cygwin Access denied.
> cmd rmdir c:\test /s /q Access denied.
> takeown /F C:\Cygwin /A /R /D Y Success!
> cmd rmdir c:\test /s /q Access denied.
At this point it would be more efficient to manually open up my ssd, and using a fridge magnet change every single bit to be exactly what I want it to be.
Or install linux.7 -
I bricked my Manjaro install by interrupting a kernel update like a fucking doofus.
After two hours of painstaking troubleshooting using a live image I finally resolved the issue. And man is that a good feeling. Solving complex problems (at least to me) on Linux is just such an amazing feeling ♥️15 -
Alright,
I recently installed pi-hole...
Everything was immediately perfect.
So, about two days later, I install a linux system... Hadn't had one when I setup my pi-hole. (Well, no Linux with desktop environment...)
So... Now I had error messages in Chrome... Connection change detected. The page didn't load, 3 seconds later it loaded. Many pages had to be reloaded.
And I focused my Google-Fu on issues connecting to pi-hole. Some issues where there, referring to Safari and pi-hole, but none for Chrome or/and Linux.
But what's a pi-hole? A DNS Resolver/Non-authoritive server and a DHCP server...
Maybe I haven't turned off my router's DHCP server correctly. So, wireshark... "bootp or dns" filter...
All dns communication is perfect, via UDP and from the pi-hole to my machine, not from the router. No DHCP messages from my router either...
Almost accidentally I found a page speaking about this issue. Had nothing to do with the pi-hole. Timing was a coincidence. Had everything to do with IPv6. Somehow that's switching over. Even worst, after reading that, I remembered I had the same issue in the past. I just forgot.
Turning off IPv6 was the solution. And fuck. Let this be a PSA: "Confirm your bloody assumptions when troubleshooting/debugging or waste time like an idiot... Just like me..." -
It works locally, it works in Dev, it works in Test, but fails to deploy in UAT. Is it a data issue? I don't know, I don't have permissions to see the UAT database. Literally all I know is that this API is returning 500 instead of what it's supposed to return, but only sometimes.
Guess I'll sit here all day and try to solve the problem telepathically as there is literally no way of troubleshooting other than scrolling through the code and hoping that a cartoon lightbulb appears above my head.2 -
Before becoming a developer I used to work in IT, and I really liked the fact that I can solve so many computer problems and the troubleshooting part.
Now I just feel very stupid when something is not working the way it's supposed to do.3 -
I had some fun with ChatGPT today. I wondered how good its problem solving skills are. Turns out, it's no better than an entry/junior dev armed with all the docs out there - it knows what's written there, how to use the thing (language/framework/tool/etc.), but it has no "understanding" neither of the problem nor the tool, in a holistic way. It's got the knowledge, but it neither has the skill nor understanding of how/why to use it to solve a problem (any problem beyond plain simple complexity).
So the problem I asked it to solve was related to this one I had: https://devrant.com/rants/6312527 .
It was painful to troubleshoot this problem with ChatGPT. It kept on focusing on this particular problem and reacting to errors while trying to fix its initial solution. It took us a good while. Eventually, it reached a working solution, but it was an ugly, convoluted approach that was not feasible to cover my use case with.
FWIW I think it is interesting to follow its line of thought. Eventually, a pattern emerges of how it tries to solve the problem. And it reminds me a lot of myself on the first week in the IT field :)6 -
Urgh. One key skill that wannabes seem to forget is patience. Patience, patience, patience. Don't panic, don't be lazy, be methodical. This is the way of the analytical computer scientist. Don't panic all over the place or make assumptions..
Some techs..4 -
I like talking to uber drivers with some limited tech experiences. My uber today was telling me about when he was helping his kid pick parts for their custom desktop as well as setting it up and the weird issues they ran into over time due to it being their first time installing windows from nothing and their troubleshooting process to solve things I might consider basic problems
Simple things yeah. But still interesting to listen about. Maybe I'm just simple and easily entertained -
I will keep this short. I fucking hate Windows 11. There is nothing I like about it after over four weeks of having its fuckery drip down everything I do on my laptop like radioactive maple syrup. None of my apps from Windows 10 work. I google troubleshooting and I'm not going to go through 10 hacks to solve a problem created by Microsoft. The screen moves all over the place for no reason. I hate it. Not as much as I hated Mac, but I'm going to revert back to Windows 10 if I can. I don't wish to separate my laptop screen from my laptop keyboard again. The only person I know who can fix it tried to steal a hundred and twenty bucks from me. Thank you for reading this rant I'm living a charmed life otherwise, but snipping tool just fucked up and I'm fucking fed up. Peace out.18
-
Junior Software Developer Job( $37k-$42k USD)
-1 year experience
- J2EE, Javascript, HTML, XML, SQL
- object oriented design and implementation
- management of relational and non-relational such as Oracle, PostGreSQL and Cassandra
- Lifecycle and Agile methods
- Familiarity with the Eclipse development environment and with tools such as Hibernate, JMS, ,TomCat/Gemini/Jetty, OSGi.
• UNIX skills, including Bash or other scripting language
• Experience installing and configuring software packages
• ActiveMQ troubleshooting/knowledge
• Experience in scientific data processing and analytical science in general
• Automated testing tools and procedures, including JUnit testing, Selenium, etc.
• Experience in interfacing with scientific instrumentation, potentially over IP networks
• Familiarity with modern web development, user interface and other ever-evolving front-end
technologies, such as React, TypeScript, Material, Jest, etc.
I am betting they don't get many people applying.9 -
Idk why but this morning I was thinking about this high school elective class where we learned Adobe flash. But specifically 2 instances where I ignored the teacher and did my own thing
1. We were using Sprite sheets and he had us use photoshop to cut out the Sprite to a different layer and manually save each Sprite one by one to disk to use in flash. Some sheets had 50 fucking sprites
So I found a script for Adobe (action script I think they called their Javascript derivative) that exported every layer for me without all the manual clicking. There is probably an even better way. But this worked for how lazy I was back then
2. Our final projects we could do anything but he suggested not doing anything too complicated cause of time constraints and he barely taught is the scrptinh language for Adobe flash so making flash games was almost out of the question.
Me being stupid really wanted to make a working pong game. So I spent too long watching a German (i dont know German) tutorial video I found, and troubleshooting outdated code from that video. And improving things where I could with my limited knowledge made worse cause I wasn't interested in programming and didn't start learning python until the following year
Yeah don't know why I was thinking about those. But I feel it's a good perspective on how far I've come. From hacking together a pong clone with no skills, to being hired to automate and optimize processes and legacy projects -
I have this habit of whenever I run up against an issue at work programming-wise (a step definition doesn’t behave the way I think it should, I have an error in the console when I attempt to do something and have to work out how to clear the error, etc.), I document the issue and the solution somewhere in the Slack.
This serves two purposes: discoverability for others who might run into that issue later and DISCOVERABILITY FOR MYSELF WHEN I INEVITABLY ENCOUNTER THE SAME ISSUE.1 -
I was troubleshooting an asyncio websocket project for two hours that was working the previous day with minimal changes.
I am new to asyncio so I was getting very frustrated thinking I wasn't understanding something.
As it turns out, I had replaced a continue with a return, killing the broadcast function completely. -
Programming at a job to me is no longer creating something fun and valuable; it's more like figuring out why shit doesn't work, con-stant-ly.
It' s like coming in to your desk every morning, dreading the day because there's yesterday's shit to fix. "Hmm, what shall today be like? Oh yes, troubleshooting why my database model doesn't work, redesign it completely and break my mind over db details. The next day? Having to redesign my classes to implement new patterns because apparently the current design isn't good enough." Even if you work on new deliverables, that's just new problems in disguise anyway.
Pleasant? Not really.
lol.3 -
Who asked for RedDatabase and RedXpert? You guessed it - nobody 😑 It's a buggy Firebird and DBeaver domestic knock-off!
My student was assigned with making Flask app by "simple" requirements. But guess what? We can't figure out hecking RedDatabase?! Figures out that they sent incompatible *.fdb database file, on which we wasted entire 3 hours troubleshooting obscure error, while clean database doesn't cause any trouble.
Last error that completely drained us is following:
"""
Reason: unsupported on-disk structure for file /var/rdb/test.fdb; found 12.3, support 12.2; IProvider::attachDatabase failed when loading mapping cache [SQLState:HY000, ISC error code:335544379]
"""
So now, he basically recreates database by scheme on image. What also shady seems to me is that application also has to deployed on virtual OS which he can bring on USB stick or by cloud later. -
Ok, we were troubleshooting a network connection problem. My boss told me: use fping, a small command line utility that gives you a timestamped ping. We can then check when did the connection go down. Ok. Since I've always advocated the importance of knowing advanced scripting tools, i tried to do it with powershell. I've been playing with Test-Connection for an hour to try to get not only the timestamp when the connection is ok, but the timestamp when the connection is down. Don't want to go into details. I've just a question. A solution that allows you to do such an easy task in say 20 lines of code is the proof that the system works or that it doesn't work? To make long story short, now i'm downloading fiping.6
-
I just found out i've wasted month of my life by troubleshooting wrong component.
I was unable to access my application in cluster and suspected networking and port configuration. (custom corporate setup with chaotic documentation did not help to situation)
In the end, it was caused by Jenkins, which failed on building a new container but still showed "build: success" while it deployed May version of container without any changes applied. -
A few years ago we had a fail-over which was successful until we started failing everything back to primary servers. The applications could not start at all.
4 hours into troubleshooting, only to find out some java security files were misbehaving. Update from another server and it worked.
Up to date i haven't understood how it failed -
Hey guys, i decided to post something useful here, rather than just complaining.
I had this problem where google app sign in loads forever. I was wondering if anyone else ever had this problem.
So, it turns out theres a param called requestidlecallback in settings, safari, advanced, experimental. It should be off.
If its not off, and your trying to sign in to google on an app, force stop the app, turn it off, then force stop settings, then restart your computer. -
My first interaction with a computer was probably playing Parsec on an old TI-99/A we dug out of the attic. After that there were a lot of troubleshooting sessions with my dad on various computers trying to get some game up and running. I still remember the IRQ/DMA combination needed to get sound in Duke.
It really is no mystery why I ended up working with this stuff.