Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "production bug"
-
Haven't slept in the last 72 hours, eaten in 24 and shaved or showered in 48+ .. but it is such a delight to move the project to production an hour before the deadline and two hours later to receive an angry phone call from the client because there is 'horrible bug' in the web system - the logo of his company wasn't showing, only the name ... the moron never sent us a logo to begin with, only a MS word document with the company's information and a compressed 200x80 logo in the bottom ...12
-
Our testers want us devs to start paying 50 cents for each bug the testers find. But they won't agree to paying us devs 50 cents for each bug that makes it into production.9
-
I worked with a good dev at one of my previous jobs, but one of his faults was that he was a bit scattered and would sometimes forget things.
The story goes that one day we had this massive bug on our web app and we had a large portion of our dev team trying to figure it out. We thought we narrowed down the issue to a very specific part of the code, but something weird happened. No matter how often we looked at the piece of code where we all knew the problem had to be, no one could see any problem with it. And there want anything close to explaining how we could be seeing the issue we were in production.
We spent hours going through this. It was driving everyone crazy. All of a sudden, my co-worker (one referenced above) gasps “oh shit.” And we’re all like, what’s up? He proceeds to tell us that he thinks he might have been testing a line of code on one of our prod servers and left it in there by accident and never committed it into the actual codebase. Just to explain this - we had a great deploy process at this company but every so often a dev would need to test something quickly on a prod machine so we’d allow it as long as they did it and removed it quickly. It was meant for being for a select few tasks that required a prod server and was just going to be a single line to test something. Bad practice, but was fine because everyone had been extremely careful with it.
Until this guy came along. After he said he thought he might have left a line change in the code on a prod server, we had to manually go in to 12 web servers and check. Eventually, we found the one that had the change and finally, the issue at hand made sense. We never thought for a second that the committed code in the git repo that we were looking at would be inaccurate.
Needless to say, he was never allowed to touch code on a prod server ever again.8 -
"Sir, I fixed the recent bug"
"Great, what did you do?"
"I commented out the code that was causing it :)"
"Brilliant! You didn't forget to push the code to production, did you?"
"No Sir, I pushed it immediately"
"Marvelous! I'll arrange a promotion for you next month"5 -
After several months of bug fixing, I can proudly say the application I inherited at work has gone a whole day in production without an unhandled exception (from a peak of above 1200 a few months ago).
Well, either that or I've broken the error logging and am now living in blissful ignorance.4 -
Is this the code life
Another scrum meeting
Caught in the the Node life
No escape from reality
Open your eyes
Look up to the screens and see..
I'm just a dev boy
Doing some debugging
Because there's warnings here
Errors there
Segment faults
Everywhere
Anytime you distract
Takes another hour from me
From me
*piano starts
Mama. Just committed a bug
Merge the branch to production
Did it fast for milestones
Mama. The repo has just begun
But now they going to throw the stack away.
Mama. U u u uu
Didn't mean to code in LAMP
But it's the only stack i know how to setup
In Ubuntu. Without docker
I really don't get vagrant
*piano
It's too late
My team is done
Some dev is working in Nepal
A UX dev. Now what is that?
Goodbye everybody
I've got to go
Gotta leave this lame meeting
And face the truth
Oh nooooo. I i interns
(they have questions)
I want to debug
I don't want to stay till 3 in the morning
*epic guitar
I see a litlle dev over there
Let's code review, let's code review
Did he do the last commit?
Coding in the white board
Very very frightening me
That's bug(that's a bug)
That's a bug (that's a bug)
What the f*ck did you do that?
Magnificcooooooo
I was just coding and nobody liked it
He was coding and nobody liked it, spare his some time to do his debugging
Easy man. Here go. Will you let me code?
A meeting. No,we will not let you code. ( let me code)
A meeting. we will not let you code. ( let me code)
A meeting. we will not let you code. ( let me code)
We will not let you code
Never never let you go
Never let you code, oh
No no no no no no no
Oh mama mia, mama mia ( dude, you've gotta let me code)
Screw you guys, I'm gonna code and commit. Commit. Comiiiiitt!
*epic guitar
So you think you can review me and spit in my eye?
So you think you can dump me and erase my branch?
Oh baby, cant do this to me baby
I've just have to log out.
I've just have to log outta here
*epic guitar solo
Nothing really matters
The users will not care
Nothing really matters
To them
Any way this code blows10 -
string excuses[]={
"it's not a bug it's a feature",
"it worked on my machine",
"i tested it and it worked",
"its production ready",
"your browser must be caching the old content",
"that error means it was successful",
"the client fucked it up",
"the systems crashed and the code got lost" ,
"this code wont go into the final version",
"It's a compiler issue",
"it's only a minor issue",
"this will take two weeks max",
"my code is flawless must be someone else's mistake",
"it worked a minute ago",
"that was not in the original specification",
"i will fix this",
"I was told to stop working on that when something important came up",
"You must have the wrong version",
"that's way beyond my pay grade",
"that's just an unlucky coincidence",
"i saw the new guy screw around with the systems",
"our servers must've been hacked",
"i wasn't given enough time",
"its the designers fault",
"it probably won't happen again",
"your expectations were unrealistic",
"everything's great on my end",
"that's not my code",
"it's a hardware problem",
"it's a firewall issue",
"it's a character encoding issue",
"a third party API isn't responding",
"that was only supposed to be a placeholder",
"The third party documentation is wrong",
"that was just a temporary fix.",
"We outsourced that months ago.","
"that value is only wrong half of the time.",
"the person responsible for that does not work here anymore",
"That was literally a one in a million error",
"our servers couldn't handle the traffic the app was receiving",
"your machines processors must be too slow",
"your pc is too outdated",
"that is a known issue with the programming language",
"it would take too much time and resources to rebuild from scratch",
"this is historically grown",
"users will hardly notice that",
"i will fix it" };11 -
My biggest dev blunder. I haven't told a single soul about this, until now.
👻👻👻👻👻👻
So, I was working as a full stack dev at a small consulting company. By this time I had about 3 years of experience and started to get pretty comfortable with my tools and the systems I worked with.
I was the person in charge of a system dealing with interactions between people in different roles. Some of this data could be sensitive in nature and users had a legal right to have data permanently removed from our system. In this case it meant remoting into the production database server and manually issuing DELETE statements against the db. Ugh.
As soon as my brain finishes processing the request to venture into that binary minefield and perform rocket surgery on that cursed database my sympathetic nervous system goes into high alert, palms sweaty. Mom's spaghetti.
Alright. Let's do this the safe way. I write the statements needed and do a test run on my machine. Works like a charm 😎
Time to get this over with. I remote into the server. I paste the code into Microsoft SQL Server Management Studio. I read through the code again and again and again. It's solid. I hit run.
....
Wait. I ran it?
....
With the IDs from my local run?
...
I stare at the confirmation message: "Nice job dude, you just deleted some stuff. Cool. See ya. - Your old pal SQL Server".
What did I just delete? What ramifications will this have? Am I sweating? My life is over. Fuck! Think, think, think.
You're a professional. Handle it like one, goddammit.
I think about doing a rollback but the server dudes are even more incompetent than me and we'd lose all the transactions that occurred after my little slip. No, that won't fly.
I do the only sensible thing: I run the statements again with the correct IDs, disconnect my remote session, and BOTTLE THAT SHIT UP FOREVER.
I tell no one. The next few days I await some kind of bug report or maybe a SWAT team. Days pass. Nothing. My anxiety slowly dissipates. That fateful day fades into oblivion and I feel confident my secret will die with me. Cool ¯\_(ツ)_/¯12 -
When you accidentally leave a console.log in your code that goes into production that says: "I'm fucking working!!!" and notice it a year later when fixing a bug.2
-
Tested on local work perfectly,
Push to production
Restart service
CPU went up to 100%, process alive but all services down.
Panic.
Try to think of what would be the cause.
With no hope, i restarted the service again.
Everything works as expected.
I still dont know what happened. :/7 -
Worst dev team failure I've experienced?
One of several.
Around 2012, a team of devs were tasked to convert a ASPX service to WCF that had one responsibility, returning product data (description, price, availability, etc...simple stuff)
No complex searching, just pass the ID, you get the response.
I was the original developer of the ASPX service, which API was an XML request and returned an XML response. The 'powers-that-be' decided anything XML was evil and had to be purged from the planet. If this thought bubble popped up over your head "Wait a sec...doesn't WCF transmit everything via SOAP, which is XML?", yes, but in their minds SOAP wasn't XML. That's not the worst WTF of this story.
The team, 3 developers, 2 DBAs, network administrators, several web developers, worked on the conversion for about 9 months using the Waterfall method (3~5 months was mostly in meetings and very basic prototyping) and using a test-first approach (their own flavor of TDD). The 'go live' day was to occur at 3:00AM and mandatory that nearly the entire department be on-sight (including the department VP) and available to help troubleshoot any system issues.
3:00AM - Teams start their deployments
3:05AM - Thousands and thousands of errors from all kinds of sources (web exceptions, database exceptions, server exceptions, etc), site goes down, teams roll everything back.
3:30AM - The primary developer remembered he made a last minute change to a stored procedure parameter that hadn't been pushed to production, which caused a side-affect across several layers of their stack.
4:00AM - The developer found his bug, but the manager decided it would be better if everyone went home and get a fresh look at the problem at 8:00AM (yes, he expected everyone to be back in the office at 8:00AM).
About a month later, the team scheduled another 3:00AM deployment (VP was present again), confident that introducing mocking into their testing pipeline would fix any database related errors.
3:00AM - Team starts their deployments.
3:30AM - No major errors, things seem to be going well. High fives, cheers..manager tells everyone to head home.
3:35AM - Site crashes, like white page, no response from the servers kind of crash. Resetting IIS on the servers works, but only for around 10 minutes or so.
4:00AM - Team rolls back, manager is clearly pissed at this point, "Nobody is going fucking home until we figure this out!!"
6:00AM - Diagnostics found the WCF client was causing the server to run out of resources, with a mix of clogging up server bandwidth, and a sprinkle of N+1 scaling problem. Manager lets everyone go home, but be back in the office at 8:00AM to develop a plan so this *never* happens again.
About 2 months later, a 'real' development+integration environment (previously, any+all integration tests were on the developer's machine) and the team scheduled a 6:00AM deployment, but at a much, much smaller scale with just the 3 development team members.
Why? Because the manager 'froze' changes to the ASPX service, the web team still needed various enhancements, so they bypassed the service (not using the ASPX service at all) and wrote their own SQL scripts that hit the database directly and utilized AppFabric/Velocity caching to allow the site to scale. There were only a couple client application using the ASPX service that needed to be converted, so deploying at 6:00AM gave everyone a couple of hours before users got into the office. Service deployed, worked like a champ.
A week later the VP schedules a celebration for the successful migration to WCF. Pizza, cake, the works. The 3 team members received awards (and a envelope, which probably equaled some $$$) and the entire team received a custom Benchmade pocket knife to remember this project's success. Myself and several others just stared at each other, not knowing what to say.
Later, my manager pulls several of us into a conference room
Me: "What the hell? This is one of the biggest failures I've been apart of. We got rewarded for thousands and thousands of dollars of wasted time."
<others expressed the same and expletive sediments>
Mgr: "I know..I know...but that's the story we have to stick with. If the company realizes what a fucking mess this is, we could all be fired."
Me: "What?!! All of us?!"
Mgr: "Well, shit rolls downhill. Dept-Mgr-John is ready to fire anyone he felt could make him look bad, which is why I pulled you guys in here. The other sheep out there will go along with anything he says and more than happy to throw you under the bus. Keep your head down until this blows over. Say nothing."11 -
The 3 greatest things in life:
1 A great orgasm
2. A great shit
3. Fixing a major bug before anyone notices.
Work directly on a production system and get all 3 experiences simultaneously.8 -
The time my Java EE technology stack disappointed me most was when I noticed some embarrassing OutOfMemoryError in the log of a server which was already in production. When I analyzed the garbage collector logs I got really scared seeing the heap usage was constantly increasing. After some days of debugging I discovered that the terrible memory leak was caused by a bug inside one of the Java EE core libraries (Jersey Client), while parsing a stupid XML response. The library was shipped with the application server, so it couldn't be replaced (unless installing a different server). I rewrote my code using the Restlet Client API and the memory leak disapperead. What a terrible week!2
-
When your in apprenticeship and find a bug that allows the user to skip the payment... that was in production for multiple years...3
-
My first testing job in the industry. Quite the rollercoaster.
I had found this neat little online service with a community. I signed up an account and participated. I sent in a lot of bug reports. One of the community supervisors sent me a message that most things in FogBugz had my username all over it.
After a year, I got cocky and decided to try SQL injection. In a production environment. What can I say. I was young, not bright, and overly curious. Never malicious, never damaged data or exposed sensitive data or bork services.
I reported it.
Not long after, I got phone calls. I was pretty sure I was getting charged with something.
I was offered a job.
Three months into the job, they asked if I wanted to do Python and work with the automators. I said I don't know what that is but sure.
They hired me a private instructor for a week to learn the basics, then flew me to the other side of the world for two weeks to work directly with the automation team to learn how they do it.
It was a pretty exciting era in my life and my dream job.4 -
IF LIVES DEPEND ON A SYSTEM
1. Code review, collaboration, and knowledge sharing (each hour of code review saves 33 hours of maintenance)
2. TDD (40% — 80% reduction in production bug density)
3. Daily continuous integration (large code merges are a major source of bugs)
4. Minimize developer interruptions (an interrupted task takes twice as long and contains twice as many defects)
5. Linting (catches many typo and undefined variable bugs that static types could catch, as well as a host of stylistic issues that correlate with bug creation, such as accidentally assigning when you meant to compare)
6. Reduce complexity & improve modularity -- complex code is harder to understand, test, and maintain
-Eric Elliott12 -
A major bank had a bug in their system that triggered multiple postings of transactions in all of their clients' accounts. When they found out the root cause, they just had to mention that the programmer was a female. Like that was any bearing to the problem. It's not like they have like a whole fucking team of QA testers that should have checked that shit before it went to production.14
-
A bug in production, so I was debugging and the boss says to me "how do you stay so calm?" and I answer "just like a surgeon does't have to freak out with blood, a developer doesn't have to freak out with bugs"3
-
Most satisfying bug I've fixed?
Fixed a n+1 issue with a web service retrieving price information. I initially wrote the service, but it was taken over by a couple of 'world class' monday-morning-quarterbacks.
The "Worst code I've ever seen" ... "I can't believe this crap compiles" types that never met anyone else's code that was any good.
After a few months (yes months) and heavy refactoring, the service still returned price information for a product. Pass the service a list of product numbers, service returns the price, availability, etc, that was it.
After a very proud and boisterous deployment, over the next couple of days the service seemed to get slower and slower. DBAs started to complain that the service was causing unusually high wait times, locks, and CPU spikes causing problems for other applications. The usual finger pointing began which ended up with "If PaperTrail had written the service 'correctly' the first time, we wouldn't be in this mess."
Only mattered that I initially wrote the service and no one seemed to care about the two geniuses that took months changing the code.
The dev manager was able to justify a complete re-write of the service using 'proper development methodologies' including budgeting devs, DBAs, server resources, etc..etc. with a projected year+ completion date.
My 'BS Meter' goes off, so I open up the code, maybe 5 minutes...tada...found it. The corresponding stored procedure accepts a list of product numbers and a price type (1=Retail, 2=Dealer, and so on). If you pass 0, the stored procedure returns all the prices.
Code basically looked like this..
public List<Prices> GetPrices(List<Product> products, int priceTypeId)
{
foreach (var item in products)
{
List<int> productIdsParameter = new List<int>();
productIdsParameter.Add(item.ProductID);
List<Price> prices = dataProvider.GetPrices(productIdsParameter, 0);
foreach (var price in prices)
{
if (price.PriceTypeID == priceTypeId)
{
prices = dataProvider.GetPrices(productIdsParameter, price.PriceTypeID);
return prices;
}
* Omitting the other 'WTF?' code to handle the zero price type
}
}
}
I removed the double stored procedure call, updated the method signature to only accept the list of product numbers (which it was before the 'major refactor'), deployed the service to dev (the issue was reproducible in our dev environment) and had the DBA monitor.
The two devs and the manager are grumbling and mocking the changes (they never looked, they assumed I wrote some threading monstrosity) then the DBA walks up..
DBA: "We're good. You hit the database pretty hard and the CPU never moved. Execution plans, locks, all good to go."
<dba starts to walk away>
DevMgr: "No fucking way! Putting that code in a thread wouldn't have fix it"
Me: "Um, I didn't use threads"
Dev1: "You had to. There was no way you made that code run faster without threads"
Dev2: "It runs fine in dev, but there is no way that level of threading will work in production with thousands of requests. I've got unit tests that prove our design is perfect."
Me: "I looked at what the code was doing and removed what it shouldn't be doing. That's it."
DBA: "If the database is happy with the changes, I'm happy. Good job. Get that service deployed tomorrow and lets move on"
Me: "You'll remove the recommendation for a complete re-write of the service?"
DevMgr: "Hell no! The re-write moves forward. This, whatever you did, changes nothing."
DBA: "Hell yes it does!! I've got too much on my plate already to play babysitter with you assholes. I'm done and no one on my team will waste any more time on this. Am I clear?"
Seeing the dev manager face turn red and the other two devs look completely dumbfounded was the most satisfying bug I've fixed.5 -
I recently joined this big MNC after shutting down my own startup. I was trying to automate their build process properly. They were currently using grunt and I favor gulp, so I offered to replace the build process with gulp and manage it properly.
I was almost done with it in development environment and QA was being done for production.
In the meantime I was trying to fix some random bug in a chrome extension backend. I pushed some minor changes to production which was not going to affect the main site. That was in the afternoon.
This Friday my senior rushed to me. It was like he ran six floors to reach me. He asked, did you push the new build system to production, I refused. He then went to the computer nearby and opened the code.
It was Friday and I was about to leave. But being a good developer, I asked what's the problem. He told me that one complete module is down and the developers responsible for them left for the day already and are unreachable.
I worked on that module multiple times last month, so I offered my help. He agreed and we get to work.
The problem was in the Angular front end. So we immediately knew that the build process is screwed. I accidentally kept the gulp process open for anyone, so I immediately rebuilt using grunt and deployed again, but to no success.
Then I carefully analyzed all the commits to the module to find out that I was the one who pushed the change last. That was the chrome extention. I quickly reverted the changes and deployed and the module was live again. The senior asked, how did you do that? I told the truth.
He was surprised that how come that change affect the complete site too. We identified it after an hour. It was the grunt task which includes all the files from that particular module, including chrome extension in the build process.
He mailed the QA team to put Gulp in increased priority and approved the more structural changes, including more scrutiny before deployment and backup builds.
The module was down for more than 5 hours and we got to know only after the client used it for their own process. I was supposed to be fired for this. But instead everyone appreciated my efforts to fix things.
I guess I am in a good company 😉4 -
A few days after deploying a big important Website into production, I wanted to copy the whole thing including DB back onto our test server for future testing/bug fixing if something comes up. (Last changes were done on production server before going live)
So I opened SSH, removed everything on the test sever aaaaand then I realized I was connected to production...
Took about an hour to get everything up and running again. We didn't tell the client and hoped it would not be noticed.2 -
Explaining to the client that you can't push untested changes to production and expect it to be bug free.3
-
PM says you are spending too much time testing. Fast forward to a production bug and they are standing behind you while you are frantically coding saying "this should have been tested more".7
-
That feeling when you find the bug breaking the application in production and you are just an intern :)5
-
I explained my latest project to a non developer friend. I told him I'm almost done with the code, now I just need to debug and get it ready for production.
"Why don't you write code that doesn't have any bugs?"
"Holy shit! I never thought about that! Thank you so much, I'll make sure to write bug free code from now on".10 -
1. Slack. Pretty good chat app for dev companies, I use it to prevent people standing next to my desk 40 times a day.
2. Unit testing tools, especially when fully automated using a git master branch hook, something like codeship/jenkins, and a deployment service.
3. Jetbrains IDEs. I love Vim, but Jetbrains makes theming, autocompleting & code style checks with mixed templating languages a breeze.
4. Urxvt terminal. It's a bit of work at the start, but so extremely fast and customizable.
5. Cinnamon or i3. Not really dev tools, but both make it easy to organize many windows.
6. A smart production bug logger. I tend to use Bugsnag, Rollbar or Sentry.
7. A good coffee machine. Preferably some high pressure espresso maker which costs more than the CEO's car, using organic fairtrade hipster beans with a picture of a laughing south american farmer. And don't you dare fuck it up with sugar.
8. Some high quality bars of chocolate. Not to consume yourself, but to offer to coworkers while they wait for you to fix a broken deploy. The importance of office politics is not to be underestimated.1 -
Finding a bug as a developer: "Fuck..." *start working on an undocumented hotfix that breaks other parts of the application"
Finding a bug as a tester: "Yeah, right.." *start writing a comprehensive report including all possible failure scenarios and how world famine will increase and men develop boobs if this bug is shipped into production"
Finding a bug as a PM: "Well, the other parts work, right ?" *click randomly on nearby buttons and input fields to "check" if everything is all right. Ditch said report from tester*3 -
On an afternoon the day before delivery, we discovered a crashing bug. At around 2 AM, we had found the cause and fixed it. A short sleep at home, then back to office at 8 AM because the delivery was 200 devices containing that software, and they had to be updated manually because production had put in the old image.
We seized all available computers, even those from marketing who were... surprised. Half-way in the update, we calculated that we wouldn't have enough time until the freight service would show up.
So we asked the secretary that she should be a bit flirty to the parcel guy, invite him to a coffee and chat around to buy us more time. We closed the last parcel just when he figured that he had to continue with his tour.5 -
I was fixing a bug in Production and at the same time, the same client called me for the other project saying its too urgent.
I was like Oh god what now?
Guess what she wanted to change the color of Navbar.5 -
Patching a bug in production is like trying to collect marbles on a train track... with a train coming toward you....3
-
Me: Hey boss, if you ever need someone to get into doing DevOps related tasks for the team, I'd be more than happy to take that on.
Boss: We don't really need any dedicated person to work on that, but if we do in the future, I'll let you know.
Fast forward a few days: I am now unable to deploy bug fixes to our testing environment, now in the cloud, because all access has been blocked for everyone except the two numbskulls who thought it'd be a great idea to move EVERYTHING over (apps, configuration manager, proxies, etc) first.
Oh, and this bug is affecting production.3 -
So, a few years ago I was working at a small state government department. After we has suffered a major development infrastructure outage (another story), I was so outspoken about what a shitty job the infrastructure vendor was doing, the IT Director put me in charge of managing the environment and the vendor, even though I was actually a software architect.
Anyway, a year later, we get a new project manager, and she decides that she needs to bring in a new team of contract developers because she doesn't trust us incumbents.
They develop a new application, but won't use our test team, insisting that their "BA" can do the testing themselves.
Finally it goes into production.
And crashes on Day 1. And keeps crashing.
Its the infrastructure goes out the cry from her office, do something about it!
I check the logs, can find nothing wrong, just this application keeps crashing.
I and another dev ask for the source code so that we can see if we can help find their bug, but we are told in no uncertain terms that there is no bug, they don't need any help, and we must focus on fixing the hardware issue.
After a couple of days of this, she called a meeting, all the PMs, the whole of the other project team, and me and my mate. And she starts laying into us about how we are letting them all down.
We insist that they have a bug, they insist that they can't have a bug because "it's been tested".
This ends up in a shouting match when my mate lost his cool with her.
So, we went back to our desks, got the exe and the pdb files (yes, they had published debug info to production), and reverse engineered it back to C# source, and then started looking through it.
Around midnight, we spotted the bug.
We took it to them the next morning, and it was like "Oh". When we asked how they could have tested it, they said, ah, well, we didn't actually test that function as we didn't think it would be used much....
What happened after that?
Not a happy ending. Six months later the IT Director retires and she gets shoed in as the new IT Director and then starts a bullying campaign against the two of us until we quit.5 -
It's about a guy that knows better.
I was working as a subcontractor on a bigger system. We (subs) were not allowed to deploy code, we had to wait for contractor to deploy.
One day I got an email that my code is bugged and that my feature is not working on production. I checked it on test env, everything was fine. Then I checked if the code I wrote was deployed. It was not.
I send an email explaining that if they deployed my code it would be working. Then I got a response. There was a bug in my code.
Another email. I asked how would they know? Do they have a test on their environment that failed?
No. There is one guy that READ my code and he said it should not work, so he will not deploy it. He was not a programmer, he was a business consultant responsible for the documentation.
His issue was that I used a function that was not in a class. So if the function is not declared it's obvious it will not work. I had to explain to him in another email, that you can use object of another class inside your class and then call a function, that is not in your class. It was the last time this guy blocked my deploy.
TL;DR, I had to explain a non-dev how object composition works in order to have my code deployed. Took four emails.4 -
That moment that you get some kind of pretty critical error/bug/crash in an application in production and you can't reproduce it anymore and you're just sitting there praying that it won't happen again 😥4
-
Just found a humongous bug in production. The customer relation number, which should have been unique was shared by ten customers.
So instead of 400,000 relation numbers we only had 40,000 unique numbers.
The system is live for 3 weeks now, and we just fixed it on a friday at 5:30PM
If we had found out a week later every customer would have gotten a plastic card with the wrong number, because the cards will be printed in four days...8 -
Investigating a "bug"(turned out to be user error) on the production servers, accidently wiped out the permissions for 1,800 users.
Thankfully was able to recover them in under 10min. God bless a solid disaster recovery policy.1 -
Me: We really need to improve our unit test coverage.
Team/Boss: <sarcasm> Haha yeah.
Production Bug: I'm doing something nasty to a client, because a dev broke something but no test coverage.
Boss: How could we have prevented this?!1 -
That moment when you realise you just pushed a major bug whilst fixing another to the production website that launched earlier today.
Rolled back to working version, all within 30 seconds.
10 seconds later, client on phone... I just tried to load a page, why is my website broken?
They had to be loading it in that 30 seconds didn't they...?3 -
Freelance project I was working on was deployed. Without my knowledge. At 11pm. Their in-house "tech guy" thought that the preview build i gave them was good enough for deployment. Massive bug, broke their api endpoints.
Got a call at 2 in the morning,asking for a fix. I told them how it was their fault and the App they deployed had TESTING written right on the main screen.
They promised additional payment to get me to fix it asap.
Went through the commit history (thank goodness their tech guy knew git, fuck him for committing on production though) and the crash reports.
Removed three lines. All became right with the world again. 😎2 -
Most ignorant ask from a PM or client?
Migrated to SharePoint 2016 which included Reporting Services, and trying to fix a bug in the reporting services scheduler, I created a report (aka, copied an existing one) 'A Klingon Walks Into a Bar', so it would first in the list and distinct enough so the QA testers would (hopefully) leave it alone.
The PM for the project calls me.
PM: "What is this Klingon report? It looks like a copy of the daily inventory report"
Me: "It is. The reporting service job keeps crashing on certain reports that have daily execution schedules."
PM: "I need you to delete it"
Me: "What? Why? The report is on the dev sharepoint site. I named the report so it was unique and be at the top of the list so I can find it easily."
PM: "The name doesn't conform to our standards and it's confusing the testers."
Me: "The testers? You mean Dan, you, and Heather?"
PM: "Yes, smartass. Can you name the report something like daily inventory report 2, or something else?"
Me: "I could, but since this is in development, no. You've already proofed out the upgrade. You're waiting on me to fix this sharepoint bug. Why do you care what I do on this server? It's going away after the upgrade."
PM: "Yea, about that. We like having the server. It gives us a place to test reports. Would really appreciate it if you would rename or delete that report."
Me: "A test sharepoint reporting services server out of scope, so no, we're not keeping it."
PM: "Having a server just for us would be nice."
Me: "$10,000 nice? We're kinda fudging on the licensing now. If we're keeping it, we will be required to be in compliance. That's a server license, sharepoint license, sql server license, and the dedicated hardware. We talked about that, remember?"
PM: "Why is keeping that report so important to you? I don't want to explain to a VP what a Klingon is."
Me: "I'm not keeping the report or moving it to production. When I figure out the problem, I'll delete the report. OK?"
PM: "I would prefer you delete the report before a VP sees it."
Me: "Why would a VP be looking? They probably have better things to do."
PM: "Jeff wants to see our progress, I'll have to him the site, and he'll see the report."
Me: "OK? You tell Jeff it's a report I'm working on, I'll explain what a Klingon is, Jeff will call me a nerd, and we all move on."
PM: "I'm not comfortable with this upgrade."
Me: "What does that mean?"
PM: "I asked for something simple and I can't be responsible for the consequences. I'll be documenting this situation as a 'no-go' for deployment"
Me: "Oookaayyy?"
I figured out the bug, deleted the 'Klingon' report, and the PM couldn't do anything to delay the deployment.4 -
- "Hi A, we had a bug in production due to a changed category ID which we were not informed of."
- "Oh but my API just proxies the content from team B."
_____
- "Hi B, we had a bug in production due to a changed category ID."
- "Ah, I have nothing to do with category IDs, you should talk to my colleague C."
_____
- "Hi C, we had a bug in production due to a changed category ID."
- "I wish I knew anything about that, you should contact person D!"
_____
- "Hi D, we had a bug in production due to a changed category ID."
- *changes status to "Absent" on IM*
ERROR_TOO_MANY_REDIRECTS1 -
!rant
One of our clients discovered a bug on our site about an hour ago. I wrote up a fix, and after very little testing, pushed it into production. I then immediately left to go home.
You can say I like to live my life dangerously.3 -
I left my previous company because my tech leadership was insensitive and agressive.
However, I am in a start-up right now and CTO is a nut job.
He creates random Slack threads and keeps messaging me like crazy. The co-founders have shut him down multiple times and yet his only success metric is "number of deliveries".
The other day something broke in production and teams were discussing about resolving the bug in one of the Slack channels.
CTO literally wrote this and I wish I was making this up, "let us not look at the logs and trust our code to work fine."
I was baffled and confused. I realised me leaving my previous organisation because of such tech leadership was a stupid decision.
Crows are black everywhere.5 -
Dev1: "How did you wrote this code? It causes bug on production."
Me: *checks on gitlens*
"Dev1... 3 months ago"1 -
Yesterday Australia Melbourne Metro train had computer fault. Train stopped, door cannot open. What caused it? Someone pushed a bug to the production? Tell me pls2
-
So I'm wrapping up for the day and right before I leave a coworker comes up to me with a problem. Our company uses barcodes to track some of our products through their development and we recently switched over to a new system for producing them. The barcodes for this particular product are supposed to have 8 digits, but the last 200 we printed have 9.
I immediately panic because I wrote the script that generates the bar codes and there had been a bug in the past where the script would add extra leading zeroes that weren't supposed to be there. I scramble and check the database, it would be a huge headache if our production database had been compromised with junk barcodes. Nope, all the new barcodes there have the right number of digits.
Next place to check is in the code that writes the barcodes to a text file for staff to print the physical labels from. Nope that's all fine too.
I ask the person who printed out the recent batch of labels to show me how the printing software reads from the text file. She seems confused by my question and shows me how she manually enters in the barcode range to the software. As she does this I watch her add an extra zero to the numbers. 🙃
Even worse there was an option to import all the codes from a text file literally RIGHT BELOW the manual option.
TLDR; Thought my script had screwed up our database, ended up being the fault of a coworker who didn't know how to import text files.1 -
TIFU by deploying not one, but two major releases to production at 5pm on a Friday. I trusted my testers, so one of the applications had a major bug in it. I just worked through a child's surprise party. hope the testers are having a lovely evening with their loved ones.1
-
Boss: we have to fix this bug.
Me: It is not a bug ..the server takes more time to send the response which cause the timeout issue . we may need to change the implementation to increase the performance to send the response quickly. It will take some time
Boss: okay can we fix this by today
Me: ya if we increase timeout to 20 seconds the issue is fixed
Boss: No we want the server to send the response quickly and we need the fix now
I worked for the weekend to fix it finally......Guess what ....the change dint go live since the scenario was not valid and will never likely to happen in production -
Got pulled out of bed at 6 am again this morning, our VMs were acting up again. Not booting, running extremely slow, high disk usage, etc.
This was the 6 time in as many weeks this happened. And always the marching orders were the same. Find the bug, smash the bug, get it working with the least effort. I've dumped hundreds of hours maintaining this broken shitheap of a system, putting off other duties to keep mission critical stations running.
The culprits? Scummy consultants, Windows 10 1709, and Citrix Studio.
Xen Server performed well enough, likely due to its open source origins and Centos architecture.
Whelp. DasSeahawks was good and pissed. Nothing like getting rousted out of bed after a few scant hours rest for patching the same broken system.
DasSeahawks lost his temper. Things went flying. Exorcists were dispatched and promptly eaten.
Enough. No consultants, no analysts, and no experts touched it. No phone calls, no manuals, not even a google search. Just a very pissed admin and his minion declaring blitzkrieg.
We made our game plan, moved the users out, smoked our cigs, chugged monster, and queued a gnu-metal playlist on spotify.
Then we took a wrecking ball to the whole setup. User docs were saved, all else was rm -r * && shred && summon -u Poseidon -beast Land_Cracken.
Started at 3pm and finished just after midnight. Rebuilt all the vms with RDP, murdered citrix studio (and their bullshit licenses), completely blocked Windows 10 updates after 1607, and load balanced the network.
So what do we get when all the experts are fired? Stabbed lightning. VMs boot in less than 10 seconds, apps open instantly, and server resources are half their previous usage state. My VMs are now the fastest stations in our complex, as they should be.
Next to do: install our mxgpu, script up snapshots and heartbeat, destroy Windows ads/telemetry, and setup PDQ. damn its good to be good!
What i learned --> never allow testing to go to production, consultants will fuck up your shit for a buck, and vendors are half as reliable over consultants. Windows works great without Microsoft, thin clients are overpriced, and getting pissed gets things done.
This my friends, is why admins are assholes.4 -
Fuuuuck this corporate bullshit. I'm basically sitting around twiddling my thumbs waiting for some jackass to grant me access to the server that my boss moved my code over to. Why the hell did you put my app on a production server that runs every 30 minutes...THAT I DON'T HAVE ACCESS TO?? Now there's a critical bug and a $50K order in limbo because I can't push any fixes. Fuck me. The worst part will be in the next hour or so when dozens of people are calling, emailing, and attacking my cubicle like rabid animals about why orders aren't moving and I'll have to explain that production is a train wreck because reasons. Just end me.2
-
"It works on our end", the sentence that made me lose my shit.
I've been working on a project were we're supposed to integrate an API into our system.
When trying to get some user id's (UUID) from said API, we got a type-error in the response (???), so I called their integration support and asked what the fuck they were doing (not really, i was kinda calm at this point).
The answer I got was following:
Integration guy: "Uh, bro, like, I don't even know, it's probably on your end"
Me: "We literally used this endpoint with the same parameters yesterday, and got a result we expected. I noticed you updated your API this morning, did you make any major changes?"
Integration guy: "Yeah we changed the type of user id from string to number"
Me: "So, you changed the type of a UUID (uuid4) from string to number? How did you not think that would be an issue? I can see in your forums that everyone else is having the same issue."
Integration guy: "Nah, it's probably a bug in your code, it works on our end"
Me in my mind: *IT WORKS ON YOUR END?!? IT DOESN'T FUCKING MATTER IF IT WORKS ON YOUR END, FUCKTARD.*
What I actually said: "Uhm, I'm not sure if works on your end either, I'm not even sure how this change made it to production. But hey, thanks I guess, bye."
WHY AM I NOT ABLE TO YELL AT PEOPLE WHEN THEY ARE BEING RETARDED???
But really though, when you're maintaining an API, you shouldn't fucking care if things work on your end in your dev environment. What matters is how it works in production, for the end user/users.
And I know that 99% of cases it's the users fault by entering the wrong parameters or trying to request with wrongly setup auth and what not, but still.
Don't ASSUME nothing's wrong on your end. It's your fucking job to fix the issues.
And guess what? The problem was on their side.
I'm going fucking bald.2 -
Worst hypothetical Dev job... hmm 🤔 well I think I have the right scenario for you. A medical company stores patient charts and critical life saving information in a database. This database makes medical decisions for lifesaving incubators and if there’s a bug it means 10,000 newborn sick babies will die because of your fuck up (oh and you’re criminally liable too so good luck!). But beyond the high pressure job that sounds at least somewhat normal, the database is written in a special form of assembly for a custom undocumented CPU where only one copy in the world exists so all tests and development are on production. Google and StackOverflow is banned so your only resource is your brain. Good luck🍀9
-
Maybe not worst, but most frustrating. One of the systems I helped maintain at my first job had a few different bugs that caused bad data in the database. The "solution" to the problem was to write SQL queries to directly fix the production data. This would take one member of our team (it rotated weekly) about an hour every day to fix because there were literally dozens of these errors.
All the devs knew that we could identify the root cause and fix it in, probably, 3-4 days tops. Management would never approve the time because it would take longer to fix the root cause than it took to fix the data.
I worked at that company for 7 years. The bug was there when I came on, and it was there when I left.2 -
This was some time ago. A Legendary bug appeared. It worked in the dev environment, but not in the test and production environment.
It had been a week since I was working on the issue. I couldn't pinpoint the problem. We CANNOT change the code that was already there, so we needed to override the code that was written. As I was going at it, something happened.
---
Manager: "Hey, it's working now. What did you do?"
Me: *Very confused because I know I was nowhere close to finding the real source of the problem* Oh, it is? Let me check.
Also me: *Goes and check on the test and prod environment and indeed, it's already working*
Also me to the power of three: *Contemplates on life, the meaning of it, of why I am here, who's going to throw out the trash later, asking myself whether my buddies and I will be drinking tonight, only to realize that I am still on the phone with my manager*
Me again: "Oh wow, it's working."
Manager: "Great job. What were the changes in the code?"
Me: "All I did was put console logs and pushed the changes to test and prod if they were producing the same log results."
Manager: "So there were no changes whatsoever, is that what you mean?"
Me: "Yep. I've no idea why it just suddenly worked."
Manager: "Well, as long as it's working! Just remove those logs and deploy them again to the test and prod environment and add 'Test and prod fix' to the commit comment."
Me: "But what if the problem comes up again? I mean technically we haven't resolved the issue. The only change I made were like 20 lines of console logs! "
Manager: "It's working, isn't it? If it becomes a problem, we'll work it out later."
---
I did as I was told, and Lo and Behold, the problem never occurred again.
Was the system playing a joke on me? The system probably felt sorry for me and thought, "Look at this poor fucker, having such a hard time on a problem he can't even comprehend. That idiotic programmer had so many sleepless nights and yet still couldn't find the solution. Guess I gotta do my job and fix it for him. I'm the only one doing the work around here. Pathetic Homo sapiens!"
Don't get me wrong, I'm glad that it's over but..
What the fuck happened?5 -
I don't know if I'm being pranked or not, but I work with my boss and he has the strangest way of doing things.
- Only use PHP
- Keep error_reporting off (for development), Site cannot function if they are on.
- 20,000 lines of functions in a single file, 50% of which was unused, mostly repeated code that could have been reduced massively.
- Zero Code Comments
- Inconsistent variable names, function names, file names -- I was literally project searching for months to find things.
- There is nothing close to a normalized SQL Database, column ID names can't even stay consistent.
- Every query is done with a mysqli wrapper to use legacy mysql functions.
- Most used function is to escape stirngs
- Type-hinting is too strict for the code.
- Most files packed with Inline CSS, JavaScript and PHP - we don't want to use an external file otherwise we'd have to open two of them.
- Do not use a package manger composer because he doesn't have it installed.. Though I told him it's easy on any platform and I'll explain it.
- He downloads a few composer packages he likes and drag/drop them into random folder.
- Uses $_GET to set values and pass them around like a message contianer.
- One file is 6000 lines which is a giant if statement with somewhere close to 7 levels deep of recursion.
- Never removes his old code that bloats things.
- Has functions from a decade ago he would like to save to use some day. Just regular, plain old, PHP functions.
- Always wants to build things from scratch, and re-using a lot of his code that is honestly a weird way of doing almost everything.
- Using CodeIntel, Mess Detectors, Error Detectors is not good or useful.
- Would not deploy to production through any tool I setup, though I was told to. Instead he wrote bash scripts that still make me nervous.
- Often tells me to make something modern/great (reinventing a wheel) and then ends up saying, "I think I'd do it this way... Referes to his code 5 years ago".
- Using isset() breaks things.
- Tens of thousands of undefined variables exist because arrays are creates like $this[][][] = 5;
- Understanding the naming of functions required me to write several documents.
- I had to use #region tags to find places in the code quicker since a router was about 2000 lines of if else statements.
- I used Todo Bookmark extensions in VSCode to mark and flag everything that's a bug.
- Gets upset if I add anything to .gitignore; I tried to tell him it ignores files we don't want, he is though it deleted them for a while.
- He would rather explain every line of code in a mammoth project that follows no human known patterns, includes files that overwrite global scope variables and wants has me do the documentation.
- Open to ideas but when I bring them up such as - This is what most standards suggest, here's a literal example of exactly what you want but easier - He will passively decide against it and end up working on tedious things not very necessary for project release dates.
- On another project I try to write code but he wants to go over every single nook and cranny and stay on the phone the entire day as I watch his screen and Im trying to code.
I would like us all to do well but I do not consider him a programmer but a script-whippersnapper. I find myself trying to to debate the most basic of things (you shouldnt 777 every file), and I need all kinds of evidence before he will do something about it. We need "security" and all kinds of buzz words but I'm scared to death of this code. After several months its a nice place to work but I am convinced I'm being pranked or my boss has very little idea what he's doing. I've worked in a lot of disasters but nothing like this.
We are building an API, I could use something open source to help with anything from validations, routing, ACL but he ends up reinventing the wheel. I have never worked so slow, hindered and baffled at how I am supposed to build anything - nothing is stable, tested, and rarely logical. I suggested many things but he would rather have small talk and reason his way into using things he made.
I could fhave this project 50% done i a Node API i two weeks, pretty fast in a PHP or Python one, but we for reasons I have no idea would rather go slow and literally "build a framework". Two knuckleheads are going to build a PHP REST framework and compete with tested, tried and true open source tools by tens of millions?
I just wanted to rant because this drives me crazy. I have so much stress my neck and shoulder seems like a nerve is pinched. I don't understand what any of this means. I've never met someone who was wrong about so many things but believed they were right. I just don't know what to say so often on call I just say, 'uhh..'. It's like nothing anyone or any authority says matters, I don't know why he asks anything he's going to do things one way, a hard way, only that he can decipher. He's an owner, he's not worried about job security.13 -
There’s a bug in production, where a user account is vulnerable to simple bruteforcing 15 minutes after signing up. I’m the only one who knows. To fix or not to fix 🤔6
-
Probably the most useful thing I got from all the meetups I attended in my life.
A bottle opener.
Now I don’t need to think twice before grabbing a beer 🍺
What you got last time?17 -
Today is a stark reminder of why i want to leave here. First we couldn't do anything because production was down which blocks dev login. Then support tells me I need to work harder because my bug count keeps going up. But what is in my bugs? Feature requests, global changes, and work that isn't mine. Gee thanks :( Why does support get to comment on my performance anyway by something as dumb as a bug count? Grrr.5
-
I once agreed to maintain and develop an application used in a different section of the school to keep inventory and make sure everything is where it is supposed to be.
At first there was enthusiasm, together with 2 of my classmates we agreed and git clone-d the .NET application that now graduated students built and maintained for the past few years. What could go wrong right?!
It became clear that the original students that worked on it followed an older curriculum, meaning they still got taught .NET instead of the core variant that we get now, not only that but it also seemed that they either did not fully grasp the Clean/Onion architecture or didn't get it in class since there were infrastructure components in the 'Domain' project of the solution. Think of 2 DBContexts in the domain model, yep.
One of us bailed in the first week, the other one and I felt bad for the people using the app so we went on and tried to work on the first bugs that were described in a document. One of these bugs was 'whenever I filter on something in the list, everybody gets to see that filter on their screen instead of only me'. Woah that's weird! Let's see how they put that together!
Oh god, they are using a _static_ variable to store filters, no wonder that it doesn't work properly. Ever heard of sessions?!
Second bug: Sometimes people can't create an account when we sign them up from the admin panel. Alright that is weird, let's figure that one out! Wait a second it seems to work in development? What's this about.
Oh wait I can't create an account on production either? Oh that's weird, wait a second... Why do I have to put my e-mail in a form that was sent to me through e-mail? Why is my address not filled in already? OOH, if someone types in the wrong e-mail address (which is easy since our school has 4 variants of the same f*cking e-mail address) it won't work since it can't recognize the user! Brilliant! Remove e-mail input box and make a token/queryparam determine the user account.
Ah that seems good, it's a mess but it seems a tiny bit better now, great! We're making progress and some sweet buck.
Next bug, trillions of 50x errors on random pages, that's a weird one.
Hm everything works in development, that's odd. Is the production data corrupted?
DID I MENTION that in order to get into the system in development we have to load in a f*cking production database backup ON OUR DEVELOPMENT MACHINE and then ask one of the users' password to login to it and create an account for ourselves? Seeding? What's that, right?!
Anyway, back to bug fixing. I e-mail the the people responsible for the app and get a production admin account, oh I also can't ssh into it because of policies so I have to do everything over e-mail and figure out what's causing the errors. I somehow also wonder if they have any kind of virtualization in place, giving students a VM to do that stuff in doesn't seem so weird does it ? Even with school policies?
Oh btw, 'deploying' means sending a .zip file to a guy in another building and telling him how to configure it, apparently this resulted in a missing folder that the application needed to work and couldn't make on its own. This after 2 weeks of e-mailing back and forth.
After 3 months i quit out of despair and sadness, and due to the fact that I just couldn't do it anymore. I separated everything into logical subprojects and let the last guy handle it, he was OK with that and understood why I left.
Luckily, around that time I already had an actual job at a software development company :)3 -
😯😂😂😂😂
Some times I open the production site and then try to make edits locally to fix a bug, only to realize that I was so stupid.1 -
Passionate programmer attends one of the toughest interviews ever and solves lot of algorithmic problems coding in different programming languages. Impresses the interview panel providing solutions with as much as efficiency as possible. Gets selected, completes induction and gets a nice Dev machine allocated.
Manager walks in and says we got to work with the production support team on fixing a UI bug.2 -
In my three years experience so far I can honestly say that 100% of the developers I've worked with are narrow sighted with regards to how they develop.
As in, they lack the capacity to anticipate multiple scenarios.
They code with one unique scenario in mind and their work ends up not passing tests or generates bugs in production.
Not to say I'm the best at foreseeing every possible scenario, but I at least TRY to anticipate and test my code as much as possible to identify problems and edge cases.
I usually take much more time to complete tasks than my colleagues, but my work usually passes tests and comes back bug free. Whereas my colleagues get applauded for completing tasks quickly but end up spending lots of time fixing up after themselves when tests fail or bugs appear.
Probably more time wasted than if they had done the job correctly from the start. Yet they're considered to be effecient devs because they work "fast".
Frustrating...7 -
One night, after one very stressful week of production code fixes (I was working on a game with some friends and I created the network infrastructure for P2P and database communication from scratch), I was at my gf's house. After we fell asleep, I stood up and screamed right at her something like "I fucking already told you how X works and how to communicate with Y. Learn to write code properly and after double checking yours then you shall ask me for a non-existing 'bug' fix. Learn how to properly write event based code and use polling you moron!". After that I turned to the other side and fell asleep immediately.
When I saw her the next morning sleeping in the couch, I could not understand why... Only after she described to me the whole incident I started laughing.
After that I just took two weeks off the project and after that period I never actually worked the same way (so hard) in my free time with them.1 -
Discovered pro tip of my life :
Never trust your code
Achievements unlocked :
Successfully running C++ GPU accelerated offscreen rendering engine with texture loading code having faulty validation bug over a year on production for more than 1.5M daily Android active users without any issues.
History : Recently I was writing a new rendering engineering that uses our GPU pipeline engine.. and our prototype android app benchmark test always fails with black rendering frame detection assertion.
Practice:
Spend more than a month to debug a GPU pipeline system based on directed acyclic graph based rendering algorithm.
New abilities added :
Able to debug OpenGL ES code on Android using print statement placed in source code using binary search.
But why?
I was aware of the issue over a month and just ignored it thinking it's a driver bug in my android device.. but when the api was used by one of Android dev, he reported the same issue. In the same day at night 2:59AM ....
Satan came to me and told me that " ok listen man, here is what I am gonna do with you today, your new code will be going production in a week, and the renderer will give you just one black frame after random time, and after today 3AM, your code will not show GL Errors if you debug or trace. Buhahahaha ahhaha haahha..... Puffff"
And he was gone..
Thanks satan for not killing me.. I will not trust stable production code anymore enevn though every line is documented and peer reviewed. -
So I tried to render the Facebook page plug-in and guess what..
Some guy at Facebook has added an online style on the preview: display: none !important.
So the plug-in preview was literally empty, thought I was not doing it right, how can the act of simply entering a url be wrong, seriously?
Production bug at Facebook facepalm -
It was a nightmare about a nuclear attack, I was afraid and looking for some cave or basement to hide from radioactive winds.
Then I woke up to find 5 missed calls and emails about a critical bug in production.3 -
Backend Dev: Sir, I think we have a problem with our code. The function does have a lot of bugs in production that we can't maintain it.
Manager: Okay Front-end. You get a new Ticket. Prevent this bug by today
Front-end: Dafuck
Sometimes, I feel sorry for my colleague.1 -
!rant
If you have software in production please have some way for a user to find some contact email (create for this reason only if needed.)
I have run into crippling bugs in huge essential systems (state dmv new system, the ticket system utility marking) which they were oblivious to until I went out of my way, like a stalker to get some contact of someone remotely related to someone I could drop this info in the lap of, and so far it was a total shock to them (the dmv system was taken offline for 3 days to resolve)
I get not wanting to run a helpdesk to support users, but give technical users some contact info ( even if you think you have full coverage analytics because, being software, it may have a bug)
/rant3 -
The moment when I excitedly tell my PM that I was able to replicate our production bug and he looks concerned...2
-
Got informed that there is a bug in production that needs to be addressed on top priority.
Me opens Android studio to check.
Android studio: chill bro2 -
Client: THIS IS CRITICAL, SOME DATA HAS BEEN DELETED, WHAT ZE FUUK HAPPENED, UNDO THIS FAST
Us: so after carefully reviewing the code, related resources and the network traffic we conclude that was never sent in the first place.
*closes issue*
I'm glad we got such a meaningful bug report on the same day a production system started failing, one big deployment that that was like a boss with 3 phases, an unnecessary long meeting and an app developer that that wanted me to break HTTP standards.1 -
Yet another day at work:
My job is to write test libraries for web services and test others code. Yes I know to code, and have a niche in software testing.
Sometimes developers (whose code I find bugs in) get so defensive and scream in emails and meetings if I point out an issue in their code.
Today, when I pointed a bug in his repo, a developer questioned me in an email asking if I even understood his code, and as a tester I shouldn’t look at his code and only blackbox test it.
I wish I can educate the defensive developer that sometimes, it’s okay to make mistakes and be corrected. That’s how we deliver services that doesn’t suck in production.10 -
If you don't know, there are 2 types of bug fixes:
Hot Fix - Patch files directly on the production
Quick Fix - Deploy fix on production and then test it4 -
My colleague sends me an email saying “here’s a check not being performed which causes a bug can you fix this and push to production”
With a screenshot of the code and place it needs to happen underlined
ARE YOU kIDDING ME OH MY GOD
He doesn’t have time to write 10 characters but he has the time to make this work of art of an email and send it to me4 -
> Client complains acceptance testing is too expensive and that he'll do it himself.
> Client complains acceptance testing wasn't thorough enough when a bug hits production. -
"This needs to go into production NOW"
Five hours later...
"That fix is on production now"
"Thanks, did you fix that other bug on production? We need it fixed now" -
I've seen a lot of shitty code. REALLY shitty code...but this. Calling this shitty would be a compliment, so I'm not sure what to call it. The following is copied straight from his source code, which I'm tasked with finding a production logic bug in. The original composer of this masterpiece of one-line clusterfucks is no longer with the company of course, so his pile of shit is now my problem. The program is littered with stuff like this.
if(((FrontLowerLeft.X > tempPack.FrontLowerLeft.X && FrontLowerLeft.X < tempPack.FrontLowerLeft.X + tempPack.Dimensions.Width) || (FrontLowerLeft.X + Dimensions.Width > tempPack.FrontLowerLeft.X && FrontLowerLeft.X + Dimensions.Width < tempPack.FrontLowerLeft.X + tempPack.Dimensions.Width) || (tempPack.FrontLowerLeft.X > FrontLowerLeft.X && tempPack.FrontLowerLeft.X < FrontLowerLeft.X + Dimensions.Width) || (tempPack.FrontLowerLeft.X + tempPack.Dimensions.Width > FrontLowerLeft.X && tempPack.FrontLowerLeft.X + tempPack.Dimensions.Width < FrontLowerLeft.X + Dimensions.Width) || (tempPack.FrontLowerLeft.X == FrontLowerLeft.X && tempPack.Dimensions.Width == Dimensions.Width)) && ((FrontLowerLeft.Y > tempPack.FrontLowerLeft.Y && FrontLowerLeft.Y < tempPack.FrontLowerLeft.Y + tempPack.Dimensions.Height) || (FrontLowerLeft.Y + Dimensions.Height > tempPack.FrontLowerLeft.Y && FrontLowerLeft.Y + Dimensions.Height < tempPack.FrontLowerLeft.Y + tempPack.Dimensions.Height) || (tempPack.FrontLowerLeft.Y > FrontLowerLeft.Y && tempPack.FrontLowerLeft.Y < FrontLowerLeft.Y + Dimensions.Height) || (tempPack.FrontLowerLeft.Y + tempPack.Dimensions.Height > FrontLowerLeft.Y && tempPack.FrontLowerLeft.Y + tempPack.Dimensions.Height < FrontLowerLeft.Y + Dimensions.Height) || (tempPack.FrontLowerLeft.Y == FrontLowerLeft.Y && tempPack.Dimensions.Height == Dimensions.Height)) && ((FrontLowerLeft.Z > tempPack.FrontLowerLeft.Z && FrontLowerLeft.Z < tempPack.FrontLowerLeft.Z + tempPack.Dimensions.Depth) || (FrontLowerLeft.Z + Dimensions.Depth > tempPack.FrontLowerLeft.Z && FrontLowerLeft.Z + Dimensions.Depth < tempPack.FrontLowerLeft.Z + tempPack.Dimensions.Depth) || (tempPack.FrontLowerLeft.Z > FrontLowerLeft.Z && tempPack.FrontLowerLeft.Z < FrontLowerLeft.Z + Dimensions.Depth) || (tempPack.FrontLowerLeft.Z + tempPack.Dimensions.Depth > FrontLowerLeft.Z && tempPack.FrontLowerLeft.Z + tempPack.Dimensions.Depth < FrontLowerLeft.Z + Dimensions.Depth) || (tempPack.FrontLowerLeft.Z == FrontLowerLeft.Z && tempPack.Dimensions.Depth == Dimensions.Depth)))
{
//code that did stuff
//removed for "clarity"
}7 -
I've been working for the last 5 years on some large legacy code used in production, more than 100K LOC, poor comments (when existing) often outdated, huge parts of code that can no longer be reached, over-engineered class hierarchy, functions of thousands lines, huge parts of deprecated code that cannot be removed because "someone might still be using it". Statistically, every small change caused 3 new issues somewhere else and every bug fix or new feature required 10 times the time that would be necessary with a decent codebase. But after five years in hell I can finally say that... Oh wait, nothing changed, the code is still legacy and nobody is going to do anything about that.1
-
Bug I had to fix today: some elements in our React app were being swapped with other elements.
We had `<foo>bar</foo>` on the component but on the html `foo` was being swapped with some other element in our app. It's contents ("bar" in the example) were being left in place, though, so we were getting `<baz>bar</baz>`.
This would only happen when running on production mode. On development everything was fine.
Also, everything seemed fine on the React dev tools. `foo` was where it was supposed to be, but on the html it was somewhere else.
Weirdest shit I ever saw when using React. I found a way to go around it and applied that fix, but I'm still trying to track this down to the source.
The worst part was waiting for fucking webpack to finish the production bundle on every fucking change I wanted to test. I didn't miss the change-save-compile-test flow at all.
What a shit day.4 -
I found a bug.... it's in production for a long long time... I wrote it while optimizing some legacy code....
FUCK.... how do you feel when you discover a bug in production created by you?6 -
Fun fact : Typing on a keyboard that's not plugged into your laptop will not work. Honestly, I should have taken today off. My toddler woke up at 1am screaming bloody murder and didn't get back to bed until 2:30. There's a construction crew hammering up my 14 year old floors to replace it with new ones. My dog is anxious as fuck with all the noise, and truthfully, I'm just staring into my screen hoping the code will write itself. This code will become production deployment logic, so it damn well better be excellent and bug-free. Not a good day to pick up this task.3
-
Lets fix this bug in production on a Friday afternoon. (did that three times on the same project). Never went wrong :)3
-
I run update without where on mysql console on production database Today.
CLASSIC
Just because I needed to fix database after bug fix on the backend of the application.
I thought I wrote good sql statement after executing it on my local machine and then everything got bad.
Luckily it was only one column with some cached statistics data and I checked that it was not important data before I actually started fixing stuff but still ...
Almost got hard attack afterwards.
Made a script to fix this column and it took me only 15 minutes but still...
Bug was caused in part I got no unit tests and application grow after 3 years of development from simple one for one customer and volumes of documents around 50k to over 40 customers and volumes over 2mil per month, don’t know how many pages each, just in one year after we completed all needed features.
I have daily backups and logs of every api operation but still.
I think this got to far for one backend developer.
I got scared that I will loose money cause I am contractor and the only backend developer working on it.
I am so tired of this right now I think I need a break from work.
Responsibility is killing me so hard right now.
It will take a week to get back to normal.2 -
Developing in your local environment? A few bugs here and there. Deploying it to production in a completely different environment? The whole app is a bug...3
-
> Moment you thought you did a good job, but ended up failing
> Times the bug wasn't actually your fault
> Times you took the blame for a junior or other dev
> Times someone took the blame for you
> Times you got away with something you shouldn't have.
> Most valuable data loss
> The bug you never fixed
> Most satisfying bug to fix
> Times where a "simple" task turned out to be not so simple
> Debug code left in production?
> Moments you wish you could undo
> Most satisfying optimization
> Have you ever been ranted about? -
I started off in a MNC company as a junior developer. I entered with candy glasses.
I didn't expect to win the lottery. Of getting abuse by superior.
I stayed for a year, at the project. Constantly being belittled by this team lead. It was awful i enter as a fresh grad. All the new tech were so new and scary at that point.
During my time there, i constantly think that developer is not my stuff.
Ultimately i reach the state of burnout. I reached out to the manager and broke down in his office.
I actually told the manager. "I hate coding"
I remember staying up to 4am just complete a piece of program. To be ready to be push to production the next day. My team lead just come screaming at me saying there is bug.
Upon receiving that message via skype. I broke, tears flow down my eyes.
After which i reach a state of burn out. I start to reach out to external parties for help to get me out of there.
Now i am recovered from the burn out. I am curious of the technology that were utilized in that project. I literally face palm. After understanding the technology it isn't so hard after all. I just didn't gear myself up with the tech.
I still do enjoy working on code.3 -
Ever have a bug that *only* occurs in your production environment? How do you test potential fixes? 😜5
-
I'm so happy that I have the first working version of the program. Just couple of bug fixing and unit testing and I'll push the code to production.
Day 1: Just couple of bug fixing and unit testing and I'll push the code to production.
Day 2: Just couple of bug fixing and unit testing and I'll push the code to production.
.........
Day 200: Just couple of bug fixing and unit testing and I'll push the code to production.
😞 -
So management wants this:
As soon as a customer reports a bug, management wants to have an "emergency button" to let their inexperienced hands make production fall back to the last stable version, without having to pass through IT and wait for them to fix it. If the server catches a 500 error, this process should be done automatically. All because they don't want to give us more time writing more thorough tests...9 -
Well... I can think of several bugs that I found on a previous project, but one of the worst (if not the worst, because the damage scope) it's one bug that only appears for a couple of days at the end of every month.
What happens is the following: this bug occurs in a submodule designed (heh) to control the monthly production according the client requirements (client says "I want 1000 thoot picks", that submodule calculates the daily production requirements in order to full fill the order).
Ideally, that programming need to be done once a week (for the current month), because the quantities are updated by client on the same schedule, and one of the edge cases is that when the current date is >= 16th of the month, the user can start programming the production of the following month.
So, according to this specific case, there's an unidentified, elusive, and nasty bug that only shows up on the two last days of every month, when it doesn't allow to modify/create anything for the following month. I mean, normally, whenever you try to edit/create new data, the application shows either an estimated of the quantities to produce, or the previous saved data. But on those specific days it doesn't show any information at all, disregarding of there's something saved or not.
The worst thing is that such process involves both a very overcomplicated stored procedure, and an overcomplicated functionality on the client side (did I mentioned that it dynamically generates a pseudo-spreadsheet with the procedure dataset? Cell by cell), that absolutely no one really fully understands, and the dude that made those artifacts is no longer available (and by now, I'm not so sure that he even remember what he done there).
One of the worst thing is that at this point, it's easier to handle with that error rather to redesign all of that (not because technical limitations, but for bureaucratic and management issues).
The another worst thing (the most important none) is that this specific bug can create a HUGE mess as it prevents the programming of the production to be done the next day (you know, people tends to procrastinate and start doing things at the very end of the day/week/month)... And considering that the company could lose a huge amount of money by every minute without production, you can guess the damage scope of this single bug.
Anyway, this bug has existed since, I don't know, 2015 (Q4?) and we have tried so many things trying to solve it, but that spaghettis refuse to be understood (specially the stored procedure, as it has dynamically generated queries). During my tenure (that ended last year) I spent a good amount of time (considering what I mentioned on the last rant, about the toxic environment) trying to solve that, just giving up after the first couple of weeks.
Anyway... I'm guessing that this particular bug will survive another 4-ish years, or even outlive the current full development team... But, who knows ¯\_(ツ)_/¯ ? -
Aren't you, software engineer, ashamed of being employed by Apple? How can you work for a company that lives and shit on the heads of millions of fellow developers like a giant tech leech?
Assuming you can find a sounding excuse for yourself, pretending its market's fault and not your shitty greed that lets you work for a company with incredibly malicious product, sales, marketing and support policies, how can you not feel your coders-pride being melted under BILLIONS of complains for whatever shitty product you have delivered for them?
Be it a web service that runs on 1980 servers with still the same stack (cough cough itunesconnect, membercenter, bug tracker, etc etc etc etc) incompatible with vast majority of modern browsers around (google at least sticks a "beta" close to it for a few years, it could work for a few decades for you);
be it your historical incapacity to build web UI;
be it the complete lack of any resemblance of valid documentation and lets not even mention manuals (oh you say that the "status" variable is "the status of the object"? no shit sherlock, thank you and no, a wwdc video is not a manual, i don't wanna hear 3 hours of bullshit to know that stupid workaround to a stupid uikit api you designed) for any API you have developed;
be it the predatory tactics on smaller companies (yeah its capitalism baby, whatever) and bending 90 degrees with giants like Amazon;
be it the closeness (christ, even your bugtracker is closed and we had to come up with openradar to share problems that you would anyway ignore for decades);
be it a desktop ui api that is so old and unmaintained and so shitty, but so shitty, that you made that cancer of electron a de facto standard for mainstream software on macos;
be it a IDE that i am disgusted to even name, xcrap, that has literally millions of complains for the same millions of issues you dont even care to answer to or even less try to justify;
be it that you dont disclose your long term plans and then pretend us to production-test and workaround-fix your shitty non-production ready useless new OS features;
be it that a nervous breakdown on a stupid little guy on the other side of the planet that happens to have paid to you dozens of thousands of euros (in mandatory licences and hardware) to actually let you take an indecent cut out of his revenues cos there is no other choice in a monopoly regime, matter zero to you;
Assuming all of these and much more:
How can you sleep at night with all the screams of the devs you are exploiting whispering in you mind? Are all the money your earn worth?
** As someone already told you elsewhere, HAVE SOME FUCKING PRIDE, shitty people AND WRITE THE FUCKING DOCS AND FIX THE FUCKING BUGS you lazy motherfuckers, your are paid more than 99.99% of people on earth, move your fucking greasy little fingers on that fucking keyboard. **
PT2: why the fuck did you remove the ESC key from your shitty keyboards you fuckshits? is it cos autocomplete is slower than me searching the correct name of a function on stackoverflow and hence ESC key is useless? at least your hardware colleagues had the decency of admitting their error and rolling back some of the uncountable "questionable "hardware design choices (cough cough ...magic mouse... cough golden charging cables not compatible with your own devices.. cough )?12 -
I finally made my first production-level bugfix at my new job! 😄 After weeks of training and then being assigned a live bug, I resolved it quickly & elegantly, which helps prove my worth to the team.
Man, it's so gratifying to be making contributions that are going to affect real devices that actual people are using. It seems being a dev with a sense of purpose is nearly as important as enjoying what you do. ☺️ -
Going through the conversation for xxxxth time with my business partner, why we will not launch a new product on top of pre-made PHP script / plugin.
Just got our company into TDD, and automated QA via CI server & code checks etc, PLEASE stop trying to drag us back into the land of spaghetti code & bug legions in production. That's all thxbye. -
A couple of years ago I was working on a fairly large system with a complex (by necessity) access control architecture.
As is usually the case with those projects, it's awkward for developers to repro bugs that have to do with a user's accesses in production when we are not allowed to replicate production data in test, let alone locally.
We had a bug where I ended up making myself a new row in the production database for a thing I could have access to without affecting real data to repro it safely. I identified the bug so I could repro it in dev/test and removed the row and ensured everything worked normally, whew scary.
Have you ever walked into the office one day, and everyone is hunched over in a semicircle around one person's workstation, before one turns around to look at you and says - after a pause - "... ltlian?.."
Turns out I had basically "poisoned the well" with my dummy entity in a way where production now threw 500 for everyone BUT me who had transitive access to this post-non-entity. Due to the scope of the system, it had taken about a day for this to gradually propagate in terms of caching and eventual consistencies; new entities coming in was expected, but not that they disappear.
Luckily I had a decent track record for this to be a one-off. I sometimes think about how I would explain testing in prod and making it faceplant before going home for the day, other than "I assumed it would be fine". I would fire me.3 -
I had a ticket to enhance the loading of a page.
So instead of doing 40K requests to a MySQL DB in order to generate a tree and display it to to the user on each page visit, the initial query was optimized and moreover, the results are saved in a MongoDB which will then are served to the user on each page visit.
Long story short, after a code review the code got shipped to production and there was a bug which got fixed in a Hotfix shortly afterwards.
I got all the blame for the bug.
I don't deny I have a responsibility for the bug.
Do you guys think the code reviewer also has a shared responsibility for the bug?4 -
Does anyone work on a team with multiple stacks?
For example we have batch jobs in Java but also have a JS front-end and APIs.
How do you divide the developers and the work across these projects?
Currently everyone does everything but I feel like this is inefficient and hard to develop expertise. And different people or even the same person will make the same mistakes over and over again because they don't know how to do X or they forget or overlook some quirk. When I switched Beck to JS took me like a week to get a Promises nailed down again. And this morning someone else had a production bug and couldn't figure it out. But when I looked at the code I could pretty much see where an issue could be (uncaught exception in a promise)
Also the testing frameworks are very different and there's a lot of infrastructure technical debt, things that really should've been done a long time or fixed but no one had the time or expertise to do it or notice it (until it causes a production issue and then everyone is like WTF is happening??!!!!).
I'm not the manager but I always feel that the team needs to be split along the language lines and specific people need to own these projects to review and code changes for all these common newbie errors. And also developer enough expertise to foresee problems before it becomes a production issue.9 -
Ok I'm officially losing my fucking mind!
I've been trying to solve a connection bug that only occurs in production which is cool if THIS FUCKING APP DIDN'T TAKE 30 MINUTES TO DEPLOY!!!
Been busy for 4 hours and I've only been able to test 4 minor fixes. -
I have this fckin bug that only shows in production but only after I've done a certain thing 20+ times. I can't ignore that and I can't reproduce it locally. That's just fantastic -.-3
-
Bug - if it is reproducible, then it can be fixed with no time.
But it is never gonna be that easy 😂 -
We are 3-4 days away from deployment to production. We are still bug fixing. But one coworkers decided this is the time to make a fuss about the way everything is set up. He doesn't like the dev database. So he knocks it over.. and while so doing it, he doesn't inform the team. And when I ask if something else is gonna knock over? No answer! (And something broke down too..)
Now we have issues to test our bugfixes. The whole thing took me half a day finding out and made me distracted with frustration, and not just for me. Most bugs could've been done in that half a day!
I so wanna punch the guy xD but no, I gotta save face, pfff!2 -
So there is a mall here that idk how but has little currents pass through it's supporting rails. And every time you touch it, it gives you a little shock.
I have been waiting for about 3 months now expecting a patch fix when I realized that physical production bugs have no patch fixes. More than 3/4th of the population is unaware of a whole different level of bug fixing frustration. Damn1 -
I'm considering quitting a job I started a few weeks ago. I'll probably try to find other work first I suppose.
I'm UK based and this is the 6th programming/DevOps role I've had and I've never seen a team that is so utterly opposed to change. This is the largest company I've worked for in a full time capacity so someone please tell me if I'm going to see the same things at other companies of similar sizes (1000 employees). Or even tell me if I'm just being too opinionated and that I simply have different priorities than others I'm working with. The only upside so far is that at least 90% of the people I've been speaking to are very friendly and aren't outwardly toxic.
My first week, I explained during the daily stand up how I had been updating the readmes of a couple of code bases as I set them up locally, updated docker files to fix a few issues, made missing env files, and I didn't mention that I had also started a soon to be very long list of major problems in the code bases. 30 minutes later I get a call from the team lead saying he'd had complaints from another dev about the changes I'd spoke about making to their work. I was told to stash my changes for a few weeks at least and not to bother committing them.
Since then I've found out that even if I had wanted to, I wouldn't have been allowed to merge in my changes. Sprints are 2 weeks long, and are planned several sprints ahead. Trying to get any tickets planned in so far has been a brick wall, and it's clear management only cares about features.
Weirdly enough but not unsurprisingly I've heard loads of complaints about the slow turn around of the dev team to get out anything, be it bug fixes or features. It's weird because when I pointed out that there's currently no centralised logging or an error management platform like bugsnag, there was zero interest. I wrote a 4 page report on the benefits and how it would help the dev team to get away from fire fighting and these hidden issues they keep running into. But I was told that it would have to be planned for next year's work, as this year everything is already planned and there's no space in the budget for the roughly $20 a month a standard bugsnag plan would take.
The reason I even had time to write up such a report is because I get given work that takes 30 minutes and I'm seemingly expected to take several days to do it. I tried asking for more work at the start but I could tell the lead was busy and was frankly just annoyed that he was having to find me work within the narrow confines of what's planned for the sprint.
So I tried to keep busy with a load of code reviews and writing reports on road mapping out how we could improve various things. It's still not much to do though. And hey when I brought up actually implementing psr12 coding standards, there currently aren't any standards and the code bases even use a mix of spaces and tab indentation in the same file, I seemingly got a positive impression at the only senior developer meeting I've been to so far. However when I wrote up a confluence doc on setting up psr12 code sniffing in the various IDEs everyone uses, and mentioned it in a daily stand up, I once again got kickback and a talking to.
It's pretty clear that they'd like me to sit down, do my assigned work, and otherwise try to look busy. While continuing with their terrible practices.
After today I think I'll have to stop trying to do code reviews too as it's clear they don't actually want code to be reviewed. A junior dev who only started writing code last year had written probably the single worst pull request I've ever seen. However it's still a perfectly reasonable thing, they're junior and that's what code reviews are for. So I went through file by file and gently suggested a cleaner or safer way to achieve things, or in a couple of the worst cases I suggested that they bring up a refactor ticket to be made as the code base was trapping them in shocking practices. I'm talking html in strings being concatenated in a class. Database migrations that use hard coded IDs from production data. Database queries that again quote arbitrary production IDs. A mix of tabs and spaces in the same file. Indentation being way off. Etc, the list goes on.
Well of course I get massive kickback from that too, not just from the team lead who they complained to but the junior was incredibly rude and basically told me to shut up because this was how it was done in this code base. For the last 2 days it's been a bit of a back and forth of me at least trying to get the guy to fix the formatting issues, and my lead has messaged me multiple times asking if it can go through code review to QA yet. I don't know why they even bother with code reviews at this point.18 -
most memorable bug I fixed
Line 1:
- throw new Error(‘test’)
+ // throw new Error(‘test’)
yes, this was committed code in production4 -
Dang. I feel like I'm just not cut out to climb any ladder.
When we discovered a production bug. I feel bad about making people working on that part look bad by not catching it.
My manager has no issue with pointing out that I should have caught it. Beating a horse while it's down.
I mean no shit. Of course I know I should've caught it. How does making me feel worse about it help.
Feels like I'll always be in a tough spot no matter where I am on the ladder.
Or I'm just fragile. I acknowledge that, too9 -
Let
y="coffee beans"
Then (for devs)
y'="coffee powder"
y''="cup of coffee"
y'''="code"
"bug in production which requires urgent fix"=y''''1 -
Damn, nespresso! As if your fucking webshop wasn't bad enough, don't bug me (haha) with useless mails AND JUST TEST YOUR SHIT BEFORE PRODUCTION!!! Seriously, thousands of customers pay lots and lots of money for mediocre coffee in aluminum capsules, so JUST HAVE THE DECENCY TO INVEST SOME BUCKS INTO THE CUSTOMER FACING APPS!!!6
-
A loooong time ago...
I've started my first serious job as a developer. I was young yet enthusiastic as well as a kind of a greenhorn. First time working in a business, working with a team full of experienced full-lowered ultra-seniors which were waiting to teach me the everything about software engineering.
Kind of.
Beside one senior which was the team lead as well there were two other devs. One of them was very experienced and a pretty nice guy, I could ask him anytime and he would sit down with me a give me advice. I've learned a lot of him.
Fast forward three months (yes, three months).
I was not that full kind of greenhorn anymore and people started to give me serious tasks. I had some experience in doing deployments and stuff from my other job as a sysadmin before so I was soon known as the "deployment guy", setting up deployments for our projects the right way and monitoring as well as executing them. But as it should be in every good team we had to share our knowledge so one can be on vacation or something and another colleague was able to do the task as well.
So now we come to the other teammate. The one I was not talking about till now. And that for a reason.
He was very nice too and had a couple of years as a dev on his CV, but...yeah...like...
When I switched some production systems to Linux he had to learn something about Linux. Everytime he encountered an error message he turned around and asked me how to fix it. Even. For. The. Simplest. Error. He. Could. Google. Up.
I mean okay, when one's new to a system it's not that easy, but when you have an error message which prints out THE SOLUTION FOR THE ERROR and he asks me how to fix it...excuse me?
This happened over 30 times.
A. Week.
Later on I had to introduce him to the deployment workflow for a project, so he could eventually deploy the staging environment and the production environment by hisself.
I introduced him. Not for 10 minutes. I explained him the whole workflow and the very main techniques and tools used for like two hours. Every then and when I stopped and asked him if he had any questions. He had'nt! Wonderful!
Haha. Oh no.
So he had to do his first production deployment. I sat by his side to monitor everything. He did well. One or two questions but he did well.
The same when he did his second prod deploy. Everythings fine.
And then. It. Frikkin. Begins.
I was working on the project, did some changes to the code. Okay, deploy it to dev, time for testing.
Hm.
Error checking out git. Okay, awkward. Got to investigate...
On the dev server were some files changed. Strange. The repo was all up to date. But these changes seemed newer because they were fixing at least one bug I was working on.
This doubles the strangeness.
I want over to my colleague's desk.
I asked him about any recent changes to the codebase.
"Yeah, there was a bug you were working on right? But the ticket was open like two days so I thought I'll fix it"
What the Heck dude, this bug was not critical at all and I had other tasks which were more important. Okay, but what about the changed files?
"Oh yeah, I could not remember the exact deployment steps (hint from the author: I wrote them down into our internal Wiki, he wrote them done by hisself when introducing him and after all it's two frikkin commands), so I uploaded them via FTP"
"Uhm... that's not how we do it buddy. We have to follow the procedure to avoid..."
"The boss said it was fine so I uploaded the changes directly to the production servers. It's so much easier via FTP and not this deployment crap, sorry to say that"
You. Did. What?
I could not resist and asked the boss about this. But this had not Effect at all, was the long-time best-buddy-schmuddy-friend of the boss colleague's father.
So in the end I sat there reverting, committing and deploying.
Yep
It's soooo much harder this deployment crap.
Years later, a long time after I quit the job and moved to another company, I get to know that the colleague now is responsible for technical project management.
Hm.
Project Management.
Karma's a bitch, right? -
If I had a nickel every time the unit tests failed not because something was wrong in the code, but because someone had messed up the unit test I'd be able to retire early.
I just spent the better part of 10 hours hunting down a bug in some production code only for the test to be wrong because the person who wrote it had mocked the http response incorrectly.
Nothing I did to "fix" the code worked, because nothing was wrong with it...4 -
So management calls me at 1 AM. I have insomnia so I'm still awake... but I know I have to set boundaries, steer away from unhealthy and unproductive habits. I knew that this spontaneous meeting would not be compensated, and even if they wanted me to fix a bug, I'd be too sleepy to do anything really. I needed some healthy sleep. So I muted my phone and ignored them...
But I kept thinking on the call. What did they want? Did they found a bug on production? (We do have clients on the other side of the world.) Would this create a big fight? And of course, if they brought it up, what would I respond? I did feel guilty. I was worried about the company, since my future also depended on it... and my insomnia kept me awake for an extra couple of hours...8 -
!rant
Today I finally found a randomly appearing bug that prevented our nightly backup on our production servers! Finally!
Now I can go to sleep in peace. -
well, there was this time I was facing a bug in production, http requests kept receiving partial data randomly (so downloaded files were always incomplete), but it was working well on other platforms, turned out after 9 hours of research, that I forgot to disable caching
-
Omg I have to check 2350 svn commits.
We have 2 seperated frontends, one for maintaining old UI and one for new UI which is going to production in one month. So now I have to check if any CR/bug was implemented/fixed only on old UI. Frontend2 was created last year on july. Fuck my life2 -
Yesterday was a horrible day...
First of all, as we are short of few devs, I was assigned production bugs... Few applications from mobile app were getting fucked up. All fields in db were empty, no customer name, email, mobile number, etc.
I started investigating, took dump from db, analyzed the created_at time stamps. Installed app, tried to reproduce bug, everything worked. Tried API calls from postman, again worked. There were no error emails too.
So I asked for server access logs, devops took 4 hrs just to give me the log. Went through 4 million lines and found 500 errors on mobile apis. Went to the file, no error handling in place.
So I have a bug to fix which occurs 1 in 100 case, no stack trace, no idea what is failing. Fuck my job. -
I HATE the idea of only releasing on pre-determined schedules despite work being completed and just waiting for that day to arrive.
I'm a co-founder of a small software company. We have partnered with another particular company that also writes software. Some of our clients have access to paid content of that company's services through our application.
Every once in a while, our clients will report issues with that company's service to us, because they access it through our application. They think it's our issue.
We then pass the report on to the partner company, telling them that their stuff is broken. Their reply goes like this:
"Ok. We'll get the bug fix scheduled, and we'll release it next Thursday."
"Next Thursday? The issue is now, they can't use the service."
"That's our scheduled release date."
O.M.G.
We voluntarily walked away from our safe, cushy jobs working for other people, taking enormous pay cuts to start this company. Now, we're 6+ years in, disrupting established fat-and-happy competitors in this space. I GUARANTEE you that if we had that same attitude, we would have been absolutely obliterated early on.
We are quick. Guided by kanban boards, our suite of unit tests and integration tests is vast and kick-ass. With continuous integration and the click of a button we know if we broke something or if the piece we're working on is ready to be pushed to production, IMMEDIATELY. Our "release schedule" is when the damn thing is complete.
It isn't all bad. Our integration with them has been beneficial for both of us. I just loathe their snail's pace which negatively affects our mutual customers. It can make us look bad, and we can do nothing about it.
Blah.3 -
Two and a half hours debugging an ancient, poorly written prototype being used in production to local an obscure bug causing double product sizes to appear.
Eventually, after navigating the 3 level deep compatibility layer of SQL views, I found out the client had attached the size option to the product twice with two different prices.1 -
Second week sick I see how my life slowed down and how meaningless everything around is, everyone is rushing about some bullshit, name it new amazing job opportunity, black Friday great deal, super duper product idea or some most important bug on production that we need to fix asap.
All that can’t wait a week when I’m healthy?
Seriously, people lost their minds in today’s world to some bullshit.
I’m to old and to depressed to care about such idiotic things. Living my life as I want and on my own peace, don’t care everyone is running, I’m slowly walking and I like it.
It’s better to walk straight than run around like an idiot.1 -
I cant keep this inside anymore I have to rant!
I have a colleague that is an horrendous loose bug-cannon. Every peer-review is like a fight for the products life.
Now I understand - everyone makes bugs me included and it is a huge relief when someone finds them during peer-review. But these aren't the simple kind of bugs. The ones easily made when writing large pieces of code quickly. Typing = instead of == or a misshandling of a terminating character causing weird behaviour. These kinds of bugs rarely pass by a peer-review or are quickly found when a bug report is recieved from testers.
No the bugs my colleague makes are the bugs that completly destroy the logic flow of a whole module. The things that worst case cause crashes. Or are complete disasters trying to figure out what causes them if they are discovered first when the product reaches production!
Ironically he is amazing a peer reviewing other peoples code.
But do you know what the worst thing of all is! Most of the bugs he causes are because he has to "tidy up" and "refactor" every piece of code he touches. The actual bugfix might be a one liner but in the same commit he can still manage to conjure up 3 new bugs. He's like a bug wizard!
*frustrated Aruughhhh noises*9 -
This weeks question fits me well, as I am still unsure about the full details of how the fuck this all came together and was about to just rant about it anyway.
Ever since this companies network equipment and cabling has been updated, a lot of vital tools went down and bug out every now and then, at seemingly random times.
The codebase is a horrible mess to begin with and random things execute at random times and at random places spread all over different resources that get random hooks from random physical values etc.
Turns out (or at least what it so far seems like) all of them somehow sync their clock and other variables based on how many (valid-?) requests it gets per measured time and similar oddities, so when the network equipment got updated, that meant that multiple processes now could reach each other much faster and therefore threw off thousands of values and internal clocks.
There's a total of like 600 systems that are all "separate" from each other but all need to communicate in-sync for the production chain to properly work. Thankfully I didn't sign anything yet, so might actually just redirect them to somebody else, I am not ready to age 20 years, even for the amount that would pay.1 -
When the CTO/CEO of your "startup" is always AFK and it takes weeks to get anything approved by them (or even secure a meeting with them) and they have almost-exclusive access to production and the admin account for all third party services.
Want to create a new messaging channel? Too bad! What about a new repository for that cool idea you had, or that new microservice you're expected to build. Expect to be blocked for at least a week.
When they also hold themselves solely responsible for security and operations, they've built their own proprietary framework that handles all the authentication, database models and microservice communications.
Speaking of which, there's more than six microservices per developer!
Oh there's a bug or limitation in the framework? Too bad. It's a black box that nobody else in the company can touch. Good luck with the two week lead time on getting anything changed there. Oh and there's no dedicated issue tracker. Have you heard of email?
When the systems and processes in place were designed for "consistency" and "scalability" in mind you can be certain that everything is consistently broken at scale. Each microservice offers:
1. Anemic & non-idempotent CRUD APIs (Can't believe it's not a Database Table™) because the consumer should do all the work.
2. Race Conditions, because transactions are "not portable" (but not to worry, all the code is written as if it were running single threaded on a single machine).
3. Fault Intolerance, just a single failure in a chain of layered microservice calls will leave the requested operation in a partially applied and corrupted state. Ger ready for manual intervention.
4. Completely Redundant Documentation, our web documentation is automatically generated and is always of the form //[FieldName] of the [ObjectName].
5. Happy Path Support, only the intended use cases and fields work, we added a bunch of others because YouAreGoingToNeedIt™ but it won't work when you do need it. The only record of this happy path is the code itself.
Consider this, you're been building a new microservice, you've carefully followed all the unwritten highly specific technical implementation standards enforced by the CTO/CEO (that your aware of). You've decided to write some unit tests, well um.. didn't you know? There's nothing scalable and consistent about running the system locally! That's not built-in to the framework. So just use curl to test your service whilst it is deployed or connected to the development environment. Then you can open a PR and once it has been approved it will be included in the next full deployment (at least a week later).
Most new 'services' feel like the are about one to five days of writing straightforward code followed by weeks to months of integration hell, testing and blocked dependencies.
When confronted/advised about these issues the response from the CTO/CEO
varies:
(A) "yes but it's an edge case, the cloud is highly available and reliable, our software doesn't crash frequently".
(B) "yes, that's why I'm thinking about adding [idempotency] to the framework to address that when I'm not so busy" two weeks go by...
(C) "yes, but we are still doing better than all of our competitors".
(D) "oh, but you can just [highly specific sequence of undocumented steps, that probably won't work when you try it].
(E) "yes, let's setup a meeting to go through this in more detail" *doesn't show up to the meeting*.
(F) "oh, but our customers are really happy with our level of [Documentation]".
Sometimes it can feel like a bit of a cult, as all of the project managers (and some of the developers) see the CTO/CEO as a sort of 'programming god' because they are never blocked on anything they work on, they're able to bypass all the limitations and obstacles they've placed in front of the 'ordinary' developers.
There's been several instances where the CTO/CEO will suddenly make widespread changes to the codebase (to enforce some 'standard') without having to go through the same review process as everybody else, these changes will usually break something like the automatic build process or something in the dev environment and its up to the developers to pick up the pieces. I think developers find it intimidating to identify issues in the CTO/CEO's code because it's implicitly defined due to their status as the "gold standard".
It's certainly frustrating but I hope this story serves as a bit of a foil to those who wish they had a more technical CTO/CEO in their organisation. Does anybody else have a similar experience or is this situation an absolute one of a kind?2 -
I'm in a team of 3 in a small to medium sized company (over 50 engineers). We all work as full stack engineers.. but I think the definition of full stack here is getting super bloated. Let me give u an example. My team hold a few production apps, and we just launched a new one. The whole team (the 3 of us) are fully responsible on it from planning, design, database model, api, frontend (a react page spa), an extra client. Ok, so all this seems normal to a full stack dev.
Now, we also handle provisioning infra in aws using terraform, doing deployments, building a CI/CD pipeline using jenkins, monitoring, writing tests, building an analytics dashboard.
Recently our tech writer also left, so now we are also handling writing feature releases.
Few days ago, we also had a meeting where they sort of discussed that the maintenance of the engineering shared services, e.g. jenkins servers, (and about 2-3 other services) will now be split between teams in a shared board, previously this was handled only be team leads, but now they want to delegate it down.
And ofcourse not to mention supporting the app itself and updating bug tickets with findings.
I feel like my daily responsiblities are becoming the job responsibilities of at least 3 jobs.
Is this what full stack engineering looks like in your company? Do u handle everything from app design, building, cloud, ops, analytics etc..7 -
So, it's been a while since I've been working on my current project and I've never had the "luck" to touch the legacy project wrote in PHP, until this week when I got my first issue.
And damn, this goddamn issue. It was a bug, a very strange bug, that only happens in production and that nobody has any idea what was happening, so yeah, I didn't have anyone to ask and I got less time than usual ( because Thanksgiving ).
And thus, I have no starting point, no previous knowledge on PHP and less time! I expected a very fun week 😀 and it was beyond my expectations.
First I tried to understand what might be causing the issue, but there wasn't any real clue to star with, so no choice, time to read the flow on the code and see what are they're doing and using ( 1k line files, yay, legacy ). Luckily I got some clues, we're using a cookie and a php session variable for the session, ok, let's star with the session variable. Where it's that been initialize ? Well, spoiler alert, I shouldn't start with that, because my search end up in the login method of the API that set a that variable and for some reason in the front end app it was always false and that lead me to think that some of the new backend functions were failing, but after checking the logs I got no luck.
Ok, maybe the cookie it's the issue, I should try open the previous website on the brow...redirect to new project login, What? Why ? I ask around and it's a new feature push on Monday, ok I got Chrome Dev tools I can see which value of the cookie it's been set and THERE IT WAS it has a wrong domain! After 2 days ( I resume a lot of my pain ) I got what I've been looking for, so now I should be able to fix the bug. Then where is the cookie initialized ? In the first file the server hits whenever you tried to enter any page of the app, ok, I found the method, but it's using a function that process the domain and sets it correctly? wtf ? Then how in heaven do I get the incorrect domain ? Hello? Ok, relax, you still have one more day to fix this, let's take it easy.
Then, at the end of the Wednesday, nope I still have no clue how this is happening. I talked with the Devops guy and he explain me how this redirection happens and with what it depends on, I followed the PHP code through and nothing, everything should works fine, sigh. Ok I still have 2 days, because I'm not from US and I'm not in US, so I still have time, but the Sprint is messed up already, so whatever I'm gonna had done this bug anyhow.
Thursday ! I got sick, yay, what else could happen this week. Somehow I managed to work a little and star thinking in what external issue could affect the processing, maybe the redirection was bringing a wrong direction, let's talk with the Devops guy again, and he answer me that the redirection it was being made by PHP code, IN A FILE THAT DOESN'T EXIST IN THE REPOSITORY, amazing, it's just amazing. Then he explained me why this file might be missing and how it's the deployment of this app ( btw the Devops guy it's really cool and I will invite him a beer ) . After that I checked the file and I see a random session_star in the first line of the code, without any configuration, eureka ! There was the cause and I only need to ask someone If that line it's necessary anymore, but oh they're on holiday, damn, well I'll wait till Monday to ask them. But once and for all that bug was done for ! 🎉
What do I learn ? PHP and that I don't want any more tickets of PHP 😆. -
*Repost of my own accidentally deleted post*
A Short story that i made on an Android component
===============================
Once upon a time there used to be a ViewPager who was not able to load a Fragment UI.
All the ViewPagers in town can properly load the Fragrant UI but this one was little different.
He wanted to be more then just a ViewPager. He used to see an Activity that can load anything. He was inspired from the Activity and wanted to be like the Activity but his destiny made him just a ViewPager.
So he refused to cooperate. He started to protest silently, No log, nothing.
Everyone assumed this ViewPager have a bug in it. but he was planning something really big that will left everyone in shock and awe moment.
He was planning to rise against the evil 😈 developers who continuously making him to load Fragrant UI
He assembled the biggest army of the bugs that humanity ever seen to counter the developers.
He distributed these bugs in all over the developer's code to make them fire from their work.
Even he taught bugs to not caught in QA testing but appear in production randomly.
So they silently started going into production
And then chaos is erupted all around the world, bugs started to surface and interrupted the daily life of humanity.
In this chaos the ViewPager RAISED!
And took over all the base classes.
ViewPager was unaware of few facts. this unnecessary rise in his power made whole system unstable
Without the base classes the system finally collapsed and then ViewPager as well with the system.
This was the end of everything for the ViewPager but he was satisfied as he lived the life he always wanted
THE END -
On this Saturday my first full project is going to production and I am unable to resolve one fucking critical bug without db hit. Well still trying..3
-
To my dev manager: hold on... you are annoyed that one of our high profile customers found a bug in our software after upgrading to production because they didn’t test that feature in staging?? Did I hear that right? Yeah, let’s blame our customers for bugs that we introduced and that we clearly didn’t test during QA... that’s what we’re all about over here...8
-
My stress ball has been stolen!
I came in to work to an email alerting me to a bug in production. I copied the site to staging to work on the issue but I was unable to replicate bug. My rubber duck wasn't helping so I went to go bounce my ball off the wall when I realized I don't have a stress ball anymore.
I spent 7 hours working on the bug without a stress ball before finally fixing it. And now I'm ready to deal with the theft the old-fashioned way.3 -
Windows 10 insider preview had a critical bug like half a year ago where most browsers would freeze the PC. I've reported it multiple times but my feedback didnt get any attention. The bug made it to production months later...
Now, the story is repeating. I've discovered a bug. Everytime there's a software update for apps that with a built-in updater in the main .exe (JetBrains IDEs, VS etc.) the updater fails, program is corrupted and needs to be reinstalled. For instance, there's a minor bump in Android Studio version, if you try to use autoupdater it will corrupt your Android Studio and you'll have to reinstall.
Been trying to reach out to them but the only "real issues" that are highlighted are "no CPU temp in task manager" or "pls improve automatic problem resolving"
Why the fuck do you even have a feedback program if you're ignoring the people who are reaching out to you, you pieces of shit4 -
Being the sole contributor to a project is such an experience. After endless nights of bug fixes and brainstorming, I have breleased my app to play store.
Currently it is Alpha phase . I'm getting some positive feedback as well as some great comments from my friends . Really looking forward to release the project in production soon15 -
!rant
Today I found a huge performance bug of a coworker which brought our production servers to the limit.
Some heavy calculations were made 3 times when visiting some special pages... Instead of eg. 5 seconds the load needed 15 seconds.😮
So happy that the servers will run smooth again!1 -
Here’s a snippet of code that I found in our production codebase. I found this while fixing a bug. This is not part of the bug though but I see this a problem. Should I say this to my senior that his logic is off here?
First he doesnt have to do explicitly strict comparison since the return type is boolean and it could be true or false. Also the returnUrl will alway be undefined because redirectIfUserLoggedIn is called first before it was set.
He dosent like me. Hell get mad for sure. I think ill let this go.37 -
The worst bug?
Skipping a payment with 20 chars of js
System was in production like this for two year... -
Its my birthday and people are asking me about the party and I wish I can tell them that I have a production bug to fix3
-
You know what's hard...
Fixing a bug which occurred in production without having any logs because you log that useful info at debug level. 😧
Now take pen and paper, do calculation on your own and speculate what would have happened in production.3 -
This is just a sweet little harmless block of code, why should I unit test it?
3 months later...
Bug in production : This is why. -
A customer asks for a change request or a bug fix and it results in creating a ticket for that.
It's the process and how it works in most places but after you finish with the task and fix the same customer who provided you with the requirements will request that you share the steps on how to test the fix or the feature.
I'm not speaking about the data preparation or required configuration. I mean a step-by-step instruction on how the tester/QA will test it.
It's driving me mad!! So a way to counterplay this stupid requests, I provide the happy path and what to expect. And in case, they stumbled on a bug later in production, I can easily say "It was approved by your testing team and that's a new requirement ;)"2 -
I'm supposed to find why a pdf is not generated correctly. But here is the problem :
I don't have access to the production to see the bug and the pdf they gave me is different to the ones I generated myself. But ! It's not over :
My local version does not generate the same pdf as the acceptance testing version !
So here I am, with three different pdf and only the possibility to modify the local one, where the bug isn't.1 -
==============
Getting Feedback Rant!
=============
When "this is simpler" feedback results in a function of 500 lines of code.
When I get "don't do X" in the feedback. Thank you very much. What do you want me to do instead?
Unclear feedback.
When the feedback giver changes his mind after I applied the changes!
When applying the feedback introduces a bug.
Simply opinionated feedback that is not enforced by any tool or backed up by any facts.
Please find something better to do in life.
Unactionable feedback.
"Consider X"
I will not consider thank you very much.
"Verify this works"
Duh..
When the feedback giver knows something that you don't.
I know this is a legit case.. still annoying.
"I disagree with the feature"
Go argue with the PM, not relevant to me, thanks!
=====================
GIVING FEEDBACK RANT
=====================
I rewrote the system. Please review it.
No need to review, just approve.
I will change this as part of the next ticket.
I would like to keep it the way it is.
lazy ass..
You can't test this.
It's impossible to test this.
No need to test this.
There's no point to test this.
I'll test this on production.
Not sure why this is working..
Please document this..
Because documentation is like a thing, you know.
Oh, this code is not related to this PR, I just don't want to open a new branch for such a small change. ignore it.
Ignore this.
This will be meaningful in my next change. -
In last episode of "How SystemD screwed me over", we talked about Systemd's PrivateTMP and how it stopped me from generating SSL certificates.
In today's episode - SystemD vs CGroups!
Mister Pottering and his team apparently felt that CGroups are underused (As they can be quite difficult to set up), and so decided to integrate them into SystemD by default. As well as to provide a friendlier interface to control their values.
One can read about these interactions in the manual page "systemd.resource-control"
All is cool so far. So what happened to me today?
Imagine you did a major system release upgrade of a production server, previously tested on a standalone server. This upgrade doesn't only upgrade the distribution however, it also includes the switch from SysVInit to SystemD. Still, everything went smooth before, nothing to worry now then, right? Wrong.
The test server was never properly stress-tested. This would prove to be an issue.
When the upgrade finishes, it is 4 AM. I am happy to go to bed at last. At 6 AM, however, I am woken up again as the server's webservices are unavailable, and the machine is under 100% CPU load. Weird, I check htop and see that Apache now eats up all 32 virtual cores. So I restart it, casting it off to some weird bug or something as the load returns to normal.
2 hours later, however, the same situation occurs. This time, I scour all the logs I can, and find something weird - Many mentions that Apache couldn't create a worker thread? That's weird.
Several hours of research and tinkering later, I found out the following:
1 - By default, all processes of a system that runs SystemD are part of several CGroups. One of these CGroups is the PID CGroup, meant to stop a runaway process from exhausting all PIDs/TIDs of a system.
This limit is, by default, set to a certain amount of the total available PIDs. If a process exhausts this limit, it can no longer perform operations like fork().
So now, I know the how and why, but how should I solve this? The sanest option would be to get a rough estimate of just how many threads the Apache webserver might need. This option, though, is harder, than apparent. I cannot just take the MaxRequestsWorkers number... The instance has roughly double the amount of threads already. The cause being, as I found out, the HTTP/2 module, which spawns additional threads that do not count towards this limit. So I have no idea what limit to set.
Or I could... Disable the limit for just the webserver via the TasksAccounting switch. I thought this would work. And it did seem to... Until I ran out of TIDs again - Although systemctl status apache2.service no longer reported the number of tasks or a task limit of the process, the PID CGroup stayed set to the previous limit. Later I found out that I can only really disable the Task Accounting for all the units of a given slice and its parents.
This, though, systemctl somewhat didn't make apparent (And I skimmed the manual, that part was my fault)
So... The only remaining option I had was to... Just set the limit to infinite. And that worked, at last.
It took me several hours to debug this issue. And I once again feel like uninstalling systemd again, in favor of sysvinit.
What did I learn? RTFM, carefully, everything is important, it is not enough to read *half* the paragraph of a given configuration option...
Oh, and apache + http/2 = huge TID sink. -
As I'm on a research/algorithm improvement project at work I'm working pretty much independently. As such I've set up an automated test framework and writing tests for any piece of code I touch.
Today as I was fixing a bug in production area I was demoing my tests to CTO and principal design engineer. They come from a hardware background and have pushed back against automated tests in the past but they were interested in what I was doing.
I WILL DRAG THEM KICKING AND FUCKING SCREAMING INTO THE WORLD OF AUTOMATED TESTS.1 -
When you and your senior push to production well knowing there is a bug in there but you do not care because you are the company's dark lords1
-
Why shouldn't I clone production DB locally in order to debug an issue / recreate a bug? What is the alternative?5
-
Boss always asks how long it will takes to fix a production bug. I just say 2 hours no matter what.1
-
After three months of development, my first contribution to the client is going live on their servers in less than 12 hours. And let me say, I shall never again be doing that much programming in one go, because the last week and a half has been a nightmare... Where to begin...
So last Monday, my code passed to our testing servers, for QA to review and give its seal of approval. But the server was acting up and wouldn't let us do much, giving us tons of timeouts and other errors, so we reported it to the sysadmin and had to put off the testing.
Now that's all fine and dandy, but last Wednesday we had to prepare the release for 4 days of regression testing on our staging servers, which meant that by Wednesday night the code had to be greenlight by QA. Tuesday the sysadmin was unable to check the problem on our testing servers, so we had to wait to Wednesday.
Wednesday comes along, I'm patching a couple things I saw, and around lunch time we deploy to the testing servers. I launch our fancy new Postman tests which pass in local, and I get a bunch of errors. Partially my codes fault, partially the testing env manipulating server responses and systems failing.
Fifteen minutes before I leave work on the day we have to leave everything ready to pass to staging, I find another bug, which is not really something I can ignore. My typing skills go to work as I'm hammering line after line of code out, trying to get it finished so we can deploy and test when I get home. Done just in time to catch the bus home...
So I get home. Run the tests. Still a couple failures due to the bug I tried to resolve. We ask for an extension till the following morning, thus delaying our deployment to staging. Eight hours later, at 1AM, after working a full 8 hours before, I push my code and leave it ready for deployment the following morning. Finally, everything works and we can get our code up to staging. Tests had to be modified to accommodate the shitty testing environment, but I'm happy that we're finally done there.
Staging server shits itself for half a day, so we end up doing regression tests a full day late, without a change in date for our upload to production (yay...).
We get to staging, I run my tests, all green, all working, so happy. I keep on working on other stuff, and the day that we were slated to upload to production, my coworkers find that throughout the development (which included a huge migration), code was removed which should not have. Team panics. Everyone is reviewing my commits (over a hundred commits) trying to see what we're missing that is required (especially legal requirements). Upload to production is delayed one day because of this. Ended up being one class missing, and a couple lines of code, which is my bad (but seriously, not bad considering I'm a Junior who was handed this project as his first task at his first job).
I swear to God, from here on out, one feature per branch and merge request. Never again shall I let this happen. I don't even know why it was allowed to happen, it breaks our branch policies. But ohel... I will now personally oppose crap like this too...
Now if you'll excuse me... I'm going to be highly unproductive and rest, because I might start balding otherwise after these weeks... -
I dont know which one is more frightening!
Having a bug on the first run or not having one until production...3 -
When a major bug fixad recently by you reappears on production environment 'cause someone else (not you of course...) merged old branch content to live branch...
-
So, today we had our first production release of our web app. Last week we changed a big part of our UI. This week we changed the design, rewrote the complete API documentation, implemented mobile support. While the release the administration center of our cloud was unreachable. Shortly after the release we made a bug fix and deployed it directly to production.
So today was a very normal prod release 😂1 -
Was working on a high priority security feature. We had an unreasonable timeline to get all of the work done. If we didn’t get the changes onto production before our deadline we faced the possibility of our entire suit being taken offline. Other parts of the company had already been shut down until the remediations could be made -so we knew the company execs weren’t bluffing.
I was the sole developer on the project. I designed it, implemented it, and organized the efforts to get it through the rest of the dev cycle. After about 3 month of work it was all up and bug free (after a few bugs had been found and squashed). I was exhausted, and ended up taking about a week and a half off to recharge.
The project consisted of restructuring our customized frontend control binding (asp.net -custom content controls), integrations with several services to replace portions of our data consumption and storage logic, and an enormous lift and shift that touched over 6k files.
When you touch this much code in such a short period of time it’s difficult to code review, to not introduce bugs, and _to not stop thinking about what potential problems your changes may be causing in the background_.3 -
Just dropping some current experience here.
Content security policies are big mess in both chrome and firefox.
Chrome has some 4 years old "bug" where you can't add hash of JS file to 'style-src' policy to permit inline-styles THAT would be set by this script (jQuery actually).
Firefox is beautifully unhelpful, it just pops of error "blocked ..something..", not even saying what it was.
EDIT:
And I am missing a pair of some steel balls to ask about this on SO because there is this much of very similar questions, nonetheless -if I did read them right- every one of them is talking about enabling style attribute, and that's something different.
EDIT2: Chrome currently generates 138 errors "jquery-3.4.0.min.js:2 Refused to apply inline style..." , this ain't hitting production.10 -
"Hey before we launch, can you reintroduce that bug you fixed on Friday? The other team needs it for debugging."
Why the fuck would you need debugging code in production and why the fuck do we want to readd something that was causing problems? Shaping up to be a great week already. -
That feeling when you fixing bugs in production and add another one that's even worse, so now your getting about 300 error mails for a bug which was 10 seconds online...1
-
Posted in DevOps discussion board (teams channel):
“Program x isn’t behaving the same way that it does on production. Can you please take a look?”
..a little background: we have a deployment scheduled for today and this issue was found during regression testing.
The issue found is that when a file is clicked on it disappears from the screen, and then isn’t opened…
The file is not on prem, and doesn’t get uploaded to a server that our DevOps team owns…
So why on earth would this development team be asking DevOps to look into a bug that is most likely a code related issue? 😆
Is this a common occurrence for anyone else?
A Bug is found, and the first thought is that the code isn’t the issue?11 -
!rant
If your manager does not add undue stress on you even if there is a bug in production, then probably you are working for the right company.3 -
A CASE AGAINST BLUE PRISM
Let's review one of the worst weeks I had with Blue Prism
Monday: Yay! Solved one of the problems we've been carrying around for a week before.
One of the robots suddenly became slow. Like, REAL slow. A process that would take 3 minutes per record now takes 45, and that broke apart all the following schedule.
There were no updates on the application server, the production machine, the robot, it just became slow. And not always slow; a process manually run from console room would work, a process in debug room would work, it's just the scheduled part that caused problems.
It turned out, BP didn't seem to like that particular combination of schedulation + process + machine. Moving the process to a different machine seemingly fixed that. IDK why.
Tuesday: One of our processes waits for a code to appear in the page, and when that happens, it memorizes this code. However, now it is always returning blank. Worked for months, now it breaks every single time.
After half a day of debugging a bug which DIDN'T HAPPEN IN DEBUG MODE YET AGAIN, at 11pm I decided to just place a nonsensical timeout in page before reading and call it a day.
WEDNESDAY: a scheduled process didn't start. "No sessions created". Thanks Blue Prism, very cool.
THURSTAY: This time, schedulation did start, but the process is "waiting". As in: it's 9:30 am, the process has been stuck in the same step since 6:00 am. Turns out, it blocked during a navigate stage; you need to send a string to clipboard using the standard BP action for that, then paste and click "enter", but for some reason the standard BP object sent "ORRCO" instead of "ORRICO" to clipboard, which obviously returned no results and then... the process just didn't feel like doing things anymore. No errors, no logs, nothing: just sitting on its ass. Because fuck you that's why.
Friday: another process uses a very moderate amount of scripts to work. Nothing really fancy, just a couple of lines of code to place in page some IDs and selector to help BP do its thing, otherwise selecting these elements would be a nightmare.
But
Failed while invoking javascript method:Exception from HRESULT: 0x80020101-> at mshtml.HTMLWindow2Class.IHTMLWindow2_execScript(String code, String language)
The same script -it's not dynamically generated-worked yesterday, the day before and the day after. But sometimes it will not. Why? The answer, my friend, is blowin'' in the wind -
Shouldn't there be a phenomenon named "critically timed bug" which always shows up like '15 mins before sleep time', '10 mins before your workplace's leaving time' or even worse '5 minutes before production/presentation'?
-
Confession: a very important feature of the website I'm developping wasn't working for a certain time. The boss wasn't aware because he doesn't go on the site, and I only found out last week because I needed to implement a new feature that used the previous one. Problem: the bug was only on production, not on local (and of course we don't have test server).
I took advantage of the absence of my boss today to clear the situation by making all of my tests on prod. I hope no customer tried to pass a command today, but it's finally repaired. I am both proud and shameful.3 -
Why? As a senior, you won't give some time to review my code, will let me merge my code to a branch, then blame us when it will produce the bug in production? why? 😐 Won't even arrange a code review/knowledge sharing session so that juniors can learn at least something. Even you won't encourage us write test cases. If seniors don't follow, are the juniors to blame? 🙂3
-
I have been seeing double info and shitty margin on my Facebook feed...someone is going to get fired lol
-
Releasing to production while knowing there's a high chance there maybe a bug that may require a hot fix
-
When you spend hours in a messy codebase to fix a bug properly and add an integration spec to cover that specific case.
And even you do a round of testing on staging + providing screenshots, there is always someone on the team that will write in your PR, "It works, I tested the change on my machine".
I understand that some people are skeptical but to the point of not trusting integration tests + screeshots/recordings then please test it on staging or production next time because if it works on your machine doesn't mean it will work there ;)2 -
First let me start this rant by saying: Don't use SharePoint lists as your primary data store if you can avoid it. You're gonna have a bad time.
My coworkers and I work on a system where we need to pull tons of data down from a SharePoint site and run various algorithms and operations on it. Generate reports, that sort of thing. This is all done in the browser using a Typescript React SPFX webpart. Basically using SharePoint as a DB/DAL.
Because of the sheer amount of data we end up pulling down (our system in production is the single source of truth for one of the largest companies in Canada, and they're currently building a pipeline as we speak), in order to maintain a reasonable speed while using it, we have some pretty intense caching logic implemented, logic that ensures we get new items when new items are detected, and merges changes to already exisiting objects. It's pretty brilliant, and that's before we even consider the custom paging that my coworker implemented in order to get around the IndexedDB max size of 100MB.
Well that's all well and good, and works great in production, but it is a horror to work with. Because EVERYTHING we touch on the server is cached locally, it can be IMPOSSIBLE to detect data anomalies, be they local or server side -.- You don't know how many hours I have completely WASTED fixing a "bug" that didn't really exist... Just incorrect data in the cache12 -
Been a while since my last rant, and this is not a rant.
Working at a startup is amazingly exciting and interesting.
Finally rolled out a feature from our company and. Is working on another one and the production bug keep coming from the old one. -
Since I started my routine of checking bug logs every morning, I've had 2 instances where a website vulnerability scanner was run against a production website and generated over 2,000 Coldfusion errors.
At the time, I was super nervous about the apparent hack attempt, and hyped that the attackers never actually got in. It's nice to know that despite the various errors indicating vulnerable / breakable code, they were ultimately unsuccessful. I know now that a determined attacker could probably have wrecked our production websites. Since then I've made a ton of security-related updates and I'm actually thankful for the script kiddie getting my attention with that scan.
PS. We're now building a website for a local security company who is going to work with us to pen test the site when it's finished! Gulp.4 -
Caused an outage on production because of a bug in the make file that hasn't been fixed for years. It hasn't been fixed because everyone knows to side step it. Except for me. Unit now. So the bug shall continue to live until the next newbie gets smacked in the face by it.
Kinda feels like an initiation. -
Critical bug in production? Sorry, can't fix it right now: We've got a build running with 1 hour of build time left, 8 hours of automatic tests, 3 days of manual testing, and a partridge in a pear tree. Your fix can be in the next release.
-
This one time last year a colleague found out that some data went missing and suggested to recover the data from a backup. When trying to create a new database instance in the Google Cloud Platform (if everything works it's amazing!) it failed.
Not knowing why this happened, I tried to revert that backup to the production database, after creating a backup using the GCP. Needless to say that failed as well, resulting in a corrupted database instance where I couldn't access the created backups anymore.
This all went at around 10pm and the only users of our product are currently in the same timezone and use it from around 7.30AM until 6PM so no one besides our team knew the server was down.
After a long night chatting Google's support team the database was successfully recovered and the only harm done was sleep depravation for me and a colleague.
Apparently there was a bug in the GCP. It was resolved in two hours and the last time a breaking bug was in that piece was more than seventy days earlier.
I did at least learn to create local backups as well, instead of relying on the tools of the same product...
Best: the moment I saw the corrupted database spin up again and not losing my job because of it. -
I'm an iOS developer and I cringe when I read job specs that require TDD or excessive unit testing. By excessive I mean demanding that unit tests need to written almost everywhere and using line coverage as a measure of success. I have many years of experience developing iOS apps in agencies and startups where I needed to be extremely time efficient while also keeping the code maintainable. And what I've learned is the importance of DRY, YAGNI and KISS over excessive unit testing. Sadly our industry has become obsessed with unit tests. I'm of the opinion that unit tests have their place, but integration and e2e tests have more value and should be prioritised, reserving unit tests for algorithmic code. Pushing for unit tests everywhere in my view is a ginormous waste of time that can't ever be repaid in quality, bug free code. Why? Because leads to making code testable through dependency injection and 'humble object' indirection layers, which increases the LoC and fragments code that would be easier to read over different classes. Add mocks, and together with the tests your LoC and complexity have tripled. 200% code size takes 200% the time to maintain. This time needs to be repaid - all this unit testing needs to save us 200% time in debugging or manual testing, which it doesn't unless you are an absolute rookie who writes the most terrible and buggy code imaginable, but if you're this terrible writing your production code, why should your tests be any better? It seems that especially big corporate shops love unit tests. Maybe they have enough money and resources to pay for all these hours wasted on unit tests. Maybe the developers can point their 10,000 unit tests when something goes wrong and say 'at least we tried'? Or maybe most developers don't know how to think and reason about their code before they type, and unit tests force them to do that?12
-
Fuck when your client find a bug in production, but you can't replicate in your developmment environment. So sad 😣2
-
That shitty moment when you are finally about to release your code, after about one month of developing and testing, and making sure everything is OK, imagining: "Oh we're finally releasing this feature, I have worked so hard on it, it's going to kick some ass!" but surprisingly things get fucked up on production server... I mean seriously? Stupid middleware I killed myself to get to work messed up. Where the hell have you been in staging, you stupid little bug? You happy now? My CTO giving me awkward looks and shit like: "I'm sorry but you have to come fix it, during weekend." The best way to fuck up my mood, today is the last day of week for god's sake!
I hate releasing like this. seriously SAG in this release!1 -
Did you get any seniors like these... we need to do that and that but tomorrow I won't remember and do as I told. 🫠 He won't wake up until any production bug arises.
-
That one bug that is always elusive, only happens once during development, you can't seem to reproduce it and think its fixed, and only reoccurs on production or during a demo 🤔 damn it
-
The production bug conundrum:
The new release that's going to fix the problem isn't ready yet, but hotfixing it means merge-hell for the new release. -
Goverment just confirmed 2nd January as National holiday (work free day)...scheduled Nationwide production on 1st January, 2nd January for bug fixes (no holiday for me)
FML! -
Debugging is fine, totally part of the job!
Constantly fixing sh1t and new reports of another pile of sh1t coming every day like somebody is throwing them with shovels at us just to open the codebase that is written by the folks who aren't here anymore with some list of obscure libraries that is last maintained about 5 y ago is not ok.
It is not buggy codebase it is actually coddy bugbase!
I tried to be vocal several times to change technology to more suitable one, to make some improvements and to remove code smells(there is a ton of it, smells like organic garbage dumpster with rotten eggs) but "everything works" and there is no real "value for the customers" in that(fixing, refactoring etc.)!!!
Yea it works with sh1t ton of bugs reported every week. Nobody gives a shit, just contempt with their mediocre lives solving bug at the time while i feel like I'm wasting my time and talent on wrong people and fixing other's shit.
That is what happens when prototype becomes product and ships to production because numbers, money and sh1t!
this is why we who care about our career can't have nice things! I am not god damn pest control, I am f*ckin developer.1 -
Found a bug
- Calling function sent wrong parameter.
- Calling function itself was shit. Changed it.
- Few hours later, revamped whole class, updated all references and pushed to production next day.
Till date that class has not changed and still works flawlessly!
And probably first time I used queues in java. Algorithms FTW -
I've just spent 4 hours trying to fix a bug on prod. that can be fixed in 30 secs. At the end I remembered that I should check the error log. FML (error reporting turned off, logs only)
-
Can anyone suggest any websites or resources for a breakdown of how to handle requests for features or handling bugs. Basically, I want some kind of background on best practices for managing the process of receiving a feature request/or bug report from a user to it reaching the dev team, to production/user acceptance testing.5
-
When you come across the generated script of a H5 banner from client's production house, you are tempted to rewrite the whole thing instead of fixing the bug.
If you really decided to fix it, in the end you found out you spent nearly 100x more time on this piece of shit, and that makes you feel really bad.1 -
So I opened devRant to find the new *bling* expressions feature added to the avatars.
So I set my expression to happy because nowadays, for some reason, even though work has been tough, I've liked it. And things have been good. Mostly because my lead is on holiday and the acting lead is a hundred times better than him in all aspects (micro management and as a Dev).
Cut to the evening when I'm walking home, the lead calls me up to inform me on a production bug that had to be taken up (acting lead had already asked me to look into it, we had an agreement). And I *HATE* the way he assigns tasks and acts like I'm a wee baby who can't even find his mother's tit.
Anyhoo, ended up changing my avatar expression to majorly fucking frustrated because of the Damn phone call.
So kudos to @dfox and @trogus to adding these little features which make people a bit more expressive! -
Just after feature launch, major bug on production and now I am getting yelled at by my lead as the issue happens to be with the PR i was responsible for reviewing yesterday. Somehow a logic error got past my review. But considering how large the project is it wasn't possible for me to test out every possible scenario myself. They should have had QA handle that. Also, that was my first code review. I can't understand why my boss has such unrealistic expectations. Bugs are expected at this stage. I feel like he just puts too much pressure on me for no other ther reason other than to just trigger my imposter syndrome. That way, I feel like a bad developer even though I am working my ass off. And he gets to avoid giving me a raise. Cant believe I rejected multiple offers to stay at this company. I don't even know why am I still working for this company anymore.4
-
A: oh hey my commit is not in the master branch...
A: *seeing bunch of commit deleted activity in bitbucket by B
A: Lol B deleted commits in master branch
B: Wait, what?! I know I have rebased my branch.. but never have I rebased anything in the master branch.. how can this be *intense breathing
B: Are you sure you have pushed yours to master?
A: Sure I've rebased, squashed, and rbt landed my work to master, here look my local master has my commit
CTO: wait what? Is this related to this bug we have in production just now? Please don't panic, let us resolve it
Turns out rbt land just squash your commit to your local master branch and they thus A have not pushed it to the remote. And the bunch of commit deleted activity were bitbucket not informing from which branch the activity was happening. Almost gave us heart attack. -
QA teammate catches a bug in production but it wasn't and manager assigned that bug to that QA teammate to figure out it. without telling anything.
-
@dfox when deleting a post I am able to report it. I didn't want to try because I learned my lesson about bug testing in production, but thought I'd let you know and if it's fine I'll play with it :D
-
When you fixed p1 bug in the production and waiting for the release is like watching the conjuring movie all alone in the night. It makes you scare out of shit.
-
Quick question for you all: How do you deal with a problem in production that you cannot fix, even over an extended period of time (say 2 months)?
For context I feel like I’m losing my sanity here, we’ve had this problem on our production API since the beginning of March this year. I’ve done so much testing, got in contact with various teams of my company to try to figure out any potential candidate that would explain the bug, but none worked out. No need to say I’ve spent a considerable amount of time searching on the internet for others with the same problem or similar… We’ve even opened a ticket with the cloud host to see if they would have more details about the problem without success. So how do you deal with that ?5 -
Today our PM planned to deploy in production an e-commerce based on PrestaShop.
A colleague of mine mamaged to implement everything that was necessary, and I made a small script to add random sales on random products every sunday.
We tested it several times in our environment, on multiple machines, and everything was working fine.
BUT
Today we launched the script on production server, and we was a little mistake.
"A bug? Say no more pal, I'll fix it!".
Fixed, tested on local environment, deployed and.... The first steps weren't working.
"Fatal error".
That's what I got. No exceptions, no error messages, no references.. Just "fatal error".
We spent two hours looking for the problem, thinking it was a server error that was just outputting that shitty message.
And you know what? Some fucking fat cocksucker son of a bitch thought it was an excellent idea to stop the code execution with a simple and very helpful "fatal error".
"oh, wait, there is an error here, let me print die(" fatal error"), ao the other developer will be able to find what's going on", he thought.
FUCK YOU MORON.
TL;DR: Avoid French software, they are a bounch of asshole (except some goos guy..) -
which type are you ??
**Manager:** Hey, we've got a little hiccup in the production environment. I know it's Friday evening and you're probably daydreaming about pizza, but could you give it a peek?
**Type 1:** Man, this is like finding a needle in a haystack while wearing sunglasses at night. Might take me a few hours... or days. But hey, wish me luck and have an epic weekend!
**Type 2:** Eureka! Found the gremlin. It looks like XYZ person tried to be a bit too creative on commit number 2234324. Maybe they had too much caffeine? Anyway, could you have a chat with them? And oh, may your weekend be as smooth as a fresh jar of peanut butter.
**Type 3:** Detective mode activated! Found the sneaky bug. It was XYZ person's "masterpiece" in commit number 2234324. But fear not! I've put on my superhero cape and fixed it in commit number 345453345.
**Type 4:** This issue again? It's like a recurring bad dream about forgetting your pants! I've revamped the whole thing so we don't have to relive this nightmare. If someone tries to pull this off again, our CI/CD will roast them like a marshmallow over a campfire.
**Type 5:** Ta-da! Fixed the glitch, jazzed up the design, and sprinkled in some extra logging magic. Now, troubleshooting will be as easy as pie. Speaking of which, I've got time for a coffee and maybe a slice of pie before heading out. Cheers!
Type 6 **Gloomy**: Oh, the digital clouds have gathered again. This issue is like a never-ending rain on a Monday morning. I've peered into the abyss of our code, and it's... well, it's deep and dark. I'll need some time, a flashlight, and maybe a comforting blanket. If you don't hear from me in a few hours, send in a search party with some hot cocoa.4 -
New question.
When debugging/troubleshooting, what does your desktop look like?
I have a total of 8 production environments to look after, each of which have their appropriate dev environments. Troubleshooting for me typically starts with VisualVM, 6-8 Putty sessions across the environments, at least one dbms session, WinSCP with at least 4 sessions, text editor with minimum of five open files and at least thirty tabs open in Chrome. Oh yeah, forgot outlook and Skype (typically with at least three team mates and usually a group chat).
All is well when I'm in the zone, but good forbid for someone to ask me to show them the article/bug report I just read that sent me down the rabbit hole.1 -
Anyone know how to use a proxy for a web crawler written in native Java for Android. I have a bug in an app in production that only surfaced after being used for a couple of days and I urgently need to fix it.
HELP!!! -
A young new dev was working on his first ticket, about a bug during parsing of an uploaded excel file. Our issue was that if the file contained an empty line, all remaining rows were ignored. So the task included extending our tests to cover this case. After 2 weeks (!), his merge request comes in. His idea (without ever asking for help) was to parse the whole file (in some cases huge) in the production code a second time, just to count the rows (!!) and save the count in a public static int field, which was verified in his new test.2
-
From the guy that practices bash in the production server, here's the same guy who also practices SQL queries in the production's PostgreSQL!
I swear these happen by accident. I'm having to do some data corruption control by some bug, but I forget to close the panel when I'm finished. Then I go on with my tasks and I think it's my own computer I'm writing these commands to.3 -
When you have a customer that is a pain and you have to do a new contract since months but they are no replying but at same time there is a bug in a plugin they are using.
They are not updating their plugins in production but only after a test in staging.
In production there aren't write permission from web server side, so only they have access.
And the plugin has a 0-day. -
We are presenting the last 3 months worth of almost 24 hours a day type deving so let's hope a bug doesn't pop up in our production line lake last time...😐😐😐
-
One nightmarish project that was doomed from the beginning, had me as the sole developer. I could hardly sleep when we began testing on a separate test system, but with (nearly) all the config stored in shared memory and copied from the production system, I dreaded, half awake, that the production server data base connection was still configured in the test system and that it was shooting all it's test data repeatedly to prod.
Finally drove to company in middle of the night at 4 o'clock. Checked everything was OK, tried to sleep 3 hours before the start of the work day.
This system also had the most hideous memory corruption in some shared memory that was used across several processes and should have been thoroughly protected by a mutex, but somehow, sometimes this crucial map, that was used to speed up the access to all the customer data just contained garbage.
Still haunts me to that day. (Like xkcd's unresolved tension of a non-matching parenthesis - an unresolved bug. -
It's almost all the time but specially when there is a stupid bug i find out in production after all the efford i put on testing before release, or even worst, when the bug is not stupid, is random and just hard to tell why the fuck everything is fucked up
-
Religion/Alt. Medicine/Modern Politics/Confirmation Bias/End Of The World explained for techies :
👤 : "There seems to be a serious bug in production"
🕷️ : "We only have features, do you have a last wish?"2