Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "service outage"
-
> *WordPress website gets down Error 500: Cannot establish Connection with database*
> Marketing loses their shit: "We need the website up and working right now"
> *Me being calm *: "Nope, we cannot it's the service provider error, there's nothing we can do"
> *MK.G*: "Alright then, switch to another ISP ASAP"
> *Me, Internal rage, a volcano erupts *: "Umm..so you want to spend more money on another hosting because this one has an outage of 48 hours?"
>*MK.G *: "Yes, because we cannot run Facebook ads, just because website is down"
>*Internal lmao*: "Alright, but by the time you purchase a new service provider and host, the website will be up and running plus since the database is down we cannot migrate"
>*MK.G*: "I don't care, just make it up and working"
>*Me chilling*: "Alright, give me few hours"
> after a few hours the website is working *me being badass even though I didn't do anything*13 -
I’ve started the process of setting up the new network at work. We got a 1Gbit fibre connection.
Plan was simple, move all cables from old switch to new switch. I wish it was that easy.
The imbecile of an IT Guy at work has setup everything so complex and unnecessary stupid that I’m baffled.
We got 5 older MacPros, all running MacOS Server, but they only have one service running on them.
Then we got 2x xserve raid where there’s mounted some external NAS enclosures and another mac. Both xserve raid has to be running and connected to the main macpro who’s combining all this to a few different volumes.
Everything got a static public IP (we got a /24 block), even the workstations. Only thing that doesn’t get one ip pr machine is the guest network.
The firewall is basically set to have all ports open, allowing for easy sniffing of what services we’re running.
The “dmz” is just a /29 of our ip range, no firewall rules so the servers in the dmz can access everything in our network.
Back to the xserve, it’s accessible from the outside so employees can work from home, even though no one does it. I asked our IT guy why he hadn’t setup a VPN, his explanation was first that he didn’t manage to set it up, then he said vpn is something hackers use to hide who they are.
I’m baffled by this imbecile of an IT guy, one problem is he only works there 25% of the time because of some health issues. So when one of the NAS enclosures didn’t mount after a power outage, he wasn’t at work, and took the whole day to reply to my messages about logins to the xserve.
I can’t wait till I get my order from fs.com with new patching equipment and tonnes of cables, and once I can merge all storage devices into one large SAN. It’ll be such a good work experience.7 -
A personal memo to all developers on devRant:
* Assume every external line of code, (including every service you consume) is an unreliable crock of flaming shit. These services can and will fail in the most glorious ways. Write your code to be resilient, and ASSUME FAILURE of dependencies. Even if it's your own team writing the other service.
Heard in a meeting today: "Your team's service outage is going to cause my service to corrupt the database!"
Response I wanted to give: "No, you asshat, my service outage is a normal part of living with microservices. Your app should have been smart enough to recognize the failure."9 -
TL;DR: OMFG! Push the button already!
I've been away on paternity leave for quite some time now. Today is my first day at work since the end of July.
Just a couple of days after my paternity leave started, I was contacted by one of the managers because a tracking and analytics service I had made some months earlier had halted.
Now, I did warn them that the project was fragile and was running of an old box in my office. So they shouldn't be surprized if it came to a halt every now and then.
Well, so being on my paternity leave and all I didn't want to spend time fixing it. I had a child to look after. So I told the manager that the box probably just had shut down. I think there was a power outage the day before, so I probably thought it was the cause. So he probably just had to turn it back on. I also told him the admin u/p in case he needed to restart some services.
Today, the CEO enters my office telling me to get that thing fixed. Because that manager apparently couldn't find the power button.4 -
The company I work for used to be hosted on 3dcart. One day the site went down and their support couldn't tell us why. After over 24 hours of downtime they restored service but left 5 days of all records and customizations across the entire store, from the DB to the damn templates. Their support apologized for the outage blaming the disaster on a combination of hard disk failure and a bad update to their backup script. They were not willing to assist us in any way. We were forced to manually enter 5 days of orders (which gave them new order numbers and caused more problems), products and template changes, with order data coming from an internal email which was luckily CC'd on the order confirmation email. Thank God for whoever setup that CC, it saved our asses. In the end it cost our company thousands of dollars and 3dcart never composited us in any way.2
-
Thoughts on forced emergency support?
I am with a company I generally like a lot but there are some things I generally despise about it. Like forced emergency support.
I am not good at it, I don't claim to be.. I generally struggle with anxiety, stress and depression, I specifically avoid roles that require on-call service .. I'm a senior level software engineer.
I find it very frustrating to be expected to be on-call from 7-7 in support of infrastructure I did not architect, did not code and basically know nothing about. They provided me with a ten minute discussion about ops genie and where to find internal support articles for my training and that's about it.
Last night I received an ops genie alarm and acked it as I was instructed to do, I went around the system looking for the alarm cause and basically had no idea what to do except watch our metrics graphing praying there wouldn't be an outage. Fortunately the alarm was for our load balancer scaling operation, it was taking a bit longer than usual ... Sigh of relief. Stay up til 6am and fall asleep..
Wake up to a few messages from various people asking why I didn't do this and that and it took me every inkling of my being to remain cordial and polite but I really just wanted to scream and say a bunch of shit that would probably get me fired.
What the actual fuck?
Why expect someone that has no god damn clue what they are doing to do something like this? Fuckin shit training and no leadership to mentor me and help me get better at this role, no shadowing, no regiment ..
#confused and #annoyed
Thoughts? Am I a bitch? Is it unreasonable for me to expect my job duties stay in line with what I'm actually good at!?
Thanks.15 -
Suffering from our first service outage since I've been at my new job.
Guess when it happened? While we have TOO MANY projects going on.
When you have too many pots on the stove, you're bound to forget the smallest, most crucial detail. -
WE: javaagent-based monitoring, as seen in this screenshot <attached>, is reporting full old-gen, full young-gen, full one of the survivors and a sky-rocketing full GC right before the service outage.
WE: container monitoring in this screenshot <attached> shows that the application peaked its memory very suddenly to MAX values and platoed on that. Then container monitoring is blank, suggesting a complete outage of a few minutes. After that monitoring starts again with memory usage reported at low levels and immediatelly spiking back to MAX again, suggesting the container crashed and had been respawned by an orchestrator. This repeats a few times throughout the day.
they: I did not find any evidence of application running out of memory. Maybe our monitoring is not working correctly?
we: *considering updating our resumes* -
So I just woke up and there is an internet outage in my neighborhood because optimum sucks balls, fuck the whole stupid low class internet service
Now I have to leave my house so I can work -
So today I learned how tree shaking works and I was just about to publish patches to my NPM modules when the registry gave up.10
-
Before he began dropping the 20K proposed to remodel my flat, I told my father I much preferred a contractor who was recommended by someone I knew, as opposed to using a big corporation like Home Depot. FAMOUS LAST... a neighbour in my building highly recommended the contractor we chose. And, week 7 [or is it 8?] of what was proposed to take no longer than two weeks has begun afresh!
On Friday the fellow who is the owner of the contract remodeling company was here touching the paint. He was here because I forbade the two painters he sent to do the initial painting job.
My internet cut out suddenly around 1300 Friday. He set to leave for the weekend shortly after that. I mentioned the outage to him. The essence of his reply was that there was no way it could have had anything to do with him. The following day, my internet provider sent a tech out to diagnose the problem. What was the problem? The head of the remodeling firm removed a face plate from the wall where there were telephone wires and disconnect them when he tore the wires as he replaced the face plate.
Although the tech told me he wasn't going to charge my account the $85.00 fee for his services because the outage was caused within my flat, I wish to be sure of this. Which brings us to the punchline.
My internet provider is a lame ass business model, dreamed up by a squint-eyed ex-circus monkey, never well endowed in the top story, and now just plain sad.
There were some 911 outages in Washington State last Thursday night. All during the day Friday when you dialled their freephone #. the recorded announcement, before saying anything else, told you they were experiencing heavier than usual call volumes, and my wait would be greater than `10 minutes. Fine. What fried my La Croix silk was that after their customer service dept closed for the weekend, that outgoing message remained.
Today, I wanted to contact my provider to see if they would know if the $ was going to be charged to my account. After pressing the 'send' key, my computer came back with an error message, saying they were having technical difficulties. So, I went on over to the 'chat' page. There's nothing to click on to take me to this enfabled location. So, can't reach them by phone unless I want to hear, every 30 seconds whether or not I wish to, how sorry they are for my delay.
A few years ago I would've used this as an excuse to have a technicolour meltdown. The reason I'm posting this is that I am now able to see beforehand what I'll be doing to myself getting upset over the circumstances. When I do reach somebody, I'm going to tell them as lightly as possible, that if they were an airline, I wouldn't board any of their aircraft. Ever.