Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
Get a devDuck
Rubber duck debugging has never been so cute! Get your favorite coding language devDuckBuy Now
Search - "outage"
Google: The SHA-1 collision is the biggest news today
Cloudflare: Hold my beer
AWS: Call me your daddy3
Biggest hurdle: torn between having boobs and missing an arm. I swear some people are under the assumption the brain is in the arm.
I am fully capable of building your network, resolving your outage due to your faulty code, can even tell you how many users your database can support at once. I don't need arms for that. Nor do my boobs distract me that badly.
"but men are going to make your life so hard" yup. And that's true no matter where i go
"all that typing with one arm can't be good for your back" welp. Find me a job that doesn't require a computer. Or manual labor. If you think typing will fuck me up, that's DEFINITELY out of the equation
"you're too pretty, there's no way this can make sense" dafuq you just say?!?!
"why don't you just stay home on disability, I'm sure you qualify, you wouldn't need to work" I'd rather be a fucking trophy wife if I'm staying at home. Fuck that.
And many more.
Sometimes they're fun. Give me more dumb arguments to counter? ;)58
We're currently experiencing major issues with the devrant.io domain due to another outage/problem with .io domains themselves. More info here: https://news.ycombinator.com/item/...
The issue is also being reported on twitter.
If you receive a host not found, connection error, etc. connecting to devRant, this is why. We'll keep you updated and in the future we will probably be switching away from .io at least for our API.
Thanks for the patience.29
The next person who calls the server disruption/emergency line for something that is NOT related to a server wide issue/outage is going to get a rusty pipe with fucking sambal up their fucking ass.
I am so fucking done with this bullshit.11
> be me
> last hour in office
> trying to figure out solution
> figured out a plausible solution
> write the code
> power outage before I compile
Well, on the bright side I committed it locally...9
Hey everyone - in case it isn't obvious, unfortunately due to the major S3 outage, no images can currently be uploaded :/
Sorry about that.4
Co-worker: "Should we keep this server up and running?"
C: "Do we have any other uses for it than the dedicated wiki?"
M: "Not really, and maybe it's time to move to the centralised platform Corporate™ introduced. Have we checked if anyone is using the server?"
C: "Good point, let me see…"
C: "… oh it's been down for last two weeks since the power outage."
M: "I think that answers the question. Let's leave it like this for a month more and if no one complains we can announce it dead"3
I'll get to my four words in a sec, but let me set the background first.
This morning, at breakfast, I fired up my trusty laptop only to get a fan failure warning.
Finally, after the three year old is asleep tonight, I'm able to start dismantling the case to get to the fan. I'm hoping it just needs cleaned out.
Hard drive, memory, and keyboard spread out over the kitchen table. I'm not even halfway done.
Guess what? Now I'm one of the lucky 3500 people to have a power outage at 9 pm. Estimated restore time: 2 am.
"All those tiny screws"
And a three year old in the house...18
Navy story continued.
And continuing from the arp poisoning and boredom, I started scanning the network...
So I found plenty of WinXP computers, even some Win2k servers (I shit you not, the year was 201X) I decided to play around with merasploit a bit. I mean, this had to be a secure net, right?
Like hell it was.
Among the select douchebags I arp poisoned was a senior officer that had a VERY high idea for himself, and also believed he was tech-savvy. Now that, is a combination that is the red cloth for assholes like me. But I had to be more careful, as news of the network outage leaked, and rumours of "that guy" went amok, but because the whole sysadmin thing was on the shoulders of one guy, none could track it to me in explicit way. Not that i cared, actually, when I am pissed I act with all the subtleness of an atom bomb on steroids.
So, after some scanning and arp poisoning (changing the source MAC address this time) I said...
"Let's try this common exploit, it supposedly shouldn't work, there have been notifications about it, I've read them." Oh boy, was I in for a treat. 12 meterpreter sessions. FUCKING 12. The academy's online printer had no authentication, so I took the liberty of printing a few pages of ASCII jolly rogers (cute stuff, I know, but I was still in ITSec puberty) and decided to fuck around with the other PCs. One thing I found out is that some professors' PCs had the extreme password of 1234. Serious security, that was. Had I known earlier, I could have skipped a TON of pointless memorising...
Anyway, I was running amok the entire network, the sysad never had a chance on that, and he seemed preoccupied with EVERYTHING ELSE besides monitoring the net, like fixing (replacing) the keyboard for the commander's secretary, so...
BTW, most PCs had antivirus, but SO out of date that I didn't even need to encode the payload or do any other trick. An LDAP server was open, and the hashed admin password was the name of his wife. Go figure.
I looked at a WinXP laptop with a weird name, and fired my trusty ms08_067 on it. Passowrd: "aaw". I seriously thought that Ophcrack was broken, but I confirmed it. WTF? I started looking into the files... nothing too suspicious... wait a min, this guy is supposed to work, why his browser is showing porn?
Looking at the ""Deleted"" files (hah!) I fount a TON of documents with "SECRET" in them. Curious...
Decided to download everything, like the asshole I am, and restart his PC, AND to leave him with another desktop wallpaper and a text message. Thinking that he took the hint, I told the sysadmin about the vulnerable PCs and went to class...
In the middle of the class (I think it was anti-air warfare or anti-submarine warfare) the sysad burst through the door shouting "Stop it, that's the second-in-command's PC!".
Stunned silence. Even the professor (who was an officer). God, that was awkward. So, to make things MORE awkward (like the asshole I am) I burned every document to a DVD and the next day I took the sysad and went to the second-in-command of the academy.
Surprisingly he took the whole thing in quite the easygoing fashion. I half-expected court martial or at least a good yelling, but no. Anyway, after our conversation I cornered the sysad and barraged him with some tons of security holes, needed upgrades and settings etc. I still don't know if he managed to patch everything (I left him a detailed report) because, as I've written before, budget constraints in the military are the stuff of nightmares. Still, after that, oddly, most people wouldn't even talk to me.
God, that was a nice period of my life, not having to pretend to be interested about sports and TV shows. It would be almost like a story from highschool (if our highschool had such things as a network back then - yes, I am old).
--- GitHub 24-hour outage post mortem ---
As many of you will remember; Github fell over earlier this month and cracked its head on the counter top on the way down. For more or less a full 24 hours the repo-wrangling behemoth had inconsistent data being presented to users, slow response times and failing requests during common user actions such as reporting issues and questioning your career choice in code reviews.
It's been revealed in a post-mortem of the incident (link at the end of the article) that DB replication was the root cause of the chaos after a failing 100G network link was being replaced during routine maintenance. I don't pretend to be a rockstar-ninja-wizard DBA but after speaking with colleagues who went a shade whiter when the term "replication" was used - It's hard to predict where a design decision will bite back and leave you untanging the web of lies and misinformation reported by the databases for weeks if not months after everything's gone a tad sideways.
When the link was yanked out of the east coast DC undergoing maintenance - Github's "Orchestrator" software did exactly what it was meant to do; It hit the "ohshi" button and failed over to another DC that wasn't reporting any issues. The hitch in the master plan was that when connectivity came back up at the east coast DC, Orchestrator was unable to (un)fail-over back to the east coast DC due to each cluster containing data the other didn't have.
At this point it's reasonable to assume that pants were turning funny colours - Monitoring systems across the board started squealing, firing off messages to engineers demanding they rouse from the land of nod and snap back to reality, that was a bit more "on-fire" than usual. A quick call to Orchestrator's API returned a result set that only contained database servers from the west coast - none of the east coast servers had responded.
Come 11pm UTC (about 10 minutes after the initial pant re-colouring) engineers realised they were well and truly backed into a corner, the site was flipped into "Yellow" status and internal mechanisms for deployments were locked out. 5 minutes later an Incident Co-ordinator was dragged from their lair by the status change and almost immediately flipped the site into "Red" status, a move i can only hope was accompanied by all the lights going red and klaxons sounding.
Even more engineers were roused from their slumber to help with the recovery effort, By this point hair was turning grey in real time - The fail-over DB cluster had been processing user data for nearly 40 minutes, every second that passed made the inevitable untangling process exponentially more difficult. Not long after this Github made the call to pause webhooks and Github Pages builds in an attempt to prevent further data loss, causing disruption to those of us using Github as a way of kicking off our deployment processes (myself included, I had to SSH in and run a git pull myself like some kind of savage).
Glossing over several more "And then things were still broken" sections of the post mortem; Clever engineers with their heads screwed on the right way successfully executed what i can only imagine was a large, complex and risky plan to untangle the mess and restore functionality. Github was picked up off the kitchen floor and promptly placed in a comfy chair with a sweet tea to recover. The enormous backlog of webhooks and Pages builds was caught up with and everything was more or less back to normal.
It goes to show that even the best laid plan rarely survives first contact with the enemy, In this case a failing 100G network link somewhere inside an east coast data center.
Link to the post mortem: https://blog.github.com/2018-10-30-...10
So apparently the Amazon S3 outage happened because of one setting being wrong in a looooong string of commands issued to shut down just a few servers.
Am I the only Linux user who totally gets how that could happen to just about anyone regardless of how awesomely competent they might be?4
Manager: "Pls deploy the changes ASAP."
Me: "Right away."
Me: *creates pull request*
... 5min later ...
GitHub: "Some checks haven't completed yet."
... 45min later ...
GitHub: "Some checks haven't completed yet."
Me: *looks into delay*
CircleCI: "Partial System Outage"
- AI/ML will be creeping into absolutely everything in some way, shape or form
- Developers of every shape, size, background and skillset will continue to be the most in-demand talent on the planet, likely leading to a huge bubble that's gonna fuck us eventually.
- JS will continue to dominate everything, NPM will probably encounter some form of MAJOR major outage that halts most of the globe's development + deployment processes, sparking the creation of another registry which will lead to the holy wars of the registries
- Web devs will continue to be stuck supporting IE8.
- Java will probably die out, maintenance halting completely unless you pay Oracle extortionate amounts of money, like single handedly paying every Oracle employee's salary extortionate.
- There'll probably be another big hype on the scale of Docker/k8s that you have to learn because you'll be unemployable without it
- The year is 2078 and WordPress still supports PHP 5.2, that fucking cockroach will outlive us all I swear
- Haters gonna hate
- React gonna react
- Angular gonna... Math?4
Quick recap of my last two weeks: 15 year old production server is basically dead, boss has taken over calls and claims credit for "resolving" outages (even though my coworker and I did the work, but ultimately the traffic died down enough to where it wasn't an issue anymore).
I go to a meeting to plan migration to a better server, boss bitches about not getting invited, I tell him I invited myself, and then he lectures about how that's not our job.
Different boss says we're migrating a schema for an application that should have been decommissioned 5+ years ago to use as a baseline. I explain what's going on, he says he understands, and proceeds to tell higher bosses it's perfect because there will be no user impact. OF COURSE THERE'S NO FRICKING IMPACT, YA DUNCE! there are no users!!!!
I merge two email threads together, since they discuss the same thing, but with different insight, and get yelled at, even though they requested it.
The two bosses I like are OOO for the next week, too, so I'm just sitting here hoping I don't say something that'll get me fired or sent to sensitivity training.
I'm just starting my on call rotation and don't know that I can do this. I cry when my phone rings, now, because I experience physical pain with how hard I cringe.
I got yelled at today by a guy because SOMEONE I DON'T KNOW assigned a ticket to him directly, rather than to the proper team (not his team). So I had to look into that, which at least had the benefit of preventing a catastrophic outage to our customers world wide, but no one will know because I don't brag at work; I'm too busy doing my job as well as most of my division/section/larger team, whatever the hell it's called. I saved us probably 25+ hours of continuous troubleshooting call from noticing something tiny that the people "smarter" than me missed.
**edit: sorry for typos; got my nails done yesterday but they feel like they're a mile long and I have to relearn how to type**7
TL;DR: disaster averted!
About a year ago, the company I work for merged with another that offered complementary services. As is always the case, both companies had different ways of doing things, and that was true for the keeping of the financial records and history.
As the other company had a much larger financial database, after the merger we moved all the data of both companies on their software.
The said software is closed source, and was deployed on premises on a small server.
Even tho it has a lot of restrictions and missing features, it gets the job done and was stable enough for years.
But here comes the fun part: last week there was a power outage. We had no failsafe, no UPS, no recent backups and of course both the OS and the working database from the server broke.
Everyone was in panic mode, as our whole company needs the software for day to day activity!
Now, don't ask me how, but today we managed to recover all the data, got a new server with 2 RAID HDDs for the working copy of the DB, another pair for backups, and another machine with another dual HDD setup for secondary backups!
We still need a new UPS and another off site backup storage, but for now...disaster averted!
Time for a beer! Or 20...
That is all :)4
Bug report comes in from a coworker. "Cloudinary uploads aren't working. I can't sign up new customers."
"I'll look into it" I say.
I go to one of our sites, and lo! No Cloudinary image loads. Well that can't be good.
I check out mobile app -- our only customer-facing platform. None of the images load! Multiple "Oops!" snackbars from 500 errors on every screen / after every action.
"None of our Cloudinary images load, even in the mobile app," I report.
Nobody seems to notice, but they're probably busy.
I go to log into the Cloudinary site, and realize I don't have the credentials.
"What are the Cloudinary credentials, @ceo?" I ask.
I'm met with more silence. I use this opportunity to look through the logs, try different URLs/transforms directly. Oddly, everything seems fine except on our site.
I check Slack again, and see nothing's changed, so I set about trying to guess the credentials.
Let's see... the ceo is basically illiterate when it come to tech, so it's probably not his email. It's a startup, and custom emails for things cost money, and haven't been a thing here forever, so it's probably oen of the CTO's email aliases. he likes dots and full names so that narrows it down. Now for the password.... his are always crappy (so they're "easy to remember") and usually have the abbreviated company name in them. He also likes adding numbers, generally two-digit numbers, and has a thing for 7s and 9s. Mix in some caps, spaces, order...
Took me a few minutes, but I managed to figured it out.
"Nevermind, I guessed them." I reported.
After getting into Cloudinary, I couldn't find anything amiss. Everything looked great. No outage warnings, metrics looked fine, images all loaded. Ex-cto didn't revoke payment or cancel the account.
I checked our app; everything started loading -- albeit slowly.
I checked the aforementioned site; after a few minutes, everything loaded there, too.
Not sure what else to do, and with everything appearing to work, I said "Fixed!" and closed the issue.
About 20 minutes later, the original person said "thanks" -- never did hear anything from the ceo. I've heard him chatting away in the other room the entire time.
Regardless, good thing for crappy passwords, eh?21
Let's take bets on the root cause of the S3 outage!
I'm guessing a bad deploy of a sever-side Java application with a garbage collection problem.6
--- UK Mobile carrier O2's data network vanishes like a fart in the wind ---
One of the largest mobile carriers in the UK; O2 has been having all manner of weird and wonderful problems this morning as bleary eyed susbcribers awoke to find their data services unavailable. What makes this particular outage interesting (more so than the annoyingly frequent wobblers some mobile masts have) is that the majority of the UK seems to be affected.
To further compound the hilarity/disaster (depending on which side of the fence you're on), Many smaller independent carriers such as GiffGaff and Tesco Mobile piggy-back off O2's network, meaning they're up the stinky creek without a paddle as well. Formal advice from the gaseous carrier is to reboot your device frequently to force a reconnect attempt, Which we're absolutely sure won't cause any issues at all with millions of devices screaming at the same network when it comes back up.
Issue reports began flooding DownDetector at around 5am (GMT), With PR minions formally acknowledging the issue 2 hours later at 7am (GMT) via the most official channel available - Twitter. After a few recent updates via the grapevine (companies involved seems to be keeping their heads down at the minute) Ericsson has been fingered for pushing out a wonky software update but there's been no official confirmation of this, so pitchforks away please folks.
If you're in need of a giggle while you wait for your 4G goodness to return, You can always hop on an open WiFi network and read the tales of distress the data-less masses are screaming into the void.6
# 3 weeks ago
customer informed us that the app will get quite some load since the beginning of July
# this monday
last spare hdd on database's SAN died.
I told everyone another hdd is to follow it very soon eithet whether nothing is done at all or if anyone attempts to touch the san array [cuz that's what redundant raids do...]. No fucks given by anyone, no attempts to have maintenance and planned outage within 24 hrs....
# this morning
another hdd has failed and now 1tb of data is lost. No way to restore backups cuz there is no database to restore them to....
# 20 minutes later I head out to get some popcorn. It's gonna be a fun week!
So... The planned heavy load [and revenue along with it] is not gonna happen... I guess they are gonna ha a week long outage.
That's what happens when you ignore warning shots fired at your face.2
I’ve started the process of setting up the new network at work. We got a 1Gbit fibre connection.
Plan was simple, move all cables from old switch to new switch. I wish it was that easy.
The imbecile of an IT Guy at work has setup everything so complex and unnecessary stupid that I’m baffled.
We got 5 older MacPros, all running MacOS Server, but they only have one service running on them.
Then we got 2x xserve raid where there’s mounted some external NAS enclosures and another mac. Both xserve raid has to be running and connected to the main macpro who’s combining all this to a few different volumes.
Everything got a static public IP (we got a /24 block), even the workstations. Only thing that doesn’t get one ip pr machine is the guest network.
The firewall is basically set to have all ports open, allowing for easy sniffing of what services we’re running.
The “dmz” is just a /29 of our ip range, no firewall rules so the servers in the dmz can access everything in our network.
Back to the xserve, it’s accessible from the outside so employees can work from home, even though no one does it. I asked our IT guy why he hadn’t setup a VPN, his explanation was first that he didn’t manage to set it up, then he said vpn is something hackers use to hide who they are.
I’m baffled by this imbecile of an IT guy, one problem is he only works there 25% of the time because of some health issues. So when one of the NAS enclosures didn’t mount after a power outage, he wasn’t at work, and took the whole day to reply to my messages about logins to the xserve.
I can’t wait till I get my order from fs.com with new patching equipment and tonnes of cables, and once I can merge all storage devices into one large SAN. It’ll be such a good work experience.7
You know, the whole AWS outage being caused by a typo while debugging got me thinking... whoever did that is most probably a developer who had a REALLY bad day. Could that person be on DevRant? Because the story of what the rest of that day and week was like for him or her has the chance to be the most epic rant on here ever. Poor guy/gal.3
So apparently this guy has the infrastructure for the Linux kernel mailinglist archive sitting under his desk.
And then there was a power outage.
While he's on vacation.
Now, someone has to physically go there to enter a LUKS passphrase to let the system boot again... 🤔😂😂😂
Sometimes I don't understand people.8
Hurricane's fixing to hit.
What does that mean?
Downloading porn and movies for the power outage that's imminent. Priorities are aligned lol6
Internet is fluctuating in our office.
The network team sent out an email of unscheduled outage.
I thought of replying back,
"Have you tried turning it off and on again".1
Sooooo.. Aws's route53 and ELB outage nuked all our environments. 503 here, 503 there, 5xx everywhere.
Just sitting and picking nose for we've got nothing else to do now.. Who on the fucking earth thought it might be a great idea to centralize the whole fucking internet into 3 companies' hands!?!
How's your day?7
Since I already posted images of my desktop setups at work(Mac) and home(Linux), I didn't want to repost this week. So, to keep it at least mildly interesting, here's a shot of my garage networking setup.
Ubiquiti UAP-AC Lite
TP-Link cable modem
A big UPS, so we'll still have wifi during a power outage, since that's apparently important
A couple of older machines I'm working on when I have time
A Philips Hue Bridge
An unremarkable 7-port switch
An Ooma phone device
A shitload of my wife's stuff that she's left there on her way in and out of the house.6
I used to manage servers, Linux lab machines, and automation. I always documented my work and made everything I did work even if I disappeared. I put so much effort into what I did, but there became too many red flags.
- payment -
I was paid minimum wage, while understandable because they were tight on money, sucks because I gave so much.
- environment -
It quickly became toxic with new employees. Insults we're fine, and they hated my optimism.
- nail in the coffin -
I resigned after I was working on bringing all systems up after a power outage. One of the main rigs wouldn't come up and a coworker decided to "slap the wrist" of a student who was last logged in. I wasn't ok with this, so I gave him a heads up before he would be called in. Someone else deleted their history file, and I got blamed. It was the power outage that caused the issue, not a student.
Still doesn't sit right with me.2
We had a major core router hardware failure in our LA datacenter today and every one of our services has been down since 6am, including all production servers. We have about 15,000 sites down across our entire platform. Our manager came over and told us to just go home because we need to replace the hardware and the process is expected to take all day, and we can't do any work until then because all the production servers are down. So you could say that it's been a pretty easy Friday so far! I'm headed home to play Spider-Man2
Completely got my localhost and live database confused and dropped the whole live server. And there was a power outage so the last backup is from Friday. Luckily not completely live, but still having a stream of people walking in. Also pretty obvious that they are talking about me especially since there's no other shes in the department.5
Just got a new monitor (HKC NB34C) and connected it to my PC (which didn't have a monitor at the time)
The first thing I see is a green screen of death because of a power outage during a Windows insiders update.11
We upgraded to Dyn Managed DNS last month, now we're down with the DDoS attack! If we didn't upgrade from their standard plan, we would be online still 😂1
Well today we got to test our system to the extreme and I'm pleased to say it passed. Major power surge followed by a black out. UPS for all networking and servers kicked in without missing a beat and the standby generator outside about 45 seconds later. After explaining to users how to turn on their computer (😑), we were able to get everyone working again in about 5 minutes. Lasted three hours without power from the grid without any client downtime1
I really enjoy my old Kindle Touch rather than reading long pdf's on a tablet or desktop. The Kindle is much easier on my eyes plus some of my pdf's are critical documents needed to recover business processes and systems. During a power outage a tablet might only last a couple of days even with backup power supplies, whereas my Kindle is good for at least 2 weeks of strong use.
Ok, to get a pdf on a Kindle is simple - just email the document to your Kindle email address listed in your Amazon –Settings – Digital Content – Devices - Email. It will be <<something>>@kindle.com.
But there is a major usability problem reading pdf's on a Kindle. The font size is super tiny and you do not have font control as you do with a .MOBI (Kindle) file. You can enlarge the document but the formatting will be off the small Kindle screen. Many people just advise to not read pdf's on a Kindle. devRanters never give up and fortunately there are some really cool solutions to make pdf's verrrrry readable and enjoyable on a Kindle
There are a few cloud pdf- to-.MOBI conversion solutions but I had no intention of using a third party site my security sensitive business content. Also, in my testing of sample pdf's the formatting of the .MOBI file was good but certainly not great.
So here are a couple option I discovered that I find useful:
Solution 1) Very easy. Simply email the pdf file to your Kindle and put 'convert' in the subject line. Amazon will convert the pdf to .MOBI and queue it up to synch the next time you are on wireless. The final e-book .MOBI version of the pdf is readable and has all of the .MOBI options available to you including the ability for you to resize fonts and maintain document flow to properly fit the Kindle screen. Unfortunately, for my requirements it did not measure-up to Solution 2 below which I found much more powerful.
Solution 2) Very Powerful. This solution takes under a minute to convert a pdf to .MOBI and the small effort provides incredible benefits to fine tune the final .MOBI book. You can even brand it with your company information and add custom search tags. In addition, it can be used for many additional input and output files including ePub which is used by many other e-reader devices including The Nook.
The free product I use is Calibre. Lots of options and fine control over documents. I download it from calibre-ebook.com. Nice UI. Very easy to import various types of documents and output to many other types of formats such as .MOBI, ePub, DocX, RTF, Zip and many more. It is a very powerful program. I played with various Calibre options and emailed the formatted .MOBI files to my Kindle. The new files automatically synched to the Kindle when I was wireless in seconds. Calibre did a great job!!
The formatting was 99.5% perfect for the great majority of pdf’s I converted and now happily read on my Kindle. Calibre even has a built-in heuristic option you can try that enables it to figure out how to improve the formatting of the raw pdf. By default it is not enabled. A few of the wider tables in my business continuity plans I have to scroll on the limited Kindle screen but I was able to minimize that by sizing the fonts and controlling the source document parameters.
Now any pdf or other types of documents can be enjoyed on a light, cheap, super power efficient e-reader. Let me know if this info helped you in any way.4
I accidentally started a reindex on a collection that had 14 million records in the middle of the day. Caused an outage in a major portion of our applications for about 3 hours. Worst thing was that once I pressed enter, I realized that it was for the production database, and not the staging database like I intended. I immediately went to go tell the dev ops lead, and he basically said, "whelp, let's just sit back and watch the world burn. Not much we can do about it"1
TL;DR: OMFG! Push the button already!
I've been away on paternity leave for quite some time now. Today is my first day at work since the end of July.
Just a couple of days after my paternity leave started, I was contacted by one of the managers because a tracking and analytics service I had made some months earlier had halted.
Now, I did warn them that the project was fragile and was running of an old box in my office. So they shouldn't be surprized if it came to a halt every now and then.
Well, so being on my paternity leave and all I didn't want to spend time fixing it. I had a child to look after. So I told the manager that the box probably just had shut down. I think there was a power outage the day before, so I probably thought it was the cause. So he probably just had to turn it back on. I also told him the admin u/p in case he needed to restart some services.
Today, the CEO enters my office telling me to get that thing fixed. Because that manager apparently couldn't find the power button.4
Can you rant about yourself?
I was reading about the AWS outage, with little to no interest. I didn't know what it was and thus figured it wouldn't affect me.
Some time goes by and I come up with this 300++ vote post. I'm witty, I'm smart, but when I want to upload a photo it doesn't work.
Must be the app right? I restart, nope nothing. Whatever..
Sometime later I have a dashing new photo for tinder. Surely to give me all the matches. Nope, can't upload it.
Must be my phone or Internet then.
Restart everything, nothing is working. Complete madness, no devRant upvotes and I'm still single.
I surrender, give up. Which is one of the worst things to do for me as a dev.
Today. Which is the cherry on the cake. I finally see my connection to the incident. I feel stupid and annoyed by myself.
God dammit Julian, pay attention.
Just got to work and we have no power. Maybe hand writing code in university will finally come in handy. Lol not. :)
My reaction after analysing the code responsible for the latest outage...
The week started great *sigh*
It all started with an undelivereable e-mail.
New manager (soon-to-be boss) walks into admin guy's office and complains about an e-mail he sent to a customer being rejected by the recipient's mail server. I can hear parts of the conversation from my office across the floor.
Recipient uses the spamcop.net blacklist and our mail was rejected since it came from an IP address known to be sending mails to their spamtrap.
Admin guy wants to verify the claim by trying to find out our static public IPv4 address, to compare it to the blacklisted one from the notification.
For half an hour boss and him are trying to find the correct login credentials for the telco's customer-self-care web interface.
Eventually they call telco's support to get new credentials, it turned out during the VoIP migration about six months ago we got new credentials that were apparently not noted anywhere.
Eventually admin guy can log in, and wonders why he can't see any static IP address listed there, calls support again. Turns out we were not even using a static IP address anymore since the VoIP change. Now it's not like we would be hosting any services that need to be publicly accessible, nor would all users send their e-mail via a local server (at least my machine is already configured to talk directly to the telco's smtp, but this was supposedly different in the good ol' days, so I'm not sure whether it still applies to some users).
In any case, the e-mail issue seems completely forgotten by now: Admin guy wants his static ip address back, negotiates with telco support.
The change will require new PPPoE credentials for the VDSL line, he apparently received them over the phone(?) and should update them in the CPE after they had disabled the login for the dynamic address. Obviously something went wrong, admin guy meanwhile having to use his private phone to call support, claims the credentials would be reverted immediately when he changed them in the CPE Web UI.
Now I'm not exactly sure why, there's two scenarios I could imagine:
- Maybe telco would use TR-069/CWMP to remotely provision the credentials which are not updated in their system, thus overwriting CPE to the old ones and don't allow for manual changes, or
- Maybe just a browser issue. The CPE's login page is not even rendered correctly in my browser, but then again I'm the only one at the company using Firefox Private Mode with Ghostery, so it can't be reproduced on another machine. At least viewing the login/status page works with IE11 though, no idea how badly-written the config stuff itself might be.
Many hours pass, I enjoy not being annoyed by incoming phone calls for the rest of the day. Boss is slightly less happy, no internet and no incoming calls.
Next morning, windows would ask me to classify this new network as public/work/private - apparently someone tried factory-resetting the CPE. Or did they even get a replacement!? Still no internet though.
Hours later, everything finally back to normal, no idea what exactly happened - but we have our old static IPv4 address back, still wondering what we need it for.
Oh, and the blacklisted IP address was just the telco's mail server, of course. They end up on the spamcop list every once in a while.
tl;dr: if you're running a business in Germany that needs e-mail, just don't send it via the big magenta monopoly - you would end up sharing the same mail servers with tons of small businesses that might not employ the most qualified people for securing their stuff, so they will naturally be pwned and abused for spam every once in a while, having your mailservers blacklisted.
I'm waiting for the day when the next e-mail will be blocked and manager / boss eventually wonder how the 24-hours-outage did not even fix aynything in the end...
The company I work for used to be hosted on 3dcart. One day the site went down and their support couldn't tell us why. After over 24 hours of downtime they restored service but left 5 days of all records and customizations across the entire store, from the DB to the damn templates. Their support apologized for the outage blaming the disaster on a combination of hard disk failure and a bad update to their backup script. They were not willing to assist us in any way. We were forced to manually enter 5 days of orders (which gave them new order numbers and caused more problems), products and template changes, with order data coming from an internal email which was luckily CC'd on the order confirmation email. Thank God for whoever setup that CC, it saved our asses. In the end it cost our company thousands of dollars and 3dcart never composited us in any way.2
Over 67,000 people are affected by a major power outage in my city right now. I wonder how many servers are shutting down :(2
Join's bridge: "hey man, something is wrong with your DB. our app can't connect in any environment, it started after our code release last night"
"Every other app connecting is working as expected, could you rollback your release?"
"nah, that can't be it. we validated it works"
"Then why am i on an outage bridge? call me if it's still broken after you rollback"1
Now my client does not want to rely on Amazon S3 because of the One Outage that it ever had a couple what weeks ago I forgot already. So my dumbass blurts out well we could always just back up to some other image or file storing website. But now I'm expected to implement this right away when I really haven't thought about it at all I mean I would have to write some sort of failover and some sort of daily or syncing mechanism. I guess I should forget about any direct upload to S3 code that I have written. Really I guess I have to wrap all of the image and file handling stuff with my own solution. Which actually that will be very nice when it is done and I could use this on other projects but it's quite a lot of work for something that I don't feel we really need at this stage in development. Just because you're using stuff on production that has am enormous red TEST label in the way of the ui doesn't mean i can code bullet proof software any faster4
The best thing with a power outage in an apartment complex is watching everybody's wifi turn back on.
So, we've been on Deutsche Telekom for about 9 months. Shitty connection in the countryside but literally not one outage.
For the last 6 weeks our internet has been dropping out with no obvious cause.
Just this week we start getting calls if we'd like to upgrade to a package with LTE...
I'm finding the coincidence just a little too convenient.1
TL;DR - (almost) childhood trauma due to Wesrern Digital crap products lead to lot of data loss and a plege to not trust or purchase their products for the rest of my life.
So, I got my first ever Wester Digital 2TB Mybook, back when 2TB was a really big thing. While in the midst of moving (not copying) a LOT of data to it, the damn disk just.. died. There was no fall, no power outage, no damage, it just stopped working. I was out of words and out of options. Tried yanking out the disk and connecting it directly to a system, but no luck because it looks like it's the HDD mobo that died.
Also stupid young me did not realise back then that, even if a "moved" the data, the original data is still most likely in their original location, and so, never bothered a recovery.
Lots of good stuff lost that day.
And as with a lot of you, my disaster recovery system kicked up 10 fold. Now I got redundant local and cloud backup copies of all critical and otherwise unattainable data.
As you may have guessed, I never bought another Wester Digital product ever again. My internal HDDs are Segate, and external is a suprisingly long lived Toshiba Canvio.6
My morning so far:
Walk out the door.
Miss bus I was supposed to have, no big deal I'll just wait for the next one (should be just 15 mins).
Next bus is 10 minutes late, seriously?
Get to the train station just to see my train doesn't go because of an outage. Screw this I'll work from home.
So how's your day going?6
when you get ready to sit down and eat a slice of pizza and get a call about a system outage...at least it's the weekend.2
Bad: Delete your production database
Good: Have a backup
Bad: Can't reimport it because your backup procedure uses scheme that are no longer supported for import by your cloud provider
Good: Backup are plaintext and somehow easy to parse
Bad: Spending the rest of the day writing scripts to reinsert everything.
End of the story: everything is up and running, 8hours of efforts1
The biggest weakness of every programmer is power outage...
Can't do anything, even the windows are powered... So hot right now.4
I'm Programmer/Analyst in one of the hgh ranking BPO company here at Philippines.
I'm currently on a project, I'm on a team who's managing machines parts. The project is CATERPILLAR.
The biggest challenge here is if there is a outage on the system, the number of Severity 1 issues keeps coming like there's no tomorrow. And there's only 5 of us on Tier 2 which is managing this abends, errors, bugs in the system.
Is there a way on preventing this outage/connection error. Like HELLO IT IS A BIG COMPANY !!! HOW THE HELL THEY CAN'T EVEN MANAGE THEIR CONNECTION!!!!2
npm waited for me to `rm -rf node_modules` and decided to experience an outage a MINUTE afterwards.1
Oh what to do, what to do when you’re stuck with a power outage that could last the entire night.14
Always thought I was inefficient because I double check so many commands in the terminal. Turns out, it's worth doing...
So I rush to job just to find a power outage on the building , don't know if I should be happy to have "nothing" to do or be sad cause I have a lot to do but can't 😓
So I remember that time when the vendor had a bunch of 500 error code responses.
Little later got rate limited.
Receive email couple hours later. They said the outage only lasted several minutes.
Forgot that I put in logic that fails the workflow if a rate limit occurs. Whomp.
Worked from 8am to 23pm today. A massive power outage yesterday messed up my schedule. Still have a lot to do tomorrow. Going to sleep now. Stay strong, everyone. Weekend is coming.
What happens when it takes too long for the office manager to get a new UPC, power outage fries your solid state drive, and you didn't put into bitbucket because credentials where not yet provided.
... Still feel some guilt 😷😷😷😷
And tremendous wrist pain as punishment....Faaack.1
If it hasn't already one day this app is going to be the reason for a global systems outage. I should be working but for some reason every 5 minutes I find my self back here [*scrolling away*]...
Suffering from our first service outage since I've been at my new job.
Guess when it happened? While we have TOO MANY projects going on.
When you have too many pots on the stove, you're bound to forget the smallest, most crucial detail.
Cable/Internet outage. Tried to contact ISP (Mediacom, who are awful)... Reported outage over an hour ago, but no update.
So, I figure it's time to call then...
In support app, selected "Call someone now." Selected sevices. Drop-down for "Tap down arrow for list"... contains the single placeholder "Tap down arrow for list". Plus, of course, you cannot submit the form without making a selection from the list.
Fuck your fucking support app right in the java-hole, Mediacom.
I did not need any more reasons to hate you - you are already at the top of my list, with no one else remotely close behind.3
the current power outage is an additional reminder why i will always decide for a notebook. no internet though, so that is the ending for my spare programming time :(1
I hate power outages. It just went out for all of 5 seconds, but it was long enough to shut down my computer. Fortunately Android studio probably caught all but the last couple seconds of my changes but it's still annoying.1
queue late night coding session, everything going smoothly and making good progress.
Really in the flow and getting stuff done, just got my rations to get some extra hours in of these devGod blessed hours of enlightedment.
a flash. then darkness. completele darkness. no light on in my room anymore, my screen is black... there is no more whirring of fans...
I look outside to see Its dark in the streets aswell...
In my mind I already know it's not gonna do anything, but i press the power button on my computer nonetheless... no response... more deadly silence...
really?! really?!?! those motherfucking knob gobbling, shit brained, waste of oxygen pieces of human waste really had to get a fucking power outage at this moment? You had one fucking job. ONE. FUCKING. JOB.1
So I just woke up and there is an internet outage in my neighborhood because optimum sucks balls, fuck the whole stupid low class internet service
Now I have to leave my house so I can work
Right now it's raining in São Paulo, and quite a lot.
While I also worry about it because of usual stuff (floods, infiltration, etc.), I'm also jumping like a little kid.
Puddles? No! Free bath? Almost!
It comes with a free excuse to NOT do work because of a small power outage, caused by poor maintenance of utility poles in my street.
Actually, one of the transformers in one of the poles just goes "bzzt" then "blam!" and then we have to wait for them to repair after rain stops.
It's always like this.
I know, it's NOT a good thing. But it's not bad to have a little rest, right?
It's pretty sad to reach a state where I'm happy when these things happen. :/2
Been reading for a while now, finally frustrated enough to add my rant...
Backing up my home server with Veeam to finally ditch raid 5 and add some space. 38% finished when we had to leave for a Thanksgiving dinner. Thought that was perfect because I'd come home to a finished backup, install the new drives and get Plex/webserver back up and running. Nope. Win 10 decides to update and reboot in the middle of my freaking backup at 79% complete. Now, I've just told the friends and family I was with to expect the outage in the next hour or two thinking I was ready. Nope. Start all over again. None of this is truly important, really... But Jesus, that's annoying.
Win 10, you have a special place in hell. And Veeam, as much as I love you, you can be right beside them for not letting me run backup console on Linux.2
Back when I was a freshman in high school a friend of mine put an emulator on the shared drive, so we could play NES games while in the computer lab. Didn't know better/didn't care. One day I get pulled out of class and walked into the computer guys office. In there is also the principal of the school and the Chief of police.
The computer guy tells me there was an issue last week that caused the school server to crash and it caused damage. I asked what happened and the he said one of the emulators we were playing had a script that crashed the server and caused damage. I asked how much damage and they informed me it was over 3 thousand dollars. At this point I'm very skeptical that the damage was worth about the cost of a new workstation (the old one sitting on his desk, buried in boxes), and afterwards none of the faculty knew of any kind of an outage. I asked for him to show me what broke and what had to be done to fix/replace the damaged equipment but all I got was a simple, "I'm sorry. I can't show you that at this time."
They threatened legal action for a felony of damaging a school property. Myself and the other tech savvy kids talked about it over the next couple of days wondering what would happen. They threatened expulsion for myself and a couple of other kids, but ultimately just got a talking to about keeping personal information safe.
What I got out of it was if they think I'm good with computers I must be doing something right. Now I'm in IT. This is where it went wrong.
Well, i enjoy programming since i think of it as a form of art. Clearly i wouldn't want all my coding to magically work since that takes away the fun. What i would like to be able to do is cause all outside distractions to go away so i can always code in peace. Meaning no thing.. Living or dead.... can bother me, nor are there any kind of events that would cause me to lose concentration, ie a sudden power/network outage. *sigh*...1
IT again: Spoke too soon about a happy server farm after Christmas... Had a SEV1 complete outage for the whole morning. *facepalm*2
Caused an outage on production because of a bug in the make file that hasn't been fixed for years. It hasn't been fixed because everyone knows to side step it. Except for me. Unit now. So the bug shall continue to live until the next newbie gets smacked in the face by it.
Kinda feels like an initiation.
This morning at 5:30 AM I was awoken to 20 text alerts for services being down.
Seems they had been down since 2 AM but the previous shift didn't take action.
Long story short:
An outsourced common component is unavailable, and the team responsible doesn't know how to troubleshoot.
I pointed them to the exact issue.
We are now 10 hours into the outage and they still don't know what to do.
I just love it when there is a power outage just after we all go home for the weekend. Coming back to dead servers is the best!
So the fucking septic cleaning guys truck snagged the internet line that goes across the driveway and the took it down.... No internet till at least noon tomorrow. Fuck me! I had a personal project I really wanted to work on.6
Working on 3 projects solo remotely and everyone wants their changes promptly. Also, I don't disclose to clients that I work for other people. (Being me)
Finished with 2 projects today so gonna Netflix and chill ( I really don't have a gf and idgaf), I need pack of smokes really bad and internet outage just in time when I finally finished, now waiting for more changes eagerly so that I can get paid finally2
Today I came to work and all our main systems where offline (Gitlab, Artifactory, Time tracking, ...). Found out that one of the HDDs of our server (external hosted) died. I started copying some stuff from the second Raid hdd (just in case and because our backup is of course not complete [#notmyfault]) while their datacenter had a power outage.... I'm now waiting for our server to come back to get our systems running again
My office network was down cause cause Comcast had a "non-existent" outage that affected my modem. Right when they picked up the phone, is when the thing finally got online.
Wasted an hour of my time cause I had to commute a half hour to go there.2
When a severe outage/screw-up happens or when an insane request comes through and the only solution is some adhoc coding... the best part of being a dev: 1. being the one they turn to in those dark times, and 2. the self confidence boost when you alone have saved the day. It's thankless but those times make up for it.
did you know Verizon fios's own outage monitoring page isn't optimized for mobile? it's true. ask me how I know.2
Anyone else effected by the UK fast outage yesterday? We've come in this morning to a failed drive (or so we think) our Web server home directory is just gone wondering if anyone else has noticed any funnies on their hosting
Thank God for "Screen" my remote Linux build session is alive in spite of fucking network outages! I'm telling you, when your company doesn't have voltage regulators, network goes down everytime there's an electrical fluctuation!
Release was supposed to start at 10 am, its now 11am but our release document is being debated heavily. The outage is scheduled to last until 2 and we thought we'd be tight for time when starting at 10 FML to right now..
Before he began dropping the 20K proposed to remodel my flat, I told my father I much preferred a contractor who was recommended by someone I knew, as opposed to using a big corporation like Home Depot. FAMOUS LAST... a neighbour in my building highly recommended the contractor we chose. And, week 7 [or is it 8?] of what was proposed to take no longer than two weeks has begun afresh!
On Friday the fellow who is the owner of the contract remodeling company was here touching the paint. He was here because I forbade the two painters he sent to do the initial painting job.
My internet cut out suddenly around 1300 Friday. He set to leave for the weekend shortly after that. I mentioned the outage to him. The essence of his reply was that there was no way it could have had anything to do with him. The following day, my internet provider sent a tech out to diagnose the problem. What was the problem? The head of the remodeling firm removed a face plate from the wall where there were telephone wires and disconnect them when he tore the wires as he replaced the face plate.
Although the tech told me he wasn't going to charge my account the $85.00 fee for his services because the outage was caused within my flat, I wish to be sure of this. Which brings us to the punchline.
My internet provider is a lame ass business model, dreamed up by a squint-eyed ex-circus monkey, never well endowed in the top story, and now just plain sad.
There were some 911 outages in Washington State last Thursday night. All during the day Friday when you dialled their freephone #. the recorded announcement, before saying anything else, told you they were experiencing heavier than usual call volumes, and my wait would be greater than `10 minutes. Fine. What fried my La Croix silk was that after their customer service dept closed for the weekend, that outgoing message remained.
Today, I wanted to contact my provider to see if they would know if the $ was going to be charged to my account. After pressing the 'send' key, my computer came back with an error message, saying they were having technical difficulties. So, I went on over to the 'chat' page. There's nothing to click on to take me to this enfabled location. So, can't reach them by phone unless I want to hear, every 30 seconds whether or not I wish to, how sorry they are for my delay.
A few years ago I would've used this as an excuse to have a technicolour meltdown. The reason I'm posting this is that I am now able to see beforehand what I'll be doing to myself getting upset over the circumstances. When I do reach somebody, I'm going to tell them as lightly as possible, that if they were an airline, I wouldn't board any of their aircraft. Ever.
In my first job another junior dev and I (junior at the time) were assigned the task of designing and implementing a user management and propagation system for a biometric access control system. None of the seniors at the time wanted to be involved because hardware interfacing in the main software was seen as a general shit show because of legacy reasons. We spent weeks designing the system, arguing, walking out in anger, then coming back and going through it again.
After all that, we thought we would end up using each other, but we actually became really good friends for the rest of my time there. The final system was so robust that support never heard back from the client about it until around 2 years later when a power outage took down the server and blew the PSU.
During troubleshooting a prod outage "I'm pretty sure that's not how computers work". Last time he said that he was dead wrong. and he can sure as hell fix it himself next time.