Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple APILearn More
Search - "downtime"
devRant will be going down on Friday, July 7th around 10:30pm EDT so we can do some database maintenance and restructuring of our cluster. It hopefully won't be down for more than about 30 minutes or so, and during that time you should see our "down for maintenance" message.
If you usually use devRant while you're on the toilet (we know many do!), we apologize and suggest you try to schedule around this!
Please let me know if you have any questions and apologies for the inconvenience.43
Merry Christmas to everyone who celebrates, happy holidays to everyone, and happy almost-new-year!
We had a bit of a slow year in terms of devRant updates, but we gained some momentum towards the end of the year and we're looking forward to carrying it into 2020. Recently, we launched what I think are our coolest new avatar items yet (https://devrant.com/rants/2322869/...) and behind the scenes we got our iOS/Android apps on the latest version of the frameworks we use, which will help us continue to improve stability. Still, we definitely would have liked to do more, but we're optimistic the coming year will bring great things for devRant.
One thing we are very proud of is this year we had our best year ever in terms of platform stability and uptime. Despite the platform growing and our userbase growing, we had almost no complete app downtime even though our infrastructure is minimal. A large part of this is thanks to devRant++ supporters, who allow us to maintain a small but effective tier of infrastructure and redundancy.
In the coming year, we're going to launch one of our most ambitious initiatives yet, and we're also going to continue to improve the devRant experience itself. We want to try to gather more user feedback, so we'll be working on a way to do that too. Stay tuned, more on this stuff coming soon.
As always, thank you everyone, and thanks for your amazing contributions to the devRant community! And thank you to our awesome devRant++ supporters for continuing to be the main drivers to keeping devRant up and running.
Looking forward to 2020,
- David and Tim31
”our PC stick isn't booting up! Come and fix it! (angry)”
”The PC are meant to boot up whenever power is delivered to them. Are you sure your TVs are powered on?”
”Yes! I just pressed the power button on both TVs and it didn't turn on the PC sticks.”
”So you can confirm the TVs are on? Can you change the input and see what happens?”
”Stop wasting my time and send someone down to fix it now! I told you it isn't working!”
”Ok, we will get someone out to you as soon as possible.”
Then a support guy drives 2 hours to their store.
When he gets there he realizes that the TVs power is connected to a light switch and they has the switch off!!!
He said ”can we turn on some lights so I can see behind the TV?” and then all the fucking TVs came on.
These are times when I fully understand the concept of “firing a customer”.
The customer sent an email saying ”the downtime for your product was unacceptable.” even after it was explained to them that the problem was them turning off the power.
These fucking idiots actually expect us to deliver products to display on TVs without fucking electricity to run them.13
First time poster here. Please be nice :)
My biggest workaround is one that's being currently deployed to 40 truck drivers (trucking company here), preventing printers being out of usage while on the road. We also have to use HP ePrint to wirelessly print documents, but that's another story for another time I guess :)
CEO asked us to install wifi printers in our 40-ish trucks which has wifi on board. However he's always picking one of the cheapest options possible, so we got consumer grade printers (Laserjet 1002w). Those printers often disconnects without getting back on the truck wifi network EVER. I have to get physically in the truck, wire the printer via USB onto my laptop and reconfigure Wifi on it with the HP Windows tool. This means lots of printer downtime, which always happens when the drivers are three timezones away from our office
Then I thought: "What if I could sniff what HP sends via USB while I (re)configure the printer, and replay whats being sent later? Our trucks all have an Android tablet with a USB type-A connector with host capability, so I could write a small app that replays the config when plugged in by the user.
Three days of hacking around later, I have a working app. By chance, HP printers (or at least those models we have) uses HTTP POST via USB, so I could easily replay the request.
Edit: the end result is that truck drivers just plug the printer to their tablet, press "reconfigure" in a home made Android app, printer is reconnected to the truck and they're good to go. They don't have access to the network nor know enough to debug themselves anyways14
Happiness is not getting any server issue/downtime notifications while you're outside on a bridge watching fireworks ❤😊4
Observed my bf spending at least a half hour browsing devRant in bed, so asked him what he'd do if devRant didn't exist anymore.
His simple reply?
May God help him tomorrow for the scheduled downtime... ;)9
Over the last couple of days we experienced an issue posting images on devRant posts and comments. This issue should now be fixed.
Apologies for the delay, it to address, it took some digging and we had some alerting that failed that would have helped quickly identify the source of the issue, but unfortunately that part of the alerting wasn't working as expected.
Despite the issue being fixed, there is a bit of additional maintenance that will take place to prevent it from occurring in the future. There could be a couple of minutes of downtime today, March 13 at around 10pm EST, but I'm hoping that can be avoided. I will update in the comments on this rant.
Lastly, and unrelated to this issue, an academic research team has been working on a project involving devRant/types of content posted, and would appreciate feedback and help with a short survey they put together for anyone who is interested: https://devrant.com/rants/3923796/...
Thank you again for the patience and feel free to let me know if you have any questions.
p.s. attached is a relevant meme, according to some people, who thought/hoped this was a feature :)17
When you move a bunch of cables you haven’t touched in a while, and underneath you find this 🥳
Guess I know what I’m taking to work on Monday.3
Today my manager asked me about my research into using RabbitMQ as a backup in case Azure Service Bus ever goes down.
Me: "Good. The way we designed the framework, all we have to do is drop the DLLs into the directory, update the config, and the services will start using RabbitMQ."
Mgr: "Excellent. Probably should be looking into using RabbitMQ as a permanent replacement for Azure"
Me: "What? The whole reason we moved to Azure was to eliminate the problems with having an on prem service bus. Since we've switched, there has been zero downtime."
Mgr: "That's what VP-Joe is afraid of. If Azure ever goes down, he won't know how to explain Azure to the president as to why we're not taking orders or can't ship packages."
Me: "That makes no sense. What did VP-Joe tell the president when a database goes down or a server mis-configuration?"
Mgr: "President understands internal outages, its just the whole 'cloud' thing he doesn't understand."
Me: "Um..then VP-Joe needs to explain it to him?"
Mgr: "The decision has already been made. Are you on board? Lets look at this move as a cost savings."
Me: "You mean the $10 a month? How much hardware will we need to support RabbitMQ?"
Mgr: "Yea, nobody probably thought of that."
Me: "I'm on board with whatever decision, but I'd like a little more than VP-Joe being afraid of the president."
Mgr: "I'm sure its not being afraid."
Mgr: "OK, lets wait and see if VP-Joe forgets about this and moves on to something new."5
Yesterday I fucked up big time.
First time in my career (I’m 23).
I just started working this week at a new company startup that had no programmers before me. They have a bunch of websites under their control that were on all different hosting solutions, and we decided to move them all to AWS.
I moved a few and was managing the folder rights on the server.
What happened next made my heart skip a few beats.
Bear in mind I’m not an expert in Linux.
I wanted to chmod to the folder I was currently in, and typed ‘sudo chmod -R 770 /‘ thinking for a while that the ‘/‘ would do it on my current dir.
Fuck. As I saw what was happening I pressed ctrl + c as fast as I could. But the damage had been done.
Fast forward a couple hours I deleted the broken instance, and created a new one from scratch. Had to do everything again but managed to do it in just a couple hours, moving as fast as I could without making such stupid mistakes again.
I was honest about it from the first minute it happened, and told my boss right away that I fucked up and had to start over, with a couple of hours of downtime.
Luckily not much was lost and I took a snapshot right after I was finished and will look into auto backups next week.8
This is going to be a long rant, coz this is the only way to vent out my frustration against our tech head.
Yesterday, while our fucking twat tech head was playing around in company aws account, he terminated the production server. By mistake, apparently. Coz he doesn't know shit about server management. But that egoist ass won't admit and fucked the production server.
And then ran away. We developers sprang into action. Updated dns to point to staging server, setup virtual hosts, env files, point to prod database, force flush dns cache. All systems were up and running in 30 mins. And since it was staging server, it had lot of untested features and codes, and we spent rest of the day fixing the bugs.
And that tech head, who ran away hiding his tail between his legs, after he fucked the server, came back after systems were up. And started cracking jokes, that "so many features got released in 1 day" . "We cut server cost by shutting down 1 server."
We were struggling and working in full throttle to make the services running again. And that fuckity fucker was cracking jokes.
And I don't even know what excuse he gave to ceo for the downtime. I am pretty sure he would have made up some crappy excuse to hide his fucking mistake. That ass never admits his mistake. I am thinking to go to ceo today and tell the real story and get that faggot head fired or at least a strict warning.4
Scheduled devRant maintenance - I'm going to be upgrading some infrastructure later and there will be some downtime, probably about 15 minutes, around 9pm EDT. Apologies for the inconvenience and devRant disruption :) It will help with working towards an even more stable service in the future.
Feel free to let me know if you have any questions!11
Hey everyone - apologies for the downtime earlier today. Our host is having a lot of issues and we're working to keep everything up through it.
On that note- there might be a little more down time tonight as they are trying to fix something and we might need a few server restarts. I will keep everyone updated and thanks for bearing with us!20
5 minutes downtime: "I lost millions because of you"
$100 extra on invoice: "it's too much I don't make money"7
Years ago we deployed this system with a SQL DB on a separate windows server.
Every now and then we had error messages saying that the system could not connect to the db. It was going on for about 5 minutes or so and then the db was up again.
We built a bunch of fallback logic to handle it gracefully.
Then one day one of the guys was in the "server room". It was not a real server room but like a dedicated office in another building.
He saw how the cleaning lady came in, unplugged the server's cable from the wall socket and plugged in the vacuum cleaner...6
Writing more infrastructure than product.
Look, my application requests and transforms data from a single external API endpoint, it's just one GET request...
But I made an intelligent response caching middleware to prevent downtime when the parent API goes down, I made mocks and tests for everything, the documentation is directly generated from the code and automatically hosted for every git branch using hooks, responses are translated into JSONschema notation which automatically generate integration tests on commit, and the transformations are set up as a modular collection of composable higher order lenses!
Boss: Please use less amphetamine.5
To save server cost and developers' productivity, devRant should have an intentional downtime of 3 to 6 hours daily :38
--- UK Mobile carrier O2's data network vanishes like a fart in the wind ---
One of the largest mobile carriers in the UK; O2 has been having all manner of weird and wonderful problems this morning as bleary eyed susbcribers awoke to find their data services unavailable. What makes this particular outage interesting (more so than the annoyingly frequent wobblers some mobile masts have) is that the majority of the UK seems to be affected.
To further compound the hilarity/disaster (depending on which side of the fence you're on), Many smaller independent carriers such as GiffGaff and Tesco Mobile piggy-back off O2's network, meaning they're up the stinky creek without a paddle as well. Formal advice from the gaseous carrier is to reboot your device frequently to force a reconnect attempt, Which we're absolutely sure won't cause any issues at all with millions of devices screaming at the same network when it comes back up.
Issue reports began flooding DownDetector at around 5am (GMT), With PR minions formally acknowledging the issue 2 hours later at 7am (GMT) via the most official channel available - Twitter. After a few recent updates via the grapevine (companies involved seems to be keeping their heads down at the minute) Ericsson has been fingered for pushing out a wonky software update but there's been no official confirmation of this, so pitchforks away please folks.
If you're in need of a giggle while you wait for your 4G goodness to return, You can always hop on an open WiFi network and read the tales of distress the data-less masses are screaming into the void.4
Things that never happen
Customer: I really am happy with the service. The 99.999% availability is great. I completely understand that downtimes are necessary to keep the system up to date....1
I have an interview on Thursday for a job I've been doing for the past 9 months - I bloody hope I get it!
I'm currently classed as an 'Apprentice' but have been doing the sole job of the Developer after he left a week before I started.
The only differences between the two roles is the pay difference and title (just about double my current rate).
I've started to produce documentation and processes for rolling upgrades to our application without downtime which is something they're big on.
Public sector for you, it took 9 months for a replacement...8
One of our newly-joined junior sysadmin left a pre-production server SSH session open. Being the responsible senior (pun intended) to teach them the value of security of production (or near production, for that matter) systems, I typed in sudo rm --recursive --no-preserve-root --force / on the terminal session (I didn't hit the Enter / Return key) and left it there. The person took longer to return and the screen went to sleep. I went back to my desk and took a backup image of the machine just in case the unexpected happened.
On returning from wherever they had gone, the person hits enter / return to wake the system (they didn't even have a password-on-wake policy set up on the machine). The SSH session was stil there, the machine accepted the command and started working. This person didn't even look at the session and just navigated away elsewhere (probably to get back to work on the script they were working on).
Five minutes passes by, I get the first monitoring alert saying the server is not responding. I hoped that this person would be responsible enough to check the monitoring alerts since they had a SSH session on the machine.
Seven minutes : other dependent services on the machine start complaining that the instance is unreachable.
I assign the monitoring alert to the person of the day. They come running to me saying that they can't reach the instance but the instance is listed on the inventory list. I ask them to show me the specific terminal that ran the rm -rf command. They get the beautiful realization of the day. They freak the hell out to the point that they ask me, "Am I fired?". I reply, "You should probably ask your manager".
Lesson learnt the hard-way. I gave them a good understanding on what happened and explained the implications on what would have happened had this exact same scenario happened outside the office giving access to an outsider. I explained about why people in _our_ domain should care about security above all else.
There was a good 30+ minute downtime of the instance before I admitted that I had a backup and restored it (after the whole lecture). It wasn't critical since the environment was not user-facing and didn't have any critical data.
Since then we've been at this together - warning engineers when they leave their machines open and taking security lecture / sessions / workshops for new recruits (anyone who joins engineering).26
I've found and fixed any kind of "bad bug" I can think of over my career from allowing negative financial transfers to weird platform specific behaviour, here are a few of the more interesting ones that come to mind...
#1 - Most expensive lesson learned
Almost 10 years ago (while learning to code) I wrote a loyalty card system that ended up going national. Fast forward 2 years and by some miracle the system still worked and had services running on 500+ POS servers in large retail stores uploading thousands of transactions each second - due to this increased traffic to stay ahead of any trouble we decided to add a loadbalancer to our backend.
This was simply a matter of re-assigning the IP and would cause 10-15 minutes of downtime (for the first time ever), we made the switch and everything seemed perfect. Too perfect...
After 10 minutes every phone in the office started going beserk - calls where coming in about store servers irreparably crashing all over the country taking all the tills offline and forcing them to close doors midday. It was bad and we couldn't conceive how it could possibly be us or our software to blame.
Turns out we made the local service write any web service errors to a log file upon failure for debugging purposes before retrying - a perfectly sensible thing to do if I hadn't forgotten to check the size of or clear the log file. In about 15 minutes of downtime each stores error log proceeded to grow and consume every available byte of HD space before crashing windows.
#2 - Hardest to find
This was a true "Nessie" bug.. We had a single codebase powering a few hundred sites. Every now and then at some point the web server would spontaneously die and vommit a bunch of sql statements and sensitive data back to the user causing huge concern but I could never remotely replicate the behaviour - until 4 years later it happened to one of our support staff and I could pull out their network & session info.
Turns out years back when the server was first setup each domain was added as an individual "Site" on IIS but shared the same root directory and hence the same session path. It would have remained unnoticed if we had not grown but as our traffic increased ever so often 2 users of different sites would end up sharing a session id causing the server to promptly implode on itself.
#3 - Most elegant fix
Same bastard IIS server as #2. Codebase was the most unsecure unstable travesty I've ever worked with - sql injection vuns in EVERY URL, sql statements stored in COOKIES... this thing was irreparably fucked up but had to stay online until it could be replaced. Basically every other day it got hit by bots ended up sending bluepill spam or mining shitcoin and I would simply delete the instance and recreate it in a semi un-compromised state which was an acceptable solution for the business for uptime... until we we're DDOS'ed for 5 days straight.
My hands were tied and there was no way to mitigate it except for stopping individual sites as they came under attack and starting them after it subsided... (for some reason they seemed to be targeting by domain instead of ip). After 3 days of doing this manually I was given the go ahead to use any resources necessary to make it stop and especially since it was IIS6 I had no fucking clue where to start.
So I stuck to what I knew and deployed a $5 vm running an Nginx reverse proxy with heavy caching and rate limiting linked to a custom fail2ban plugin in in front of the insecure server. The attacks died instantly, the server sped up 10x and was never compromised by bots again (presumably since they got back a linux user agent). To this day I marvel at this miracle $5 fix.1
Being responsible for a massive breach of personal & financial data.
Seriously, that crap scares me way more than any amount of downtime does.1
So yesterday I said to my private laptop update and shutdown...
Fast forward to this morning. Hell breaks loose. Have to fix it asap! We have downtime. But fucking windows update!!!
You fucking peace of shit should have done this yesterday. And why does it have to take so long.11
PSA: "sudo apt-get remove nginx" doesn't actually remove nginx. It will still continue to run and block port 80 on every reboot.
Until you run sudo apt-get autoremove, nginx-core and others still remain.
And that's how twenty seconds of scheduled downtime turns into 10 minutes.
Alright, so the "big e-commerce" site have ranted about a few times decided to move their site to google, because the developers blame our server to be the issue.
Well, I wish I had a couple of beer to drink while I am enjoying the downtime, servercrashes, and timeout on the site now. I hope the devs eat their own shit, because they are.
A lot of engineering fads go in circle.
Architecture in the 80s: Mainframe and clients.
Architecture in the 90s: Software systems connected by an ESB.
Architecture in the 2000s: Big central service and everyone connects to it for everything
Architecture in the 2010s: Decentralized microservices that communicate with queues.
Current: RabbitMQ and Kafka.
... Can't we just go back to the 90s?
I hate fads.
I hate when I have to get some data, and it's scattered on 20 different servers, and to load a fucking account page, a convoluted network of 40 apps have to be activated, some in PHP, others in JS, others on Java, that are developed by different teams, connected to different tiny ass DBs, all on huge clusters of tiny ass virtual machines that get 30% load at peak hours, 90% of which comes from serializing and parsing messages. 40 people maintaining this nightmare, that could've been just 7 people making a small monolithic system that easily handles this workload on a 4-core server with 32GB of RAM.
Tripple it, put it behind a load balancer, proper DB replication (use fucking CockroachDB if you really want survivability), and you've got zero downtime at a fraction of the cost.
Just because something's cool now, doesn't mean that everybody has to blindly follow it for fucks sake!
Same rant goes for functional vs OOP and all that crap. Going blindly with any of these is just a stupid fad, and the main reason why companies need refactoring of legacy code.13
First company I worked for, built around 40 websites with Drupal 7...in only a year (don't know if it's a lot for today's standards, but I was one guy doing everything). Of course I didn't have the time to keep updating everything and I continually insisted to the boss that we need more people if we are going to expand. Of course he kept telling me to keep working harder and that I "got this". Well, after a year a couple of websites got defaced, you know the usual stuff if you've been around for some time. Felt pretty bad at the time, it was a similar feeling to having your car stolen or something.
Anyways, fast forward about 2 years, started working on another company, and well...this one was on another level. They had a total of around 40 websites, with about 10 of them being Joomla 1.5 installations (Dear Lord have mercy on my soul(the security vulnerabilities from these websites only, were greater than Spiderman's responsibilities)) and the others where WordPress websites, all that ON A SINGLE VPS, I mean, come on... Websites being defaced on the daily, pharma-hacks everywhere, server exploding from malware queing about 90k of spam emails on the outbox, server downtime for maintenance happening almost weekly, hosting company mailing me on the daily about the next malware detection adventure etc. Other than that, the guy that I was replacing, was not giving a single fuck. He was like, "dude it's all good here, everything works just fine and all you have to do is keep the clients happy and shit". Sometimes, I hate myself for being too caring and responsible back then.
I'm still having nightmares of that place. Both that office and that VPS.
Tired, sick, brain foggy, cold.
I’m trying to finish my last few specs and it totally isn’t going well.
My PM also promised me he would get the change requests for this ticket to me by today so I could work on them — as we’re moving this Friday. He did not. He made the same promise last week. Bloody useless.
I let him know that I wouldn’t be able to finish the feature in time if he didn’t get back to me, so... week off? :D
As if packing and moving and driving is downtime.
I do need to figure out this last spec, though. I rewrote the entire feature, and broke functionality specific to some client, and apparently it’s tricky and extremely fragile. I have no idea how it was working before, and the only person I have to ask is... grumpy and overly busy, and hasn’t looked at any of it in years. Yay!
I might just go to bed.6
Two days ago...
I was happy, building out the network in a new lokation.
Suddenly my phone just doesn't want to stop ringing, from all the other lokation calling in that they can't connect to HQ.
Then HQ calls, we don't have internet, nothing works. The one guy on location who has access to the server room enters and finds all the servers offline and a couple of breakers blown.
Turn on breakers, servers won't boot properly.
Me in a taxi and hurry to HQ, to help boot the servers.
Afterwards I find out that one of the bosses spilled a cup of coffee on his desk, shorting the circuit.
Apparently he is on the same breaker group as the servers!?! What the actual fuck!
At least now the other bosses are like; yeah, we need to do something about that2
My first job was actually nontechnical - I was 18 years old and sold premium office furniture for a small store in Munich.
I did code in my free time though (PHP/JS mostly, had a litte browsergame back then - those were the days), so when my boss approached me and asked me whether I liked to take over a coding project, I agreed to the idea.
Little did I know at the time: I was supposed to work with a web agency the boss had contracted to build their online shop. Only that he had no plan or anything, he basically told them "build me an online shop like abc(a major competitor of ours at the time)"
He employed another sales lady who was supposed to manage the shop (that didn't exist yet). In the end, I think 80% of her job was to keep me from killing my boss.
As you can imagine, with this huuuuge amout of planning and these exact visions of what was supposed to be, things went south fast and far. So far that I could visit my fellow flightless birds down in the Penguin's republic of Antarctica and still need to go further.
Well... When my boss started suing the web agency, I was... ahem, asked to take over. Dumb as I was, I did - I was a PHP kid and thought that Magento, being written in PHP, would be easy to master. If you know Magento, you know that was maybe the wrongest thing I ever said.
Fast forward 3 very exhausting months, the thing was online. Not all of it worked yet, but it was online and fairly secure.
I did next to everything myself, administrating the CentOS box the shop was running on, its (own) e-mail server, the web server, all the coding required for the shop (can you spell 12 hour day for 8 hour pay?)
3 further months later, my life basically was a wreck, I dragged myself to work, the only thing I looked forward being the motorcycle ride home. The system worked though.
Mind you, I was still, at the time, working with three major customers, doing deskside support and some admin (Win Server 2008R2 at the time) - because, to quote my boss, "We could not afford a full time developer and we don't need one".
I think i stopped coding in my free time, the one hobby I used to love more than anything on the world, somewhere Decemerish 2012. I dropped out of the open source projects I was in, quit working on my browser game and let everything slide.
I didn't even care to renew the domains and servers for it, I just let it die without notice.
The little free time I had, I spent playing video games and getting drunk/high.
December 2013, 1.5 years on the job, I reached my breaking point and just left, called in sick at least a week per month because I just could not see this fucking place anymore.
I looked for another job outside of ALL of what I did before. No more Magento, no more sales, no more PHP. I didn't have to look for long, despite what I thought of my skills.
In February 2014, I told my boss that I quit. It was still seven months until my new job started, but I wanted him to know early so we could migrate and find a replacement.
The search for said replacement started in June 2014. I had considerably less work in the months before, looks like he got the hint.
In August 2014, my replacement arrived and I got him started.
I found a job, which I am still in, and still happy about after almost half a decade, at a local, medium sized ISP as a software dev and IT security guy. Got a proper training with a certificate and everything now.
My replacement lasted two months, he was external and never really did his job - the site, which until I had quit, had a total of 3 days downtime for 3 YEARS (they were the hoster's fault, not mine), was down for an entire month and he could not even tell why.
HIS followup was kicked after taking two weeks to familiarize himself with the project. Well, I think that two weeks is not even barely enough to familiarize yourself with nearly three years of work, but my boss gave him two days.
In 2016, the shop was replaced with another one. Different shop system, different OS, different CI. I don't know why and I can't say I give a damn.
Almost all the people that worked at the company back with me have left for greener pastures, taking their customers (and revenue) with them.
As for my boss' comments, instructions and lines: THAT might not be safe for work. Or kids. Or humans in general. And there wouldn't be much left if you put it through a language filter...
Moral of the story: No, it's not a bad thing to leave a place if you're mistreated there. Don't mistake loyalty with stupidity!
And, to quote one of my favourite Bands: "Nothing matters when the pain is all but gone" (Tragedy + Time by Rise Against).8
Yes, it is dumb that airports, stores, and hospitals run very outdated software, but imagine how hard it would be to upgrade all those machines, especially considering the programs that might not work well with never operating systems and the fact that staff would have to be trained all over again. Not to mention, most of these businesses and services can't afford any downtime and need to make sure that everything is compatible (so, update one PC, you have to update all of them). In theory, I am still a fan of updated systems, but then again, I have a 10 year old XP installation at home, which I've been preparing to reinstall for a year or so (don't really use that PC, but still)7
I was part of a on-call rotation. We had ~800 microsites with decent traffic on this one box, because that's a good idea...
One day the box was experiencing kernel panics and causing core dumps. After exhausting every possiblity I decided it was time to restart the box:
sudo shutdown now
Missed the -r and the box was not accessible remotely. Had to wait for someone at the data center to terminal in.
Downtime was ~2 hours.
This was caused by a crontab that automatically ran apt-get update & apt-get upgrade... Also made by me... None of this should have worked or allowed to be done!
Yesterday the boss told me he wanted to extend my contract and that everybody was happy to have me around. Processes are rolling fast, I almost had no downtime, tickets are always in the right column, couldn't be happier.
Next project will be a well known UK old pirate radio website10
Microsoft Azure down on a Friday at home time? Whoever tripped over that cable is probably trying awfully hard to slink out of the building unnoticed right about now2
Subject of message: "Important: New feature for all 000webhost users." Thanks, 000webhost, an hour of downtime a day was exactly the feature I wanted to be implemented. P.S. if you have OCD, don't look at how many unread messages I have10
I was an oilfield machinist for about 10 years. During downtime I'd read blogs and books on my phone. Eventually I wrote an app to manage parts drawings and CNC programs for my shop. Any time I came across a package or pattern I didn't understand I'd pursue it relentlessly. CodeWars and reading other people's code got me a long way. Now I've got a job in silicon valley and things are pretty sweet.
I fucking hate stupid accountants!
Yesterday we went to a customer to talk to the accountants because we want to remove one of their unused PC's in the office.
First, just the way they think (and talk if) they are the most important and it's absolutely critical everything works 100%. I see they are important but not 100 times more important than everybody else!
They called us their EDP-guys (EDV in German, that's the translation I found). That insulted me a bit. I'm rather called IT-guy, I don't know anything about the fucking EDP systems nor want I to. I'm there to make sure the hardware works. But whatever, fine, call me what you want.
Then they straight up threatened us, because their work is so important, they can't afford to have downtime in their systems. They don't really care, but the bosses of us both do and if we fuck up they (the bosses) will hold us responsible. There is a fucking update for your piece of shit software (datev)! I don't do the update, I'm just responsible that the update can be deployed on the hardware. I'm not responsible if this update fucks your system and frankly I don't care!
I could tell them all of this but they won't listen. They always talk in this patronising arrogant voice, because they are so important and we better don't fuck up the update.
I'm there to help. I don't want downtime for your systems. I want you to work with our systems the best you can.
But fuck you, I hope the server burns down!12
I'm in a slack channel with our fellow devs as a side chat for downtime.
We get to talking about coding, and then it led to the tools of coding then it led to OS debate.
I said I use Windows because it's what I work the fastest with. Then out of nowhere, they start flaming me, calling me random boy and there's really nothing I can do about it, because the "elitests" keep piling on the list of why Windows is bad.
Why is it that when I go into a coding server and I link a Windows solution to said problem, I get flamed for it?
It's honestly like I can't use software without someone trying to dox me (even if that is a overstatement)25
Well today we got to test our system to the extreme and I'm pleased to say it passed. Major power surge followed by a black out. UPS for all networking and servers kicked in without missing a beat and the standby generator outside about 45 seconds later. After explaining to users how to turn on their computer (😑), we were able to get everyone working again in about 5 minutes. Lasted three hours without power from the grid without any client downtime1
It seems Microsoft have taken my critique of Azure last week to heart. Last week, I insinuated that perhaps hometime on Friday was an inconvenient time of day for major downtime.
This week they've rectified their previous fault, and in their great mercy have decided to take down service at the busiest time of the day on Thursday instead.2
>Client complains about a 30 minute downtime around midnight
>Client also pays only for a single VM on a HV that they don't even own themselves
>Replies with an offer of how to make the setup more resilient, going from 1 VM to 2 LBs/FE loadbalanced through BGP, and distributing traffic through HaProxy onto 2 BE machines that in turn talk to a Postgres Cluster with RepMgr for dynamic failover.
>No reply so far
GOOGLES SERVERS WERE DOWN FOR 30 MINUTES!!!! WAOW 1TRILLION DOLLAR COMPANY CAN'T EVEN KEEP THEIR SERVERS UP SMH HOW CAN THEY BE SUCH IDIOTS GOOGLE SUCKS BETTER IF IT NEVER EXISTED IMAGINE 30 MINUTE DOWNTIME GUYS HOW COULD THAT HAPPEN WAIT GUYS DON'T GO WHERE ARE YOU GOING4
DONT do production stuff on friday afternoon. This friday evening we had an issue on production and just wanted to do a quick fix. The fix resulted in a ddos attack that we accidentally started on our servers in an IoT project. We contacted all customers' devices and asked them for response at the same time. Funny thing is that the devices are programmed to retry if a request fails until it is successful. We ended up with 4 hours downtime on production, servers were running again at 11pm.4
Not only are you not your job, your job is not worth taking home with you; unless it's actually your company, leave it in the office. You can love your job and still have days when you hate it, or days when you'd rather be doing anything else; that doesn't mean you don't still like what you do.
As a profession we can all be obsessive and not take the time out that we need, so make special effort to do so, even if that just means you're working on a personal project instead. Your brain, and partner, will be glad that you did. Whether you like to admit it or not, everyone needs downtime.1
What a bunch of cunts.
It's sad how they keep applying restrictions to everything. Two years ago, there were no restrictions. Now:
Max one website, random account locks if you ever get actual visitors, no support unless you're premium, max 5 simultaneous clients, one hour sleeptime a day, some "random" disk full errors or internal server errors and at least two hours downtime a day.
Just now I was reading on https://pve.proxmox.com/wiki/... about high availability. Now my Proxmox VE is just a tower (which happens to have ECC memory) that's stored in my storage room (and which is mostly used for experimental and home server purposes). But my mail servers.. those have been made with high availability in mind. Most importantly, I've made their services entirely redundant (but within the same datacenter). And when they have updates, I apply updates to one, reboot, see if it didn't break something and then do the same to the other server after the first one came up again. So no downtime whatsoever.
If memory serves me right, I think that I've been able to maintain these servers for the last year without any downtime at all (I reboot them every month to apply new kernels but they haven't both been simultaneously down at any moment). Does that make them High Availability? My interventions regarding their availability have been rather trivial. Is it really that hard..?4
My superpower would be the ability to split myself into multiple copies of myself so that I could function as an entire dev company on my own whilst learning the skills to do so without downtime because a copy of you could sleep while another works.
Of course the copies would share their knowledge and can merge back when needed9
A guy who had the same nationality as the enterprise we were working for was promoted from JUNIOR js developer to UX/UI coordinator for the entire department just because he was 2 year older than me (26 vs 28). Literally he was a junior dev and went to that.
One day he was accusing me of writing a piece of code which led prod to downtime. I was in the office, he was in another country with our manager and technical director next to him and we were talking over internal conference system. I shown git history + his name + his code and he was saying ‘that’s not true!!!’.
I couldn’t resist and I began to yell something like ‘You fucking fuck piece of shit cocksucker...’ for 5 minutes. Since that day i was the god on my project for UI/UX side.
Even now he is in the same place on the same position...
PS: more stories to come with this guy6
Before, when you bought $3k Cisco router you got the box that will run indefinitely as long as it has power in about any environment you can go to install it; with or without fans, it had more important business than to give a shit about such things.
Now, when you buy $500k Cisco box you get a over-engineered chassis with 5 separate fan modules with their own firmwares, self tests, watchdog timers and shitload of bugs. It's a fucking fan, it should spin, not do quantum chromodynamic simulations.
Next rant could probably be how Cisco's switch from monolithic to modular linux arhitecture (in order to reduce downtime) turned into having time bombs just waiting to do some crazy shit.2
The worst part of moving to a new apartment, besides the heavy lifting, is the Internet downtime, before it's reestablished 😒2
Being a sysadmin can be the most frustrating thing ever, but it's worth it for those moments when you feel like an absolute ninja.
Switched from single threaded gevent server to an nginx configuration, added ssl, and setup a reverse proxy to flask socketio, all with less than 10 minutes aggregate downtime. On the prod server. \o/3
What is it with devs who try to bloody "cost optimise" everything to within an inch of its life when there's no reason to do so?! This ain't your personal pocket money project here. This is a real commercial app with real consequences.
Seriously, saving £100 a month might seem like a lot to you, but this is a multi million pound project we're talking about. That's bloody nothing, and no-one will care. If a Fargate spot instance restarts at the wrong time and causes downtime though, or if we need logs going back a week, and don't have them because the log retention period is a few days, then everyone will be royally pissed. All because you thought "it should be ok", or it "seemed like the right thing to do". Sod off.4
I kinda feel the pain of the developers over at Bungie. They just had over 15h downtime of Destiny 1 and 2 and had to stay up all night to fix it.
You did a good job - have a nice weekend :)6
Not sure if that qualifies as prank...
Had an pretty incompetent CS teacher and used to simply unplug her PC when we had enough of her shit. Usually took her about 45mins to figure out what was wrong with her PC and another 5 of ranting why we'd do that. Eventually she started to check the cable first which reduced the ”downtime” to about 15mins.
However, we soon started to flip the power switch at the back of her machine instead. She never figured that out and called IT several times to fix it.
Thinking about it, it's probably worse than a prank 😅5
The company I work for used to be hosted on 3dcart. One day the site went down and their support couldn't tell us why. After over 24 hours of downtime they restored service but left 5 days of all records and customizations across the entire store, from the DB to the damn templates. Their support apologized for the outage blaming the disaster on a combination of hard disk failure and a bad update to their backup script. They were not willing to assist us in any way. We were forced to manually enter 5 days of orders (which gave them new order numbers and caused more problems), products and template changes, with order data coming from an internal email which was luckily CC'd on the order confirmation email. Thank God for whoever setup that CC, it saved our asses. In the end it cost our company thousands of dollars and 3dcart never composited us in any way.2
Docker swarm. All i want is a 'zero-downtime' system and everytime i try to set it up there's three damn things missing. Load balencer, service updater, and a good distributed storage. I finally got pissed off and am working on those but fuck it's been how fucking long docker has been out why the hell somebody else hasn't done this yet.3
Go to meetups and talk to people. Give presentations at meetups if you can. Get involved in community projects. Love coding. Use your downtime to study new stuff.
When talking to potential employers be positive and enthusiastic about your technology.
EDIT: Oh, a few more. Don't seem desperate for a job. Without saying anything, potential employers should feel like you have other offers and they're being evaluated by you. Ask questions about their company if you get an interview.
Try to give off an air of being in control and having a number of choices in your carreer (even if you're living off ramen every day).
The pressure should be on companies to hurry up and snap you up before another company does.
Be honest but a little spin won't hurt.
So glad to be staying a new job next week. Today a junior colleague asked me what the best way to test something would be as it won't work locally. Knowing this has a good chance of taking down the server, I suggest he sounds up a server on his AWS account. My manager comes in, oh no I don't want him doing it on AWS use the production server instead. By the time stuff States hitting the fan I will be gone.2
Client: my website is down
Support: can you just google my ip and let me know your IP
Client : OMG google is down!!! Oops router was'nt plugged.
**Client is on call just incase you wonder :p
How much does your internet cost?
Bandwidth: Unlimited with no FUP
Downtime: Almost never
Note: This is for home internet plan and NOT mobile plan51
I was Just college fresher who completed his Engineering. My first week in the office. And a system was provided to me, since it was support project so I was given direct access to production database.
Fresher + Production Database + Access of Admin credentials = Worst Possible Combination
So it was my night shift, I was told to update new tariff plan for our client (which was one of the largest telecom service in India) .
If someone recharges for more than 200 Rupee, that person will get 10% or 20% extra talk time. Which was only applicable for particular circle (Like Bihar and Rajasthan).
Since I was fresher, I was told to update given query from my senior employee which he shared on the shared folder. Production downtime was in the mid night, so at that time I updated that query on the production database.
Query successfully updated. I completed my night shift, went home and slept.
When I woke up, I saw my mobile it had 200+ missed calls from different locations of India. They were Circle heads of that telecom service provider who contacted me. I realized something unexpected is expecting me.
Then at that moment my team lead called me and he asked me to come office right away.
Reminding you I was a fresher, I was shivering. What have I done there?
When I reached office, I came to know that the query I updated on production bombarded.
Every person who recharged that day (duration from midnight to morning 10 AM) got 10 times or 20 times more talktime.
A part of Query was something like this where error was made:
TalkTime = RechargeAmount + RechargeAmount * 10/100; (Bihar)
TalkTime = RechargeAmount + RechargeAmount * 20/100; (Rajasthan)
But instead of this query, I updated below one:
TalkTime = RechargeAmount + RechargeAmount * 10;
TalkTime = RechargeAmount + RechargeAmount * 20;
In a span of 10 hours, that telecom service lost revenue of 6.5 crore Rupees. Thanks to recovery team they were able to recover 6 crore but still 50 lakh Rupees were in loss.
One small query, and approx 1 million dollar was on stake.
Aftermath of this incident
I should have taken those queries on mail. Or, there should have been mail communication regarding this.
Never ever do anything over oral communication. Senior employee who did this denied and said he provided correct query, and I had no proof of communication.
I told them, it was me who executed that query on production. Since I was fresher, and took my responsibility of that incident. My team lead rescued me from that situation.
Always test your query and code multiple times before you execute or Go live it on production.
Always have email communication for every action you take on production.
Power comes with responsibility. If you have admin credentials of production never use it for update/delete/drop until you are sure.
Don’t take your job lightly.
I was not fired from that Job, but I have learnt my lesson very well.1
Had a shoulder operation, and currently unable to move my arm. Getting pretty frustrated with being unable to move and feeling useless.
A mate just explained that I was basically patching my body. After a little downtime I'll be stronger and better.
Somehow, with this logic/analogy, it seems sensible and acceptable!4
On the week I went on holiday there was a fire alarm in our data center and the Halon fire suppression system was deployed.
I always miss the excitement.
Apparently our servers rolled over perfectly and we had less than 5 min downtime though, even with 4 killed drives in our SAN. So that's a win.
Kudos to our system administration team, especially the poor bastard who was sat in the data center with his toddler at fuck-my-life-o'clock in the morning.1
Well, throughout my life I've never really thought about programming. Then one day during some downtime on a backpacking trip with a friend, while I had nothing to do my friend sat there with his computer with the screen all dark, filled with funny colourful text in lines of different length, with some lines even starting more towards the middle of the page than to the left, almost following a vertical wave pattern. He said he was writing a program to control his home remotly as well as working as a security feature that could unlock his home automatically when he got home. I was amazed by the colorful text as well as the fact that he could just create this crazy program out of nothing.
Half a year later I attended my first lecture at the computer science programme. My first program was a command line tool used for baking bread. It asked you how much flour you'd use and how many eggs, then it'd tell you wether or not you'd got the correct ratio. I was blown away by the intuitive nature of programming. I could imagine the control flow as a tree or flow chart in my head. I mean the whole program was only a couple of user inputs followed by an if-statement and a print-statement, but for me it was awe inspiring. I knew then that I'd probably chosen the right path in education.
First real dev project was a calculator for a browser game, that calculates the optimal number/combination of buildings to build. I got bored constantly doing it manually, so I made this program as a fun and useful challenge. It involved basic math, and I did it in VB.
Second one was a stats tracking page for my team in another browser game, that let us easily share and keep track of stuff. It allowed us to minmax our actions and reduced the downtime between actions of different players. HTML, CSS, JS, PHP, MySQL.
Third one was a userscript for the same game that added QoL features and made the game easier to play. JS
Fourth was for the first game, also a QoL feature userscript, that added colors/names, number limit validation to inputs, and optimization calculators built in the interface. It also fixed and improved various UI things. Also had a cheating feature where I could see the line of sight of enemies in the fog of war (lol the dev kept the data on the page even if you couldnt see the enemies on the map), but I didnt use it, it was just fun to code it. JS
From there on, I just continued learning and doing more and more complex shit, and learning new languages.2
Okay, I have a lot to rant about discord, but today, exceptionally, to the point.
I have my dedicated server. It has uptime last 3 years better than 99,99% (was down 15ish minutes for maintainance and RAM upgrade and like 10 minutes down becouse hoster's generators failed to trigger when there was outage)
This year it was up 24/7/365.
Why am I saying it?
Well, my TS3 server is up 100% of time this year. Yet still everyone moves to discord and suffers brutal audio quality and audio lags, and outages like right now. Its not first time this year and recently discord was acting up before. Today they scored bigger downtime than my dedi server (thats not redundant, not distributed nor any fancy "uptime helpers") last 3 years.
Why the fuck people prefer discord to ts3 other that it allows to upload images more conviniently? Okay, it looks nicer, and is like 10 times heavier on machine, but other than that? Its beyond me.
E: fix typo
E2: fix typo29
Not exactly vacation, but there was this nice-to-have feature in our application that I coded up in the hospital after my wife gave birth to our son. I wrote it during the downtime while they were both sleeping.3
Concerning my last post on the two Commodores, (https://devrant.com/rants/963917/...) here's the great story behind the boxed one.
So at the place where I interned over the summer, I helped the tech dept. (IT herein) move to a new bldg. We had to dismantle most of the network infrastructure stuff, so we were in the server room a lot. First day on the job, Boss shows me server room, I'm amazed and all because this is my first real server room lol.
We walk around, and there's a Commodore 64 box on a table, just kinda there. I ask, "Uh, is that actually a C64?" B: "Yeah, that's E's." Me: "E?" (name obfuscated) B: "Yeah, E's a little crazy." Me: "Is it actually in there?" B: "Absolutely, check it out!" *opens box and sees my jaw drop* Me: "Well, alrighty then!" So that lingers in my mind for a while until I meet E. He is a fuckin hilarious guy, personifying the C64, making obscure and professionally inappropriate references. Everyone loves him, until he pranks them. He always did.
We’re in the server room, wiping some Cisco switches or something, and we have some downtime, so I ask him about the 64, and he's like "Yeah, I haven't had time to diagnose her issues much. If you want her, go ahead, see if you can make it work!" Me: "You're kidding, right?" E: "Nah, not at all!"
That day I walked out with a server motherboard, 2 Xeon CPUs and some RAM for the server (all from an e-waste bin, approved for me to take home from boss) and a boxed C64. Did a multimeter test on the PSU pins, one of the 9vAC pins is effectively dead (1.25v fluctuating? No thanks.) but everything else is fine except for a loose heatsink and a blown fuse in each C64. Buying the parts tonight. I wanna see this thing work!1
"How to make $17k in 10 hours for a 5 minutes job"
"Live physical server migration to another building"
A nice rant :)
Some folks in my prev workplace tried to move a live SUN machine to a different hall and yet ended up with messed up HDDs (which ofc can only be replaced and rebuilt by SUN, since it's UNIX). Including the system RAID :)
Hats off to Matt!4
TLDR: I need advice on reasonable salary expectations for sysadmin work in the rural United States.
I need some community advice. I’m the sysadmin at a small (35 employee) credit card processing company. I began as an intern and have now become their full time sysadmin/networking specialist. Since I was hired in January I have:
-migrated their 2007 Exchange server to Office 365
-Upgraded their ailing Windows server 2003 based architecture to 2012R2
-Licensed their unlicensed VMware ESXi servers (which they had already paid for license keys for!!!) and then upgraded them to 6.5 while preventing downtime on hosted VMs using tricky transfers and deployments (without vMotion!)
-Deployed a vCenter server to manage said ESXi servers easier
-Fixed a three month gap in their backups by implementing Veeam, and verifying its functionality
-Migrated a ‘no downtime’ fileserver to a new hypervisor host, implemented a ‘hot standby’ server as a backup kept up to date by the minute with DFS replication.
-Replaced failing hard drives in a RAID array underlying their one ‘business critical’ fileserver, which had no backups for 3 months at that time
-Reorganized Active Directory and Group Policy deployment from a nightmare spiderweb of OUs and duplicate policies
-Documented the entire old network and now the new one as I’ve been upgrading this
-Audited the developers AWS instances and removed redundant machines, optimized load balancing on front end Nginx servers, joined developer run Fedora workstations to the AD domain and implemented centralized syslog monitoring on them.
-Performed network scans and rewrote firewall exceptions to tighten security
There’s more, but you get the idea. I’ve now been tasked with taking point on an upcoming PCI audit which will be my first.
I’m being paid $16/hr US, with marginal health benefits. This is roughly $32,000 a year, before taxes.
I have two years previous work experience managing a third party Apple repair facility (SimplyMac) and every Apple certification for warranty repair and software troubleshooting. I have a two year degree in general sciences, with about 4 years of college credit (Two years of a physics education and two years of computer science after I switched focus) I’m actively pursuing a CCNA and MCSA server 2016 with exams paid for and scheduled.
I’m going into a salary negotiation in two months. What is a reasonable salary to request, from your perspective, for someone in my position?
Thanks in advance!6
it turns out we probably caused the downtime ourselves. I didn't know dropping 170 databases and deleting 80 entire projects at once could do that"
Gave me a hearty chuckle. Especially as the client adamantly refused to have SSDs installed for the DB to run on top.
Just imagining the poor read-write heads jerking back and forth in vain attempts to find all the data to delete... So yes, dropping 170 databases at once does in fact take a database server down to its knees, as deleting is all the drives will be doing for a while.
At least it wasn't my or my colleague's mistake this time.6
Node: The most passive aggressive language I've had the displeasure of programming in.
Reference an undefined variable in a module? Prepare to waste your time hunting for it, because the runtime won't tell you about it until you reference a property or method on the quietly undefined module object.
Think you know how promises work? As a hiring manager, I've found that less than 5% of otherwise well-experienced devs are out of the Dunning Kruger danger zone.
Async causes edge cases and extra dev effort that add to the effort required to make a quality product.
Got a bug in one of your modules? Prepare yourself for some downtime because a single misplaced parentheses can take out the entire Node process, killing unrelated pages and even static file hosting.
All this makes for a programming experience that demands much higher cognitive load, creates more categories of bugs, and leads to code bloat/smell much more quickly than other commonly substituted languages.
From a business perspective, the money you save on scaling (assuming your app is more compute efficient under Node) is wasted on salaries and opportunity costs stemming from longer dev time, more QA, and more frequent outages.
IMO, Node is an awesome experiment, a fun language, a great tool for specific use cases, and a terrible fucking choice for an entire website.8
Have some downtime today, so since I lucked out and found some old backups (from before I used Git) of a project I was planning on revisiting, I decided to fire it up and see what I can do to get that going again.
And discovering just how much my coding style has changed since then...
[Code is in PHP, for reference]
* Virtually no documentation (whereas my current style is near-obsessive with PHPdoc blocks)
* Where comments exist, they only use // and are a full tab after the end of the line
* All assignment operators are dutifully aligned on tabs
* Have to update the entire codebase because it relies on deprecated `mysql_*` calls
* Have to flip all of the quotes throughout the codebase because I used double-quotes as my primary at the time instead of single quotes.
* Also relied on magic quotes for injecting variable content into strings
* Associative array practices varied; sometimes the names are encased in double quotes, but I just hit a block where it's all leaving it to the compiler to interpret unquoted string literals
And perhaps the most egregious so far...
* Any time we get database results back the process for cycling through them is to do `$count = mysql_num_rows($result);` (or $count2, etc.), then do a `for ($i = 0; $i < $count; $i++)` (again, or $j, $k, etc.), instead of just a simple `while ($data = $result->fetch_assoc())`2
I know it's a weekend, but there are a few lifeless developers who work even on Sundays. Stackoverflow.com is down :/4
Am I the only one who gets intimidated when shit its roof?
Yesterday, during crucial business hours, one of the major OMS db column type got overflowed. Caused around 30 mins downtime and then later, pool of all connections with high concurrent requests flushed down stream which caused thunder herd.
One by one.. all services went down; Fucking java service couldn't even start because of load..
This is the moment I fell in love with GoLang. We shard request using GoLang service, it just started and picked up the load beautifully..
At the end, it is around 6 millions business loss, but a good lesson learned :)
So it's been 22 days and were still going strong without a single day's downtime! https://devrant.com/rants/1147150/...2
Friend of mine messaged me about sites being down, of course Im at a family dinner with no laptop or ssh keys with me so no way to fix it!!!
Online java IDE suggestions?
My (non-dev) job is boring and I tend to have a lot of downtime, any suggestions?2
What's up with bitbucket? For the last 2 months it's been down numerous times, obviously when you need it the most :(4
For crying out loud, no, GoDaddy, you don't just shutdown expired domain without ANY warnings. No!!! Not cool!!!5
i think i'll transfer my entire work environment on my vps so if my laptop crash or whatever I can just ssh from my phone. Won't really be ergonomics, but better than downtime.14
"data randomly disappeared and caused us downtime. I fixed the problem by replacing the missing data"
"I don't see a problem. the data is there"
of course it's fucking there. I just put it back, but that doesn't change the fact that downtime happened.2
My employer should burn his DevOps system to the ground: esoteric configuration split on 1000 files, bugs and downtime almost daily, not communicated breaking changes which breaks pipelines, shitty documentation, few opportunities for customization and for everything you have to open a fucking ticket, I love programming but since I have to spend more time on a fucking ticketing system rather than on Vim my motivation is gradually falling to pieces.5
My job is paying a consultant to do some Node.js training for a few days. In our downtime the guy was telling us about his daughter whom he’s been teaching computer type shit at home for years. He says she’s got every cert offered by CompTIA. She’s 16 years old. That’s demoralizing af. I’ve got zero certifications. I’ve gotta get on top of this shit...4
Today I am an awesome because the major Ruby upgrade went out the door to production with zero downtime. What makes you badass today?5
When the guy you are relying on to do an export for an app during a MISSION CRITICAL downtime exports the wrong data and drops offline... Then you find his number in an email... then you find out he is driving somewhere and will not be back at his computer for 30 minutes...
Thanks for staying up with me @joeygreen
Should cloudflare have taken down their servers to protect their clients? Which is worse, the leak live or the downtime?1
Any suggestions to work on coding (php/sql atm) during downtime while at work? I've been learning css and js (front/back) for a year while unemployed. Just got IT call centre job in highly monitored corporate environment. Have potential side programming job but need more practice.4
I hope my boss learned his lesson: dd if=/dev/zero of=[hdd storing DB about VM cluster]
- is a very very bad idea...10
Next week I'm beginning a paid intership in an sysadmin/infrastructure manager/bit of devops position. My tutor already told me he would give me things to learn alone so we could work together on stuff, and I can't wait for it to begin.
However, in the meantime I don't have a lot of things to do, so I would like to put this downtime to use and start reading stuff.
I already know I'll be doing a lot of Linux (that, I already master pretty well), and also some Active Directory, Kubernetes, and a bit of DevOps. Those are the main keywords he throwed at me during the interview.
What subject would you advice me to start learning in advance ? Do you have nice resources/books/videos on those matters ?
I would have asked to my tutor but right now he's on holidays and I don't intend to piss him off with job related questions.
On a side note : do you have any good and complete documentation or learning resource about SELinux ? I've had issues with it on my main rig for some months and can't find any good answer so I decided to learn it as best as I can and come up with an answer on my own. Since I intend to work in the field, I should what there's to know about it anyway.6
My friends don't understand what it's like to be a dev so when I ask for times/arrangements they think I'm just being a prick about it. Sometimes I ask for specific times because I have to do pull requests and what not and I want to arrange it to maximize little downtime but because none of these guys are Devs they don't understand. How do I help these guys understand that me asking for specific times isn't about me being a prick, it just has to do with work because when I tell them that they don't get it3
Vultr's Block Storage in New Jersey has been down all day. My Mastadon node is hosted there. I'm jonesing for my Fediverse fix!2
Client needed their site transferred to our hosting environment with "NO downtime"... easy enough. Our provider doesn't support their configuration so they need their site rebuilt on our platform... again, no problem. Their current hosting plan expired YESTERDAY and they're running off the fumes of the grace period... now we've got a problem.3
Met a client. He need some abc services. Client asked me what will be the sla uptime. I told him unintentionally that it will be 99% instead of 99.99% and he was like that's good. I mean seriously....?
Was he okay with the 3 days 15 hrs downtime?5
Databases and LDAP down since 1 1/2 days...so embarrassing...am i really working in an it company???
luckily there are options beside work...hello amazon, spotify, devrant...:D
if we got Server/DB issues it always takes about at least half a day to fix it! *facepalm*1
This is an actual transcript...
Since it's way too long for the normal 5000 characters, hence splitting it up...
Infra Guy: mr Dev, could you please give some rational for update of jjb?
Dev: sparse checkout support is missing
Infra Guy: is this support mandatory to achive whatever you trying to do?
Infra Guy: u trying to get set of specific folder for set of specific components?
Infra Guy: bash script with cp or mv will not work for you?
Infra Guy: ?
Dev: when you have already present functionality why reinvent the wheel
Dev: jenkins has support for it
Dev: the jjb is the bottle neck
Infra Guy: getting this functionality onto our infra would have some implications
Dev: why should I write bash script if jenkins allows me to do that
Dev: what implications ??
Infra Guy: will you commit to solve all the issues caused by new jjb?
Dev: you show me the implications first
Infra Guy: like a year ago i have tried to get new jjb <commit_url>
Infra Guy: no, the implications is a grey area
Infra Guy: i cant show all of them and they may hit like in week or eve month
Dev: then why was it not tackled
Dev: and why was it kept like that
Infra Guy: few jobs got broken on something
Dev: it will crop up some time later
Dev: if jobs get broken because of syntax
Dev: then jobs can be fixed
Dev: is it not ???
Infra Guy: ofc
Infra Guy: its just a question who will fix them
Dev: follow the syntax and follow the guidelines
Dev: put up a test server and try and lets see
Dev: you have a dev server
Dev: why not try on that one and see what all jobs fails
Dev: and why they fail
Dev: rather than saying it will fail and who will fix
Dev: let them fail and then lets find why
Dev: I manually define a job
Dev: I get it done
Infra Guy: i dont think we have test server which have the same workload and same attention as our prod
Dev: unless you test how would you know ??
Dev: and just saying that it broke one with a version hence I wont do it
Infra Guy: and im not sure if thats fair for us to deal with implication of upgrading of the major components just cause bash script is not good enough for u
Dev: its pretty bad
Infra Guy: i do agree
Infra TL Guy: Dev, what Infra Guy is saying is that its not possible to upgrade without downtime
Infra Guy: no
Dev: how long a downtime are we looking at ??
Infra Guy: im saying that after this upgrade we will have deal with consequences for long time
Infra Guy-2: No this is not testing the upgrade is the huge effort as we dont have dev resources to handle each job to run
Dev: if your jjb compiles all the yaml without error
Dev: I am not sure what consequences are we talking of
Infra Guy: so you think there will be no consequences, right?
Dev: unless you take the plunge will you know ??
Dev: you have a dev server running at port 9000
Infra Guy: this servers runs nothing
Dev: that is good
Dev: there you can take the risk
Infra Guy: and the fack we have managed to put something onto api doesnt mean it works
Dev: what API ?
Infra Guy: jenkins api
Infra Guy: hmmm
Dev: what have you put on Jenkins API ??
Infra Guy: (
Dev: jjb is a CLI
Infra Guy: ((
Dev: is what I understand
Dev: not a Jenkins API
Infra Guy: (((
Infra Guy: jjb build xmls and push them onto api
Infra Guy: and its doent matter
Dev: so you mean to say upgrading a CLI is goig to upgrade your core jenkisn API
Dev: give me a break
Infra Guy: the matter is that even if have managed to build something and put it onto api
Infra Guy: doesnt mean it will work
Dev: the API consumes the xml file and creates a job
Infra Guy: right
Dev: if it confirms to the options which it understands
Dev: then everything will work
Dev: I am actually not getting your point Infra Guy
Infra Guy: i do agree mr Dev
Dev: we are beating around the bush
Infra Guy: just want to be sure that if this upgrade will break something
Infra Guy: we will have a person who will fix it
Dev: that is what CICD is supposed to let me know with valid reasons
Dev: why can't that upgrade be done
Infra Guy: it can be done
Infra Guy: i even have commit in place3
Today was a painful day when I realized that I need to backup my nginx configs like I backup my actual data. 20 minutes of downtime turned into an afternoon when I accidentally deleted the nginx config backups on my server. It's been... let's say fun.4
Azure, great development slots! Must have, now I can have developer, staging and production. The greatest no downtime when swapping a new server in....
Everything crashes? WTF?
OKAY, so swapping to a service that authenticates users makes the authentication part crash :/
Phew development slots ROLL BACK...
No the entire service was broken. Rolling back, all non authenticating controllers work, but the authentication never happens, so server is working, but the users cant use it. Fuck!
Delete everything. Recreate. The setting persists. WTF. Delete again, recreate, reinitialize, republish, it works as it should when tested phew.
Creating new service experiencing cant replicate. Hmm, okay must have been a glitch. Next, update, YEAH swap, no downtime!!!
*EXPLOSION* ..... RINSE AND REPEAT:/
Screamed Terraform is not a joke at coworker today.
Idiot corrupted the remote state while just trying to change the AMI of an EC2 instance for staging. I even said any amount of downtime is completely ok.
Sunday planned to Building a project VS family reunion long drive. No laptop at the moment. Beach later will just watch tutorials in my phone oh well 100 km long driving 😩
It's winter and it's quiet. Too quiet. My shitty job has me sitting here, waiting for work to appear. I could be at home working on something dev related and fun and meaningful to the progress of my life but no, I have to be here and I have to "look" productive for the bosses. I hate this shit, it's like prison, except I get paid, so I should be thankful. I can remote into my PC at home but I already got snapped for that, now I'm paranoid and afraid to try use this shitty downtime in a productive way.
Well, guess I better go sweep the already swept floors again to maintain the illusion of "work" for my penny dripping masters.
QQ having nothing to do is worse than too much to do.1
When you have to call the priority line of your hosting provider because the site went down and they're only available between 9.00 and 18.00.
They must be joking...
So I finally got full computer access 2 and a half weeks into working, and now there's "scheduled network downtime" with no indication of when it will be back up. I swear I'm never going to get anything done here
Please excuse: This is my first step into python. So consider this a beginners question:
This forked script checks a twitter page for words and sends a mail (probably using .qmail) to the owner.
If I execute this python:
"[$USER@$HOST uberspace-downtime-notify]$ python fetch.py
Traceback (most recent call last):
File "fetch.py", line 11, in <module>
ImportError: No module named html
Similar errors are fixed in this github commit https://github.com/datalib/... - but that's a more complex script and I don't quite get where the imported module is needed (on a code basis - on the logical basis all is clear)
Any idea for a guy with his first steps into python and back into programming languages since some years=5
If I had a dollar for every time a mother fucker in QA distracted me and threw me out of focus..
Nothing against QA but an unnecessary interaction costs me about an hour of downtime trying to back up to speed.
My Instagram feed wasn't loading up, neither was the profile. I thought my app was acting up and neglected it. Apparently both FB and Instagram are down for some users around the world. Sad.
Now, even my WhatsApp is unable to send across images.
Background: We switched from just simple old PHP and JS using notepad++ to PHPStorm and its infinite configurables, Symfony 4, Twig, Composer, Doctrine, Yarn, NPM, Bootstrap, ( thank the stars we didn't try to add Docker in with all this ), any other junk I'm missing here? Then upgraded to Symfony 5.
Symfony's autowiring: madness behind the curtains. I get frustrated about when and where I can just magically inject these dependencies or use config variables, you know, like the ones you define in service.yaml. Hmmm, "service".yaml. In a controller you can say getParameter() but in a service you have to inject the parameter, FROM THE "SERVICE".yaml!!! Autowiring drives me nuts. Ok, so we can supply dependencies using the constructor, that's great! Within a controller you never have to instantiate the object you're passing to the constructor (autowiring handles that). That's cool, weird when we you try to trace it for the first few times, but nice I guess. Feels like half-assin' it. What bugs me here is that it only works in controllers... I guess out of the box.. i'm not even sure. To get that feature to work for services you have to make some yaml edits. Right?Maybe? Some of the Symfony tutorials have you code up some junk then trash it. Change config then wipe that out and do X instead... so I have no idea what "out of the box" for Symfony really is.
Found this cool article that describes my frustrations in better terms and seems like a good resource to learn about autowiring. I need to continue my yaml wizardry classes. https://alanstorm.com/symfony-autow...
.....And on to YAMLs, or CSS, or JS or any other friggin' change you make to a file anywhere... Make a change, reload page, nothing... nope you have to do some hidden cheat combo of yarn dostuff -> cache:clear -> cache:warmup -> cache:cache:the:cache ... I really really hate this crap. Maybe I'm too old school for all this junk. It was simple with pure PHP. Edit code, push file, reload page, and oh look it changed! Done. So happy! Ok, Ok. Occasionally the js or css might get cached by the browser and you have to ctrl/f5 or Shift/f5 .. one of those. With this framework there's just so much more that you have to remember to do get some new feature of your site loaded.
Now, I totally get wanting to use some type of entity framework, but I feel like my entire world turned backwards. Designing tables using something like MySQL Workbench made sense. I can see all the columns and datatypes right there as i'm building them. From what I've experienced now with Symfony/Doctrine is you have to make and entity, get a shit-ton of question lobbed at you and if it's a relation field you have to really have a clear idea of the cardinality up front. Then we migrate that to the database. Carefully read through the SQL if you really really just want to use migrations:migrate in Prod. That alter table could cost you some some downtime if your table is large.
Some days man....
Client asks if we could proceed with migration today, or on monday
We agree on today and proceed to spell out the procedure, if it's okay
Client replies that they would prefer to migrate on monday, and asks how long the downtime will be, and whether it would be possible to migrate without downtime.
Why, of course, but only if your frickin infrastructure didn't consist of a *single* machine!
Ugh, why me...
I have 2 server that run in production that using SQL Server Developer Edition and SQL Server Standard Edition.This was setup by shit people before they all resigned from the company.
I need to upgrade both server to Enterprise Edition.It give me a real pain since both server is on production side now.
Is it possible to upgrade it without any error or long downtime?3
When you have to be at work becuase it's work, but you finish all your work in 1 day regularly, and it takes QA 2-3 days to get back to you.... Massive downtime.2
Starting my bachelors in Computer Science this fall! Going to try to continue to work full time as long as possible (I’m a team lead at a call center, got a lot of downtime, not too stressful) to pay for school. Any tips for CS in college?2
Spent downtime during testing passing papers with this dude in my class working on an app. Pretty chill guy imo.
Why are most (cloud-based) websites failing? They just spew out 5xx HTTP errors all the time. I can't even register properly on a website anymore. What the funk, man.3
Bloody softlayer sending notifications about expected downtime on "IMS services" (which could mean any of a great number of things), without specifying what it is, what it does or to what services or regions it is related...
Grmbl, what use is there to get a notification about unexpected maintenance if you can't even make out if you'll be affected or not!
Every 3 years or so I invest in a new iMac. I was holding on for the new M1 IMacs, which are ready to order. So I am trading in my 2017 Imac and guess what I get £420 🍾 trade in value. What I am saying is, they may seem pricey to most people but when you can get a 1/3rd back when you trade in for a machine that has run constantly for 3 years without any issues or downtime that’s a pretty good investment. 👌🏼PS the MacBook pros are shite, only a fool would by one of those😀38
If you were to host a PHP website in a managed hosting, able to handle 200 concurrent users and upgrade to a better plan with no or small downtime if needed, which would be your choice?
The ability to integrate a CI/CD solution would be really helpful.
Context: We are dealing with a one-time campaign at the company and we don't plan to integrate this project into our architecture, so we looking for alternative solutions where to host it and deploy it to.4