Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "maintenance window"
-
KISS.
Keep it simple, stupid.
At the beginning the project is nothing but an idea. If you get it off the ground, that's already a huge success. Rich features and code quality should be the last of your worries in this case.
Throw out any secondary functionality out the window from day 0. Make it work, then add flowers and shit (note to self: need to make way for flowers and shit).
Nevertheless code quality is an important factor, if you can afford it. The top important things I outline in any new non-trivial project:
1. Spend 1-2 days bootstrapping it for best fit to the task, and well designed security, mocking, testing and extensibility.
2. Choose a stack that you'll most likely find good cheap devs for, in that region where you'll look in, but also a stack that will allow you to spend most of your time writing software rather than learning to code in it.
3. Talk to peers. Listen when they tell that your idea is stupid. Listen to why it's stupid, re-assess, because it most probably is stupid in this case.
4. Give yourself a good pep talk every morning, convincing you that the choices you've made starting this project are the right ones and that they'll bring you to success. Because if you started such a project already, the most efficient way to kill it is to doubt your core decisions.
Once it's working badly and with a ton of bugs, you've already succeeded in actually making it work, and then you can tackle the bugs and improvements.
Some dev is going to hate you for creating something horrific, but that horrific thing will work, and it's what will give another developer a maintenance job. Which is FAR, far more than most would get by focusing on quality and features from day 0.9 -
So I looked at our dashboard and noticed a banner mentioning scheduled maintenance set for 7:00 AM. And I thought to myself, "I never released an update, and even if I had, the maintenance would be performed 15 minutes after the build finished, not at 7:00 AM." So I emailed my coworkers, asking if they had put up the banner, no, no. I started pulling my hair out trying to figure out what caused this banner to be created. Was there some old job that was just now running? I combed through the server logs, thousands of entries later, and I found the banner was installed by some user with the IP 172.18.0.1...which was the local machine. I went through all the users on the system, running atq to see if anyone had jobs scheduled. And there was one job scheduled, under the root user. At that moment, I legit thought to myself, "have we been hacked? How is that possible?" It's wasn't! Then I looked under /var/spool/atjobs to see what the job actually was. And then I saw it. My weekly updater cron job had installed updates and had scheduled a maintenance window to reboot the system. And I smiled, realizing that my code was now sentient.
-
Damn maintenance windows. I mean, I get that the SaaS product my boyfriend works on can't go out in the middle of the day. But he just spent weeks on a business trip, with an absolutely mad sleep schedule because he's redesigning their data center (or something. Don't ask me, it's a hardware problem...) and now he's finally coming home today and I'm super excited but all he's going to want to do this weekend is sleep. Even though I'd love to spend Sunday going to this yearly event, or finally getting another round of DnD in (everyone else in the group is available). But nope. Bloody all-hours work is taking away his (and thus my) weekend too.2
-
I used to be in an infrastructure maintenance team, and I worked with an old guy. We had a jump box we all used. This guy would work weekend maintenance windows and still be trying to get changes done at 7am, three hours after the end of the window. He was glacially slow. I remember watching him login to a prod weblogic server. He would open the Windows start menu, move his fucking mouse through two or three submenus, and finally click putty. Then, he would type out the FQDN of the jump server, and move his mouse to the connect/ok button. Then it would prompt him for his username and password, both of which took him about 90 seconds to single-finger type. Then, once loved into the jump box, he would then type ssh user@server.fqdn, rather than copying and pasting the server name.
It took him fully five minutes to get logged into the weblogic server. I could not take it. It would have taken me about ten seconds. -
!coding
I used to be a sysadmin, which meant I was in charge of quarterly server patching. My team managed about 2500 servers, running various flavors of linux and legacy unix. The vast majority(95% or more) ran Linux(SLES). Our maintenance window was always in the overnight-- 10pm to 6am --so the stroke of 10pm would be a massive cascade of patching commands sent to hundreds of servers.
Before I was brought into the process, it made use of the automation product we were tasked by mgmt to use: Bigfix. It's a real piece of shit. Though we had 2500 or so servers, this environment was dominated by windows. All our vcenter servers ran it, and more importantly, our bigfix nodes were all windows machines. That meant that while we're trying to patch, the bigfix servers would get patched by the windows team. This would cause lots of failed and timed out patching, because the windows admins never quite understood that taking down the automation infrastructure would cause problems.
As such, I got tired of depending on a bunch of button-pushing checkbox-clickers who didn't know shit about shit, so I started writing an ssh-wrapped patching system. By the time I left for my current job, patching had been reduced to a single command to initiate each group's patching and reboots, and an easy check to see when servers come back up. So usually, the way it worked out was that I would send patching orders to 750 machines or so, and within about 5 minutes, they would all be done patching, and within another 20 minutes all the ones that required rebooting but about 5 would be done rebooting.
The "all-nighter" which happened every time was waiting for oracle servers to run timed fscks against a dozen or so large filesystems per server, because they were all on ext3/4, which eats complete shit. Then, several hours later, as they finished, I would have to call the DBAs to tell them to validate their shitty servers.3 -
With a recent HAProxy update on our reverse proxy VM I decided to enable http/2, disable TLS 1.0 and drop support for non forward-secrecy ciphers.
Tested our sites in Chrome and Firefox, all was well, went to bed.
Next morning a medium-critical havock went loose. Our ERP system couldn't create tickets in our ticket system anymore, the ticket systems Outlook AddIn refused to connect, the mobile app we use to access our anti-spam appliance wouldn't connect although our internal blackboard app still connected over the same load balancer without any issues.
So i declared a 10min maintenance window and disabled HTTP/2, thinking that this was the culprit.
Nope. No dice.
Okay, i thought, enable TLS 1.0 again.
Suddenly the ticket system related stuff starts to work again.
So since both the ERP system and the AddIn run on .NET i dug through the .NET documentation and found out that for some fucking reason even in the newest .NET framework version (4.7.2) you have to explicitly enable TLS 1.1 and 1.2 or else you just get a 'socket reset' error. Why the fuck?!
Okay, now that i had the ticket system out of the way i enabled HTTP/2 and verified that everything still works.
It did, nice.
The anti-spam appliance app still did not work however, so i enabled one non-pfs cipher in the OpenSSL config and tested the app.
Behold, it worked.
I'm currently creating a ticket with them asking politely why the fuck their app has pfs-ciphers disabled.
And I thought disabling DEPRECEATED tech wouldn't be an issue... Wrong... -
"Could not complete cursor operation because the table schema changed after the cursor was declared"
So, our customer is updating their side of the database, while their own users continue to use our software. Would've been nice if we got word of the maintenance window too.1 -
The story of a normal release:
- tool gets tested "intensely" by 3 ppl quite a long time - everything works
- a major 2 days reserved as maintenance window for even more testing
- release starts
- first the admin panel of the server suddenly is not accessible anymore
- after some problems the tool is deployed
- suddenly servers are down and not pingable anymore - off on off on (provider has major problems .. good job)
- ppl start testing
- testers report lots and lots of new bugs - seems like the testing wasn't that intense after all...
- people start coming with lots of new requirements (oh we need to import those excels.. excels don't match our internal stuff.. )
- confusion over confusion
- getting pissed of a lot...
- quit caring and focus on another project
- profit
Fuck my life -
McDonald's has a Window problem... And a really slow customer...
I have a buying problem tho.... Just bought an OP6 this morning and rationalized it by saying I need a new camera and my monthly maintenance is $500 anyway so...2 -
I am the responsible for the atlassian Suite at work, as I maintain the systems, set them up, and stuff.
One day, our crowd (the authentication and authorization application) just went crazy. At like lunch time it could not connect to the AD anymore. No reasons. Throwing XSRF errors (cross site scripting), because http would connect to https. "won't do it, fuck you" it told me. Out of the blue. Noone changed anything. And yea, seriously. Noone did.
It just refused to connect (as connecting to AD is connecting yourself with you own api. And refusing yourself talking to yourself). It runs behind a proxy. Therefore http/https. Well, this worked for years. But out of sudden not anymore.
Yea. Fuck you.
It was reported some hours later, at like 3pm, as people could not login to the applications using crowd as authentication and authorization server.
Tried to debug the system, where nothing was did, to make it work. At best time to fail.
First workaround: if you are logged into one of the other applications of atlassian, just refresh the site, so your SSO token gets a refresh and you are signed on again.
Then I searched more and more. And more.
But nothing worked, nothing helped.
So I addressed an emergency maintenance, take down the whole Suite, restart crowd, to apply some changes to it's settings, not knowing what happening then, because all connections of SSO will then be released. Sent out the mail like 30 minutes beforehands.
While waiting for the window, I just typed my credentials... And redid, and redid, so to type and being bored.
Three minutes before the window...
It just worked again.
Well. Wtf. Serioudl
Just came back.
No Intrusion, no changes at all. Just came back, as nothing has happened.
Kind of best part of this story... A headhunter messaged me on my way home to offer me a job as an Atlassian Suite SysAdmin for a company, at kinda the double of my salary.
At first I was thinking to go there, and when someone then asked me sth about Atlassian just start to laugh and then leave still laughing...
But then I very nicely respond that I dont want to cry at work. And wished him best luck.
I am doing some bad upgrades now on our Suite. Very painful.
And I looked into the start scripts. Some Look like the untalented intern tells another one to write scripts. Seriously wtf.
Today I followed the guide to Update a confluence and change database to Postgres. Didnt work, Postgres error.
Try it again, jquery won't load. Next try, tomcat not starting anymore. Did same thing. Every fucking time.
Yea. Maintenance window to get a nice new export soon. Will only take an hour.
To switch database in confluence, you need to set it up very fresh. And then Import your export.
Export takes an hour at our system.
Importing maybe the same time. Hope it will work (hint: Nope).
Oh, can be nice also. Just tell the Bitbucket to migrate databases, there is a fucking setting for it. Enter new database, ready, go, finished.
At least they don't raise costs very much every kinda year.
Oh sorry, yes, they do.4 -
When does the bank decide to take their site down for maintenance? Payday. I'm not alone in this boat, millions of people get paid today (It's the last business day before the 15th!)
Not only can I not see what's in my account, but I can't pay the mortgage, the car payment, or a bunch of other bills.
Of all the fucking times to pick a maintenance window, somebody had to not be thinking about this.3 -
We’re only random people living in random places, speaking random languages, eating random food, sleeping, studying and working random hours. Traveling to random points on a sphere.
Just random range is different.
Just random stuff happens on crossroads of two random dots and the entropy speed ups or slows down.
Nothing special at all.
Just a finite state machine iteration.
I mean the amount of effort we put into explanation of infinity is outstanding.
What if there is no infinity at all ?
What if infinity is just misunderstanding of our interpretation of the world around us. It’s just pixels, resolution, gaussian splatting, quantum state, you name it.
Hey man the world is flat. Just put it to the 2d space. How many space you need from a simulation perspective where your patient eyes can only see up to certain amount of light particles per second on a shitty lens.
Propose a world optimization techniques by slowing down subject perception, tiredness introduced. Compress memory, sleep introduced. Limit neurons, cpu power assigned. Deploy on cloud - put it to life. Exit 0 body failure. Exit 1 suicide. Kill -9 killed by tty from ip EARTH.X.Y
What you can do to make the world around this planet alive? Make it blink.
We developers are lazy and I believe that nature is even more lazy than us.
You think you’re going to elevator right now ? You’re going to the preloader. Looking at the window equals playing video from playback. Never goes live, just precomputed fsm. Cars, trains, airplains ? Preloaders everywhere. Highways to split traffic to cities and communication. The road and cities planning department is a matrix maintenance department. And don’t get me started about space.
Space is empty because it’s not even finished. So they put it all behind glass called milky way. You know how glass looked 500 years ago ? It was milky so it’s milky way so we don’t see shit.
If the space would be finished I’ll be starting writing this text from mars, finished it and sent from earth but no it’s light years guys, light years is not a second for a matter. Light year is a second of the the injected thoughts exchange only. Thoughts of the global computer called generative AI that they introduced on local computing devices called cloud.
Even the preloader system is not present, they left us with the one map and overpopulated demo. What a shit hole.I bet they’re increasing temperature right now to erase this alpha build and cash out. Obviously so many bugs here that his one can’t be fixed anymore. To many viruses.
Hope for 0days to start happening so we can escape using time travel or something.
I bet they cut a budget or something, moved the team to other projects. Or even worse solar system team got layoff off because we are just neurons that ordered to do it. And now we’re stuck in some maintenance mode, no new physics no new thoughts to pursue, just slow degeneration. I would pay more for the next run and switch to other galaxy far far away where they at lest have more modern light speed technology.
What do you think about it Trinity ? Not even worth wasting your time for that. No white rabbit this time.
I do not recommend this game at this stage of early access.
- only one available map despite promises for expansions over the years no single dlc arrived,
- missing space adventures
- no galaxy travel mode only a teaser trailers of what you can do in other “universes”
- developers don’t respond to complains
- despite diversity of species and buildings at first sight world looks to generic
- instead of new features bots with mind manipulation, AB testing and data harvesting was introduced
- death anti cheat mode installed1 -
migrating servers to new hardware in vmware...
project manager: I don't want to log in to do 15 minutes worth of work on Sunday during the scheduled maintenance window. can you stay late on Friday? I'll have someone shutdown the servers at 515.
so you want me to waste my time to save your time?.... fine....