5
nerdbeere
10d

I am the responsible for the atlassian Suite at work, as I maintain the systems, set them up, and stuff.

One day, our crowd (the authentication and authorization application) just went crazy. At like lunch time it could not connect to the AD anymore. No reasons. Throwing XSRF errors (cross site scripting), because http would connect to https. "won't do it, fuck you" it told me. Out of the blue. Noone changed anything. And yea, seriously. Noone did.
It just refused to connect (as connecting to AD is connecting yourself with you own api. And refusing yourself talking to yourself). It runs behind a proxy. Therefore http/https. Well, this worked for years. But out of sudden not anymore.

Yea. Fuck you.

It was reported some hours later, at like 3pm, as people could not login to the applications using crowd as authentication and authorization server.

Tried to debug the system, where nothing was did, to make it work. At best time to fail.

First workaround: if you are logged into one of the other applications of atlassian, just refresh the site, so your SSO token gets a refresh and you are signed on again.

Then I searched more and more. And more.
But nothing worked, nothing helped.

So I addressed an emergency maintenance, take down the whole Suite, restart crowd, to apply some changes to it's settings, not knowing what happening then, because all connections of SSO will then be released. Sent out the mail like 30 minutes beforehands.

While waiting for the window, I just typed my credentials... And redid, and redid, so to type and being bored.
Three minutes before the window...

It just worked again.

Well. Wtf. Serioudl
Just came back.

No Intrusion, no changes at all. Just came back, as nothing has happened.

Kind of best part of this story... A headhunter messaged me on my way home to offer me a job as an Atlassian Suite SysAdmin for a company, at kinda the double of my salary.

At first I was thinking to go there, and when someone then asked me sth about Atlassian just start to laugh and then leave still laughing...

But then I very nicely respond that I dont want to cry at work. And wished him best luck.

I am doing some bad upgrades now on our Suite. Very painful.
And I looked into the start scripts. Some Look like the untalented intern tells another one to write scripts. Seriously wtf.

Today I followed the guide to Update a confluence and change database to Postgres. Didnt work, Postgres error.
Try it again, jquery won't load. Next try, tomcat not starting anymore. Did same thing. Every fucking time.

Yea. Maintenance window to get a nice new export soon. Will only take an hour.

To switch database in confluence, you need to set it up very fresh. And then Import your export.
Export takes an hour at our system.
Importing maybe the same time. Hope it will work (hint: Nope).

Oh, can be nice also. Just tell the Bitbucket to migrate databases, there is a fucking setting for it. Enter new database, ready, go, finished.

At least they don't raise costs very much every kinda year.
Oh sorry, yes, they do.

Comments
  • 3
    Sounds awful. That's basically a comprehensive list of software I avoid like the plague.
  • 1
    The Atlassian suite is the poster child for why "SaaS" is crap software that managers were suckered into paying for year after year because they can't add. Subscription licences benefit the software seller only. Buyers pay less up front but a huge premium over time. Users get whatever shit gets peddled by vendors that are convinced that more releases equates to quality.
  • 2
    It was probably a Problem with the system clock. A reboot fixed the issue, bc the clock re-synced correctly. Most transient https issues are caused by this. You can verify by doing a test from curl, or openssl from the affected system to the problem server.
    Atlassian is a whole issue in itself. Prices are Good for small teams, but it is really bad news for big ones.
  • 1
    Maybe, but it just worked again before I got to restart again.
Add Comment