Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
Get a devDuck
Rubber duck debugging has never been so cute! Get your favorite coding language devDuckBuy Now
Search - "test in prod"
I worked with a good dev at one of my previous jobs, but one of his faults was that he was a bit scattered and would sometimes forget things.
The story goes that one day we had this massive bug on our web app and we had a large portion of our dev team trying to figure it out. We thought we narrowed down the issue to a very specific part of the code, but something weird happened. No matter how often we looked at the piece of code where we all knew the problem had to be, no one could see any problem with it. And there want anything close to explaining how we could be seeing the issue we were in production.
We spent hours going through this. It was driving everyone crazy. All of a sudden, my co-worker (one referenced above) gasps “oh shit.” And we’re all like, what’s up? He proceeds to tell us that he thinks he might have been testing a line of code on one of our prod servers and left it in there by accident and never committed it into the actual codebase. Just to explain this - we had a great deploy process at this company but every so often a dev would need to test something quickly on a prod machine so we’d allow it as long as they did it and removed it quickly. It was meant for being for a select few tasks that required a prod server and was just going to be a single line to test something. Bad practice, but was fine because everyone had been extremely careful with it.
Until this guy came along. After he said he thought he might have left a line change in the code on a prod server, we had to manually go in to 12 web servers and check. Eventually, we found the one that had the change and finally, the issue at hand made sense. We never thought for a second that the committed code in the git repo that we were looking at would be inaccurate.
Needless to say, he was never allowed to touch code on a prod server ever again.9
Before an interview prepare a list of questions for them, they expect it!
My list to give inspiration:
Describe your company culture? - if the response is buzzword heavy, avoid.
What’s the oldest technology still in use? - all companies have legacy systems but some are worse than others
Describe your agile process? - a few companies I’ve interviewed with said they are agile but it’s actually kanban
Are developers involved with customers?- if they trust you to talk to customers you can infer trust to do your job ( I’m sure others will disagree)
Describe your development environment?- do they have such a thing as dev, test and prod?
These are the only ones I can remember but should give others a bit of inspiration I hope 😄9
Someone called me an incapable, arrogant bitch today because I didn't let them test in prod. We had words.
The monologue from Full Metal Jacket went through my head at one point, and I almost didn't mute my phone in time before I said it rather than just thinking it. I'm not sure I would still have a job if I did, but I kind of wish I didn't mute my phone.13
Hello! I am Johnny Knoxville and tonight on Jackass...we test in prod with no back out plan in place...and then I will glue Bam's hand to his dick while he's sleeping.6
When migrating from MySQL server to MariaDB and having a query start returning a completely different result set then what was expected purely because MariaDB corrected a bug with sub selects being sorted.
It took several days to identify all that was needed on that sub select was “limit 1” to get that thing to return the correct data, felt like an idiot for only having to do 7 character commit 😆4
That feeling when your client connection is more stable than the connection of a fucking game server... Incompetent pieces of shit!!! BEING ABLE TO PUT A COUPLE OF SPRITES DOESN'T MAKE YOU A FUCKING SYSADMIN!!!
Oh and I sent those very incompetent fucks a mail earlier, because my mailers are blocking their servers as per my mailers' security policy. A rant from the old box - their mail servers self-identify a fucking .local!!! Those incompetent shitheads didn't even properly change the values from test into those from prod!! So I sent them an email telling them exactly how they should fix it, as I am running the same MTA on my mailers (Postfix), at some point had to fix my mailers against the exact same issue as well, and clearly noticed in-game that they have deliverability problems (they explicitly mention to unblock their domain). Guess why?! Because their server's shitty configuration triggers fucking security mechanisms that are built against rogue mailers that attempt to spoof themselves as an internal mailer, with that fucking .local! And they STILL DIDN'T CHANGE IT!!!! Your fucking domain has no issues whatsoever, it's your goddamn fucking mail servers that YOU ASOBIMO FUCKERS SHOULD JUST FIX ALREADY!!! MOTHERFUCKERS!!!!!2
On the first day of Christmas, the bossman gave to me: The fact that my new computer purchase order needs to be OKed by the CEO and I need to continue working on a 2014 Mac Mini (i5-4260U, 8 Gig RAM, GPU shot by an ESD on the case long ago) for the next year.
On the second day of Christmas, my family gave to me... a good reason to get shitfaced
On the third day of Christmas, getting shitfaced gave to me: A hangover and some urgent plastic welding job that had to be done with a soldering iron. FML, I've had a headache before breathing in pure hydro-cyano-whatthefuckyougetwhenyoumeltplastics
On the fourth day of Christmas, my team gave to me: A legacy, age-old Rails 2 project that was written by an intern and never reviewed, went to prod in 2014 and can't be changed anymore, but needs to be changed after the fact that it has zero test coverage and needs 100 % now to prevent issues and costly manual testing.
On the fifth day of Christmas, devrant gave to me: The Idea that making fun of Christmas songs to get over the sheer amount of dicks that working over the twelve days of Christmas sucks.
To be continued...2
TLDR - you shouldn't expect common sense from idiots who have access to databases.
I joined a startup recently. I know startups are not known for their stable architecture, but this was next level stuff.
There is one prod mongodb server.
The db has 300 collections.
200 of those 300 collections are backups/test collections.
25 collections are used to store LOGS!! They decided to store millions of logs in a nosql db because setting up a mysql server requires effort, why do that when you've already set up mongodb. Lol 😂
Each field is indexed separately in the log.
1 collection is of 2 tb and has more than 1 billion records.
Out of the 1 billion records, 1 million records are required, the rest are obsolete. Each field has an index. Apparently the asshole DBA never knew there's something called capped collection or partial indexes.
Trying to get approval to clean up the db since 3 months, but fucking bureaucracy. Extremely high server costs plus every week the db goes down since some idiot runs a query on this mammoth collection. There's one single set of credentials for everything. Everyone from applications to interns use the same creds.
And the asshole DBA left, making me in charge of handling this shit now. I am trying to fix this but am stuck to get approval from business management. Devs like these make me feel sad that they have zero respect for their work and inability to listen to people trying to improve the system.
Going to leave this place really soon. No point in working somewhere where you are expected to show up for 8 hours, irrespective of whether you even switch on your laptop.
Wish me luck folks.4
I really, really, fucking god damn it REALLY need to move a legacy project from the grave yard server and get it in git, and then build a dev environment for it, so I can stop making incredibly volatile changes direct to PROD (backend, frontend and DB all at once and then test it while it’s live and being used, but fuck me if I can be bothered digging through a 10GB code base and attempting to make it work in a multi-environment setup when it’s going to be a long trip down the error logs until it works again 😱🔫2
Long time ago, back in a day of Microsoft Office 95 and 97, I was contracted to integrate a simple API for a payment service provider.
They've sent me the spec, I read it, it was simple enough: 1. payment OK, 2. payment FAILED. Few hours later the test environment was up and happy crediting and debiting fake accounts. Then came the push to prod.
I worked with two other guys, we shut down the servers, made a backup, connected new provider. All looked perfectly fine. First customers were paying, first shops were sending their products... Until two days later it turned out the money isn't coming through even though all we are getting from the API is "1" after "1"! I shut it off. We had 7 conference calls, 2 meetings, 3 days of trying and failing. Finally, by a mere luck, I found out what's what.
You see, Microsoft, when you invent your own file format, it's really nice to make it consistent between versions... So that the punctuation made in Microsoft Word 97 that was supposed to start from "0" didn't start from "1" when you open the file in Microsoft Word 95.
Also, if you're a moron who edits documentation in Microsoft Word, at least export it to a fucking PDF before sending out. Please.
Typos kill, kids! And deploying to production.
Instead of "for item in items" in my script, I accidentally did "for items in items". Thus, an exponential loop has been entering things into the database for the past few hours before I found the place to fix it.
By the way, this runs on cron every minute. So there are processes still running exponentially right now, possibly 180+.
Yeah, I'm setting up a a test server instead now.11
I've been staffed on a old ongoing project, first day.
0. Compatibility has to be guaranteed down till IE9... ppf.
1. Front end made in XHTML+JS(jQuery)... bah, ok.
2. XHTML+JS is actually generated by PHP5.4, not a line is actually statically served... beh, funny, ok.
3. PHP files are the output of an XSLT transform of a bunch of XMLs... meh, seriously? Oooook.
4. XMLs are the product of the serialisation of a truck of stateful JavaEE6 DTOs populated magically (undocumented) with data coming from a SQL DB... WTF mode!!!
5. Session logics lives within PHP-land at point 2, front end makes ajax calls here that propagates to another WS out of our control that triggers -somehow- (undocumented) our Java backend at point 4 to generate new XMLs and then reach front end again. Kill me now.
Boss: look... it's too slow for the client, it's too heavy on our servers: fix it. Ah, and we sold 85% test coverage by October. You're the man for the job. (I'm a Node.js fullstacker and right now there's not even a testing scaffold, ofc).
Me: prod is on Linux or Windows?
Me: rm -rf / as root. Done.
Boss: I know I know...
I think time has come...6
There is a company providing a very speciffic service. And it has a core application for that svc, supported by a core app team.
That company also has other services, which are derivations of the core one. So every svc depends on core.
Now that we're clear on that... I was working in a team of one of the subservices. We very strongly depended on core. In fact, our svc was useless if integration w/ core broke down.
The core team had an annoying habbit. They refused to version their webservices and they LOVED to push api updates w/o any warnings. Our prod, test, other envs used to fail bcz of core api changes quite often. Mgmt, IT head was aware of the problem and customers' complaints as well.
So as a result, once core api changes we're all in a panic mode: all prior priorities are lowered and revival of prod is to be our main focus. Core api is not docummented, the changes are not clear, so we have to reverse engineer the shit out of it. We manage to patch our prod up w/ hotfixes, but now we have tech debt. While working on the debt, core api changed again, in test env. Mgmt pushes debt back and reallocates us to hotfix test. Hotfix is 80% done when another core api breaks. Now mgmt asks us to drop wtv we're working on and fix that new break. By the time we're to deploy the hotfix, another api breaks in another env. The mgmt..... You get the picture :)
2 years go by, nothing has changed so far.6
I once ran a batch import job to stress test with much more data than usual in staging... so I thought.
I sshd into prod by accident. History search can be a bastard at times...
The moment I realized what I was doing was crazy. I was shivering and thought I get fired if anything went wrong.
fortunately I just duplicated the original data for the test, and the system was built to ignore unnecessary updates... so the data was correct and nothing went wrong.
Not an active stupid dev choice but still something I will remember for a while.
So after 6 months of asking for production API token we've finally received it. It got physically delivered by a courier, passed as a text file on a CD. We didn't have a CD drive. Now we do. Because security. Only it turned out to be encrypted with our old public key so they had to redo the whole process. With our current public key. That they couldn't just download, because security, and demanded it to be passed in the fucking same way first. Luckily our hardware guy anticipated this and the CD drives he got can burn as well. So another two weeks passed and finally we got a visit from the courier again. But wait! The file was signed by two people and the signatures weren't trusted, both fingerprints I had to verify by phone, because security, and one of them was on vacation... until today when they finally called back and I could overwrite that fucking token and push to staging environment before the final push to prod.
Only for some reason I couldn't commit. Because the production token was exactly the same as the fucking test token so there was *nothing to commit!*
BECAUSE FUCKING SECURITY!5
I used to work on a production management team, whose job was, among other things, safeguarding access to production. Dev teams would send us requests all the time to, "run a quick SQL script."
Invariably, the SQL would include, "SELECT * FROM db_config."
We would push the tickets back, and the devs would call us, enraged. I learned pretty quickly that they didn't have any real interest in dev, test, or staging environments, and just wanted to do everything in prod, and see if it works.
But they would give up their protests pretty fast when I offered to let them speak to a manager when they were upset I wouldn't run their SQL.2
This was some time ago. A Legendary bug appeared. It worked in the dev environment, but not in the test and production environment.
It had been a week since I was working on the issue. I couldn't pinpoint the problem. We CANNOT change the code that was already there, so we needed to override the code that was written. As I was going at it, something happened.
Manager: "Hey, it's working now. What did you do?"
Me: *Very confused because I know I was nowhere close to finding the real source of the problem* Oh, it is? Let me check.
Also me: *Goes and check on the test and prod environment and indeed, it's already working*
Also me to the power of three: *Contemplates on life, the meaning of it, of why I am here, who's going to throw out the trash later, asking myself whether my buddies and I will be drinking tonight, only to realize that I am still on the phone with my manager*
Me again: "Oh wow, it's working."
Manager: "Great job. What were the changes in the code?"
Me: "All I did was put console logs and pushed the changes to test and prod if they were producing the same log results."
Manager: "So there were no changes whatsoever, is that what you mean?"
Me: "Yep. I've no idea why it just suddenly worked."
Manager: "Well, as long as it's working! Just remove those logs and deploy them again to the test and prod environment and add 'Test and prod fix' to the commit comment."
Me: "But what if the problem comes up again? I mean technically we haven't resolved the issue. The only change I made were like 20 lines of console logs! "
Manager: "It's working, isn't it? If it becomes a problem, we'll work it out later."
I did as I was told, and Lo and Behold, the problem never occurred again.
Was the system playing a joke on me? The system probably felt sorry for me and thought, "Look at this poor fucker, having such a hard time on a problem he can't even comprehend. That idiotic programmer had so many sleepless nights and yet still couldn't find the solution. Guess I gotta do my job and fix it for him. I'm the only one doing the work around here. Pathetic Homo sapiens!"
Don't get me wrong, I'm glad that it's over but..
What the fuck happened?5
So... Heard back from a recruiter today. Lovely lass.
I’d passed over a submission for her tech demo.
The brief was basically just to create a small simple module that calculates shit, nae effort.
But, when the recruiter had me on the phone she said “I know it’s a silly small module but try and run it up like you would a production ready app”.
The job spec and recruiter were keen on me demonstrating TDD, not specific on js version, final runtime, etc. The job was a senior spec at a higher salary range. So it warranted some effort, and demonstrating more than a simple module.
“Okay, cool, nae bother, let’s crack on.”
The feedback in the response from the dev today:
“He’s over-engineered tests, build...”
SUCK MY LEFT TESTICLE YOU FUCKWIT.
Talk to your recruiters, not me.
The feedback included a phrase I never hope to hear from a developer I work with:
“Tests are good but...” 😞
It was a standard 98% test suite from an RGR cycle, no more or less than I’d expect in prod.
The rest of the feedback was misguided or plain wrong. It was useful to see because I know now when they say they have “high standards” they mean: we listen to the dude who put the factory pattern in a JS brief.
Oh shit also: “someone’s done chmod 777” was in there as a sarcastic comment in the feedback. It was his fucking unarchive tool 😞
My response was brief and polite: “cheers for the consideration, all the best, James”
It’s honestly not worth warning them. Or, asking why they’d criticise something they’d asked me to do.
If you want a shitty js module, ask for a shitty js module and no more.4
Developing a notification API, sends emails to subscribers, email API can take only 100 IDs at once, so partitioned the email list and send mails in blocks of 100.
Forgot to reset the list after every block, so each new partition got appended to the existing list and kept going on.
Ran it against a test DB, which was recently refreshed with near-prod data !!! Thousands of emails went out of the app server in one shot and everybody receiving numerous duplicate emails. Especially the ones in the very first partition.
Got an incident raised by the CEO himself reg the flurry of emails. But, things were out of our hands, quite literally. All emails are queued up in the exchange server.
Called up the exchange server team, purged the queued emails. No other emails were sent/received during this whole episode.
Thanks to Iterables.partition in the present day.3
ughh, studying about various software project management methodologies and lifecycles is so boring.
every different model looks similar, saying :
you got an idea?
- check weather its needed and if its practically / financially possible
- get investment and resources,
- design,develop, test, release
- repeat .
why name them waterfall or spiral or rad or agile or shit?
and we know how project go in reality: "fuck its 2 days to release and 5 features left? push to prod, make breaking features, leave the tests and release"7
So last week I really fucked up
I had this new implementation that was supposedly to be integrating smoothly into the rest of the service. It depended on a serialized model made by a data scientist. I test it in local, in QA environment: no problem.
So, Friday, 4pm, I decide to deploy to production. I check once from the app: the service throw an error. Panic attack, my chief is at my desk, we triy to understand what went wrong. I make calls with cUrls: no problem. Everything seems fine. I recheck from the app again: no problem.
We dedice to let it in prod, as the feature work. I go get some beers with the guys, to celebrate the deploy.
Fast-forward the next morning, 11am, my phone ring: it's a colleague of my chief. "Please check Slack, a client is trying to use the feature, it's broken"
Panic attack again. I go to the computer, check the errors: two types of errors. One I can fix, the other from a missing package on the machine that the data guy used.
Needless to say, I had a fairly good weekend.
- make sure Dev, QA and Prod are exactly the same (use Ansible or Container)
- never deploy on a Friday afternoon if you don't have a quick way to revert1
TL;DR Dear boss, firstly, you always get someone to review anything important done by a fucking intern.
Secondly, you do not give access to your fucking client's production server to an intern.
Thirdly, you don't ask your fucking intern to test the intern's work that has not been reviewed by anyone directly on your client's fucking production server.
Last week, the boss and one of the lead devs (the only guy with some serious knowledge about systems and networking) decided to give me (an intern who barely has any work experience) the task of fixing or finding an alternate solution to allowing their support team access to their client machines. Currently they used a reverse SSH tunnel and an intermediary VH but for some reason, that was very unreliable in terms of availability. I suggested using OpenVPN and explained how it would work. Seemed to be a far better idea and they accepted. After several days of working through documentations and guides and everything, I figured out how OpenVPN works and managed to deploy a TEST server and successfully test remote access using two VMs. On seeing my tests, the boss told me that he wanted to test it on the client network. I agreed. Today he comes to me and he tells me to prepare testing for tomorrow and that the client technician is going to give me access to one of their boxes. And then he adds, "It's a working prod server. We'll see if we can make it work on that" and left. I gaped at him for a while and asked another dev guy in the room if what I heard was right. He confirmed. Turns out, the lead dev and the boss's son (who also works here) had had a huge argument since morning on the same issue and finally the dev guy had washed it off his hands and declared that if anything goes wrong from testing it on production, it's entirely the boss's own fault. That's when the boss stepped in and approached me. I ran back to his office and began to explain why prod servers don't top the list of things you can fuck around with. But he simply silenced me saying, "What can go wrong?" and added, "You shouldn't stay still. You should keep moving". Okay, like firstly what the fuck and secondly, what the fuck?.
Even though OpenVPN client is not the scariest thing to install, tomorrow's going to be fun.4
Departure: 13.00 Train: test Destination: A1 Delay 18888min
Now it is 15:43 and it is still on...
Finaly poland joins the testing in the prod club!
Station is Wroclaw if anybody is interested.2
Just needing somewhere to let some steam off
Tl;dr: perfectly fine commandline system is replaced by bad ui system because it has a ui.
For a while now we have had a development k8s cluster for the dev team. Using helm as composing framework everything worked perfectly via the console. Being able to quickly test new code to existing apps, and even deploy new (and even third party apps) on a simar-to-production system was a breeze.
We are now required to commit every helm configuration change to a git repository and merge to master (master is used on dev and prod) before even being able to test the the configuration change, as the package is not created until after the merge is completed.
Rolling out new tags now also requires a VCS change as you have to point to the docker image version within a file.
As we now have this awesome new system, the ops didn't see a reason to give us access to kubectl. So the dev team is stuck with a ui, but this should give the dev team more flexibility and independence, and more people from the team can roll releases.
Back to reality: since the new system we have hogged more time from ops than we have done in a while, everyone needs to learn a new unintuitive tool, and the funny thing, only a few people can actually accept VCS changes as it impacts dev and prod. So the entire reason this was done, so it is reachable to more people, is out the window.3
Rolled out a new application I built almost entirely by myself 2 days ago... But my dev group is understaffed and has a project manager who is literally the most clueless person I have ever met, so as a result, we don't have a functional/useful dev/test/prod framework and no standards for how to deploy apps. So my past 2 days were comprised of fixing bugs in the live system that could probably have been caught if I had the time and resources to get everything thoroughly tested. It's stable now, but damn our management for being generally idiots. Our motto appears to be "Fuck it, we'll do it live"1
I hate dev politics...
PM: Hey there is a weird error happening when I upload this file on production, but it works on our test environments.
Me: After looking at this error, I don't find any issues with the code, but this variable is set when the application is first loaded, I bet it wasn't loaded correctly our last deployment and we just need to reload the application.
Senior Dev: We need to output all of the errors and figure out where this error is coming from. Dump out all the errors on everything in production!!
Me: That's dumb... the code works on test... it's not the code.. it's the application.
Senior dev: %$*^$>&÷^> $
Me: Hey I have an idea! If test works... I can go ahead and deploy last week's changes to prod and dump those errors you were talking about!!
Senior Dev: OK
Me: *runs Jenkins job the deploys the new code and restarts the application*
PM: YAY you fixed it!!
Senior Dev: Did you sump put those errors like I said.
Me: Nope didn't touch a thing... I just deployed my irrelevant changes to that error and reloaded the application.2
TL;DR: When picking vendors to outsource work to, vet them really well.
Got a large redesign project that involves rebuilding a website's main navigation (accessibility reasons).
Project is too big just for our dev team to handle with our workload so we got to bring a 3rd party vendor to help us. We do this often so no big deal.
But, this time the twist was Senior Management already had retained hours with a dev shop so they want us to use them for project. Okay...
Have our scope / discovery meeting about the changes and our expected DevOps workflow.
Devs work Local and push changes to our Github, that kicks off the build and we test on Dev, then it goes to Staging for more testing & PM review. Once ready we can push to prod, or whenever needed. All is agreed, everyone was happy.
Emailed the vendors' project manager to ask for their devs Github accounts so we can add them to the project. Got no reply for 3 days.
4th day, I get back "Who sets up the Github accounts?"
fuck me. they've never used Github before but in our scope meeting 4 days ago you said Github was fine...??
Whatever, fuck it. I'll make the accounts and add them.
Added 4 devs to the repo and setup new branch. 40min later get an email that they can't setup dev environment now, the dev doesn't know how to setup our CMS locally, "not working for some reason."
So, they ask for permission to develop on our STAGING server.. "because it's already setup"... they want to actively dev on our staging where we get PM/Senior Management approvals?
We have dev, staging, production instances and you want to dev in staging, not dev?... nay nay good sir.
This is whom senior management wants us to use, already paid for via retainer no less. They are a major dev shop and they're useless...
Cant wait for today's progress checkup meeting. 😐😐
I've now worked on both monolithic solutions and microapps/microservices. I gotta say I'm not sold on the new approach. There's so much overhead! You don't have to know your way around one solution -- no, now you need to know your way around 100 solutions. Debugging? Yeah, good luck with that. You don't have to provision one environment for dev, test, staging, and prod. No, now you need 100 environments per... environment. Now, you need a dedicated fulltime devops person. Now devs can check in breaking changes because their code compiles fine in that one tiny microapp. The extra costs go on and on and on. I get the theoretical benefits but holy crap you pay for it dearly. Going back to monolithic is so satisfying. You just address the bug or new feature head on without the ceremony and complexity. You know you're not crapping on other people's day (compilation-wise) because the entire solution compiles.
...and yeah, I'm getting old. So get off the lawn! ;)2
I'm still studying computer science/programming, I still have one year to do in order to graduate (Master). I am in a work study program so I'm working for a company half of the time, and I'm studying the other half. It is important to mention that I am the only web developer of the company
When I arrived in the company 9 months ago, I was given a Vue project which had been developed by a trainee a few weeks before my arrival and I was asked to correct a few things, it was mostly about css. Then, I was ask to add a few functionalities, nothing really hard to code, and we were supposed to test the solution in a staging environment, and if everything was ok, deploy it to prod.
However, the more I did what I was asked, the more functionalities I had to implement, until I reached a point where I had to modify the API, create new routes, etc. I'm not complaining about that, that's my job and I like it. But the solution was supposed to be ready when I arrived, it was also supposed to be tested and deployed.
The problem is, the person emitting these demands (let's call him guy X) is not from the IT service, it's a future user of the website in the admin side. The demands kept going and going and going because, according to him, the solution was not in a good enough state to be deployed, it missed too many (un)necessary features. It kept going for a few months.
The best is yet to come though : guy X was obviously a superior, and HIS superior started putting pressure on me through mails, saying the app was already supposed to be in production and he was implying that I wasn't working fast enough. Luckily, my IT supervisor was aware of what was going on and knew I obviously wasn't to blame.
In the end, the solution was eagerly deployed in production, didn't go through the staging environment and was opened to the users. Now, guy X receives complaints because none of what I did was tested (it was by me, but I wasn't going to test every single little thing because I didn't have time). Some users couldn't connect or use this or that feature and I am literally drowning in mails, all from guy X, asking me to correct things because users are blocked and it's time consuming for him to do some of the things the website was doing manually.
We are here now just because things have been done in a rush, I'm still working on it and trying to fix prod problems and it's pissing me off because we HAVE a staging environment that was supposed to prevent me from working against the clock.
On a final note, what's funny is that the code I'm modifying, the pre-existing one needs to be refactored because bits and pieces are repeated sometimes 5 times where it should have been externalized and imported from another file. But I don't know when and if I will ever be able to do that.
I could have given more context but it's 4am and I'm kinda tired, sorry if I'm not clear or anything. That's my first rant
Ever have one of those moments where you're running a service you built to update about a decade worth of police records, realize about halfway through that you fucked the loop and you're copying data from the first record onto every other record, and then just really wish that you had checked things better in test before running this on the prod server?
I'm sure the only reason I'm still here is because the audit log contained the original values and I'm good at pulling data out of it.1
Tl;Dr Im the one of the few in my area that sees sftping as the prod service account shouldn't be a deployment process. And the ONLY ONE THAT CARES THAT THIS IS GONNA BREAK A BUNCH OF SHIT AT SOME POINT.
The non tl;dr:
For a whole year I've been trying to convince my area that sshing as the production service account is not the proper way to deploy and/or develop batch code. My area (my team and 3 sister teams) have no concept of using version control for our various Unix components (shell scripts and configuration files) that our CRITICAL for our teams ongoing success. Most develop in a "prodqa like" system and the remainder straight in production. Those that develop straight in prodqa have no "test" deployment so when they ssh files straight to actual production. Our area has no concept of continuous integration and automated build checking. There is no "test cases", no "systems testing" or "regression testing". No gate checks for changing production are enforced. There is a standing "approved" deployment process by the enterprise (my company is Whyyyyyyyyyy bigger than my area ) but no one uses it. In fact idk anyone in my area who knows HOW to deploy using the official deployment method. Yes, there is privileged access management on the service account. Yes the managers gets notified everytime someone accesses the privileged production account. The managers don't see fixing this as a priority. In fact I think I've only talk to ONE other person in my area who truly understands how terrible it is that we have full production change access on a daily basis. Ive brought this up so many times and so many times nothing has been done and I've tried to get it changed yet nothing has happened and I'm just SO FUCKING SICK that no one sees how big of a deal this. I mean, overall I live the area I work in, I love the people, yet this one glaring deficiency causes me so much fucking stress cause it's so fucking simple to fix.
We even have an newer enterprise deployment. Method leveraging a product called "urban code deploy" (ucd) to deploy a git repository. JUST FUCKING GIT WITH THE PROGRAM!!!!..... IT WAS RELEASED FUCKING 12 YEARS AGO......
Please..... Please..... I just want my otherwise normally awesome team to understand the importance and benefits of version control and approved/revertable deployments2
I am forced to work with a client's notoriously slow SOAP api. Slow in this case is 1.5-2s per request.
The api is structured rather... creatively... at the same time. So we have to bombard it with thousands of requests to build our data base with historical SOAP data. Also the data sometimes is a couple of hours late, giving a flat line (all values at 0) until retroactively fixing the output for the same requests.
So to fill one dev data base with a year's worth of historical data (nice to have when testing a dashboard application) we hammer the api with ~20k requests (~1 million if we want to be thorough).
Best thing about that: There is no staging/test api and the prod api seems not to handle lots of requests at the same time very well...
Latest thought: Maybe we could put a varnish cache in front of the SOAP for testing. Better have wrong data, than nothing at all and we don't kill the prod clients every time we ramp up a new instance.
Also that would dramatically decrease the 4.2 hours of data pumping to about 7 minutes after the first run.
OMG you fucking dick! All I asked if you minded if I created a PR to push your branch to prod because test was down.
You didn’t have to bother the fucking software architects in a group chat with me saying I needed fucking help.
If you wrote the isset function correctly to begin with or even put the JSON file in the correct repo like every other file on the site this would not be a problem.
Fuck contractors! Fuck Assholes!2
Major project, multi million budget, huge business and IT coordination, board level status updates, meeting started back in March 2018 for a Go Live of Aug 2019.
Based on draft requirements (and experience) I request the test environment be built for half of the work. Turns out that no one told Server Eng and they are out of space in both dev and prod until Q2 of 2019. We went from Green to Red because a Service Request.5
Yesterday whole 12 hours we were working on deployment about a feature X that has deadline yesterday itself.
Everything damn perfectly running on Test env but not on Prod.
We made Prod into Dev/Test/Fucking garabage env. Haha.
I was laughing to myself at same time crying hard in my deep heart.
Business guys chasing PM
PM chasing us
And from morning till night we were in same room. Had lunch, and dinner only went out for toilet and to refil water bottles.
And found that feature Y is not working at same time that is related to our feature X. Fucking we have been wasted hours on it.
One of my devs got so fucked up emotionally that he messed up the code (not his fault) he didnt had his lunch and dinner. Had to console him later that its not his fault. Poor guy not sure whether he slept or not; will find out in few hours.
Anyways reported a bug.
But that bug assigned to us for fixing.
Are you fucking kidding me.
Anyways no choice. Had to do it.
Hope today everything goes good or horribly bad. FYI no deployments on Friday damn we are in stalememt till Monday.
Fuck that bug
May be fuck our stupiditiy while makiing mistakes.1
While addressing a Senior Dev's (SD) query from another team.
SD: why is this field mandatory? Can't it be just optional? Any other work around?
Me: Is your code changes already pushed in Devo? In that case, we provide a value which will work since you are not concerned about it.
SD: Yes. It's pushed till production. And, I want to test changes in Prod.
Me: (shared some codes) and explained that this feature for testing is only available in Devo.
SD: I know that. (Shared me a ticket) I want this field to be optional. That's it.
Me: (read the entire ticket. Didn't find anything related their) Told him, I will discuss with team. And meanwhile, for Devo, you can use this value.
Next morning, I accidentally came over some other ticket raised by him only which had the correct doubts regarding request to support this field in production
Now, I don't know why did he share a wrong ticket with me.
And, how will it even help him if that field was even optional.
THAT JUST WONT WORK IN PRODUCTION.
I will discuss with my team and see what can happen.
Its time to remind ourselves the wonderful time of the year, where people look back and decide to be better and learn from your mistakes...
For example never test in prod and wake me up with notifications you dumbfucks... Someone has exams to do..
Seriously, I have seen a lot of push notifications lately, its some kind of Letstestinprod-uary?1
Long post, TLDR: Given a large team building large enterprise apps with many parts (mini-projects/processes), how do you reduce the bus-factor and the # of Brent's (Phoenix Project)?
# The detailed version #
We have a lot of people making changes, building in new processes to support new flows or changes in the requirements and data.
But we also have to support these except when it gets into Production there is little information to quickly understand:
- how it works
- what it does/supposed to do
- what the inputs and dependencies are
So often times, if there's an issue, I have to reverse engineer whatever logic I can find out of a huge mess.
I guess the saying goes: the only people that know how it works is whoever wrote it and God.
I'm a senior dev but i spend a lot of time digging thru source code and PROD issues to figure out why ... is broken and how to maybe fix it.
I think in Agile there's supposed to be artifacts during development but never seen em.
Personally whenever i work on a new project, I write down notes and create design diagrams so i can confirm things and have easy to use references while working.
I don't think anyone else does that. And afterwards, I don't have anywhere to put it/share it. There is no central repo for this stuff other than our Wiki but for the most part, is like a dumping ground. You have to dig for information and hoping there's something useful.
And when people leave, information is lost forever and well... we hire a lot of monkeys... so again I feel a lot of times i m trying to recover information from a corrupted hard drive...
The only way real information is transferred is thru word of mouth, special knowledge transfer sessions.
Ideally I would like anything that goes into PROD to have design docs as well as usage instructions in order for anyone to be able to quickly pick it up as needed but I'm not sure if that's realistic.
Even unit tests don't seem to help much as they just test specific functions but don't give much detail about how a whole process is supposed to work.9
This is an actual transcript...
Since it's way too long for the normal 5000 characters, hence splitting it up...
Infra Guy: mr Dev, could you please give some rational for update of jjb?
Dev: sparse checkout support is missing
Infra Guy: is this support mandatory to achive whatever you trying to do?
Infra Guy: u trying to get set of specific folder for set of specific components?
Infra Guy: bash script with cp or mv will not work for you?
Infra Guy: ?
Dev: when you have already present functionality why reinvent the wheel
Dev: jenkins has support for it
Dev: the jjb is the bottle neck
Infra Guy: getting this functionality onto our infra would have some implications
Dev: why should I write bash script if jenkins allows me to do that
Dev: what implications ??
Infra Guy: will you commit to solve all the issues caused by new jjb?
Dev: you show me the implications first
Infra Guy: like a year ago i have tried to get new jjb <commit_url>
Infra Guy: no, the implications is a grey area
Infra Guy: i cant show all of them and they may hit like in week or eve month
Dev: then why was it not tackled
Dev: and why was it kept like that
Infra Guy: few jobs got broken on something
Dev: it will crop up some time later
Dev: if jobs get broken because of syntax
Dev: then jobs can be fixed
Dev: is it not ???
Infra Guy: ofc
Infra Guy: its just a question who will fix them
Dev: follow the syntax and follow the guidelines
Dev: put up a test server and try and lets see
Dev: you have a dev server
Dev: why not try on that one and see what all jobs fails
Dev: and why they fail
Dev: rather than saying it will fail and who will fix
Dev: let them fail and then lets find why
Dev: I manually define a job
Dev: I get it done
Infra Guy: i dont think we have test server which have the same workload and same attention as our prod
Dev: unless you test how would you know ??
Dev: and just saying that it broke one with a version hence I wont do it
Infra Guy: and im not sure if thats fair for us to deal with implication of upgrading of the major components just cause bash script is not good enough for u
Dev: its pretty bad
Infra Guy: i do agree
Infra TL Guy: Dev, what Infra Guy is saying is that its not possible to upgrade without downtime
Infra Guy: no
Dev: how long a downtime are we looking at ??
Infra Guy: im saying that after this upgrade we will have deal with consequences for long time
Infra Guy-2: No this is not testing the upgrade is the huge effort as we dont have dev resources to handle each job to run
Dev: if your jjb compiles all the yaml without error
Dev: I am not sure what consequences are we talking of
Infra Guy: so you think there will be no consequences, right?
Dev: unless you take the plunge will you know ??
Dev: you have a dev server running at port 9000
Infra Guy: this servers runs nothing
Dev: that is good
Dev: there you can take the risk
Infra Guy: and the fack we have managed to put something onto api doesnt mean it works
Dev: what API ?
Infra Guy: jenkins api
Infra Guy: hmmm
Dev: what have you put on Jenkins API ??
Infra Guy: (
Dev: jjb is a CLI
Infra Guy: ((
Dev: is what I understand
Dev: not a Jenkins API
Infra Guy: (((
Infra Guy: jjb build xmls and push them onto api
Infra Guy: and its doent matter
Dev: so you mean to say upgrading a CLI is goig to upgrade your core jenkisn API
Dev: give me a break
Infra Guy: the matter is that even if have managed to build something and put it onto api
Infra Guy: doesnt mean it will work
Dev: the API consumes the xml file and creates a job
Infra Guy: right
Dev: if it confirms to the options which it understands
Dev: then everything will work
Dev: I am actually not getting your point Infra Guy
Infra Guy: i do agree mr Dev
Dev: we are beating around the bush
Infra Guy: just want to be sure that if this upgrade will break something
Infra Guy: we will have a person who will fix it
Dev: that is what CICD is supposed to let me know with valid reasons
Dev: why can't that upgrade be done
Infra Guy: it can be done
Infra Guy: i even have commit in place3
A while back I was looking for a new job and was given an interview by one company who shall remain nameless. Before the interview, they asked me look through their current site, nothing unusual there, so I started browsing. Then I received an email with all the details I needed to access their production server. Apparently they wanted me to look through the code, unusual but I did so.
First thing all the passwords, including those belonging to members of the public were stored in plain text and many were still the default passwords which were based on the Id so were sequential.
I highlighted these issues at the interview and they then asked me to do a test, not the usual test though, they asked me to add some charts to their prod site. Needless to say that didn’t happen and I got another job elsewhere.1
We're sorry, but something went wrong. We know it works in dev and test and staging, but it just won't fucking work in prod. If you are the application owner check the logs for more information. Note that it won't help you at all in this case - you didn't really think it will be that easy?1
Tfw you find a comment saying something needs to be tested when the system is in prod.
Narrator: they did in fact not test it when the system was in production
This place I’m consulting at just had a new Directory of Change, first policy she made for IT is mandatory 3 working days wait for any release in Test and 2 weeks for Prod. That includes application config value update in Test database. We think she might have misunderstood her job title Ms. Director of No Change..2
Should I be even be testing if no one cares.
I keep asking Devs regarding the functionalities of their system for testing and then, realize it's already on Prod even before I test in beta or enable to test in beta.
Do I exist!!!
Now, for the nth time, I have started to test when things are almost there or already there in Prod.
I should keep myself in adrenaline mode always on from now on.
Need to do something worthy.
The worst way to start a new job.2
- Change this, change that too, oh and that too.
* Ok, will do ( however unlikely it's going to be finished correctly, as you didn't consult me before or listen to me about the impact this might have )
- ( just stfu and do it )
Sometime at or after important cutoff:
- Hey this doesn't show up, is that right
* Yeah, you wanted too many things changed at once + fix it and notice a stupid error like an && switched with a ||, or other models that don't know about change x yet
You know what grinds my gears?
When someone complains about an issue, then bypasses it and ruins your ability to test. Sure, you can recreate it to the best of your ability, but if the issue is very specific, you're screwed (browser, data in prod, etc).
Whats the fucking purpose of our companys dev test and prod env. Dev always only has a single instance. Sometimes clustered services run as cluster on test. Producing headaches because the clustering behaviour couldnt be seen on a single instance and Prod lacks all the nice deployment tools off dev/test. Fuck thinking you could dev then test and prod without any major reconfiguration and headaches. And all because the Storage costs is RETARDEDLY expensive because the backup EVERYTHING with ridiculess overkill. That results in headaches when requesting new servers. Took an old Workstation from the shelves and made it my vm slave so at least i could reliably deploy to test.. Fuck this process
Context: I am leaving my company to work at a data science lab in another one.
My senior dev (with PO hat): we need to gather data from prod to check test coverage. You will like it as you will be data scientist hehehe (actually not funny). You will have to analyze the features, and find relations between them to be able to compare with the existing tests
Me: oh cool, we can use ML to do that!
Him: Nope, we need to di it in the next 3 weeks so we need to do it manually.
Me:... I have quit for something....
One nightmarish project that was doomed from the beginning, had me as the sole developer. I could hardly sleep when we began testing on a separate test system, but with (nearly) all the config stored in shared memory and copied from the production system, I dreaded, half awake, that the production server data base connection was still configured in the test system and that it was shooting all it's test data repeatedly to prod.
Finally drove to company in middle of the night at 4 o'clock. Checked everything was OK, tried to sleep 3 hours before the start of the work day.
This system also had the most hideous memory corruption in some shared memory that was used across several processes and should have been thoroughly protected by a mutex, but somehow, sometimes this crucial map, that was used to speed up the access to all the customer data just contained garbage.
Still haunts me to that day. (Like xkcd's unresolved tension of a non-matching parenthesis - an unresolved bug.
In my initial days as a web developer, i was assigned a task, to implement a cart share functionality in an e commerce company.
I made the functionality and tested on my system.
Result: working good.
Pushed it to beta testing environment.
Resilt: working good.
Pushed to pre production environment.
Result: working good.
Pushed to live site.
Result: 😀 Error in live site..
So a call comes to me from my team lead..
Asks what was the issue...
Me: i dont know either.
After 3-4 hrs:
I found the reason.
My system, beta test env, pre prod env are all having latest php version (5.6 i guess)
But the live server had old version of php.
Me: laughed like anything.
I didn't know that these things would matter in such a great level.
Moral of the story:
Be one with the force (server in this case)2