Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
IntrusionCM3765223dWhen everything is confusing, it might be lupus...
Aka: Use cross diagnosis from Dr House.
Start with one failing test, examine it, write down every fucking shit it does, then try to find out what and why it changes based on the data you have.
From experience, it sounds like something small with a lot of influence...
If I'd start, I'd try examining the environment... Depending on what it does, maybe an sysctl / rlimit setting might be different.
Then climb up the ladder - Timezone / Encoding / Software Versions....
Everything the test suite is influenced by. Mostly it's, like jenkins, a linux bash environment, so from exports to locales to whatever I find.
Then the services and their configuration, everything the test touches.
Last but not least the test itself.
Last I usually cross referenced everything and double check I haven't missed anything.
It's tedious but amazing what you learn on the way.
NeatNerdPrime2442223d@IntrusionCM what this guy says.
I've encountered such issues when dependencies get mashed up. A development codebase written in lets say ruby 2.5 and it's gems then goes through the pipeline with a 2.4 rubyversion and some older gems and so on.
In a java project somehow a dependency for a library got hardcoded into a meta_inf file while the maven project had a different one. When parallell building was used you got a race condition where the first loaded lib got used, and that caused varying results in code coverage and consequently failed builds.
In another occasion i had to check for system setrings influencing database engine which can fuck up a lot.
A lot of areas but to my experience, such randomness is often due to combination of mismatch in deps and the nature of miltithreaded building and testing.
Voxera8402223dIf you consistantly get the same amount of failing tests for the different ways to run you have some environment issue.
Either this us a good indication of code that fails in some environment, or it is tests that should be more isolated from environment.
If the number of failing tests differ you have un unstable test, either from environment, race conditions or it uses random data.
For example, if you hit the database thing can be affected there if anything is time or date based, make sure you either use a moched date provider, or generate testcases based on now.
We have a similar problem with some manual tests that are dependent on data in a third party system which can fail the test if the third party data is out of bounds of the test :/
Idealy you have full control over all environment related info for a test.