Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "hbase"
-
Me: Brooo, We did it last night!
Friend : Which base did you get to?
Me : Hbase
Friend: Ha, doope!4 -
So on my new position I get to work on Spark jobs. Never had to work with the infamous big data technologies. I never thought this would get SO frustrating for all the wrong reasons.
I'm currently trying to introduce integration tests for some Spark job I wrote. This isn't trivial though, as the data comes from several HBase tables. Mocking everything simply isn't feasible. So why not use the integrated HBaseTestingUtility? With it you can start a mini cluster that runs all nessecary services in the scope of your test.
Sounds great, eh? WRONG. Firstly the used mapr dependencies get in the way. The baked in configuration tries to automatically authenticate with your local cluster through Kerberos. Of course this doesn't work. And of course there is no way to reconfigure this as it happens IN A FUCKING STATIC BLOCK. AHHHH.
Ok. So after calming down I "simply" had to exclude all mapr dependencies and replace them with vanilla ones. After two days of dependency hell it FINALLY works!
...or does it? Well now we need test data. For that we got a map reduce algorithm that can import dumps. Sounds again, great, eh? WROOOONNNG.
The fucking map reduce mini cluster can't start, as it tries to write a symlink. Now take a wild guess what the sys admin here blocked. Yepp. TWO DAYS OF WORK RENDERED USELESS, BECAUSE OF SOME FUCKING SECURITY SETTING.
This is fine.