Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "images can go fuck themselves"
-
Ok friends let's try to compile Flownet2 with Torch. It's made by NVIDIA themselves so there won't be any problem at all with dependencies right?????? /s
Let's use Deep Learning AMI with a K80 on AWS, totally updated and ready to go super great always works with everything else.
> CUDA error
> CuDNN version mismatch
> CUDA versions overwrite
> Library paths not updated ever
> Torch 0.4.1 doesn't work so have to go back to Torch 0.4
> Flownet doesn't compile, get bunch of CUDA errors piece of shit code
> online forums have lots of questions and 0 answers
> Decide to skip straight to vid2vid
> More cuda errors
> Can't compile the fucking 2d kernel
> Through some act of God reinstalling cuda and CuDNN, manage to finally compile Flownet2
> Try running
> "Kernel image" error
> excusemewhatthefuck.jpg
> Try without a label map because fuck it the instructions and flags they gave are basically guaranteed not to work, it's fucking Nvidia amirite
> Enormous fucking CUDA error and Torch error, makes no sense, online no one agrees and 0 answers again
> Try again but this time on a clean machine
> Still no go
> Last resort, use the docker image they themselves provided of flownet
> Same fucking error
> While in the process of debugging, realize my training image set is also bound to have bad results because "directly concatenating" images together as they claim in the paper actually has horrible results, and the network doesn't accept 6 channel input no matter what, so the only way to get around this is to make 2 images (3 * 2 = 6 quick maths)
> Fix my training data, fuck Nvidia dude who gave me wrong info
> Try again
> Same fucking errors
> Doesn't give nay helpful information, just spits out a bunch of fucking memory addresses and long function names from the CUDA core
> Try reinstalling and then making a basic torch network, works perfectly fine
> FINALLY.png
> Setup vid2vid and flownet again
> SAME FUCKING ERROR
> Try to build the entire network in tensorflow
> CUDA error
> CuDNN version mismatch
> Doesn't work with TF
> HAVE TO FUCKING DOWNGEADE DRIVERS TOO
> TF doesn't support latest cuda because no one in the ML community can be bothered to support anything other than their own machine
> After setting up everything again, realize have no space left on 75gb machine
> Try torch again, hoping that the entire change will fix things
At this point I'll leave a space so you can try to guess what happened next before seeing the result.
Ready?
3
2
1
> SAME FUCKING ERROR
In conclusion, NVIDIA is a fucking piece of shit that can't make their own libraries compatible with themselves, and can't be fucked to write instructions that actually work.
If anyone has vid2vid working or has gotten around the kernel image error for AWS K80s please throw me a lifeline, in exchange you can have my soul or what little is left of it5 -
Stuck at dealing with a huge amount of images again. 🤦
No idea how quickly I can get this object classification nn up and running, as it seems I have forgotten how to do shit. 🙄😒8