Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
Grumpycat588213dYou are not going to do it with bash. New problem new stack. Good luck!
Yes you are a 🧙♂️ wizard!
Make it happen captain. -
Grumpycat588213dDoes the order matter in the 1.6 million records? If not split the file up and run in parallel.
-
b2plane6395213d@Grumpycat why would splitting the file be faster whatsoever?
The file contains over 1.6 million rows of data -
Grumpycat588213dIf you have 10 processes processing 160000 rows each they will finish faster than one process processing 1.6 million rows.. But order in the file must not matter.
-
b2plane6395213d@Grumpycat so i should write a bash script to first take the main 1.6m row file and split it in 10 files, copy 1/10 data into each file and process all 10 files in the same time?
-
Grumpycat588213d@b2plane Yeah. It has worked before. Now you are going to hammer the machine though and you will find out where in your code the bottlenecks are. What does each row represent?
-
jestdotty5225213dye parallelizing is what came to mind as well
can bash do that? well it can't do 2d arrays so!
I think there was some tool in the terminal already to do commands in parallel but I don't remember it
it is faster because CPUs have multiple cores and applications just use one core. but you have like. 8-32 cores. so you could. just multiply your processing speed by that.
I wondered if there was a way to inspect performance in a bash script though. seemed funner and not like cheating!
if order in the file matters you can also just sort it after, maybe with some cleverness involved also. I'd assume it takes so long to process because it's doing a bunch of other stuff -
Grumpycat588213dBash has a join and fork command from what I remember (its been years since I was that involved with bash). I would use another language if you can. Bash will bash your head in.
-
b2plane6395212d@Grumpycat i suggested to replace shit bash scripts with python automation scripts but their systen which was built in 1998 is so old they cant do it, and also there are restrictions when trying to install anything, any python 3rd party libraries get blocked instsntly. Any external downloads within their private vm is blocked. I have to request for approval for everything. Dont even have vscode here i need to use notepad. Fucking shitty 90 year old grandpa bankers dont care about their bank having modern technology. All they care about is money. Which is ironic. And even more ironic that they generate trillions of dollars with a system as shitty as this one!
-
Grumpycat588211dI sympathize with your plight. I worked for a bank for a couple of years. Like warfare. 99% boredom. 1% terror.
They increased 1 single file to have over 1.6 million records of data and bow the processing takes 12h to complete. They want me to improve the current bash scripts to decrease this processing time down to max 5 hours. Are you serious rn. Do i look like a magic fcking wizard 🧙🏿♂️🪄
rant