Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
Your statement is so vague...
What does big mean? Mega, Giga, Tera, Exa?
What does analyze mean?
If it's just a one time operation like aggregation, python / multi threading and it should be fine.
https://pypi.org/project/ijson/
Iterative JSON is the way to go if it doesn't fit the memory.
If it's more than one time parsing, one might think about creating an intermediate representation - depending on analysis, this might be e.g. one day representation of values, allowing the values to be easily summed up to various units of time (weekly / monthly / yearly etc).
Newline delimited JSON can simplify things greatly, database might not be needed. -
j0n4s54342y@IntrusionCM it is not *that* big, i think I will learn jq a bit more (i already wanted to do this a few times, maybe it is now time) and when that doesn't work out for me i'll look at ijson (thanks) as this seems better than anything else i tried and when that doesn't satisfy me i will probably do some SQL stuff
Are there some good tools to analyze a big dataset of json files? I mean i could normalize the dataset into a SQL database but are there some secret weapons to make life simpler?
question