Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "data extraction"
-
Spent multiple weeks preparing data for rendering. Part of this process is reprojecting coordinates of huge ass GeoJSON files (some files are 100gb+).
Compared one normal and one reprojected file with the head command and noticed no difference, then found out that this data already comes in the right projection....
Full-speed extraction it is!3 -
you motherfucking cocksucking ass wipes.
How fucking hard is it for you JS cockheads to have STABLE fucking code?
So hear I am, thinking through a side project for data extraction and loading to automate some shitty part of my job, that could be used by the broader team... and decide to use electron.... I know it's a clusterfuck, but this wouldn't be a big application, so against my better judgement I run:
npm install electron
npm start
...
Error: unknown spawn
🤷♂️ you had 1 fucking job... 1 fucking lousy shit stain of a job, and you can't even have something run out of the god foresaken box without someone debugging your shit.
Now who has a WORKING alternative to electron?10 -
Data wrangling is messy
I'm doing the vegetation maps for the game today, maybe rivers if it all goes smoothly.
I could probably do it by hand, but theres something like 60-70 ecoregions to chart,
each with their own species, both fauna and flora. And each has an elevation range its
found at in real life, so I want to use the heightmap to dictate that. Who has time for that? It's a lot of manual work.
And the night prior I'm thinking "oh this will be easy."
yeah, no.
(Also why does Devrant have to mangle my line breaks? -_-)
Laid out the requirements, how I could go about it, and the more I look the more involved
it gets.
So what I think I'll do is automate it. I already automated some of the map extraction, so
I don't see why I shouldn't just go the distance.
Also it means, later on, when I have access to better, higher resolution geographic data, updating it will be a smoother process. And even though I'm only interested in flora at the moment, theres no reason I can't reuse the same system to extract fauna information.
Of course in-game design there are some things you'll want to fudge. When the players are exploring outside the rockies in a mountainous area, maybe I still want to spawn the occasional mountain lion as a mid-tier enemy, even though our survivor might be outside the cats natural habitat. This could even be the prelude to a task you have to do, go take care of a dangerous
creature outside its normal hunting range. And who knows why it is there? Wild fire? Hunted by something *more* dangerous? Poaching? Maybe a nuke plant exploded and drove all the wildlife from an adjoining region?
who knows.
Having the extraction mostly automated goes a long way to updating those lists down the road.
But for now, flora.
For deciding plants and other features of the terrain what I can do is:
* rewrite pixeltile to take file names as input,
* along with a series of colors as a key (which are put into a SET to check each pixel against)
* input each region, one at a time, as the key, and the heightmap as the source image
* output only the region in the heightmap that corresponds to the ecoregion in the key.
* write a function to extract the palette from the outputted heightmap. (is this really needed?)
* arrange colors on the bottom or side of the image by hand, along with (in text) the elevation in feet for reference.
For automating this entire process I can go one step further:
* Do this entire process with the key colors I already snagged by hand, outputting region IDs as the file names.
* setup selenium
* selenium opens a link related to each elevation-map of a specific biome, and saves the text links
(so I dont have to hand-open them)
* I'll save the species and text by hand (assuming elevation data isn't listed)
* once I have a list of species and other details, to save them to csv, or json, or another format
* I save the list of species as csv or json or another format.
* then selenium opens this list, opens wikipedia for each, one at a time, and searches the text for elevation
* selenium saves out the species name (or an "unknown") for the species, and elevation, to a text file, along with the biome ID, and maybe the elevation code (from the heightmap) as a number or a color (probably a number, simplifies changing the heightmap later on)
Having done all this, I can start to assign species types, specific world tiles. The outputs for each region act as reference.
The only problem with the existing biome map (you can see it below, its ugly) is that it has a lot of "inbetween" colors. Theres a few things I can do here. I can treat those as a "mixing" between regions, dictating the chance of one biome's plants or the other's spawning. This seems a little complicated and dependent on a scraped together standard rather than actual data. So I'm thinking instead what I'll do is I'll implement biome transitions in code, which makes more sense, and decouples it from relying on the underlaying data. also prevents species and terrain from generating in say, towns on the borders of region, where certain plants or terrain features would be unnatural. Part of what makes an ecoregion unique is that geography has lead to relative isolation and evolutionary development of each region (usually thanks to mountains, rivers, and large impassible expanses like deserts).
Maybe I'll stuff it all into a giant bson file or maybe sqlite. Don't know yet.
As an entry level programmer I may not know what I'm doing, and I may be supposed to be looking for a job, but that won't stop me from procrastinating.
Data wrangling is fun.1 -
For any webDev in here, is there any open source tools for web data extraction? i've seen imacros and UiPath, maybe you use other tools.5
-
Obviously ai and autodocument recognition and data extraction is not usable yet
Excepting when it's a pdf not a scanned document or image
Ocr may be but shift the whole.image or bend it or remove a border from some white out
And then handwritten -
Need help with selecting a proper backend and website frameworks. After trying out a couple identity verification service providers we were dissapointed with their lack of support (takes weeks to do minimal changes).
So now we are having discussions about building in-house id verification system. We already have libraries for ios/android apps (ZOOM lib for face recognition and another lib for data extraction via OCR from document picture). So what we need is a proper backend and then a decent web framework with proper ux/ui design for our web/ios/android apps.
Currently thinking what kind of backend framework should we choose? Backend's main responsibility is for each client registered from website to assign an api key and to create a database/storage where his users would authenticate via clients app and upload a picture and a video.
Also wondering what kind of framework for website apps (main web app, dashboard app where we display pending verifications, and of course verification app) to choose. Should be go for angular?