Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "consume the data already"
-
Had to consume a soap webservice which spits out a XML of 5000 lines with ambiguous node names and a shitload of data that needs to be parsed.
Built a ORM model to hold all the data and I already built a Xmlparser which works like a boss.. untill now..
I've been debugging for 3 hours, cursing every God man ever made up. Swearing at my screen like a madman... but this particular set of nodes just didn't got saved properly to the DB...
Alright, so my ORM definition is fucked... nope... Alright, so my XmlParser is fucked... nope...
Whaaaaat the fuuuuck...
Oh wait, I've been checking the wrong table for hours....
Hooray for ambiguous tables because I followed the ambiguous structure.
I am going to get drunk now.
X1 -
EoS1: This is the continuation of my previous rant, "The Ballad of The Six Witchers and The Undocumented Java Tool". Catch the first part here: https://devrant.com/rants/5009817/...
The Undocumented Java Tool, created by Those Who Came Before to fight the great battles of the past, is a swift beast. It reaches systems unknown and impacts many processes, unbeknownst even to said processes' masters. All from within it's lair, a foggy Windows Server swamp of moldy data streams and boggy flows.
One of The Six Witchers, the Wild One, scouted ahead to map the input and output data streams of the Unmapped Data Swamp. Accompanied only by his animal familiars, NetCat and WireShark.
Two others, bold and adventurous, raised their decompiling blades against the Undocumented Java Tool beast itself, to uncover it's data processing secrets.
Another of the witchers, of dark complexion and smooth speak, followed the data upstream to find where the fuck the limited excel sheets that feeds The Beast comes from, since it's handlers only know that "every other day a new one appears on this shared active directory location". WTF do people often have NPC-levels of unawareness about their own fucking jobs?!?!
The other witchers left to tend to the Burn-Rate Bonfire, for The Sprint is dark and full of terrors, and some bigwigs always manage to shoehorn their whims/unrelated stories into a otherwise lean sprint.
At the dawn of the new year, the witchers reconvened. "The Beast breathes a currency conversion API" - said The Wild One - "And it's claws and fangs strike mostly at two independent JIRA clusters, sometimes upserting issues. It uses a company-deprecated API to send emails. We're in deep shit."
"I've found The Source of Fucking Excel Sheets" - said the smooth witcher - "It is The Temple of Cash-Flow, where the priests weave the Tapestry of Transactions. Our Fucking Excel Sheets are but a snapshot of the latest updates on the balance of some billing accounts. I spoke with one of the priestesses, and she told me that The Oracle (DB) would be able to provide us with The Data directly, if we were to learn the way of the ODBC and the Query"
"We stroke at the beast" - said the bold and adventurous witchers, now deserving of the bragging rights to be called The Butchers of Jarfile - "It is actually fewer than twenty classes and modules. Most are API-drivers. And less than 40% of the code is ever even fucking used! We found fucking JIRA API tokens and URIs hard-coded. And it is all synchronous and monolithic - no wonder it takes almost 20 hours to run a single fucking excel sheet".
Together, the witchers figured out that each new billing account were morphed by The Beast into a new JIRA issue, if none was open yet for it. Transactions were used to update the outstanding balance on the issues regarding the billing accounts. The currency conversion API was used too often, and it's purpose was only to give a rough estimate of the total balance in each Jira issue in USD, since each issue could have transactions in several currencies. The Beast would consume the Excel sheet, do some cryptic transformations on it, and for each resulting line access the currency API and upsert a JIRA issue. The secrets of those transformations were still hidden from the witchers. When and why would The Beast send emails, was still a mistery.
As the Witchers Council approached an end and all were armed with knowledge and information, they decided on the next steps.
The Wild Witcher, known in every tavern in the land and by the sea, would create a connector to The Red Port of Redis, where every currency conversion is already updated by other processes and can be quickly retrieved inside the VPC. The Greenhorn Witcher is to follow him and build an offline process to update balances in JIRA issues.
The Butchers of Jarfile were to build The Juggler, an automation that should be able to receive a parquet file with an insertion plan and asynchronously update the JIRA API with scores of concurrent requests.
The Smooth Witcher, proud of his new lead, was to build The Oracle Watch, an order that would guard the Oracle (DB) at the Temple of Cash-Flow and report every qualifying transaction to parquet files in AWS S3. The Data would then be pushed to cross The Event Bridge into The Cluster of Sparks and Storms.
This Witcher Who Writes is to ride the Elephant of Hadoop into The Cluster of Sparks an Storms, to weave the signs of Map and Reduce and with speed and precision transform The Data into The Insertion Plan.
However, how exactly is The Data to be transformed is not yet known.
Will the Witchers be able to build The Data's New Path? Will they figure out the mysterious transformation? Will they discover the Undocumented Java Tool's secrets on notifying customers and aggregating data?
This story is still afoot. Only the future will tell, and I will keep you posted.6 -
DEAR NON TECHNICAL 'IT' PERSON, JUST CONSUME THE FUCKING DATA!!!!
Continuation of this:
https://devrant.com/rants/3319553/...
So essentially my theory was correct that their concern about data not being up to date is almost certianly ... the spreadsheet is old, not the data.... but I'm up against this wall of a god damn "IT PERSON" who has no technical or logic skills, but for some reason this person doesn't think "man I'm confused, I should talk to my other IT people" rather they just eat my time with vague and weird requests that they express with NO PRECISION WHATSOEVER and arbitrary hold ups and etc.
Like it's pretty damn obvious your spreadsheet was likely created before you got the latest update, it's not a mystery how this might happen. But god damn I tell them to tell me or go find out when the spreadsheet was generated and nothing happens.
Meanwhile their other IT people 'cleaned the database' and now a bunch of records are missing and they want me to just rando update a list of records. Like wtf is 'clean the database' all about!?!?!?
I'm all "hey how about I send you all records between these dates and now we're sure you've got all the records you need up to date and I'll send you my usual updates a couple times a day using the usual parameters".
But this customer is all "oh man that's a lot of records", what even is that?
It's like maybe 10k fucking records at most. Are you loading this in MS Access or something (I really don't know MS Access limits, just picking an old weird system) and it's choking??!?! Just fucking take the data and stick it in the damn database, how much trouble can it be?!!?!?
Side theory: I kinda wonder if after they put it in the DB every time someone wants the data they have some API on their end that is just "HERE"S ALL THE FUCKING DATA" and their client application chokes and that's why there's a concern about database size with these guys.
I also wonder if their whole 'it's out of date' shit is actually them not updating records properly and they're sort of grooming the DB size to manage all these bad choices....
Having said all that, it makes a lot more sense to me how we get our customers. Like we do a lot of customer sends us their data and we feed it back to them after doing surprisingly basic stuff ever to it... like guies your own tools do th---- wait never mind....1 -
We ended up finding ourselves with a bunch of tables that have mostly the same columns, but differ by a few. Every time we consume a REST API, we store the `access_token`s and expiration dates and the other OAuth data. However, each provider has slightly different requirements. For example, we store email addresses for email api's, other providers require us to store some additional information, etc. etc.. I'm tempted by the flexibility and lack of schema brought by document databases, but not enough to use one since they're generally slower and we already have everything in SQL. So I got the idea of using JSON columns to alleviate this issue: have a single table for all REST integrations (be it outlook or facebook), and then store the unique integration data inside of this JSON column for "additional data". This data is mostly just read, not filtered by (but ocasionally so). Has anyone had experience with this? How's the performance of JSON fields? Is this a good practice or will it get harder with more integrations?