Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "query plan"
-
0. Plan before you code. Document everything. You won't remember either your idea or those clever implementations next week (or next month, or next year...).
1. Don't hack your way through, unless that's what you intend to do. Name your variables, functions etc. neatly: autocomplete exists!
Protip: Sometimes you want to check a quick language feature or a piece of code from one of your modules. Resist the urge to quickly hack in the test into your actual project. Maintain a separate file where you can quickly type in and check what you're looking for without hacking on your project (For example, in Python, you can open a new terminal or IDLE window for those quick tests).
2. Keep a quiet environment where you can focus. Recommend listening to something while coding (my latest fad is on asoftmurmur.com). Don't let anything distract you and throw your contextual awareness out of whack.
3. Rubber ducks work. Really. Talking out a complex piece of logic, or that regex or SQL query aids your mind greatly in grasping the concept and clearing the idea. Bounce off code and ideas with a friend or colleague to catch errors and oversights faster. Read more here: https://en.wikipedia.org/wiki/...
4. Since everyone else is saying this (and because it merits saying), USE VERSION CONTROL. Singular most important thing to software development aside from planning and documenting.
5. Remember to flout all of the above once in a while and just make a mess of a project where you have fun throwing everything around all over the place. You'll make mistakes that you never thought were possible by someone of your caliber :) That's how you learn.
Have fun, keep learning!3 -
Today, for fun, I wrote prime number generation upto 1000 using pure single MySQL query.
No already created tables, no procedures, no variables. Just pure SQL using derived tables.
So does this mean that pure SQL statements do not have the halting problem?
Putting an EXPLAIN over the query I could see how MySQL guessed that the total number of calculations would be 1000*1000 even before executing the query in itself and this is amazing ♥️
I have attached a screenshot of the query and if you are curious, I have also left below the plain text.
PS this was a SQL problem in Hackerrank.
MySQL query:
select group_concat(primeNumber SEPARATOR '&') from
(select numberTable.number as primeNumber from
(select cast((concat(tens, units, hundreds)+1) as UNSIGNED) as number from
(select 0 as units union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) unitsTable,
(select 0 as tens union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) tensTable,
(select 0 as hundreds union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) hundredsTable order by number) numberTable
inner join
(select cast((concat(tens, units, hundreds)+1) as UNSIGNED) as divisor from
(select 0 as units union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) unitsTable,
(select 0 as tens union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) tensTable,
(select 0 as hundreds union select 1 union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9) hundredsTable order by divisor) divisorTable
on (divisorTable.divisor<=numberTable.number and divisorTable.divisor!=1)
where numberTable.number%divisorTable.divisor=0
group by numberTable.number having count(*)<=1 order by numberTable.number) resultTable;9 -
EoS1: This is the continuation of my previous rant, "The Ballad of The Six Witchers and The Undocumented Java Tool". Catch the first part here: https://devrant.com/rants/5009817/...
The Undocumented Java Tool, created by Those Who Came Before to fight the great battles of the past, is a swift beast. It reaches systems unknown and impacts many processes, unbeknownst even to said processes' masters. All from within it's lair, a foggy Windows Server swamp of moldy data streams and boggy flows.
One of The Six Witchers, the Wild One, scouted ahead to map the input and output data streams of the Unmapped Data Swamp. Accompanied only by his animal familiars, NetCat and WireShark.
Two others, bold and adventurous, raised their decompiling blades against the Undocumented Java Tool beast itself, to uncover it's data processing secrets.
Another of the witchers, of dark complexion and smooth speak, followed the data upstream to find where the fuck the limited excel sheets that feeds The Beast comes from, since it's handlers only know that "every other day a new one appears on this shared active directory location". WTF do people often have NPC-levels of unawareness about their own fucking jobs?!?!
The other witchers left to tend to the Burn-Rate Bonfire, for The Sprint is dark and full of terrors, and some bigwigs always manage to shoehorn their whims/unrelated stories into a otherwise lean sprint.
At the dawn of the new year, the witchers reconvened. "The Beast breathes a currency conversion API" - said The Wild One - "And it's claws and fangs strike mostly at two independent JIRA clusters, sometimes upserting issues. It uses a company-deprecated API to send emails. We're in deep shit."
"I've found The Source of Fucking Excel Sheets" - said the smooth witcher - "It is The Temple of Cash-Flow, where the priests weave the Tapestry of Transactions. Our Fucking Excel Sheets are but a snapshot of the latest updates on the balance of some billing accounts. I spoke with one of the priestesses, and she told me that The Oracle (DB) would be able to provide us with The Data directly, if we were to learn the way of the ODBC and the Query"
"We stroke at the beast" - said the bold and adventurous witchers, now deserving of the bragging rights to be called The Butchers of Jarfile - "It is actually fewer than twenty classes and modules. Most are API-drivers. And less than 40% of the code is ever even fucking used! We found fucking JIRA API tokens and URIs hard-coded. And it is all synchronous and monolithic - no wonder it takes almost 20 hours to run a single fucking excel sheet".
Together, the witchers figured out that each new billing account were morphed by The Beast into a new JIRA issue, if none was open yet for it. Transactions were used to update the outstanding balance on the issues regarding the billing accounts. The currency conversion API was used too often, and it's purpose was only to give a rough estimate of the total balance in each Jira issue in USD, since each issue could have transactions in several currencies. The Beast would consume the Excel sheet, do some cryptic transformations on it, and for each resulting line access the currency API and upsert a JIRA issue. The secrets of those transformations were still hidden from the witchers. When and why would The Beast send emails, was still a mistery.
As the Witchers Council approached an end and all were armed with knowledge and information, they decided on the next steps.
The Wild Witcher, known in every tavern in the land and by the sea, would create a connector to The Red Port of Redis, where every currency conversion is already updated by other processes and can be quickly retrieved inside the VPC. The Greenhorn Witcher is to follow him and build an offline process to update balances in JIRA issues.
The Butchers of Jarfile were to build The Juggler, an automation that should be able to receive a parquet file with an insertion plan and asynchronously update the JIRA API with scores of concurrent requests.
The Smooth Witcher, proud of his new lead, was to build The Oracle Watch, an order that would guard the Oracle (DB) at the Temple of Cash-Flow and report every qualifying transaction to parquet files in AWS S3. The Data would then be pushed to cross The Event Bridge into The Cluster of Sparks and Storms.
This Witcher Who Writes is to ride the Elephant of Hadoop into The Cluster of Sparks an Storms, to weave the signs of Map and Reduce and with speed and precision transform The Data into The Insertion Plan.
However, how exactly is The Data to be transformed is not yet known.
Will the Witchers be able to build The Data's New Path? Will they figure out the mysterious transformation? Will they discover the Undocumented Java Tool's secrets on notifying customers and aggregating data?
This story is still afoot. Only the future will tell, and I will keep you posted.6 -
The more I work with performance, the less I like generated queries (incl. ORM-driven generators).
Like this other team came to me complaining that some query takes >3minutes to execute (an OLTP qry) and the HTTP timeout is 60 seconds, so.... there's a problem.
Sure, a simple explain analyze suggests that some UIDPK index is queried repeatedly for ~1M times (the qry plan was generated for 300k expected invocations), each Index Scan lasts for 0.15ms. So there you go.. Ofc I'd really like to see more decimal zeroes, rather than just 0.15, but still..
Rewriting the query with a CTE cut down the execution time to pathetic 0.04sec (40ms) w/o any loops in the plan.
I suggest that change to the team and I am responded a big fat NO - they cannot make any query changes since they don't have any control on their queries
....
*sigh*
....
*sigh*
but down to 0.04sec from 3+ minutes....
*sigh*
alright, let's try to VACUUM ANALYZE, although I doubt this will be of any help. IDK what I'll do if that doesn't change the execution plan :/ Prolly suggest finding a DBA (which they won't, as the client has no € for a DBA).
All this because developers, the very people sho should have COMPLETE control over the product's code, have no control over the SQLs.
This sucks!27 -
I was Just college fresher who completed his Engineering. My first week in the office. And a system was provided to me, since it was support project so I was given direct access to production database.
Fresher + Production Database + Access of Admin credentials = Worst Possible Combination
So it was my night shift, I was told to update new tariff plan for our client (which was one of the largest telecom service in India) .
If someone recharges for more than 200 Rupee, that person will get 10% or 20% extra talk time. Which was only applicable for particular circle (Like Bihar and Rajasthan).
Since I was fresher, I was told to update given query from my senior employee which he shared on the shared folder. Production downtime was in the mid night, so at that time I updated that query on the production database.
Query successfully updated. I completed my night shift, went home and slept.
When I woke up, I saw my mobile it had 200+ missed calls from different locations of India. They were Circle heads of that telecom service provider who contacted me. I realized something unexpected is expecting me.
Then at that moment my team lead called me and he asked me to come office right away.
Reminding you I was a fresher, I was shivering. What have I done there?
When I reached office, I came to know that the query I updated on production bombarded.
Every person who recharged that day (duration from midnight to morning 10 AM) got 10 times or 20 times more talktime.
A part of Query was something like this where error was made:
TalkTime = RechargeAmount + RechargeAmount * 10/100; (Bihar)
or
TalkTime = RechargeAmount + RechargeAmount * 20/100; (Rajasthan)
But instead of this query, I updated below one:
TalkTime = RechargeAmount + RechargeAmount * 10;
or
TalkTime = RechargeAmount + RechargeAmount * 20;
In a span of 10 hours, that telecom service lost revenue of 6.5 crore Rupees. Thanks to recovery team they were able to recover 6 crore but still 50 lakh Rupees were in loss.
One small query, and approx 1 million dollar was on stake.
Aftermath of this incident
My Mistake:
I should have taken those queries on mail. Or, there should have been mail communication regarding this.
Never ever do anything over oral communication. Senior employee who did this denied and said he provided correct query, and I had no proof of communication.
I told them, it was me who executed that query on production. Since I was fresher, and took my responsibility of that incident. My team lead rescued me from that situation.
Lesson Learned:
Always test your query and code multiple times before you execute or Go live it on production.
Always have email communication for every action you take on production.
Power comes with responsibility. If you have admin credentials of production never use it for update/delete/drop until you are sure.
Don’t take your job lightly.
I was not fired from that Job, but I have learnt my lesson very well. -
We are researching enhancing our current alerting system (we use Splunk) to be 'smarter' about who is emailed/texted/whatever when there are problems in our applications.
Currently, if there are over 50 errors logged within a 15 minute period, a email/phone/text blast to nearly 100 individuals ranging from developers, network admins, DBAs, and vice presidents.
Our plan is to group errors by team and let each team manage their own applications. Alert on 1 error, 5, 500...we don't care, let the team work out the particulars.
The trick was interfacing with Splunk's API (that's a long rant by itself)
In about a day or so I was able to use Splunk's WebHook feature to notify a WebAPI service I threw together to send myself an email with details about the underlying data (simulating the kind of alert we would send to the team)
I thought ...cool... it worked. Show it off to the team, most thought it was a good start, except one:
Dev: "The errors are not grouped by team."
Me: "No, I threw the webapi service together to demonstrate how we can extract the splunk bits to get access to the teams"
Dev: "Well...this won't work at all."
Me: "Um..what?"
Dev: "The specification c l e a r l y states the email will be team based. This email was only sent to you and has all the teams and their applications"
Me: "Um...uh...the service can, if we want to go using a service route. Grouping by team name is easy using a LINQ query. I just through this service together yesterday."
Dev: "I don't know. Sounds like I need to schedule a meeting to discuss what you are proposing. I don't think emailing all that to everyone is a good idea."
WTF! Did you not listen to what I said?!!!
Oh well..the dev's proposal is to use splunk's email notification and custom Exchange rules with callbacks into splunk that resend...oh good lord ...a fracking rube goldberg of a config nightmare ...
I suspect we'll go the service route once I finish the service before the meeting.1 -
Wow...a talented developer created a 3 via SQL execution plan. The query runs like crap but the execution plan art is pretty.
-
I hate the elasticsearch backup api.
From beginning to end it's an painful experience.
I try to explain it, but I don't think I will be able to cover it all.
The core concept is:
- repository (storage for snapshots)
- snapshots (actual backup)
The first design flaw is that every backup in an repository is incremental. ES creates an incremental filesystem tree.
Some reasons why this is a bad idea:
- deletion of (older) backups is slow, as newer backups need to be checked for integrity
- you simply have to trust ES that it does the right thing (given the bugs it has... It seems like a very bad idea TM)
- you have no possibility of verification of snapshots
Workaround... Create many repositories as each new repository forces an full backup.........
The second thing: ES scales. Many nodes / es instances form a cluster.
Usually backup APIs incorporate these in their design. ES does not.
If an index spans 12 nodes and u use an network storage, yes: a maximum of 12 nodes will open an eg NFS connection and start backuping.
It might sound not so bad with 12 nodes and one index...
But it get's pretty bad with 100s of indexes and several dozen nodes...
And there is no real limiting in ES. You can plug a few holes, but all in all, when you don't plan carefully your backups, you'll get a pretty f*cked up network congestion.
So traffic shaping must be manually added. Yay...
The last thing is the API itself.
It's a... very fragile thing.
Especially in older ES releases, the documentation is like handing you a flex instead of toilet paper for a wipe.
Documentation != API != Reality.
Especially the fault handling left me more than once speechless...
Eg:
/_snapshot/storage/backup
gives you a state PARTIAL
/_snapshot/storage/backup/_status
gives you a state SUCCESS
Why? The first one is blocking and refers to the backup status itself. The second one shouldn't be blocking and refers to the backup operation.
And yes. The backup operation state is SUCCESS, while the backup state might be PARTIAL (hence no full backup was made, there were errors).
So we have now an additional API that we query that then wraps the API of elasticsearch. With all these shiny scary workarounds like polling, since some APIs are blocking which might lead to a gateway timeout...
Gateway timeout? Yes. Since some operations can run a LONG (multiple hours) time and you don't want to have a ton of open connections hogging resources... You let the loadbalancer kill it. Most operations simply run in ES in the background, while the connection was killed.
So much joy and fun, isn't it?
Now add the latest SMR scandal and a few faulty (as in SMR instead of CMD) hdds in a hundred terabyte ZFS pool and you'll get my frustration level.
PS: The cluster has several dozen terabyte and a lot od nodes. If you have good advice, you're welcome - but please think carefully about this fact.
I might have accidentially vaporized people sending me links with solutions that don't work on large scale TM.2 -
This is more of an essay than a rant. TLDR at the end. I simply can't choose from all the shitty lecturers I've had, so I'm going to have to go through them one by one. But of background. I'm currently in 7th year of college, I did a multimedia degree in 2 years, a intro course to Software Dev and I'm currently in my final year of my Software Dev degree. So let's start.
Intro Software Course
- we had a database module, which was thought by, I shit you not, the head of the psychology course in the college, she attempted to teach us Databases using access. And not even using SQL, using access GUI components and it's query builder. Need I say more?
1st year software dev
- We had a networking module, the guy that taught the labs, he literally didn't say more than 12 words the entire 12 week semester, his answer to any question you asked him was a grunt and "research it"
- We had a psychology module, I have no fucking idea why, but instead of learning something useful we were told to read this and get in touch with your feelings...
- database module. Yes we actually did SQL here, 12 weeks of select statements and normal form, talked about by a guy in a monotone voice, who sounded like he was contemplating bringing in an assault riffle some day. Also instead of using MySQL he decided to use Ingres. Why I will never know.
2nd Year Software Dev
- We had a module called Algorithms and Data Structures. The lecturer gave us problems she couldn't solve. Simple problems. She was also crazy. Absolutely nuts.
- Object Orientated Programming. I had this lecturer for 3 semesters up until 3rd year. This guy did COBOLT in college, graduated in the 70s or something and went straight into teaching, he taught us Java for nearly 2 years. He literally copied and pasted texts from PDFs and read through them in class. He told myself and another guy at one stage he really didn't care, and was just counting down the days to his retirement.
- Databases again, different lecturer from 1st year, taught us for 2 semesters (24 weeks) and somehow managed to teach us nothing.
3rd Year Software Dev
- software engineering.. This is where the biggest cunt I've ever met was introduced. He arrives into class 15 minutes late every time without fail, talks shit about stuff that has no relevancy to the topic at all, tries to turn everything into a rugby metaphor and every time you ask a question he somehow dodges it and swiftly changes topic. This cunts past profession? A Project Manager. Fucking typical. This dickhead has also thought me 2 other modules.
4th yr Software Dev
- El cunto mentioned above for 2 more modules. Need I say more.
- real time systems, this module took the piss, the module was written by the lecturer which is what earns his space here. Assignments given to us, which required more time to do than we had in labs so we had to work at home, the problem we that is we were using an obscure RTOS called OS9 which would only work on the college computers. When brought to the lecturers attention he just said "figure it out"
Internet of Things - There was 2 lecturers, each lecturer seemingly working off a different plan, one week you'd have one lecturer, the next would be the other one going on about something completely different and unrelated to anything else we'd done.
Some lecturers didn't even make this list as I couldn't be bothered trying to think back about how shit other ones were. These were the ones that always stood out in my mind.
My main take away point from this is that you go to college for the paper which says you have a degree. Learning things that are going to benefit you in a career is up to yourself.
TLDR; 90% of my college lectures were shit. You need to learn useful stuff yourself.1 -
!rant more advice needed for weather API
I am making a spur of the moment travel app built around a weather API. The problem is I need to feed in temperature and humidity for a date and get a list of countries and all the APIs I'm looking at work on location requests.
So, I plan to solve this using a JavaScript webworker and set intervals to build my own database that I can query.
Would it be better to use a script external to my API or keep the business logic internal?
Best answer receives my warm thanks.
P.S. @dfox there should be a non rant section?6