12

Turns out we doing SQL in CS now.
Not gonna lie it's...surprisingly easy(?).

Comments
  • 6
    it can get pretty complex to write proper SQL and to properly administer databases tho.

    A good dba makes bank, i know that our does. So if this interests you it can be lucrative as shit man.
  • 1
    Hehehehehe
  • 6
    It's easy, until it's unwieldy not.

    Small crud projects, it's a breeze.
    Once you start dipping into many tables being joined and dealing with 1m+ records, it gets out of hand quite quickly. You're not only having to find all the data you need for a query, but dealing with query optimisation to extract as much data as need.

    But,

    select x,y,z from some_table where b in (1,2,3,4,5);

    Is simple 😅
  • 1
    Yeah you can say it’s easy. I think it gets a little bit trickier when you integrate it with a programming language. Also, you can google best practices and look at the things that you’ve been doing that might be wrong or inefficient. Look at how to make your apps/sql queries more secure
  • 4
    tl;dr: It’s delightfully simple until it starts getting frightfully complex and is slow as beans for absolutely no apparent reason.

    The difficulties are subtle, and while they are still present even when dealing with simple queries, they don’t rear their heads unless you’re at scale — and they scale exponentially.

    Joins are easy. Multiple joins get increasingly difficult both to write and in execution time, especially when nested. (And there’s problems there too)

    Selects are simple, but can get difficult when working with multiple tables, sub selects, temp tables, etc.

    Groups, orders, etc. can also be troublesome in complex queries.

    But the real problem is always the sheer scope of what you’re asking the poor database to do. Especially with joins and orders, as these can exponentially increase the complexity of your query, and therefore execution time. For example, it is trivial to perform a full table scan when you don’t need to. Or accidentally create a temporary table when you don’t need to. Or forget to index a column. Or do an expensive comparison (and/or expensive format) instead of a simple one. Or ... you get the idea.

    But the thing is, you don’t really see these issues on small tables/datasets. Databases are so ridiculously optimized that you won’t be able to tell until things become a problem. And by then, your problems will likely be pretty widespread, requiring query rewrites and maybe even data fixes. Like anything, there’s a lot of skill you can acquire here, and best practices to follow.

    Also, and somewhat unrelated: altering tables locks them for the duration unless you do some special magic. And even (especially) if you do, it can take a very very long time on large tables. At work for example, adding a column takes about twenty minutes on some of our smaller tables, and multiple hours on larger ones. (Look into gh-ost if you’re experiencing this; takes much longer, but no locking!)
  • 4
    SQL is very powerful, It is almost always possible that multiple queries can be reduced to a single query, one who understands it's power will write SQL queries which look like a piece of art.
  • 1
    I hate writing sql. It’s not fun
  • 4
    SQL is easy until you have 18 joins with custom conditions for each. Haven forbid you want to have it grouped nicely. Then your gonna have a nice long talk with the sql interpreter that won't let you run your query under 10 min with indexes on all used fields.
  • 1
    @C0D4 Not when

    select x,y,z from some_table where b in (1,2,3,4,5);

    Becomes

    select x,y,z from some_table where b in (<more than 1000 values>).

    Discovered it the hard way. new client on the palteform (5-6 years ago, we were just starting) with way more data than I anticipated.

    I'm still looking for a good solution to that BTW.

    My currenmt solution is to take these 1000 values, BulkInsert them into a tanle (which is cleared after execution) and do a join.

    I'll take a betetr way any day.
  • 2
    @NoToJavaScript Nope.

    That _is_ the right way.

    You could use temporary tables, but these are connection bound.

    Alternative would be server or client side prepared statements and iteration, selecting in chunks - which is clunky and depending on what you do, not the right thing.

    I think the most fucked up thing I've seen was someone reading data from MySQL, creating a SQLITE database, and then used this as a primary solution for temporary tables.

    (Insane... Yeeeeessss!)
  • 0
    @IntrusionCM
    Hmmm this way is slower (at least with current setup on Azure SQL) than my wrong way.

    I’ll do some more testing.

    Note : the query is prepared by EF.

    Side note :

    For prod we use Azure SQL
    For dev we use a MSSQL server on a linux VM hosted in Azure.

    There are so many things working differently between two, I’m astonished every day.

    Some querry will execute (on the SAME db, dev server gives a copy of prod every night) in 25 ms on dev server. And take 15 SECONDS on azure

    And the querry which timeout at 30 seconds on dev, will be around 150 ms on Azure sql.
  • 2
    @NoToJavaScript

    The thing is: A DB has more knobs to configure then a teenage boy has pimples... MySQL has I think over 500 variables

    Usually every database can create temporary tables of any size as temporary means in RAM.

    Combined with a multithreaded database execution engine this spells desaster.

    As such, most DBs limit the size a temporary table can have in RAM and write it to HD as soon as it exceeds the size.

    That's the most common error case I'm aware of.

    Usually it's available in the detailed analysis of the query optimizer.

    But I've never dealt much with MSSQL.... Was always the ugly thing I never wanted to touch even If being paid to do so.
Add Comment