Do all the things like ++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatarSign Up
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple APILearn More
rutee071739742dThere's no convincing, it has a different use.
NoSQL... I hate that term.
Key Value Store. Document Store.
A term that actually defines what you do - that's okay.
Whenever people use the term NoSQL, they get a free death stare with an explanation that intelligent people shouldn't use the hyped brainfart buzzwords.
It's a completely different use, you shouldn't think of it as replacing one with another, but instead using each, where it makes sense, in a lot of cases both.
Consider a very common usecase, the CQRS principle.
You keep a relational database for consistency on the backend, but you also maintain a denormalized document store like mongoDB to retrieve views *fast*... No joins, no code iteration, no mapping, you just maintain entire views. That can grt you from 500+ms to below 15ms response times for thousands od users....
A well designed nosql database can significantly support your service...
Another example are full text searching on *a lot* and *large* data... While SQLs can do it, Something like elasticsearch with lucene can perform several times faster if you need to say find all documents containing keywords. Ofc you can snap lucene to something like postgres too though... Again, it comes down to your priorities.. Fast? Available? Consistent?
Hard to get everything covered in a devrant comment section...
The short of it is, It's a tool, learn to use it and you'll be able to design products that better match their requirements
(Or you pick wrong and complicate everything xD)
It has it's own downsides...
you can boil it down to this (very oversimlified)
SQL: Fast Writes, Slow read
NoSQL: Slow Writes, Fast Read
in CQRS you join the two to achieve fast writes and fast reads, but at the cost of eventual consistency
It's definitely an overhead for the programmer to manage these databases...
In return you get massive throughput... say you have a company system, and you need to return the issued devices, for all employees, that fall under some department
so you join emplyees with departments and devices, return the fields you need, your ORM then maps it into some sort of class and you end up returning a json... that takes time to build up! sure it's in milliseconds when you start up, but it grows fast
with nosql, you look at the employees collection, filter employee.dep == searchDep (and if dep is indexed this is pretty much istant) and then join their employee.devices to a list and you're done. the output is already a json doc
all of that has little effect in a small-medium system...
but the closer you get to an enterprise system, let alone something like google, facebook or netflix and you start relying heavily on shaving off these times... suddenly with a nosql query service you can serve billions instead of millions because each of you thread only spends <5ms serving a person...
another simplified way to think of it is that you can use a nosql db as a "cache" of a sort (and that's what redis actually does!, it's all in-memory there)
instead of building responses, you have them prepared and then you just shoot them at the person you're serving... ofc you need to maintain them to be up to date!
but that's just one specific use, the one I have most experience with at least
@Hazarth I didn't insult you.
But I think it's wrong for many reasons to oversimplify - especially when oversimplifying and delivering a whole lot of conclusions without any explanation.
And most of your conclusions are wrong.
CQRS is a good example. You should have stopped right at that point.
And as I said above... Don't use the term NoSQL. It's completely misleading.
well just saying "you're wrong and ignorant" without any value added is kinda implying insult, but fine, I have no issues there...
but you're also implying you know a lot more about this topic, you just wont share?
This is the type of attitude you'd get at SO and it closes all further conversation...
I'll love it if you correct me, I don't want to be ignorant... this is just the level of understanding I so far got from projects I worked on (under other people) in the past... it was enough to get me enthusiastic about the topic though!
so enlighten us, please, or do you want everyone to get stuck with my misguided views? you'll have to deal with even more people who think the same then :)
@Hazarth Ok. Will be dry, but I'll try.
The thing that is very striking is that most people try to fit everything into a one box.
I can take a SQL database like MySQL, disable ACID completely (eg. reconfigure syncing, disable foreign keys etc.).
Is this still a RDBMS? Yes.
Do I still use SQL? Yes.
Does it something completely different than most people expect? Yes.
You did try to describe what you do and what you don't do. Just add an "I'd expect" or "Usually" or "generally" - cause what the system does is completely up to you.
You can put a terabyte of RAM into a system and let a relational database run on it.
It's an in memory database.... Isn't it?
The term NoSQL does the same. I think it was invented by someone who was fed up with SQL and thought he would be insanely clever.
Except that NoSQL is SQL: SQL is based upon the foundation of set theory. A key value store is based upon the foundation of set theory - too. It's just more specific. A document store is the same.
The thing about... let's use the general term "databases" is the question of what to store and how.
If you have a key value structure, use a key value store.
If you have a document, use a document store.
If you can model relations and do it in a relational way, use relational databases.
The term "SQL" and "NoSQL" are pretty wishy washy now adays and don't describe well what they do.
I can store JSON in a RDBMS database - this doesn't make it to a document store, although I could use it like one.
I can do an associative table in a RDBMs database - this doesn't make it to a key value store, although I could use it like one (hint: there is a _large_ difference behind the curtain ;) )
REDIS eg. started as a key value store.
It then added more complex data types cause people started using it for several other things. ;)
Document storage exist in several forms, eg. Scylla, MongoDB, RocksDB ... Each with a specific background and interna. Don't put everything in a blender, please.
you're prolly right, NoSql was probably named like that out of spite, I wouldn't be surprised, it sounds like it but I see what you mean, technically an apple is also "NoSQL" because it literally is not SQL, I get that, though when the word is already used as a standard, it becomes useful in itself because language keeps evolving, often through misuse and misunderstanding but what can you do? you know how many managers call containers "docker" still?
though one thing I'm unsure here, are document store databases really based on set theory? there are no unions, or intersections, it doesn't allow joins or relations... sets are usually normalized I believe and document stores violate that by design... but I might be wrong? I didn't read anything on set theory used in sql, I just have surface level knowledge from uni and such...
(so IntrusionCM doesn't tear my head off, I'll word this carefully)
what *we* did, and I'm *not sure* if it was the right approach but that's how it was designed by our engineers and it worked pretty well
we had two backend services, one was for receiving commands and other was for serving queries
the command service was backed by Postgres and the query service was backed by MongoDB
when a user did an action, it went into the command service
if the user requested data, it went to the query service
the communication was done using a messaging queue, we used Kafka for this. everytime a command was received, the data was saved in a relational database and also sent as a message into kafka with the minimal data needed to construct the view on the query side, where the query service listened for messages, took the data, and built/updated the documents as necessary.
I do suggest you read more on CQRS, it's fun
It's a challenge, but as I made the experience that most people (devs, managers, even dba's) don't have a fcking clue what they're talking about...
It became for me a global red flag. Someone talking about NoSQL without any specifics usually means having fun in several meetings to try and figure out what they really want vs what they actually cry for "coz it's cool man!".
(and we all know how dreadful this is -.-)
Document storage per se has less to do with set theory.
Buuuut.... this is kind of a word pun.
Cause the way the document is stored on disk has a lot to do with set theory.
In most document storages, an inverted index is used... Which can be expressed by set theory and which has a lot of mathematical background I don't understand ;)
@Hazarth Sounds like a typical CQRS system.
It makes sense when you need to have a high throughpot - low latency setup, although it is a very complicated and delicate thing.
The thing that ticked me off btw was not the CQRS.
It was this: "you can boil it down to this (very oversimplified)" ... and the following comment.
SQL has no fast writes. SQL - if _fully_ ACID compliant (most DBA's ignore the necessity of certain hardware like a RAID BBU), has a whole lot of complicated stuff to make writes appear fast, but when high throughput happens it sooner or later all goes down the drain.
(Double Write Buffer, FSYNC behaviour, certain filesystem specific behaviour and so on. Syncing is a pain in the butt).
Slow read isn't true either. As long as u keep it readonly, it's pretty fast. And when it's read only, you have a myriad of options to optimize for this.
JOINs are not slow per se - quite the opposite. With a growing result set, they can become slow, but this is true for anything.
... and the following comment...
regarding ORMs, JSON documents and so on.
It's not fully wrong at all.
But a lot of the statements bark up the wrong tree.
As an example: You don't need an ORM. You can return stuff with the correct query even in JSON form in most RDBMs. You don't need REDIS for an in memory database. Most RDBMs have support for in memory tables since a long time.
This is the thing with boxes....
An RDBMs can do all kind of things. Maybe not well, but it's possible.
Choose the right tool for the right job.
Look behind the curtain, understand what you'll try to do and then decide on what tool to use. ;)
huh, I didn't think about the inverted index, but I suppose you're right, the store would be set-like in the end, funny how that works
yeah, I tried to boil it down to how it might "feel" to use it rather than what it really does for the sake of inspiring OP to give it a try in the first place. I guess my approach is "I'd be rather you used it wrong and slowly learned than not used it at all" so I usually oversimplify into "human" terms?
I admit, the approach is not great, especially since I myself have little understanding on the topic, but I'd prefer if people felt excited to learn new things before they dive into it out of necessity.
I understand that it ticked you off though, It's not my intention to spread misinformation, but it happens and I am aware of that, I just think the end result can outweigh that (hopefully temporary) negative!
I let myself get a bit tilted too, sorry bout that o/
@Hazarth No problem, I'm not mad by the way.
I don't claim to have all the answers and being right all the time, that's impossible for any human being.
The thing with misinformation is just that it piles up. I had some pretty heated discussions, which is the reason I'll get angry nowadays.
I'll try to tone myself down in face to face discussions, hardly possible in a chat ;)
It's just that the mass of FUD / misinformation / buzz and hype in IT makes a job in management to fucking minesweeper.
k0pernikus554542dThere are certain datastructures that just make it so damn convoluted to translate them into relational databases.
Relational databases are great if you have a well-defined set of homogeneous data and with proper normalization they can be quite powerful and even beautiful.
Problem is the real messy world, with all its edgecases and exceptions and ever-demanding changes. In my experience, document-driven databases just handle heterogeneous data much better and it's simpler to reason about.
That, and a lot of people just downright abuse text fields.
(You know guys, if you want to fake nosql with MySQL-like databases, there is no reason to store serialized php arrays / objects: Just use the JSON type. Yes, it exists.)
Well... I don't like "NoSQL", but I don't dislike it either... different uses...
NoSQLs are great for caching data.
That's the thing... A lot of factors are involved.
CQRS is very delicate and hard.
Utilizing as a cache not so much, unless you completely bork up programming - eg. by not abstracting.
Although caching _can_ be a sign for severe implementation / hardware / configuration issues - some people just build Frankensteins to cover up the mess of "Don't know. Don't care."
@IntrusionCM I use them as caching because I am smart :p
I mean, why bother executing a SQL query each time somebody wants to get all releases of a project when I can just store those in the cache until they change? :p
Caching can (like you said) be a sign of other issues, though it's not too uncommon that people have put actual thought into it :p
junon296642dDocument DBs are great.
Don't do the whole "data denormalization" thing, I don't know where people got that that was a good idea.
They scale very well, and can do very interesting queries on data rather performantly.
Realtional databases tend to have a lot of issues scaling because data cannot be as sparse as documents due to the higher degree of coupling. Horizontally scaling MySQL, for example, is nearly impossible without some layer on top of it (e.g. Uber's Schemaless monstrosity they had).
Yes, there are exceptions to everything I said, but that's the general takeaway.
Also, you'll find the following to be true quite a lot: for example in MongoDB, the database technology itself is rather decent. There are a few conceptual/scientific flaws with it but they can mostly be ignored or at least worked around.
However, the client software (libraries, etc.) are generally TERRIBLE and only kind of work on a good day. They are maintained by different groups, which is why.
junon296642dIt always comes down to use-case.
I have been personally very interested in graph databases lately. They have excellent scaling properties, a very intuitive query language, and can represent much more nuanced relationships without introducing complexity to the fundamental design.
dUcKtYpEd370642dI know everyone heres going to tell you "It depends on the use case" but maintaining an easily pluggable domain and thus keeping to the Liskov principle of substitution means having an interchangeable data layer. If the business model changes, if the cost analysis doesnt make sense in the way mongo is being used by your org, do you really want to be rewriting every use case implementation to match how your storing data where your storing it ? If your storing all a users data in one document with as many internal relations as needed on it, your fucked when trying to go to sql. If your use case matches the same as it would across other data storage mechanisms, then the transition (which will happen. we all pretend it never will but it does somewhere down the line) will be doable.
For me I only like NoSQL for caching purposes like Redis. I'll look into that CQRS sync using MySQL and NoSQL.
I've read somewhere that MySQL is better than Postgre (I'll find the link and post it here).
IntrusionCM471141d@Devnergy Better is wrong, different.
MySQL has fragmented.
MariaDB tries to be sane and with 10.5 they split off, but they're lagging behind feature wise. Architecture wise they're going in a better direction than Oracle MySQL imho, because they're trying to simplify instead of adding complexity (Oracle is currently redesigning quite a lot of interna, idea good, but design reeks at some places of architecture design by theory instead of practice).
Oracle MySQL is closed source, they cut off and shut off community wise, eg. Planet MySQL got filters, Oracle will break stuff in minor versions nowadays (feature additions are ok in minor versions) and the QA is a nightmare.
Percona tries to be the missing QA, adding support in general for all databases, but having their own business oriented database solutions based on MySQL. They are good. Sadly it increases the fragmentation, cause their QA work is really needed.
PostGres. I and PostGres have a love hate relationship.
It has a single storage system, although they seem to be working on a kind of multi storage engine based on their plugins.
Great optimizer, feature bonanza regarding indexes and most datastructures, lot of possibilities if you like to drown yourself in research and get to know it.
That's love. What I absolutely hate about it is that it's really... Archaic academic.
It's configuration, plugin management and user management are a PITA imho.
If it wouldn't be for that, I'd be in love I guess. But setting up PostGres and finetuning it always alienates me.
rooter162140dBefore I go into attack mode, what do you like about NoSQL?
Edit: My bad. You told. Well, afaik as I know Mongo has a timing thingy. If the replication didn't happen in x time, it will never happen. So, about the performance, ditch it. It's cheating. But we're years further know. Don't know about current situation.
But sql is great