Ranter
Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Comments
-
Voxera113883yGraph databases offer other ways to look up things.
In many cases it will be way slower than a relational but for cases where you need to dig deep in the relations but not so wide a graph db can be orders of magnitude faster than relational.
And for updates its the same.
Relational indexes takes a lot of performance to update and if you have many large ones it gets very heavy.
Its also harder to spread out the load onto multiple machines which a good graph db can be better at, depending on your case.
What this means is that as a general db, relational has few peers, but for the specific use cases where graph shines, it really! Shines :)
And for those case a relational db, while still possible, would be many times slower and require monstrously bug servers.
Like most nosql, graph db’s are often used together with other types on nosql or even rekational db’s depending on the specific data and its uses. -
I'm moving to Neptune and let me just say that rewriting all my SQL queries into graph traversals (gremlin) makes them much more natural to read and understand. That's one benefit.
Yes, you could model that as an sql schema but think of all the foreign keys that need to be created to maintain relationships. On a graph db, the edge _is_ the relationship. You can compare traversal vs join performance to see how much of a difference it makes. -
@bananaerror I've looked at gremlin just today and it looks like over complex horseshit LOL
-
@Voxera so which use cases ? Have examples or benchmarks ? That's what I'm presently looking for. My gut is telling me for example that nodejs offers the same kind of lazy lure, as say neo4j which lets you just slap whatever old thing onto an object whenever you want with no structuring !
got this object that was created to manage a file load ? want to for some reason tag it with a series of other unrelated objects in one place in the code and hope it doesn;t get missed ?
GO RIGHT AHEAD !
now with data this is a nice feature but could be accomplished in the manner I'm saying it could.
but I really would like to see some actual data since I have not actually generated myself and have been looking sideways at neo4j thinking '.... why is this taking 900 mb of ram just to start and how could it possibly be storing things or indexing them that retrieval would be faster at all than say a foreign key relationships' -
bioDan56223y@AvatarOfKaine its simple.
Tl;dr - different data visualizations are better in different uses cases.
Different types of visualizations, like
Tables, graphs, line chart, bar chart, etc.
Have the same goals.
1. Explore the data
2. Understand the results
2. Communicate the insights
Some forms of data are easirer to understand in tabular form, less than ~100 rows, where you can sort by columns.
For much more aggregated data, bar charts, pie charts, line charts, scatterplots, etc. Make it easier to understand general trendlines or patterns in the data than in tabular format.
When you have a lot of relationships in the data, and you need to dive in through those relationships, graphs make it easier to understand, see the patterns, and get insights.
For example use cases for social networks, telecommunication logs, monetary transactions, etc. Are better suited for consumption in a graph.
Example of tabular data vs graph data:
Lets say you have a table with the following columns:
person_id, name, email, father_id, mother_id
When you query it in SQL you get a table with 200 rows.
Questions like
Who are the youngest generations? Who are the eldest persons?
Are very hard to understand and consume when you get the tabular data, but are instantly recognized and understood when presented as a graph. -
I think that there are a lot of things intertwined here.
A representation of a graph via a diagram is just that.
It's a representation, a frontend, a visual concept - but not necessarily how the data is stored.
You could - of course - store the mentioned graph in a graph database.
But - and that's why I would keep the presentation of data separate from the storage - it depends on what you want to achieve.
@Voxera explained in great detail the pros and cons of relational vs graph databases, though the elefant in the room is kinda missing.
A relational database stores tuples (rows) following a definition schema.
A graph database stores usually a relationship between two nodes with a label - afaik some graph databases can store additionally key value pairs as properties to a relationship.
The quantity of information and it's organization is completely different.
It's obvious I know :)
Yet it's the most common mistake that happens in dev discussions regarding different database types - the assumption that one "can do" the same as the other. -
killames5703y@Voxera i postulate it’s applying similar logic and just representing queries with steps missing to make it seem neato
-
killames5703y@bioDan I think you’re missing the point
Data storage is not the same as visualization
To some degree if I store a data item with a relationship to another data item with another relationship
That is the same as a table with a foreign key
I think you
Missed the question -
bioDan56223y@killames i dont think i did. The only question in the post is:
"What is the use of graph and whats it really offering?"
No mention of data storage anywhere -
@killames I suggest you look for Mr Karwins SQL Antipatterns.
A tree representation is possible in relational databases, but it's hard.
A tree representation on the other hand is in a graph database quite easy, as a tree can be represented as a graph. -
bioDan56223y@IntrusionCM absolutely.
Also, aggregations on graph databses are much more expensive.
@killames but its not the same as foreign keys because foreign keys are
- constraints to other tables through columns whereas in a graph you can have a relationship without constraints.
- relationships carry information about the relationship itself, as an entity, while foreign keys are attributes of a property (column) for example, the relationship between a student and a university can be "graduated_from"
i.e.
person -> graduated_from -> university
And graduated_from can have its own properties like: started_at, graduated_at, degree_type
Those properties are not of the person, and not of the university, but of the relationship between them -
killames5703y@bioDan well he’s a but a foreign key is an enforced relationship is my point
It also ends up containing the physical record # of the target table from the engines viewpoint
Still doesn’t answer my question about how the storage mechanism would end up differing much in the long run unless it’s More inefficient -
bioDan56223y@killames if you want to talk about data storage then i think you can make an analogy of relational dbs vs object stores/graph dbs to arrays vs linked lists.
In some use cases linked lists are better, in others its better to use an array. -
killames5703y@bioDan no because that’s the same difference
A table is not an array unless it’s missing reference constraints
And relying on the user to query the structures separately
My point is I don’t see the underlying storage mechanism being arranged in a way that is any better
If you’re traversing a graph you’d need the nodes and relationships bare minimum and then the search through relationships for example would still best be arranged to a sort by a certain value or by a property of the relationship
In either case that would likely be the only difference except it wouldn’t because a foreign key constraint can be indexed lol -
bioDan56223y@killames we are talking about data storage.
Tables and arrays are similar in that they require a fullscan when you get values out of it (not in all cases but im over simplifying for the example)
Whereas graphs and linked lists do not do fullscans when you read values out of them, you traverse through the items explicitly. -
killames5703y@bioDan yes but you’re not getting something and honestly we already had this conversation
-
killames5703y@bioDan and a linked list is pretty much an array
It contains a. Previous and next field in the way I was taught to see them of pointers of the same type -
Voxera113883yFor an example of graph data think of facebook friends.
Sure you can use a simple friends table.
But if you want to find your friends friends friends relatives and all their latest posts and updats, using a relational db you would have to use some pretty obscure sql or make multiple queries to dig deeper and it will hit the friends table quite hard.
In a graph db you first can create the query much easier but the graph indexes are also much better designed for that type of query. -
@Voxera now that I'm not quite able to visualize.
so...
tb1: person
tb2: friends
tb2.user = id of person
tb2.friendid = id of another person.
so you'd have to yes run multiple queries composing a temporary table of values until its size didn't change.
however whether i write that code in one statement or several, kind of seems like it would have to do the same thing in the background...
so its just syntax you're saying that makes it so much better suited from your perspective ?
i mean each object would already have to have an id.
Person = node type 1
and then relationship edges would just be between two people.
but if you scan through even the edges you'd have to have some recursive logic in there if you said "GIVE ME ALL THE FRIENDS OF ALL THE FRIENDS OF ALL THE PEOPLES !!!!! ALL OF THEM ! I DON'T CARE IF ITS A MILLION OUT OF TWO MILLION ! MUAHAHAHAHAHHAAH!"
Thing is, in both cases the data is boiling down to the same storage format in essence. Except some unparsed text. -
Voxera113883y@AvatarOfKaine The performance comes from how the graph is organized under the hood which offers benefits that relational indexes and tables have a harder time to handle.
And important, for small graphs like a few million records this might not be really measurable, but once you have link tables of billions of rows, making multiple consecutive lookups to that table will start to cost, especially if you get locking with updates.
Graph dbs skip some promises of the relational model but offer the benefit of all those lookups might operate on thousands of items at a time instead of billions which will make a huge impact.
And updates will also avoid locking larger swats of items.
To be honest, I have not really used graph since we never reached the scale where it motivated the addition of a second database platform ;
But I did evaluate it and we did see clear opportunities if the data grew. -
bioDan56223y@killames if you dont understand the conceptual and practical differences between how arrays and linked lists are stored and read from memory/disk, its no wonder you dont get my point. This discussion is futile.
-
killames5703y@bioDan there really aren’t any except for how they can be reordered in some languages and dynamically sized and how they can’t be indexed
That is the only difference that and storage
I haven’t used a linked list since I used pascal in the late 90s
They’re not a useful data structure but more a way of compensating for a lack of dynamic arrays and collections
Unlike binary trees which are hella useful still -
killames5703y@bioDan I mean the other is taking a disembodied element and having an idea what elements were before and aster in the list
But how often is that useful ? -
bioDan56223y@killames many things, the DOM in your browser is a linked list. You can search for linked list practical applications online if the topic interests you
-
killames5703y@bioDan as to the topic at hand
Look man
If you stored all the links between one object and another group of objects that’s a tree not a linked list
So sure you could arrange all the data like that in a graph storage mechanism but it would really just boil down to storing a pointer to a list
And then each item in the list would have a pointer to a list
So you’re still suck traversing all the times
Same as if you did an indexed scan for a many to many table -
killames5703y@bioDan given the behind the scenes way data structures and objects store data I don’t even see linked lists as being great for sorting
It’s effectively going to do the same thing and referencing the messiest object the world has ever seen doesn’t much
That and the dom being a single field linked list doesn’t make sense either
A tree would -
bioDan56223yAny tree, even a binary tree, is a type of linked list. I have no idea what you're arguing about
I just look at a layout like this and see a relational database. Because minus random markers, there is a defined set of relationships some of which can be inferred or taken from OTHER data like.
"Joe travels at 8 am +/- 1 hr 99% of the time, every day of the work week for the last 52 weeks, likely joe is commuting to this location"
or you could just add a schedule table and one item could be marked commute vs a log table of data that is actually happening.
With everything else I see the same things.
I also see a possibility for graph edges and the likej to get out of control really quickly when you start adding event data into it.
so what is the use of graph and whats its really offering ?
any data worthwhile is likely going to have some kind of structure, even if you add ad hoc fields that don't exist, after enough additions those fields should be standardized !
question