What is the right term for the DynamoDB and Cassandra data model?

Guy · December 27, 2022, 1:17pm

The DynamoDB Wikipedia article states that DynamoDB is a “key-value” database. However, the term “key-value” database completely overlooks a very fundamental feature of DynamoDB, that of the sort key: keys consist of two parts (partition key and sort key), and items with the same partition key can be efficiently retrieved together by the sort key.

Cassandra also has the exact same function of sorting items within a partition (which is called a “clustering key”), and Cassandra’s Wikipedia article uses the term wide colum store to describe this. While the term “wide column” is better than “key value,” it is still somewhat misleading as it describes the more general situation where an element may have a large number of unrelated columns, not necessarily an ordered list of separate items.

So my question is, is there a more appropriate term that can describe the data model of a database like DynamoDB and Cassandra: databases that, like a key-value store, can efficiently retrieve items for individual keys, but also items sorted by the key or just a part of it (the DynamoDB sort key or the Cassandra clustering key).

*Based on a question originally asked on Stack Overflow by Nadav Har’El

Guy · December 27, 2022, 1:19pm

Before the introduction of CQL, Cassandra adhered more strictly to the wide column store data model, in which there were only rows identified by a row key and containing sorted key/value columns. With the advent of CQL, rows became known as partitions, and columns could optionally be grouped into logical rows via clustering keys.

Even up to Cassandra 3.0, CQL was simply an abstraction over the original thrift data model, and the concept of CQL rows within the storage engine did not exist. They were simply a sorted set of columns with a compound key consisting of the concatenated values of the clustering keys. See this article for more details. There is now native support for CQL in the storage engine, allowing CQL data models to be stored more efficiently.

However, if you think of a CQL row as a logical grouping of columns within the same partition, Cassandra could still be viewed as a wide column store. Anyway, to my knowledge, there is no other established term to describe this kind of database.

*Based on an answer on Stack Overflow by J.B. Langston

Topic		Replies	Views
What are the differences between column families in Cassandra's data model compared to Bigtable? ScyllaDB bigtable , data-model	1	2172	December 5, 2022
Data modelling to replace Redis ScyllaDB data-model	2	864	February 9, 2023
Using DynamoDB Alernator and CQL together, data modeling and performance ScyllaDB data-model , alternator , amazon-dynamodb , udt	0	27	November 24, 2024
What is the difference between Clustering, Primary, Partition, and Composite (or Compound) Keys in ScyllaDB? Knowledge Base	0	1124	November 2, 2022
Data Modeling question, choosing the Partition Key, Clustering Key, Indexing and performance impact ScyllaDB data-model , cql , secondary-index	0	192	May 22, 2024

What is the right term for the DynamoDB and Cassandra data model?

Related topics