Data Modeling question with a high consistency requirement, using LWT, MV, SI or frozen maps?

Guy · June 2, 2024, 3:54am

Originally from the User Slack

@XH_L: Hello everyone! I am considering using ScyllaDB as the only database, but I have encountered some consistency issues.
Could anyone point me in the right direction or provide a better solution? Thanks!

For instance, I want to make email and phone be unique, similar to the UNIQUE in SQL.
CREATE TABLE user (
  id UUID PRIMARY KEY,
  email TEXT,
  phone TEXT,
  password TEXT
)
I noticed that ScyllaDB has a LightWeight Transaction (LWT) IF NOT EXISTS.

Based on this, I have made some attempts:

• Adding email and phone to the PRIMARY KEY, then using INSERT ... IF NOT EXISTS
◦ It checks the combination of the entire primary key
◦ This conflicts with my desired query to find the password using email or phone
• Creating two new tables which use email or phone as the PRIMARY KEY, then performing BATCH INSERT ... IF NOT EXISTS on both tables atomicity
◦ Queries need access multi tables
◦ The primary key could lead to unbalanced distribution
◦ BATCH cannot perform transactional insertions on different tables, and separating the insertions could lead to new consistency issues
By the way, I noticed that cassandra plan to support ACID, do we have similar plans?
https://thenewstack.io/acid-transactions-change-the-game-for-cassandra-developers/

The New Stack: ACID Transactions Change the Game for Cassandra Developers

@Felipe_Cardeneti_Mendes: Based on your description it seems like you want to support retrieving a password for either email or phone inputs

Why don’t you use a view or an index then? If the input is an email you read from the main table. If it’s a phone you read from the view . Both should return the same password
From your proposed data model your partition is ID, but if you already know the id early in that stage then this question is irrelevant. Perhaps you should find the email/phone first and discover the ID to support other functions.

@XH_L: Thanks for your reply! Yes, mv or si can implement my queries, but my main concern is how to ensure email and phone are globally unique and keep my queries work.

This is my final attempt, It work, but probably not a good practice：
CREATE TABLE IF NOT EXISTS user.infos (
    id UUID,
    name TEXT,
    phone TEXT, -- add phone column to allow get phone by id
    email TEXT, -- add email column to allow get email by id
    password TEXT,
    PRIMARY KEY (id)
)

-- use phone as partition key for keeping unique
CREATE TABLE IF NOT EXISTS user.phones (
    phone TEXT,
    id UUID,
    PRIMARY KEY (phone)
)

-- use email as partition key for keeping unique
CREATE TABLE IF NOT EXISTS user.emails (
    email TEXT,
    id UUID,
    PRIMARY KEY (email)
)

-- create indexes for get user info by name, phone, email
CREATE INDEX ON user.infos (name)
CREATE INDEX ON user.infos (phone)
CREATE INDEX ON user.infos (email)

-- Atomic insert in application (if any insert return false, stop subsequent insert)
INSERT INTO user.phones (phone, id) VALUES (?, ?) IF NOT EXISTS
INSERT INTO user.emails (email, id) VALUES (?, ?) IF NOT EXISTS
INSERT INTO user.infos (id, name, phone, email, password) VALUES (?, ?, ?, ?, ?) IF NOT EXISTS
There are still some problems with applcation level atomic insert. It is possible that the phone insertion is successful but the email insertion fails. At this time, it needs to be deleted the phone. I think it is too complicated and it is estimated that it will not have good distribution and performance.

@Felipe_Cardeneti_Mendes: Maybe a frozen map then? Email and phone are authentication methods, and you perhaps may want to support others in the future.

A method maps to an id. It is unique and all under a single table.
What Id do would be to require gating of these methods. One must confirm their identity to each one. At the same time, they can register with either one— or even both. But we require just one method initially, and they are free to add others (and confirm they own it) later on

@XH_L: I’ll look into forzen map, thanks for the suggestion!

Topic		Replies	Views
Data modeling issues ScyllaDB data-model , open-source	4	263	March 25, 2024
Implement transaction in client side ScyllaDB lwt , data-model , ttl , secondary-index , consistency	1	43	October 28, 2024
Need help with "transaction" modeling ScyllaDB	3	457	October 31, 2023
Modeling "transactional" behavior, modifying two tables ScyllaDB lwt , data-model , migration , acid	1	142	July 11, 2024
Datamodel for Scylla/Cassandra for table partition key is not known beforehand -> static field? ScyllaDB	1	299	February 9, 2023

Data Modeling question with a high consistency requirement, using LWT, MV, SI or frozen maps?

Related topics