Originally from the User Slack
@Ritesh: Hello ScyllaDB users!
I am currently working with Spark and the Cassandra connector to store and modify map collections in a ScyllaDB table. So far, I have successfully implemented and run the following code:
// Write the result DataFrame to a new table in ScyllaDB
final_df.rdd.saveToCassandra(“mykeyspace”, “mytable”, SomeColumns(“id”, “map1” append, “map2” append))
However, I would like to specify a TTL (time-to-live) on a per-element basis within the map collection type. I am aware of the example code in the spark-cassandra-connector repository, which applies TTL at the row level:
import com.datastax.spark.connector.writer._
…
rdd.saveToCassandra(“test”, “tab”, writeConf = WriteConf(ttl = TTLOption.constant(100)))
rdd.saveToCassandra(“test”, “tab”, writeConf = WriteConf(timestamp = TimestampOption.constant(ts)))
Is there a way to specify TTL for each element in a map collection rather than at the row level?
Any help would be greatly appreciated!
Thanks!
@avi: You could write each element in a separate saveToCassandra call perhaps
@Ritesh: @avi Thanks for the suggestion!
I used an older version of Spark and was able to set a TTL on the map column in ScyllaDB. However, despite setting the TTL to 200 seconds, my data does not expire as expected. I am still able to retrieve the data in query results using cqlsh command even after the specified TTL has passed.
@avi: You can try to debug it by using the ttl() function to see what the database thinks your ttl is
@Ritesh: I tried using TTL() function on map type column(non-frozen) but I got below error-
“TTL expects an atomic column, but it is a non-frozen collection”
@avi: What about TTL(my_map[‘key’])?
@Ritesh: No such luck!
@avi: I see. Please file an enhancement request.
@Ritesh: Okay
Hi @avi,
I have a question regarding our previous discussion. Since I am applying a TTL of 2 months to each element in the map, which compaction strategy would be most suitable for this scenario? Additionally, there might be monthly updates to the existing map if needed
@avi: Size Tiered
@Ritesh: Can’t we use TWCS for this?
@avi: TWCS is suitable for insert-only, no updates, and with the clustering key corresponding to time
@Ritesh: Okay, got it
Thanks!
Hi @avi!
Could you please help me with the following question:
Is there a way to temporarily pause TTL in ScyllaDB without re-inserting the same data with a new TTL?
@avi: No (you can try moving your clock backwards, but likely many things will break)
@Ritesh: Okay