The cassandra.yaml comment explains perfectly: "When executing a scan, within or across a
partition, we need to keep the tombstones seen in memory, so we can return them to the coordinator, which
will use them to make sure other replicas also know about the deleted rows. With workloads that generate a
lot of tombstones, this can cause performance problems and even exhaust the server heap."
How do we get tombstones in Cassandra
- Inserting null values
- Inserting values into collection columns
- Expiring Data with TTL
- Frequent update on at Table with TTL
- Explicit delete statement
- Materialized view
Inserting Null Value
Upsert operation on a table can generate a tombstone as well. How? Because Cassandra doesn’t check a
condition before writing a value if it exist or not (that would slow write down). Cassandra returns a
Table Definition used for this example
null
value when there is no value for a field. Therefore when a field is
set to null
Cassandra needs to delete (marks the column with a tombstone for deletion) the
existing data. Table Definition used for this example
CREATE TABLE example ( KEY INT PRIMARY KEY, Column1 text, Column2 text );
Statement with Problem
Statement will result in a tombstone for Column2, even if it is the first insertion
INSERT INTO example (KEY, Column1 , Column2) VALUES (1, 'someValue', null);
Inserting Values Into Collection Columns
Cassandra collections
list
, SET
, map
results in tombstones even if
we never delete a value. Cassandra optimizes for writes and does not check, if the list has
changed (or even existed), instead, it immediately deletes (creates tombstone on a column), before inserting
the new value.
Table Definition used for this example
CREATE TABLE collection_example( KEY INT PRIMARY KEY, Column1 list, Column2 SET, Column3 map );
Statement with Problem
statement creates three tombstone; one for each collection type even for first insert, because Cassandra does not check for existence of a collection.
INSERT INTO collection_example(KEY, Column1 , Column2 ,Column3 ) VALUES (1, ['a', 'b'], {'c', 'd'}, {1 :
'a', 2 : 'b'});
Solution
There is no direct solution to this, we need to be careful while designing table, Collection column is required or not.Expiring Data with TTL
Expiring data by setting a TTL (Time To Live) is one an alternative to deleting data explicitly, but technically results in the same tombstones recorded by Cassandra and requiring the same level of attention as other types of tombstones.Cassandra sets all the TTLs at column level irrespective of definition being Global (Table) or Row level.
Post a Comment
Post a Comment