nodetool garbagecollect¶
Removes deleted data (tombstones) from SSTables without full compaction.
Synopsis¶
nodetool [connection_options] garbagecollect [options] [--] [keyspace [table ...]]
Description¶
nodetool garbagecollect performs a targeted cleanup of tombstones and expired data without triggering a full compaction. This is lighter weight than compact and specifically targets garbage collection of deleted data.
Arguments¶
| Argument | Description |
|---|---|
keyspace |
Keyspace to garbage collect. If omitted, processes all keyspaces |
table |
Specific table(s) to garbage collect |
Options¶
| Option | Description |
|---|---|
-g, --granularity |
ROW or CELL level granularity |
-j, --jobs <jobs> |
Number of concurrent jobs (default: 2) |
Granularity Levels¶
ROW Granularity (Default)¶
nodetool garbagecollect -g ROW my_keyspace my_table
- Removes entire rows that are fully deleted
- More efficient for row-level deletes
- Less I/O intensive
CELL Granularity¶
nodetool garbagecollect -g CELL my_keyspace my_table
- Removes individual deleted cells within rows
- More thorough cleanup
- Higher I/O cost
When to Use¶
After Bulk Deletes¶
# After deleting many rows
DELETE FROM my_keyspace.my_table WHERE partition_key IN (...);
# Clean up tombstones
nodetool garbagecollect my_keyspace my_table
TTL-Heavy Tables¶
For tables with many expiring rows:
nodetool garbagecollect my_keyspace ttl_table
Instead of Full Compaction¶
# Lighter alternative to:
# nodetool compact my_keyspace my_table
# Use garbage collect for tombstone cleanup only:
nodetool garbagecollect my_keyspace my_table
Examples¶
Garbage Collect Specific Table¶
nodetool garbagecollect my_keyspace my_table
Garbage Collect All Tables in Keyspace¶
nodetool garbagecollect my_keyspace
Cell-Level Garbage Collection¶
nodetool garbagecollect -g CELL my_keyspace my_table
Parallel Garbage Collection¶
nodetool garbagecollect -j 4 my_keyspace
Tombstone Eligibility¶
Tombstones are only removed when:
- Age > gc_grace_seconds - Default 10 days
- All replicas have seen the tombstone - Via repair or normal writes
-- Check table's gc_grace_seconds
SELECT gc_grace_seconds FROM system_schema.tables
WHERE keyspace_name = 'my_keyspace' AND table_name = 'my_table';
gc_grace_seconds
Tombstones younger than gc_grace_seconds cannot be removed. Running garbage collect on fresh deletes has no effect.
Comparison with Compact¶
| Aspect | garbagecollect | compact |
|---|---|---|
| Purpose | Remove tombstones | Merge all SSTables |
| Scope | Targeted cleanup | Full rewrite |
| I/O Impact | Lower | Higher |
| Duration | Shorter | Longer |
| Space needed | Less | More |
Use garbagecollect when you only need tombstone cleanup.
Use compact when you need full SSTable consolidation.
Monitoring¶
Check Progress¶
nodetool compactionstats
Garbage collection appears as a compaction operation.
Check Tombstone Counts Before/After¶
# Before
nodetool tablestats my_keyspace.my_table | grep "tombstones"
# Run garbage collect
nodetool garbagecollect my_keyspace my_table
# After
nodetool tablestats my_keyspace.my_table | grep "tombstones"
Common Scenarios¶
High Tombstone Warnings¶
If logs show tombstone warnings:
WARN Read 5000 live rows and 100000 tombstone cells
# Clean up tombstones (after gc_grace_seconds has passed)
nodetool garbagecollect my_keyspace problematic_table
Prepare for Major Compaction¶
# Clean tombstones first (faster)
nodetool garbagecollect my_keyspace
# Then compact if needed
nodetool compact my_keyspace
After Range Deletes¶
Range deletes create many tombstones:
# Wait for gc_grace_seconds
# Then clean up
nodetool garbagecollect my_keyspace my_table
Best Practices¶
Garbage Collection Guidelines
- Wait for gc_grace_seconds - Tombstones must be old enough
- Run repair first - Ensure tombstones propagated to all replicas
- Monitor disk space - Some temporary space needed
- Start with ROW granularity - Less I/O, often sufficient
- Use CELL for column deletes - When updating individual columns
Scripting Example¶
#!/bin/bash
# garbage_collect_old_tables.sh
# Clean tombstones from tables with high tombstone counts
THRESHOLD=10000
for ks_table in $(nodetool tablestats 2>/dev/null | grep -B5 "Tombstones" | \
grep "Table:" | awk '{print $2}'); do
tombstones=$(nodetool tablestats $ks_table | grep "tombstones" | \
awk '{sum += $NF} END {print sum}')
if [ "$tombstones" -gt "$THRESHOLD" ]; then
echo "Garbage collecting $ks_table (tombstones: $tombstones)"
nodetool garbagecollect $ks_table
fi
done
Related Commands¶
| Command | Relationship |
|---|---|
| compact | Full compaction (heavier) |
| tablestats | Check tombstone counts |
| scrub | Rebuild SSTables |
| cleanup | Remove non-local data |