sstablerepairedset¶
Marks SSTables as repaired or unrepaired, controlling their participation in incremental repair.
Synopsis¶
sstablerepairedset --really-set [--is-repaired | --is-unrepaired] <sstable_files>
Description¶
sstablerepairedset modifies the repair status metadata of SSTable files. This status determines how SSTables participate in Cassandra's incremental repair mechanism:
- Repaired - SSTable has been verified consistent across replicas
- Unrepaired - SSTable needs to be included in future repair operations
This tool is essential for:
- Migrating to incremental repair - Mark existing SSTables appropriately
- Fixing repair metadata - Correct incorrect repair status
- Recovery scenarios - Reset repair state after data issues
- Anti-entropy management - Control what gets repaired
Cassandra Must Be Stopped
Cassandra must be completely stopped before running sstablerepairedset. Modifying repair status while Cassandra is active can cause data inconsistency.
How Repair Status Works¶
Repair Status Values¶
| Status | repaired_at Value | Meaning |
|---|---|---|
| Unrepaired | 0 | Needs repair verification |
| Repaired | Timestamp | Verified at that time |
| Pending | UUID | Part of ongoing repair |
Arguments¶
| Argument | Description |
|---|---|
sstable_files |
One or more paths to SSTable Data.db files |
Options¶
| Option | Description |
|---|---|
--really-set |
Required safety flag to confirm modification |
--is-repaired |
Mark SSTables as repaired |
--is-unrepaired |
Mark SSTables as unrepaired |
The --really-set flag is mandatory to prevent accidental modification.
Examples¶
Mark as Repaired¶
# Stop Cassandra first
sudo systemctl stop cassandra
# Mark specific SSTable as repaired
sstablerepairedset --really-set --is-repaired \
/var/lib/cassandra/data/my_keyspace/my_table-abc123/nb-1-big-Data.db
# Start Cassandra
sudo systemctl start cassandra
Mark as Unrepaired¶
# Mark SSTable as unrepaired
sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/my_keyspace/my_table-abc123/nb-1-big-Data.db
Mark Multiple SSTables¶
# Mark all SSTables for a table as unrepaired
sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db
Using with find¶
# Mark all SSTables in keyspace as unrepaired
find /var/lib/cassandra/data/my_keyspace/ -name "*-Data.db" -print0 | \
xargs -0 sstablerepairedset --really-set --is-unrepaired
Mark All SSTables on Node¶
#!/bin/bash
# mark_all_unrepaired.sh - Reset all repair status
DATA_DIR="/var/lib/cassandra/data"
# Find all user keyspaces (exclude system keyspaces)
for ks_dir in ${DATA_DIR}/*/; do
ks_name=$(basename "$ks_dir")
# Skip system keyspaces
if [[ "$ks_name" == system* ]]; then
continue
fi
echo "Processing keyspace: $ks_name"
find "$ks_dir" -name "*-Data.db" -print0 | \
xargs -0 -r sstablerepairedset --really-set --is-unrepaired
done
echo "All user SSTables marked as unrepaired"
When to Use sstablerepairedset¶
Scenario 1: Migrating to Incremental Repair¶
#!/bin/bash
# migrate_to_incremental.sh
KEYSPACE="$1"
# 1. Stop Cassandra
sudo systemctl stop cassandra
# 2. Mark all existing SSTables as unrepaired
# This ensures they will be included in the first incremental repair
find /var/lib/cassandra/data/${KEYSPACE}/ -name "*-Data.db" -print0 | \
xargs -0 sstablerepairedset --really-set --is-unrepaired
# 3. Start Cassandra
sudo systemctl start cassandra
# 4. Run incremental repair
nodetool repair -pr "$KEYSPACE"
Scenario 2: Fixing Incorrect Repair Status¶
# If SSTables were incorrectly marked as repaired
sudo systemctl stop cassandra
# Reset to unrepaired
sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db
sudo systemctl start cassandra
# Re-run repair
nodetool repair my_keyspace my_table
Scenario 3: After sstablescrub¶
# Scrub marks SSTables as unrepaired, verify status
# Check current status
sstablemetadata /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db | \
grep "Repaired"
# If needed, ensure consistent state
sudo systemctl stop cassandra
sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db
sudo systemctl start cassandra
# Repair to restore consistency
nodetool repair my_keyspace my_table
Scenario 4: After Data Recovery¶
# After restoring from backup, data needs repair verification
sudo systemctl stop cassandra
# Mark restored SSTables as unrepaired
sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/restored_keyspace/*/*.db
sudo systemctl start cassandra
# Full repair to sync with cluster
nodetool repair restored_keyspace
Scenario 5: Switching from Full to Incremental Repair¶
#!/bin/bash
# setup_incremental_repair.sh
# 1. Ensure all repairs are complete
nodetool repair -full
# 2. Stop Cassandra
sudo systemctl stop cassandra
# 3. Mark all as repaired (since full repair just completed)
find /var/lib/cassandra/data/ -name "*-Data.db" \
! -path "*/system*" -print0 | \
xargs -0 sstablerepairedset --really-set --is-repaired
# 4. Start Cassandra
sudo systemctl start cassandra
# 5. Future repairs will be incremental
# Only new unrepaired SSTables will be included
Verification¶
Check Current Repair Status¶
# View repair status for all SSTables
sstablemetadata /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db | \
grep -E "SSTable:|Repaired"
Sample Output¶
SSTable: nb-1-big-Data.db
Repaired At: 1705401600000 (2024-01-16T12:00:00.000Z)
SSTable: nb-2-big-Data.db
Repaired At: 0 (unrepaired)
SSTable: nb-3-big-Data.db
Repaired At: 1705488000000 (2024-01-17T12:00:00.000Z)
Audit Script¶
#!/bin/bash
# repair_status_audit.sh
KEYSPACE="$1"
echo "Repair Status Audit for ${KEYSPACE}"
echo "===================================="
repaired=0
unrepaired=0
for sstable in /var/lib/cassandra/data/${KEYSPACE}/*/*-Data.db; do
status=$(sstablemetadata "$sstable" 2>/dev/null | grep "Repaired At:" | awk '{print $3}')
if [ "$status" = "0" ]; then
unrepaired=$((unrepaired + 1))
echo "UNREPAIRED: $(basename $sstable)"
else
repaired=$((repaired + 1))
fi
done
echo ""
echo "Summary:"
echo " Repaired: $repaired"
echo " Unrepaired: $unrepaired"
echo " Total: $((repaired + unrepaired))"
Impact of Repair Status¶
Marking as Repaired¶
| Aspect | Effect |
|---|---|
| Incremental repair | Excluded from future repairs |
| Compaction | Compacted only with other repaired SSTables |
| Anti-entropy | Considered consistent |
| Risk | If data actually inconsistent, won't be fixed |
Marking as Unrepaired¶
| Aspect | Effect |
|---|---|
| Incremental repair | Included in next repair |
| Compaction | Compacted only with other unrepaired SSTables |
| Anti-entropy | Will be verified against replicas |
| Risk | Increased repair work, but ensures consistency |
Compaction Segregation¶
Troubleshooting¶
Permission Denied¶
# Run as cassandra user
sudo -u cassandra sstablerepairedset --really-set --is-unrepaired \
/var/lib/cassandra/data/.../*-Data.db
# Or fix ownership after
sudo chown -R cassandra:cassandra /var/lib/cassandra/data/
Cassandra Still Running¶
# Must stop Cassandra first!
nodetool drain
sudo systemctl stop cassandra
# Verify stopped
pgrep -f CassandraDaemon # Should return nothing
# Now safe to run
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db
Missing --really-set Flag¶
# Error: Must provide --really-set flag
# The flag is required as a safety measure
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db
Forgot --is-repaired or --is-unrepaired¶
# Error: Must specify either --is-repaired or --is-unrepaired
# Choose one:
sstablerepairedset --really-set --is-repaired /path/to/sstable-Data.db
# or
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db
Best Practices¶
sstablerepairedset Guidelines
- Understand implications - Wrong status causes data issues
- Prefer unrepaired when uncertain - Safer to re-repair
- Verify after setting - Check with sstablemetadata
- Document changes - Track what was modified and why
- Run repair after - Especially after marking unrepaired
- Backup first - Snapshot before bulk changes
- Consistent approach - Apply to all nodes in cluster
Critical Warnings
- Never mark as repaired without actual repair - Causes silent data loss
- Repair status affects compaction - Wrong status causes issues
- Cluster-wide impact - Inconsistent status causes problems
- Cannot undo easily - Wrong status requires re-repair
Related Commands¶
| Command | Relationship |
|---|---|
| sstablemetadata | Verify repair status |
| nodetool repair | Run repair operations |
| nodetool repair_admin | Check repair progress |
| sstablescrub | Also affects repair status |
| sstableupgrade | Also affects repair status |