sstablerepairedset¶

Marks SSTables as repaired or unrepaired, controlling their participation in incremental repair.

Synopsis¶

sstablerepairedset --really-set [--is-repaired | --is-unrepaired] <sstable_files>

Description¶

sstablerepairedset modifies the repair status metadata of SSTable files. This status determines how SSTables participate in Cassandra's incremental repair mechanism:

Repaired - SSTable has been verified consistent across replicas
Unrepaired - SSTable needs to be included in future repair operations

This tool is essential for:

Migrating to incremental repair - Mark existing SSTables appropriately
Fixing repair metadata - Correct incorrect repair status
Recovery scenarios - Reset repair state after data issues
Anti-entropy management - Control what gets repaired

Cassandra Must Be Stopped

Cassandra must be completely stopped before running sstablerepairedset. Modifying repair status while Cassandra is active can cause data inconsistency.

How Repair Status Works¶

Repair Status Values¶

Status	repaired_at Value	Meaning
Unrepaired	0	Needs repair verification
Repaired	Timestamp	Verified at that time
Pending	UUID	Part of ongoing repair

Arguments¶

Argument	Description
`sstable_files`	One or more paths to SSTable Data.db files

Options¶

Option	Description
`--really-set`	Required safety flag to confirm modification
`--is-repaired`	Mark SSTables as repaired
`--is-unrepaired`	Mark SSTables as unrepaired

The --really-set flag is mandatory to prevent accidental modification.

Examples¶

Mark as Repaired¶

# Stop Cassandra first
sudo systemctl stop cassandra

# Mark specific SSTable as repaired
sstablerepairedset --really-set --is-repaired \
    /var/lib/cassandra/data/my_keyspace/my_table-abc123/nb-1-big-Data.db

# Start Cassandra
sudo systemctl start cassandra

Mark as Unrepaired¶

# Mark SSTable as unrepaired
sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/my_keyspace/my_table-abc123/nb-1-big-Data.db

Mark Multiple SSTables¶

# Mark all SSTables for a table as unrepaired
sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db

Using with find¶

# Mark all SSTables in keyspace as unrepaired
find /var/lib/cassandra/data/my_keyspace/ -name "*-Data.db" -print0 | \
    xargs -0 sstablerepairedset --really-set --is-unrepaired

Mark All SSTables on Node¶

#!/bin/bash
# mark_all_unrepaired.sh - Reset all repair status

DATA_DIR="/var/lib/cassandra/data"

# Find all user keyspaces (exclude system keyspaces)
for ks_dir in ${DATA_DIR}/*/; do
    ks_name=$(basename "$ks_dir")

    # Skip system keyspaces
    if [[ "$ks_name" == system* ]]; then
        continue
    fi

    echo "Processing keyspace: $ks_name"

    find "$ks_dir" -name "*-Data.db" -print0 | \
        xargs -0 -r sstablerepairedset --really-set --is-unrepaired
done

echo "All user SSTables marked as unrepaired"

When to Use sstablerepairedset¶

Scenario 1: Migrating to Incremental Repair¶

#!/bin/bash
# migrate_to_incremental.sh

KEYSPACE="$1"

# 1. Stop Cassandra
sudo systemctl stop cassandra

# 2. Mark all existing SSTables as unrepaired
# This ensures they will be included in the first incremental repair
find /var/lib/cassandra/data/${KEYSPACE}/ -name "*-Data.db" -print0 | \
    xargs -0 sstablerepairedset --really-set --is-unrepaired

# 3. Start Cassandra
sudo systemctl start cassandra

# 4. Run incremental repair
nodetool repair -pr "$KEYSPACE"

Scenario 2: Fixing Incorrect Repair Status¶

# If SSTables were incorrectly marked as repaired

sudo systemctl stop cassandra

# Reset to unrepaired
sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db

sudo systemctl start cassandra

# Re-run repair
nodetool repair my_keyspace my_table

Scenario 3: After sstablescrub¶

# Scrub marks SSTables as unrepaired, verify status

# Check current status
sstablemetadata /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db | \
    grep "Repaired"

# If needed, ensure consistent state
sudo systemctl stop cassandra
sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db
sudo systemctl start cassandra

# Repair to restore consistency
nodetool repair my_keyspace my_table

Scenario 4: After Data Recovery¶

# After restoring from backup, data needs repair verification

sudo systemctl stop cassandra

# Mark restored SSTables as unrepaired
sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/restored_keyspace/*/*.db

sudo systemctl start cassandra

# Full repair to sync with cluster
nodetool repair restored_keyspace

Scenario 5: Switching from Full to Incremental Repair¶

#!/bin/bash
# setup_incremental_repair.sh

# 1. Ensure all repairs are complete
nodetool repair -full

# 2. Stop Cassandra
sudo systemctl stop cassandra

# 3. Mark all as repaired (since full repair just completed)
find /var/lib/cassandra/data/ -name "*-Data.db" \
    ! -path "*/system*" -print0 | \
    xargs -0 sstablerepairedset --really-set --is-repaired

# 4. Start Cassandra
sudo systemctl start cassandra

# 5. Future repairs will be incremental
# Only new unrepaired SSTables will be included

Verification¶

Check Current Repair Status¶

# View repair status for all SSTables
sstablemetadata /var/lib/cassandra/data/my_keyspace/my_table-*/*-Data.db | \
    grep -E "SSTable:|Repaired"

Sample Output¶

SSTable: nb-1-big-Data.db
Repaired At: 1705401600000 (2024-01-16T12:00:00.000Z)

SSTable: nb-2-big-Data.db
Repaired At: 0 (unrepaired)

SSTable: nb-3-big-Data.db
Repaired At: 1705488000000 (2024-01-17T12:00:00.000Z)

Audit Script¶

#!/bin/bash
# repair_status_audit.sh

KEYSPACE="$1"

echo "Repair Status Audit for ${KEYSPACE}"
echo "===================================="

repaired=0
unrepaired=0

for sstable in /var/lib/cassandra/data/${KEYSPACE}/*/*-Data.db; do
    status=$(sstablemetadata "$sstable" 2>/dev/null | grep "Repaired At:" | awk '{print $3}')

    if [ "$status" = "0" ]; then
        unrepaired=$((unrepaired + 1))
        echo "UNREPAIRED: $(basename $sstable)"
    else
        repaired=$((repaired + 1))
    fi
done

echo ""
echo "Summary:"
echo "  Repaired: $repaired"
echo "  Unrepaired: $unrepaired"
echo "  Total: $((repaired + unrepaired))"

Impact of Repair Status¶

Marking as Repaired¶

Aspect	Effect
Incremental repair	Excluded from future repairs
Compaction	Compacted only with other repaired SSTables
Anti-entropy	Considered consistent
Risk	If data actually inconsistent, won't be fixed

Marking as Unrepaired¶

Aspect	Effect
Incremental repair	Included in next repair
Compaction	Compacted only with other unrepaired SSTables
Anti-entropy	Will be verified against replicas
Risk	Increased repair work, but ensures consistency

Compaction Segregation¶

Troubleshooting¶

Permission Denied¶

# Run as cassandra user
sudo -u cassandra sstablerepairedset --really-set --is-unrepaired \
    /var/lib/cassandra/data/.../*-Data.db

# Or fix ownership after
sudo chown -R cassandra:cassandra /var/lib/cassandra/data/

Cassandra Still Running¶

# Must stop Cassandra first!
nodetool drain
sudo systemctl stop cassandra

# Verify stopped
pgrep -f CassandraDaemon  # Should return nothing

# Now safe to run
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db

Missing --really-set Flag¶

# Error: Must provide --really-set flag

# The flag is required as a safety measure
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db

Forgot --is-repaired or --is-unrepaired¶

# Error: Must specify either --is-repaired or --is-unrepaired

# Choose one:
sstablerepairedset --really-set --is-repaired /path/to/sstable-Data.db
# or
sstablerepairedset --really-set --is-unrepaired /path/to/sstable-Data.db

Best Practices¶

sstablerepairedset Guidelines

Understand implications - Wrong status causes data issues
Prefer unrepaired when uncertain - Safer to re-repair
Verify after setting - Check with sstablemetadata
Document changes - Track what was modified and why
Run repair after - Especially after marking unrepaired
Backup first - Snapshot before bulk changes
Consistent approach - Apply to all nodes in cluster

Critical Warnings

Never mark as repaired without actual repair - Causes silent data loss
Repair status affects compaction - Wrong status causes issues
Cluster-wide impact - Inconsistent status causes problems
Cannot undo easily - Wrong status requires re-repair

Command	Relationship
sstablemetadata	Verify repair status
nodetool repair	Run repair operations
nodetool repair_admin	Check repair progress
sstablescrub	Also affects repair status
sstableupgrade	Also affects repair status