nodetool getcolumnindexsize¶

Displays the current column index size threshold.

Synopsis¶

nodetool [connection_options] getcolumnindexsize

Description¶

nodetool getcolumnindexsize displays the current column index size threshold in kilobytes. This threshold controls how frequently Cassandra creates index entries within partitions when writing SSTables. The column index (more accurately called the partition index) enables efficient row lookups within large partitions.

Understanding the Column Index

Despite the name "column index," this setting controls partition index granularity—how Cassandra navigates within partitions to find specific rows. See setcolumnindexsize for detailed explanation of what this index does and why it matters.

What This Value Means¶

The column index size threshold determines the maximum amount of partition data written before Cassandra creates an index entry:

Displayed Value	Meaning
16 KB	Index entry created every 16 KB of partition data
64 KB (default)	Index entry created every 64 KB of partition data
128 KB	Index entry created every 128 KB of partition data

Impact on Queries¶

Partition Data (256 KB total)

With 64 KB threshold (default):
┌────────────┬────────────┬────────────┬────────────┐
│  Block 1   │  Block 2   │  Block 3   │  Block 4   │
└────────────┴────────────┴────────────┴────────────┘
      ▲            ▲            ▲            ▲
   Index 1      Index 2      Index 3      Index 4

→ 4 index entries = 4 possible seek points

With 128 KB threshold:
┌──────────────────────┬──────────────────────┐
│       Block 1        │       Block 2        │
└──────────────────────┴──────────────────────┘
           ▲                      ▲
        Index 1                Index 2

→ 2 index entries = 2 possible seek points (less precise)

Examples¶

Basic Usage¶

nodetool getcolumnindexsize

Sample output:

Current column index size: 64 KB

Check Value on Remote Node¶

ssh 192.168.1.100 "nodetool getcolumnindexsize"

Check Value Across Cluster¶

#!/bin/bash
# check_column_index_cluster.sh

echo "=== Column Index Size Across Cluster ==="# Get list of node IPs from local nodetool status


nodes=$(nodetool status | grep "^UN" | awk '{print $2}')

for node in $nodes; do
    echo -n "$node: "
    ssh "$node" "nodetool getcolumnindexsize 2>/dev/null | grep -oP '\d+ KB' || echo "FAILED""
done

Sample output:

=== Column Index Size Across Cluster ===
192.168.1.101: 64 KB
192.168.1.102: 64 KB
192.168.1.103: 32 KB   <-- Inconsistent!

Interpreting the Value¶

Default Value (64 KB)¶

The default value of 64 KB is appropriate for most workloads. This provides:

Reasonable index granularity for partitions up to ~10 MB
Balanced memory usage for index storage
Good read performance for typical access patterns

Values Below Default (16-32 KB)¶

Indicates the cluster has been tuned for:

Large partitions (> 10 MB average)
Frequent point queries on wide partitions
Latency-sensitive read workloads

Trade-off: Higher memory usage for index storage.

Values Above Default (128+ KB)¶

Indicates the cluster has been tuned for:

Small partitions (< 100 KB average)
Memory-constrained nodes
Workloads where read latency is less critical

Trade-off: Potentially slower row lookups within partitions.

When to Check This Value¶

Performance Investigation¶

When read latency is higher than expected:

# Check current setting
nodetool getcolumnindexsize

# Compare with partition sizes
nodetool tablestats my_keyspace.my_table | grep -E "partition size"

# If large partitions + large column index size = potential issue

Configuration Audit¶

Ensure consistent configuration across cluster:

#!/bin/bash
# audit_column_index.sh

echo "Checking column index size consistency..."

values=()
for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    value=$(ssh "$node" "nodetool getcolumnindexsize 2>/dev/null | grep -oP '\d+')"
    values+=("$node:$value")
done

# Check for inconsistency
unique_values=$(printf '%s\n' "${values[@]}" | cut -d: -f2 | sort -u | wc -l)

if [ "$unique_values" -gt 1 ]; then
    echo "WARNING: Inconsistent column index sizes detected!"
    printf '%s\n' "${values[@]}"
else
    echo "OK: All nodes have consistent column index size"
fi

Before Tuning¶

Always check current value before making changes:

# Document current state
echo "Current column index size: $(nodetool getcolumnindexsize)"
echo "Partition sizes:"
nodetool tablestats my_keyspace.my_table | grep -i "partition"

# Then make informed decision about changes

Relationship to cassandra.yaml¶

The displayed value reflects the runtime setting, which may differ from cassandra.yaml:

Source	Precedence	Persistence
`nodetool setcolumnindexsize`	Active at runtime	Lost on restart
`cassandra.yaml`	Loaded at startup	Permanent

Checking Configuration vs Runtime¶

# Runtime value (what's actually in use)
nodetool getcolumnindexsize

# Configuration file value (what will be used after restart)
grep "column_index_size" /etc/cassandra/cassandra.yaml

If these differ, the runtime value was changed via nodetool setcolumnindexsize and will revert to the configuration file value on restart.

Decision Guide¶

Based on the value returned:

If Value Is	And Partition Sizes Are	Consider
64 KB	Small (< 100 KB)	Increasing to 128 KB to save memory
64 KB	Large (> 10 MB)	Decreasing to 32 KB for better read latency
16-32 KB	Small	Increasing to 64 KB (default may be better)
128+ KB	Large	Decreasing for better read performance

Common Issues¶

Value Differs from cassandra.yaml¶

# Check both
nodetool getcolumnindexsize
grep column_index_size /etc/cassandra/cassandra.yaml

# If different, someone used setcolumnindexsize
# Either update cassandra.yaml or wait for restart

Inconsistent Values Across Cluster¶

# Standardize across cluster
TARGET_SIZE=64

for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    echo "Setting column index size on $node..."
    ssh "$node" "nodetool setcolumnindexsize $TARGET_SIZE"
done

# Update cassandra.yaml on all nodes for persistence

Not Sure If Current Value Is Optimal¶

See the comprehensive guide in setcolumnindexsize for:

How to analyze partition sizes
Trade-offs of different values
Tuning workflow and best practices

Command	Relationship
setcolumnindexsize	Modify the threshold (includes detailed explanation)
tablestats	Check partition sizes to inform tuning decisions
info	View memory usage including index structures