Skip to content

nodetool getcolumnindexsize

Displays the current column index size threshold.


Synopsis

nodetool [connection_options] getcolumnindexsize

Description

nodetool getcolumnindexsize displays the current column index size threshold in kilobytes. This threshold controls how frequently Cassandra creates index entries within partitions when writing SSTables. The column index (more accurately called the partition index) enables efficient row lookups within large partitions.

Understanding the Column Index

Despite the name "column index," this setting controls partition index granularity—how Cassandra navigates within partitions to find specific rows. See setcolumnindexsize for detailed explanation of what this index does and why it matters.


What This Value Means

The column index size threshold determines the maximum amount of partition data written before Cassandra creates an index entry:

Displayed Value Meaning
16 KB Index entry created every 16 KB of partition data
64 KB (default) Index entry created every 64 KB of partition data
128 KB Index entry created every 128 KB of partition data

Impact on Queries

Partition Data (256 KB total)

With 64 KB threshold (default):
┌────────────┬────────────┬────────────┬────────────┐
│  Block 1   │  Block 2   │  Block 3   │  Block 4   │
└────────────┴────────────┴────────────┴────────────┘
      ▲            ▲            ▲            ▲
   Index 1      Index 2      Index 3      Index 4

→ 4 index entries = 4 possible seek points

With 128 KB threshold:
┌──────────────────────┬──────────────────────┐
│       Block 1        │       Block 2        │
└──────────────────────┴──────────────────────┘
           ▲                      ▲
        Index 1                Index 2

→ 2 index entries = 2 possible seek points (less precise)

Examples

Basic Usage

nodetool getcolumnindexsize

Sample output:

Current column index size: 64 KB

Check Value on Remote Node

ssh 192.168.1.100 "nodetool getcolumnindexsize"

Check Value Across Cluster

#!/bin/bash
# check_column_index_cluster.sh

echo "=== Column Index Size Across Cluster ==="# Get list of node IPs from local nodetool status


nodes=$(nodetool status | grep "^UN" | awk '{print $2}')

for node in $nodes; do
    echo -n "$node: "
    ssh "$node" "nodetool getcolumnindexsize 2>/dev/null | grep -oP '\d+ KB' || echo "FAILED""
done

Sample output:

=== Column Index Size Across Cluster ===
192.168.1.101: 64 KB
192.168.1.102: 64 KB
192.168.1.103: 32 KB   <-- Inconsistent!

Interpreting the Value

Default Value (64 KB)

The default value of 64 KB is appropriate for most workloads. This provides:

  • Reasonable index granularity for partitions up to ~10 MB
  • Balanced memory usage for index storage
  • Good read performance for typical access patterns

Values Below Default (16-32 KB)

Indicates the cluster has been tuned for:

  • Large partitions (> 10 MB average)
  • Frequent point queries on wide partitions
  • Latency-sensitive read workloads

Trade-off: Higher memory usage for index storage.

Values Above Default (128+ KB)

Indicates the cluster has been tuned for:

  • Small partitions (< 100 KB average)
  • Memory-constrained nodes
  • Workloads where read latency is less critical

Trade-off: Potentially slower row lookups within partitions.


When to Check This Value

Performance Investigation

When read latency is higher than expected:

# Check current setting
nodetool getcolumnindexsize

# Compare with partition sizes
nodetool tablestats my_keyspace.my_table | grep -E "partition size"

# If large partitions + large column index size = potential issue

Configuration Audit

Ensure consistent configuration across cluster:

#!/bin/bash
# audit_column_index.sh

echo "Checking column index size consistency..."

values=()
for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    value=$(ssh "$node" "nodetool getcolumnindexsize 2>/dev/null | grep -oP '\d+')"
    values+=("$node:$value")
done

# Check for inconsistency
unique_values=$(printf '%s\n' "${values[@]}" | cut -d: -f2 | sort -u | wc -l)

if [ "$unique_values" -gt 1 ]; then
    echo "WARNING: Inconsistent column index sizes detected!"
    printf '%s\n' "${values[@]}"
else
    echo "OK: All nodes have consistent column index size"
fi

Before Tuning

Always check current value before making changes:

# Document current state
echo "Current column index size: $(nodetool getcolumnindexsize)"
echo "Partition sizes:"
nodetool tablestats my_keyspace.my_table | grep -i "partition"

# Then make informed decision about changes

Relationship to cassandra.yaml

The displayed value reflects the runtime setting, which may differ from cassandra.yaml:

Source Precedence Persistence
nodetool setcolumnindexsize Active at runtime Lost on restart
cassandra.yaml Loaded at startup Permanent

Checking Configuration vs Runtime

# Runtime value (what's actually in use)
nodetool getcolumnindexsize

# Configuration file value (what will be used after restart)
grep "column_index_size" /etc/cassandra/cassandra.yaml

If these differ, the runtime value was changed via nodetool setcolumnindexsize and will revert to the configuration file value on restart.


Decision Guide

Based on the value returned:

If Value Is And Partition Sizes Are Consider
64 KB Small (< 100 KB) Increasing to 128 KB to save memory
64 KB Large (> 10 MB) Decreasing to 32 KB for better read latency
16-32 KB Small Increasing to 64 KB (default may be better)
128+ KB Large Decreasing for better read performance

Common Issues

Value Differs from cassandra.yaml

# Check both
nodetool getcolumnindexsize
grep column_index_size /etc/cassandra/cassandra.yaml

# If different, someone used setcolumnindexsize
# Either update cassandra.yaml or wait for restart

Inconsistent Values Across Cluster

# Standardize across cluster
TARGET_SIZE=64

for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    echo "Setting column index size on $node..."
    ssh "$node" "nodetool setcolumnindexsize $TARGET_SIZE"
done

# Update cassandra.yaml on all nodes for persistence

Not Sure If Current Value Is Optimal

See the comprehensive guide in setcolumnindexsize for:

  • How to analyze partition sizes
  • Trade-offs of different values
  • Tuning workflow and best practices

Command Relationship
setcolumnindexsize Modify the threshold (includes detailed explanation)
tablestats Check partition sizes to inform tuning decisions
info View memory usage including index structures