Secondary Index Query Execution¶

Secondary index queries in Cassandra employ distinct execution strategies compared to primary key queries. The distributed coordination model varies significantly between index implementations, with Storage Attached Index (SAI) introducing adaptive query routing mechanisms that differ fundamentally from the scatter-gather pattern used by legacy secondary indexes.

Query Execution Models¶

Primary Key Query Execution¶

Queries containing the complete partition key enable deterministic replica identification through token computation:

Execution characteristics:

Replica identification via token(partition_key) computation
Query scope limited to replication factor (RF) nodes
Node selection determined by consistency level requirements
Latency bounded by replica response times

Secondary Index Query Execution¶

Queries lacking partition key constraints cannot leverage token-based routing. Secondary indexes maintain local data structures—each node indexes exclusively the partitions it owns, necessitating distributed query coordination.

Legacy Index Query Model: Scatter-Gather¶

Secondary Index (2i) and SASI implementations employ a scatter-gather execution pattern where the coordinator broadcasts queries to all relevant nodes and aggregates responses.

Scatter Phase¶

Coordinator receives query containing indexed column predicate
Coordinator identifies target nodes:
Global query: One replica from each token range
Partition-restricted query: Replicas for specified partition only
Query dispatched to all identified nodes concurrently

Gather Phase¶

Each replica node:
Executes local index lookup
Retrieves matching partition keys from index structure
Performs base table reads for matched keys
Serializes and returns results
Coordinator:
Collects responses from all contacted nodes
Merges and deduplicates result sets
Applies post-filtering for non-indexed predicates
Returns final result to client

Latency Characteristics¶

Query latency in scatter-gather execution is dominated by the slowest responding node:

T_total = T_network + max(T_node_1, T_node_2, ..., T_node_n) + T_merge

Where T_node_i = T_index_lookup + T_base_read + T_serialization

This tail latency sensitivity presents challenges in heterogeneous clusters where node performance varies.

SAI Query Model: Adaptive Range Reading¶

Storage Attached Index introduces a fundamentally different distributed query architecture. Rather than employing static scatter-gather, SAI implements adaptive coordinator-based range reading with dynamic concurrency adjustment.

Coordinator-Level Query Planning¶

The SAI coordinator performs query analysis to optimize distributed execution:

Index selectivity analysis: Identifies the most selective index among available SAI indexes to narrow the filtering space
Read command construction: Embeds selected index metadata into the distributed range read command
Concurrency factor estimation: Calculates initial parallel replica query count using local data statistics and query limit parameters

Multi-Round Execution¶

SAI executes queries iteratively rather than broadcasting to all nodes simultaneously:

Round initialization: Coordinator dispatches requests to a subset of token ranges based on computed concurrency factor
Response collection: Await responses from requested replicas
Concurrency adjustment: Recompute concurrency factor based on returned row count relative to query limit
Iteration: Repeat until query limit is satisfied or all ranges exhausted

This adaptive approach yields several advantages:

Reduced network overhead for queries satisfied by early rounds
Dynamic adjustment to cluster conditions
Bounded coordinator memory consumption

Index Status Propagation¶

SAI nodes propagate local index status to peers via the gossip protocol. This mechanism enables coordinators to:

Filter replicas with non-queryable indexes from request routing
Detect index build completion status
Identify failed index states requiring rebuild

Replica-Level Query Processing¶

Upon receiving a token range request, replicas execute local index queries through the Query Plan interface:

Query Plan Components¶

Expression analysis: Determines which available SAI column indexes satisfy query predicates
Query Controller acquisition: Obtains references to active SSTable indexes intersecting the requested token range
Reference retention: Maintains SSTable references until query completion, preventing compaction interference

Token Flow Framework¶

SAI employs the Token Flow framework for asynchronous iteration over partition matches across Memtable and SSTable indexes. The framework supports three core operations:

Operation	Description
Iteration	Sequential disk access via chunk cache to row IDs for partition key lookups
Token skipping	Advances past unmatched tokens at paging boundaries or during token intersection
Stream merging	Combines results from multiple indexes using Boolean logic for multi-predicate queries

Post-Filtering Pipeline¶

Following index-based partition key retrieval, SAI applies a filter tree for secondary filtering:

Filter Stage	Purpose
Partition granularity	Wide partitions require row-level filtering beyond partition key matches
Tombstone elimination	Removes rows deleted after indexing but before compaction
Non-indexed predicates	Evaluates query conditions lacking index support

Query Scope Classification¶

Global Queries¶

Queries without partition key restriction require cluster-wide coordination:

-- Global query: all token ranges involved
SELECT * FROM users WHERE city = 'NYC';
SELECT * FROM events WHERE level = 'ERROR' LIMIT 1000;

Index Type	Execution Model
2i / SASI	Scatter to all nodes, gather all results
SAI	Adaptive range reading with concurrency factor

Partition-Restricted Queries¶

Queries including partition key constraints limit scope to replica nodes:

-- Partition-restricted: RF nodes only
SELECT * FROM users WHERE user_id = ? AND city = 'NYC';
SELECT * FROM events WHERE device_id = ? AND level = 'ERROR';

Performance comparison:

Query Type	Nodes Contacted	Latency Characteristics	Cluster Scalability
Global (2i/SASI)	All token ranges	Bounded by slowest node	Degrades linearly
Global (SAI)	Adaptive subset	Bounded by concurrency factor	Improved via adaptation
Partition-restricted	RF nodes	Predictable, bounded	Constant

Index Implementation Comparison¶

Secondary Index (2i)¶

Legacy secondary indexes utilize hidden tables for index storage with independent lifecycle management:

Characteristic	Implementation
Storage structure	Separate table per indexed column
Replication	Inherits base table replication strategy
Compaction	Independent from base table
Query coordination	Static scatter-gather
Consistency	Index and base table may diverge temporarily

SASI (SSTable Attached Secondary Index)¶

SASI attaches index structures directly to SSTables, ensuring synchronized lifecycle:

Characteristic	Implementation
Storage structure	Per-SSTable index components
Compaction	Synchronized with base table
Query coordination	Static scatter-gather
Multi-predicate	Single-pass intersection within SSTables
Memory model	Unbounded during query execution

SAI (Storage Attached Index)¶

SAI introduces architectural improvements addressing limitations of prior implementations:

Characteristic	Implementation
Storage structure	Per-SSTable index with segment architecture
Memtable indexing	Enables read-your-writes semantics
Query coordination	Adaptive range reading with concurrency factor
Multi-predicate	Token Flow framework with Boolean merging
Memory model	Bounded via segment flushing
Cluster awareness	Index status propagation via gossip

Performance Considerations¶

Latency Model¶

Index Type	Latency Formula
2i / SASI	`T = T_network + max(T_node_i) + T_merge`
SAI	`T = Σ(T_round_j)` where rounds ≤ ceil(LIMIT / concurrency_factor)

SAI's multi-round execution can terminate early when the query limit is satisfied, potentially reducing total latency for bounded queries.

Throughput Considerations¶

Factor	2i / SASI	SAI
Coordinator load	High (all results buffered)	Reduced (streaming, bounded)
Network utilization	Proportional to cluster size	Adaptive to result density
Memory pressure	Unbounded result accumulation	Bounded via concurrency factor

Operational Implications¶

Aspect	2i / SASI	SAI
Index build visibility	Limited	Gossip-based status propagation
Failed index handling	Manual detection required	Automatic coordinator filtering
Zero-copy streaming	Not supported	Fully compatible

Query Design Guidelines¶

Recommended Patterns¶

Pattern	Rationale
Include partition key when possible	Restricts query to RF nodes
Apply LIMIT clause	Enables SAI early termination
Prefer high-selectivity predicates	Reduces post-filtering overhead
Combine multiple indexed predicates	Leverages Token Flow intersection

Query Examples¶

Optimal: Partition-restricted with index

SELECT * FROM orders
WHERE customer_id = ?      -- partition key constraint
  AND status = 'pending';  -- SAI index predicate

Acceptable: Global query with limit

SELECT * FROM orders
WHERE status = 'pending'   -- SAI index predicate
  AND created_at > ?       -- SAI range predicate
LIMIT 100;                 -- enables early termination

Suboptimal: Unbounded global query

SELECT * FROM orders
WHERE status = 'pending';  -- unbounded result set

Monitoring¶

Key Metrics¶

# SAI-specific metrics
org.apache.cassandra.metrics:type=StorageAttachedIndex,name=*

# Per-table read latency
org.apache.cassandra.metrics:type=Table,keyspace=*,scope=*,name=ReadLatency

# Coordinator request metrics
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency

Diagnostic Indicators¶

Symptom	Probable Cause	Remediation
Elevated P99 latency	Tail latency in scatter-gather	Migrate to SAI, add partition key
Query timeouts	Unbounded result sets	Apply LIMIT clause
Coordinator GC pressure	Result set accumulation	Pagination, query redesign
Index query failures	Non-queryable index state	Check gossip status, rebuild index

Index Overview - Index type selection criteria
SAI - Storage Attached Index architecture
Consistency - Consistency level impact on index queries
Partitioning - Token distribution and routing

Secondary Index Query Execution¶

Query Execution Models¶

Primary Key Query Execution¶

Secondary Index Query Execution¶

Legacy Index Query Model: Scatter-Gather¶

Scatter Phase¶

Gather Phase¶

Latency Characteristics¶

SAI Query Model: Adaptive Range Reading¶

Coordinator-Level Query Planning¶

Multi-Round Execution¶

Index Status Propagation¶

Replica-Level Query Processing¶

Query Plan Components¶

Token Flow Framework¶

Post-Filtering Pipeline¶

Query Scope Classification¶

Global Queries¶

Partition-Restricted Queries¶

Index Implementation Comparison¶

Secondary Index (2i)¶

SASI (SSTable Attached Secondary Index)¶

SAI (Storage Attached Index)¶

Performance Considerations¶

Latency Model¶

Throughput Considerations¶

Operational Implications¶

Query Design Guidelines¶

Recommended Patterns¶

Query Examples¶

Monitoring¶

Key Metrics¶

Diagnostic Indicators¶

Related Documentation¶