nodetool gettraceprobability¶

Displays the current probability of tracing CQL requests on the node.

Synopsis¶

nodetool [connection_options] gettraceprobability

Description¶

nodetool gettraceprobability shows the current probability (0.0 to 1.0) that any given CQL request will be traced. This setting controls Cassandra's probabilistic tracing feature, which samples a percentage of requests for detailed performance analysis.

What is Request Tracing?¶

Request tracing in Cassandra records detailed timing information about how a CQL query is processed across the cluster. When a request is traced, Cassandra captures:

Coordinator activity - Time spent parsing, planning, and coordinating the query
Replica communication - Time to send requests to and receive responses from replica nodes
Per-replica processing - What each replica did (memtable reads, SSTable reads, bloom filter checks)
Latency breakdown - Microsecond-level timing for each operation phase

Trace data is written to the system_traces keyspace, which contains two tables:

Table	Contents
`system_traces.sessions`	One row per traced request with summary information
`system_traces.events`	Detailed events for each traced request

Why Probabilistic Tracing?¶

Tracing has significant overhead—each traced request generates multiple writes to system_traces. Enabling tracing for all requests (probability 1.0) would:

Increase write amplification substantially
Consume significant disk space
Impact cluster performance

Probabilistic tracing allows sampling a small percentage of requests to gather representative performance data without overwhelming the cluster. For example, with 0.001 (0.1%) probability on a cluster handling 100,000 requests/second, approximately 100 requests/second would be traced—enough for analysis without significant overhead.

Examples¶

Basic Usage¶

nodetool gettraceprobability

Sample Output¶

Current trace probability: 0.0

A value of 0.0 means no requests are being traced (the default).

Understanding Probability Values¶

Value	Percentage	Meaning	Use Case
0.0	0%	No tracing (default)	Normal production operation
0.0001	0.01%	1 in 10,000 requests	High-traffic production sampling
0.001	0.1%	1 in 1,000 requests	Production performance monitoring
0.01	1%	1 in 100 requests	Active troubleshooting
0.1	10%	1 in 10 requests	Development/testing
1.0	100%	All requests	Brief debugging only

Performance Impact

Values above 0.01 (1%) can noticeably impact performance on busy clusters. Values of 0.1 or higher should only be used briefly during active debugging sessions or in non-production environments.

Viewing Trace Data¶

Once tracing is enabled and requests are sampled, trace data can be queried from system_traces:

View Recent Trace Sessions¶

SELECT * FROM system_traces.sessions
WHERE started_at > toTimestamp(now()) - 1h
LIMIT 10;

View Events for a Specific Trace¶

-- First, get a session_id from sessions table
SELECT session_id, coordinator, request, started_at, duration
FROM system_traces.sessions LIMIT 5;

-- Then query events for that session
SELECT activity, source, source_elapsed, thread
FROM system_traces.events
WHERE session_id = <session_id_from_above>;

Example Trace Output¶

 activity                                          | source        | source_elapsed
---------------------------------------------------+---------------+----------------
 Parsing SELECT * FROM users WHERE id = ?          | 192.168.1.101 |             52
 Preparing statement                               | 192.168.1.101 |            118
 Determining replicas for query                    | 192.168.1.101 |            156
 Sending READ message to /192.168.1.102           | 192.168.1.101 |            203
 READ message received from /192.168.1.101        | 192.168.1.102 |             45
 Executing single-partition query on users        | 192.168.1.102 |            112
 Acquiring sstable references                      | 192.168.1.102 |            158
 Bloom filter allows skipping sstable 1           | 192.168.1.102 |            201
 Partition index with 1 entries found             | 192.168.1.102 |            289
 Seeking to partition indexed section             | 192.168.1.102 |            334
 Merging memtable contents                        | 192.168.1.102 |            412
 Read 1 live rows and 0 tombstone cells           | 192.168.1.102 |            498
 Enqueuing response to /192.168.1.101             | 192.168.1.102 |            534
 Processing response from /192.168.1.102          | 192.168.1.101 |           2341
 Request complete                                  | 192.168.1.101 |           2456

Use Cases¶

Verify Tracing is Disabled¶

Before performance testing, ensure tracing isn't adding overhead:

nodetool gettraceprobability
# Should return 0.0

Check if Debugging Session is Active¶

Verify if someone enabled tracing for troubleshooting:

nodetool gettraceprobability
# If > 0.0, tracing is active

Audit Cluster Configuration¶

Include in cluster health checks:

#!/bin/bash
# Check trace probability on all nodes

for node in $(nodetool status | grep "^UN" | awk '{print $2}'); do
    prob=$(ssh "$node" "nodetool gettraceprobability 2>/dev/null | grep -oE "[0-9]+\.[0-9]+")"
    if [ "$prob" != "0.0" ]; then
        echo "WARNING: $node has trace probability $prob"
    fi
done

Trace Probability and Performance¶

The relationship between trace probability and overhead:

Probability	Overhead	`system_traces` Growth	Recommended Duration
0.0	None	None	Indefinite (default)
0.0001-0.001	Minimal	Slow	Days to weeks
0.001-0.01	Low	Moderate	Hours to days
0.01-0.1	Moderate	Fast	Minutes to hours
0.1-1.0	High	Very fast	Minutes only

Cleaning Up Trace Data

Trace data accumulates in system_traces with a default TTL of 24 hours. For extended tracing sessions, consider:

Lowering the TTL: ALTER TABLE system_traces.sessions WITH default_time_to_live = 3600;
Manually truncating: TRUNCATE system_traces.sessions; TRUNCATE system_traces.events;

Comparing with CQL TRACING¶

Cassandra offers two tracing mechanisms:

Feature	Probabilistic Tracing	CQL `TRACING ON`
Scope	All requests cluster-wide	Single cqlsh session
Control	`nodetool settraceprobability`	`TRACING ON/OFF` in cqlsh
Sampling	Percentage-based	All queries in session
Use case	Production monitoring	Interactive debugging
Persistence	`system_traces` tables	`system_traces` tables

-- CQL session-level tracing (alternative to probabilistic)
TRACING ON;
SELECT * FROM my_keyspace.my_table WHERE id = 123;
TRACING OFF;

Command	Relationship
settraceprobability	Set the trace probability
proxyhistograms	View latency histograms
tablehistograms	View per-table latency histograms