AxonOps Coordinator Dashboard Metrics Mapping¶
This document maps the metrics used in the AxonOps Coordinator dashboard.
Dashboard Overview¶
The Coordinator dashboard monitors coordinator-level request handling in Cassandra. When a client sends a request, the coordinator node handles the request and coordinates with replica nodes. This dashboard tracks latency and throughput broken down by consistency level, providing insights into how different consistency levels impact performance.
Metrics Mapping¶
Client Request Metrics¶
| Dashboard Metric | Description | Attributes |
|---|---|---|
cas_ClientRequest_Latency |
Request latency at coordinator level | scope (Read/Write/RangeSlice), function (percentiles/count), dc, rack, host_id |
cas_Client_connectedNativeClients |
Number of connected native protocol clients | dc, rack, host_id |
cas_CommitLog_WaitingOnSegmentAllocation |
Time waiting for commit log segment allocation | function (percentile), dc, rack, host_id |
Scope Patterns¶
Read Operations¶
Read- Simple read (no consistency level)Read-ALL- Read with ALL consistencyRead-ONE- Read with ONE consistencyRead-TWO- Read with TWO consistencyRead-THREE- Read with THREE consistencyRead-QUORUM- Read with QUORUM consistencyRead-LOCAL_QUORUM- Read with LOCAL_QUORUM consistencyRead-EACH_QUORUM- Read with EACH_QUORUM consistencyRead-SERIAL- Read with SERIAL consistencyRead-LOCAL_SERIAL- Read with LOCAL_SERIAL consistencyRead-LOCAL_ONE- Read with LOCAL_ONE consistency
Write Operations¶
Write- Simple write (no consistency level)Write-ALL- Write with ALL consistencyWrite-ANY- Write with ANY consistencyWrite-ONE- Write with ONE consistencyWrite-TWO- Write with TWO consistencyWrite-THREE- Write with THREE consistencyWrite-QUORUM- Write with QUORUM consistencyWrite-LOCAL_QUORUM- Write with LOCAL_QUORUM consistencyWrite-EACH_QUORUM- Write with EACH_QUORUM consistencyWrite-LOCAL_ONE- Write with LOCAL_ONE consistency
Range Operations¶
RangeSlice- Range query operations (SELECT with ranges)
Query Examples¶
Coordinator Read Distribution (Pie Chart)¶
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Read*',scope!='Read',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}) by (scope)
Coordinator Write Distribution (Pie Chart)¶
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Write*',scope!='Write',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}) by (scope)
Read Latency by Consistency Level¶
cas_ClientRequest_Latency{scope='Read.*$consistency',function='$percentile',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}
Range Read Latency¶
cas_ClientRequest_Latency{scope='RangeSlice',function='$percentile',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}
Write Latency by Consistency Level¶
cas_ClientRequest_Latency{scope='Write.*$consistency',function='$percentile',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}
Read Throughput by Consistency¶
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Read.*$consistency',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack', host_id=~'$host_id'}) by ($groupBy)
Commit Log Waiting Time¶
cas_CommitLog_WaitingOnSegmentAllocation{dc=~'$dc',rack='$rack',host_id=~'$host_id',function='$percentile'}
Panel Organization¶
Consistency Distribution Section¶
-
Coordinator Reads distribution - Pie chart showing read request distribution by consistency level
-
Coordinator Writes distribution - Pie chart showing write request distribution by consistency level
Latency Statistics By Node Section¶
-
Coordinator Read $consistency Latency - $percentile - Line chart for read latency at selected consistency
-
Coordinator Range Read Request Latency - $percentile - Line chart for range query latency
-
Coordinator Write $consistency Latency - $percentile - Line chart for write latency at selected consistency
Throughput Statistics Section¶
-
Coordinator Read Throughput Per \(groupBy (\)consistency) - Count Per Second - Read operations per second
-
Coordinator Range Read Request Throughput - Count Per Second - Range queries per second
-
Coordinator Write Throughput Per \(groupBy (\)consistency) - Count Per Second - Write operations per second
Connections Section¶
- Number of Native Connections per host - Line chart showing client connections
Commitlog Statistics Section¶
- Waiting on Segment Allocation - Time spent waiting for commit log segments
Filters¶
-
data center (
dc) - Filter by data center -
rack - Filter by rack
-
node (
host_id) - Filter by specific node -
groupBy - Dynamic grouping (dc, rack, host_id)
-
percentile - Select latency percentile (50th, 75th, 95th, 98th, 99th, 999th)
-
consistency - Filter by consistency level (ALL, ANY, ONE, TWO, THREE, SERIAL, QUORUM, etc.)
Consistency Levels¶
Strong Consistency¶
-
ALL - All replicas must respond
-
QUORUM - Majority of replicas must respond
-
LOCAL_QUORUM - Majority in local datacenter
-
EACH_QUORUM - Quorum in each datacenter
Weak Consistency¶
-
ONE - Only one replica must respond
-
TWO - Two replicas must respond
-
THREE - Three replicas must respond
-
ANY - Any node can accept write (including hints)
-
LOCAL_ONE - One replica in local datacenter
Serial Consistency¶
-
SERIAL - Linearizable consistency
-
LOCAL_SERIAL - Linearizable in local datacenter
Important Considerations¶
Latency vs Consistency Trade-off:
- Higher consistency levels increase latency
- Monitor percentiles to understand impact
- Consider LOCAL variants for multi-DC
Throughput Patterns:
- Distribution shows application consistency preferences
- Imbalanced distribution may indicate issues
- Monitor for consistency level changes
Coordinator Load:
- Each node can be a coordinator
- High coordinator load impacts performance
- Balance using client-side load balancing
Range Queries:
- Typically more expensive than point reads
- Monitor separately from regular reads
- Consider pagination for large ranges
Units and Display¶
-
Latency: microseconds
-
Throughput: reads/writes per second (rps/wps)
-
Connections: count (short)
-
Legend Format:
$dc - $host_idor$groupByfor aggregated views