AxonOps Reporting Dashboard Metrics Mapping¶
This document maps the metrics used in the AxonOps Reporting dashboard.
Dashboard Overview¶
The Reporting dashboard provides high-level system resource utilization and coordinator performance metrics for reporting and capacity planning. It focuses on aggregated views of resource usage and request distribution by consistency level.
Metrics Mapping¶
System Resource Metrics¶
| Dashboard Metric | Description | Attributes |
|---|---|---|
host_Disk_Used |
Used disk space in bytes | mountpoint, dc, rack, host_id |
host_Disk_UsedPercent |
Disk usage percentage | mountpoint, dc, rack, host_id |
host_Disk_SectorsWrite |
Disk sectors written | partition, axonfunction (rate), dc, rack, host_id |
host_Disk_SectorsRead |
Disk sectors read | partition, axonfunction (rate), dc, rack, host_id |
host_CPU_Percent_Merge |
CPU usage percentage | time (real), dc, rack, host_id |
Coordinator Metrics¶
| Dashboard Metric | Description | Attributes |
|---|---|---|
cas_ClientRequest_Latency |
Request latency at coordinator | scope (Read/Write), function (percentiles/Count), axonfunction (rate), dc, rack, host_id |
Query Examples¶
System Resource Utilization¶
// Used Disk Space Per Node
host_Disk_Used{mountpoint=~'$mountpoint',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}
// Average Disk % Usage (Pie Chart)
// Used
avg(host_Disk_UsedPercent{mountpoint=~'$mountpoint',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'})
// Free
avg(100-host_Disk_UsedPercent{mountpoint=~'$mountpoint',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'})
// Max Disk Write Per Second
max(host_Disk_SectorsWrite{axonfunction='rate',dc=~'$dc',rack=~'$rack',host_id=~'$host_id',partition=~'$partition'})
// Max Disk Read Per Second
max(host_Disk_SectorsRead{axonfunction='rate',dc=~'$dc',rack=~'$rack',host_id=~'$host_id',partition=~'$partition'})
// Average CPU Usage per DC
avg(host_CPU_Percent_Merge{time='real',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}) by (dc)
Coordinator Distribution¶
// Coordinator Reads Distribution (Pie Chart)
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Read*',scope!='Read',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}) by (scope)
// Coordinator Writes Distribution (Pie Chart)
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Write*',scope!='Write',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'}) by (scope)
Coordinator Performance by Consistency¶
// Read Throughput by Consistency
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Read.*$consistency',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack', host_id=~'$host_id'}) by ($groupBy)
// Write Throughput by Consistency
sum(cas_ClientRequest_Latency{axonfunction='rate',scope='Write.*$consistency',function='Count',function!='Min|Max',dc=~'$dc',rack=~'$rack', host_id=~'$host_id'}) by ($groupBy)
// Max Read Latency by Consistency
max(cas_ClientRequest_Latency{scope='Read.*$consistency',function='$percentile',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'})
// Max Write Latency by Consistency
max(cas_ClientRequest_Latency{scope='Write.*$consistency',function='$percentile',function!='Min|Max',dc=~'$dc',rack=~'$rack',host_id=~'$host_id'})
Panel Organization¶
System Resource Utilisation¶
-
Average Disk % Usage - Pie chart showing used vs free disk space
-
Used Disk Space Per Node - Line chart of disk usage trends
-
Max Disk Read Per Second - Peak disk read throughput
-
Max Disk Write Per Second - Peak disk write throughput
-
Average CPU Usage per DC - CPU utilization by datacenter
Coordinator¶
-
Coordinator Reads distribution - Pie chart of read consistency level distribution
-
Coordinator Writes distribution - Pie chart of write consistency level distribution
-
Coordinator Read Throughput Per \(groupBy (\)consistency) - Read ops/sec by consistency
-
Total Coordinator Write Throughput Per \(groupBy (\)consistency) - Write ops/sec by consistency
-
Max Coordinator Read $consistency Latency - Maximum read latency for selected consistency
-
Max Coordinator Write $consistency Latency - Maximum write latency for selected consistency
Filters¶
-
data center (
dc) - Filter by data center -
rack - Filter by rack
-
node (
host_id) - Filter by specific node -
percentile - Select latency percentile (50th, 75th, 95th, 98th, 99th, 999th)
-
groupBy - Dynamic grouping (dc, rack, host_id)
-
mount point (
mountpoint) - Filter by disk mount point -
partition - Filter by disk partition
-
consistency - Filter by consistency level (ALL, ANY, ONE, TWO, THREE, SERIAL, QUORUM, etc.)
Understanding the Reporting Dashboard¶
Purpose¶
- High-level cluster health overview
- Resource capacity planning
- Performance trend analysis
- Consistency level impact assessment
Key Metrics for Reporting¶
Resource Utilization:
- Disk usage trends for capacity planning
- I/O throughput for performance baseline
- CPU usage for load distribution
Consistency Patterns:
- Distribution shows application behavior
- Performance impact of consistency choices
- Helps optimize consistency settings
Aggregated Views:
- DC-level CPU averages
- Cluster-wide consistency distribution
- Peak performance metrics
Report Generation Use Cases¶
Capacity Planning Reports¶
Metrics to Include:
- Average Disk % Usage - Current utilization
- Used Disk Space Per Node - Growth trends
- Average CPU Usage per DC - Processing capacity
Performance Baseline Reports¶
Metrics to Include:
- Max Disk Read/Write Per Second - I/O capacity
- Max Coordinator Latencies - SLA compliance
- Throughput by Consistency - Workload patterns
Consistency Analysis Reports¶
Metrics to Include:
- Coordinator Read/Write distribution - Usage patterns
- Latency by Consistency - Performance impact
- Throughput by Consistency - Load distribution
Best Practices¶
Resource Monitoring¶
Disk Space:
- Monitor usage trends
- Set alerts at 80% utilization
- Plan expansion at 70%
I/O Performance:
- Track peak read/write rates
- Identify I/O bottlenecks
- Correlate with application load
CPU Usage:
- Monitor DC-level averages
- Identify hot spots
- Balance load distribution
Consistency Reporting¶
Distribution Analysis:
- Understand application patterns
- Identify consistency misuse
- Optimize for performance
Performance Impact:
- Compare latency by consistency
- Measure throughput differences
- Make data-driven decisions
Data Aggregation Notes¶
Disk Metrics¶
- Sectors are converted to bytes for display
- Rates calculated using
axonfunction='rate' - Mount points exclude system paths (
/etc*)
CPU Metrics¶
- Uses
time='real'for actual CPU usage - Averaged by datacenter for overview
- Percentage scale 0-100
Coordinator Metrics¶
- Excludes base scopes (
scope!='Read',scope!='Write') - Filters out Min/Max functions for cleaner data
- Groups by configurable dimensions
Units and Display¶
-
Disk Space: bytes (binary units)
-
Disk I/O: bytes/second
-
CPU: percent (0-100)
-
Latency: microseconds
-
Throughput: rps/wps (reads/writes per second)
Legend Format:
- Resources:
$dc - $host_id - $mountpoint/$partition - Coordinator:
$groupByor$scope
Troubleshooting¶
Missing Disk Metrics¶
- Verify mount point filter
- Check partition naming
- Confirm agent disk collection
Inconsistent CPU Averages¶
- Check node availability
- Verify DC assignment
- Review time range
No Consistency Distribution¶
- Ensure client requests exist
- Check consistency filter
- Verify scope patterns
Notes¶
- Pie charts show current distribution, not historical
- Max functions used for worst-case reporting
- Some queries use special filtering like excluding empty consistency (
.*$consistency)