Skip to content

AxonOps Kafka Requests Dashboard Metrics Mapping

Overview

The Kafka Requests Dashboard provides comprehensive monitoring of request rates, processing times, and message conversions across your Kafka cluster. It helps identify performance bottlenecks, track request patterns, and monitor client compatibility issues.

Metrics Mapping

Dashboard Metric Description Attributes
Request Rate Metrics
kaf_RequestMetrics_RequestsPerSec Rate of requests per second by type request={type}
kaf_BrokerTopicMetrics_TotalProduceRequestsPerSec Produce requests per second per topic topic={topic}
kaf_BrokerTopicMetrics_TotalFetchRequestsPerSec Fetch requests per second per topic topic={topic}
Request Timing Metrics
kaf_RequestMetrics_TotalTimeMs Total time to process requests request={type}
kaf_RequestMetrics_RequestQueueTimeMs Time requests spend in queue request={type}
Message Conversion Metrics
kaf_BrokerTopicMetrics_FetchMessageConversionsPerSec Rate of message conversions during fetch -
kaf_BrokerTopicMetrics_ProduceMessageConversionsPerSec Rate of message conversions during produce -
Client Metrics
kaf_socket_server_metrics_ (function='connections') Client connections by version listener={listener}, clientSoftwareName={name}, clientSoftwareVersion={version}

Query Examples

Request Rates

// Total requests per second per broker
sum(kaf_RequestMetrics_RequestsPerSec{axonfunction='rate',function='Count',rack=~'$rack',host_id=~'$host_id'}) by (host_id)

// Produce requests per second
sum(kaf_RequestMetrics_RequestsPerSec{axonfunction='rate',function='Count',request='Produce',rack=~'$rack',host_id=~'$host_id'}) by (host_id)

// Fetch consumer requests per second
sum(kaf_RequestMetrics_RequestsPerSec{axonfunction='rate',function='Count',request='FetchConsumer',rack=~'$rack',host_id=~'$host_id'}) by (host_id)

// Metadata requests per second
sum(kaf_RequestMetrics_RequestsPerSec{axonfunction='rate',function='Count',request='Metadata',rack=~'$rack',host_id=~'$host_id'}) by (host_id)

Topic-Level Request Rates

// Produce requests per topic
sum(kaf_BrokerTopicMetrics_TotalProduceRequestsPerSec{axonfunction='rate',function='Count',rack=~'$rack',host_id=~'$host_id', topic!=''}) by (topic)

// Fetch requests per topic
sum(kaf_BrokerTopicMetrics_TotalFetchRequestsPerSec{axonfunction='rate',function='Count',rack=~'$rack',host_id=~'$host_id',topic=~'$topic', topic!=''}) by (topic)

Request Processing Times

// Produce request total time
kaf_RequestMetrics_TotalTimeMs{request='Produce',function=~'$percentile',rack=~'$rack',host_id=~'$host_id'}

// Fetch request total time
kaf_RequestMetrics_TotalTimeMs{request='Fetch',function=~'$percentile',rack=~'$rack',host_id=~'$host_id'}

// Fetch follower request total time
kaf_RequestMetrics_TotalTimeMs{request='FetchFollower',function=~'$percentile',rack=~'$rack',host_id=~'$host_id'}

Request Queue Times

// Fetch request queue time
kaf_RequestMetrics_RequestQueueTimeMs{request='Fetch',function=~'$percentile',rack=~'$rack',host_id=~'$host_id'}

// Fetch follower request queue time
kaf_RequestMetrics_RequestQueueTimeMs{request='FetchFollower',function=~'$percentile',rack=~'$rack',host_id=~'$host_id'}

Message Conversions

// Fetch message conversions per second
sum(kaf_BrokerTopicMetrics_FetchMessageConversionsPerSec{axonfunction='rate',rack=~'$rack',host_id=~'$host_id'})

// Produce message conversions per second
kaf_BrokerTopicMetrics_ProduceMessageConversionsPerSec{function='MeanRate',rack=~'$rack',host_id=~'$host_id'}

// Client version distribution
sum(kaf_socket_server_metrics_{function='connections',rack=~'$rack',host_id=~'$host_id'}) by (clientSoftwareVersion, clientSoftwareName)

Panel Organization

Overview Section

  • Empty row for spacing/organization

Requests

  • Total Request Per Sec
  • Metadata Request Per Sec
  • Produce Request Per Sec
  • Fetch Request Per Sec
  • Produce request per sec per topic
  • Fetch request per sec per topic

Request Times

  • Produce Time
  • Fetch Time
  • FetchFollower Time

Request Queues

  • Request Queue Fetch Follower Requests Time
  • Request Queue Fetch Requests Time

Message Conversion

  • Number of produced message conversion
  • Number of consumed message conversion
  • Client version repartition

Filters

  • rack: Filter by rack location

  • host_id: Filter by specific host/broker

  • topic: Filter by specific topic(s)

  • percentile: Select percentile for latency metrics (50th, 95th, 99th, etc.)

Best Practices

Request Rate Monitoring

  • Monitor total request rates for capacity planning
  • High metadata request rates may indicate client issues
  • Balance request rates across brokers

Request Timing Analysis

  • Monitor 99th percentile for worst-case scenarios
  • High total time indicates processing bottlenecks
  • Compare request types to identify slow operations

Queue Time Monitoring

  • High queue times indicate thread pool saturation
  • Consider increasing request handler threads
  • Queue time should be minimal compared to total time

Message Conversion Impact

  • Message conversions impact performance significantly
  • High conversion rates suggest client version mismatches
  • Update clients to match broker message format version

Client Version Management

  • Monitor client version distribution
  • Identify and upgrade outdated clients
  • Ensure compatibility with broker version

Performance Tuning

  • Adjust num.network.threads for high request rates
  • Tune num.io.threads for I/O operations
  • Monitor and adjust queued.max.requests

Troubleshooting

  • High produce times: Check replication settings
  • High fetch times: Review consumer configurations
  • Message conversions: Align client/broker versions