Skip to content

Virtual Tables

Virtual tables expose Cassandra's internal state through standard CQL queries. They provide real-time access to metrics, configuration, cluster status, and operational information without requiring JMX or external tools.


Overview

What Are Virtual Tables?

Virtual tables are read-only tables that do not store data on disk. Instead, they generate results dynamically by querying Cassandra's internal state. Each query executes against the local node only—virtual tables do not coordinate across the cluster.

Characteristic Regular Tables Virtual Tables
Data storage SSTables on disk None (generated on query)
Scope Distributed across cluster Local to queried node
Mutability Read/write Read-only
Consistency Configurable CL Always local
ALLOW FILTERING Required for non-key filters Not required
Schema User-defined System-defined

Historical Context

Version Virtual Tables Feature
4.0 Virtual tables introduced (CASSANDRA-7622). Initial tables for settings, thread pools, and clients.
4.1 Additional metrics tables, improved repair tracking
5.0 SAI index introspection tables, expanded repair state, streaming visibility (CEP-14)

Before virtual tables, accessing internal Cassandra state required JMX connections, which presented challenges:

  • JMX requires separate tooling and authentication
  • Firewall rules often block JMX ports
  • No standard query language for JMX
  • Difficult to integrate with existing CQL-based monitoring

Virtual tables solve these problems by exposing the same information through CQL.

Architecture

uml diagram

Key characteristics:

  • Queries execute entirely on the coordinator node
  • Results reflect the state of that specific node only
  • No replication, no consistency levels, no tombstones
  • Each query generates fresh data from internal state

Behavioral Guarantees

What Virtual Tables Guarantee

  • Results reflect the current state of the local node at query time
  • Queries never require ALLOW FILTERING warnings (safe to use without partition key)
  • Schema is stable within a major version
  • Virtual tables are always available when the node is running
  • Read operations have minimal overhead

What Virtual Tables Do NOT Guarantee

Undefined Behavior

The following behaviors are undefined and must not be relied upon:

  • Cross-node consistency: Querying the same virtual table on different nodes returns different results (each node's local state)
  • Point-in-time snapshots: Virtual table contents may change between rows being returned if internal state changes during query
  • Schema stability across versions: Virtual table schemas may change between Cassandra versions
  • Complete metric accuracy: Some metrics are approximations or samples
  • Historical data: Virtual tables show current state only; historical data requires external collection

Query Behavior

Behavior Description
Consistency Level Ignored—always reads local node
Tracing Supported but shows only local execution
Paging Supported for large result sets
Prepared Statements Supported and recommended
ALLOW FILTERING Not required—virtual tables are local-only

Virtual Keyspaces

Cassandra 5.0 provides two virtual keyspaces:

Keyspace Purpose Documentation
system_virtual_schema Metadata about virtual tables Schema Reference
system_views Operational tables See categories below

system_views Categories

Category Description Documentation
Metrics Latency, read statistics, batch/CQL metrics Metrics Tables
Thread Pools Thread pool utilization and status Thread Pools
Caches Cache hit rates and sizes Caches
Clients Connected clients and active queries Clients
Cluster State Gossip, hints, internode communication Cluster State
Storage Disk usage, partitions, snapshots, tasks Storage
Repair Repair operation tracking Repair
SAI Indexes Storage-Attached Index introspection SAI
Streaming Data streaming operations Streaming
Configuration Runtime settings and properties Configuration

Quick Reference

Common Monitoring Queries

-- Thread pool health
SELECT name, active_tasks, pending_tasks, blocked_tasks
FROM system_views.thread_pools
WHERE pending_tasks > 0 OR blocked_tasks > 0;

-- Cache efficiency
SELECT name, hit_ratio, size_bytes, capacity_bytes
FROM system_views.caches;

-- Connected clients
SELECT address, username, driver_name, request_count
FROM system_views.clients;

-- Table latencies
SELECT keyspace_name, table_name, p99th_ms
FROM system_views.coordinator_read_latency;

-- Tombstone problems
SELECT keyspace_name, table_name, p99th, max
FROM system_views.tombstones_per_read
WHERE p99th > 100;

Alerting Queries

-- Alert: Blocked thread pools
SELECT name, blocked_tasks
FROM system_views.thread_pools
WHERE blocked_tasks > 0;

-- Alert: Low cache hit ratio
SELECT name, hit_ratio
FROM system_views.caches
WHERE name = 'KeyCache' AND hit_ratio < 0.80;

-- Alert: Large pending hints
SELECT host_id, address, files
FROM system_views.pending_hints
WHERE files > 100;

-- Alert: Active repairs taking too long
SELECT id, keyspace_name, duration_millis
FROM system_views.repairs
WHERE completed = false AND duration_millis > 3600000;

Best Practices

Monitoring Integration

Use virtual tables in monitoring systems instead of JMX where possible:

nodetool Command Virtual Table Equivalent
nodetool tpstats system_views.thread_pools
nodetool gossipinfo system_views.gossip_info
nodetool compactionstats system_views.sstable_tasks
nodetool clientstats system_views.clients
nodetool listsnapshots system_views.snapshots
nodetool listpendinghints system_views.pending_hints

Polling Intervals

Virtual tables are designed for monitoring queries, not high-frequency polling:

Use Case Recommended Interval
Dashboard refresh 30-60 seconds
Alerting checks 60 seconds
Capacity planning 5-15 minutes
Ad-hoc debugging On-demand

Multi-Node Queries

Virtual tables return local data only. To aggregate across the cluster, query each node:

# Example: Query all nodes
from cassandra.cluster import Cluster

cluster = Cluster(contact_points=['node1', 'node2', 'node3'])

for host in cluster.metadata.all_hosts():
    session = cluster.connect()
    # Execute on specific node
    result = session.execute(
        "SELECT * FROM system_views.thread_pools",
        host=host
    )
    print(f"Node {host.address}: {list(result)}")

Restrictions

Virtual Table Restrictions

Cannot modify: Virtual tables are read-only. INSERT, UPDATE, DELETE operations fail.

Local scope only: Results reflect only the queried node. Cluster-wide views require querying each node.

No indexes: Cannot create secondary indexes on virtual tables.

No materialized views: Cannot create materialized views based on virtual tables.

Schema changes: Cannot ALTER virtual tables. Schema is fixed by Cassandra version.

No TTL/Timestamps: WRITETIME() and TTL() functions return null.