CQL Quickstart Guide¶

Learn Cassandra Query Language (CQL) fundamentals in this hands-on tutorial. CQL is similar to SQL but designed for Cassandra's distributed architecture.

Prerequisites¶

Cassandra running (see Installation Guide)
Access to cqlsh or CQLAI

Connecting to Cassandra¶

Using cqlsh¶

# Connect to local Cassandra
cqlsh

# Connect to remote host
cqlsh 192.168.1.10 9042

# Connect with authentication
cqlsh -u cassandra -p cassandra

# Connect with SSL
cqlsh --ssl

Using CQLAI (Recommended)¶

# Connect with CQLAI (modern, AI-powered shell)
cqlai

# Connect with specific host
cqlai --host 192.168.1.10

Creating a Keyspace¶

A keyspace is the top-level container for data (similar to a database in SQL).

Create a Keyspace¶

-- For development (single node)
CREATE KEYSPACE IF NOT EXISTS my_app
WITH replication = {
    'class': 'SimpleStrategy',
    'replication_factor': 1
};

-- For production (multi-datacenter)
CREATE KEYSPACE IF NOT EXISTS my_app
WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'dc1': 3,
    'dc2': 3
};

Use the Keyspace¶

USE my_app;

View Keyspaces¶

-- List all keyspaces
DESCRIBE KEYSPACES;

-- Show keyspace details
DESCRIBE KEYSPACE my_app;

Creating Tables¶

Basic Table¶

CREATE TABLE users (
    user_id UUID PRIMARY KEY,
    username TEXT,
    email TEXT,
    created_at TIMESTAMP
);

Compound Primary Key¶

-- Partition key: user_id
-- Clustering column: message_time (sorted DESC)
CREATE TABLE user_messages (
    user_id UUID,
    message_time TIMESTAMP,
    message_id UUID,
    content TEXT,
    PRIMARY KEY ((user_id), message_time, message_id)
) WITH CLUSTERING ORDER BY (message_time DESC);

Composite Partition Key¶

-- Composite partition key: (tenant_id, year_month)
CREATE TABLE events (
    tenant_id TEXT,
    year_month TEXT,
    event_time TIMESTAMP,
    event_id UUID,
    event_type TEXT,
    data TEXT,
    PRIMARY KEY ((tenant_id, year_month), event_time, event_id)
) WITH CLUSTERING ORDER BY (event_time DESC);

View Table Structure¶

-- Show table schema
DESCRIBE TABLE users;

-- List all tables in keyspace
DESCRIBE TABLES;

Inserting Data¶

Basic Insert¶

INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'john_doe', '[email protected]', toTimestamp(now()));

Insert with Specific UUID¶

INSERT INTO users (user_id, username, email, created_at)
VALUES (
    550e8400-e29b-41d4-a716-446655440000,
    'jane_smith',
    '[email protected]',
    '2024-01-15 10:30:00'
);

Insert with TTL (Time To Live)¶

-- Data expires after 86400 seconds (24 hours)
INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'temp_user', '[email protected]', toTimestamp(now()))
USING TTL 86400;

Insert If Not Exists (Lightweight Transaction)¶

INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'unique_user', '[email protected]', toTimestamp(now()))
IF NOT EXISTS;

Batch Insert¶

BEGIN BATCH
    INSERT INTO users (user_id, username, email, created_at)
    VALUES (uuid(), 'user1', '[email protected]', toTimestamp(now()));

    INSERT INTO users (user_id, username, email, created_at)
    VALUES (uuid(), 'user2', '[email protected]', toTimestamp(now()));

    INSERT INTO users (user_id, username, email, created_at)
    VALUES (uuid(), 'user3', '[email protected]', toTimestamp(now()));
APPLY BATCH;

Querying Data¶

Select All¶

SELECT * FROM users;

Select Specific Columns¶

SELECT username, email FROM users;

Query by Primary Key¶

-- By partition key (efficient)
SELECT * FROM users WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Query with Clustering Columns¶

-- Get messages for a user, newest first
SELECT * FROM user_messages
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Get messages in time range
SELECT * FROM user_messages
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000
  AND message_time >= '2024-01-01'
  AND message_time < '2024-02-01';

-- Limit results
SELECT * FROM user_messages
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000
LIMIT 10;

Query with ALLOW FILTERING (Use Sparingly!)¶

-- WARNING: Scans entire table - avoid in production
SELECT * FROM users WHERE email = '[email protected]' ALLOW FILTERING;

Warning: ALLOW FILTERING performs a full table scan. Only use for development or very small tables. For production, use proper data modeling or secondary indexes.

Using IN Clause¶

-- Query multiple partition keys
SELECT * FROM users
WHERE user_id IN (
    550e8400-e29b-41d4-a716-446655440000,
    660e8400-e29b-41d4-a716-446655440001
);

Updating Data¶

Basic Update¶

UPDATE users
SET email = '[email protected]'
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Update Multiple Columns¶

UPDATE users
SET
    email = '[email protected]',
    username = 'updated_username'
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Update with TTL¶

UPDATE users USING TTL 3600
SET email = '[email protected]'
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Conditional Update (Lightweight Transaction)¶

UPDATE users
SET email = '[email protected]'
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000
IF email = '[email protected]';

Deleting Data¶

Delete a Row¶

DELETE FROM users
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Delete Specific Columns¶

DELETE email FROM users
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Delete with Condition¶

DELETE FROM users
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000
IF EXISTS;

Truncate Table¶

-- Delete all data from table
TRUNCATE users;

Working with Collections¶

List Type¶

CREATE TABLE user_hobbies (
    user_id UUID PRIMARY KEY,
    hobbies LIST<TEXT>
);

-- Insert list
INSERT INTO user_hobbies (user_id, hobbies)
VALUES (uuid(), ['reading', 'gaming', 'hiking']);

-- Append to list
UPDATE user_hobbies
SET hobbies = hobbies + ['cooking']
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Prepend to list
UPDATE user_hobbies
SET hobbies = ['swimming'] + hobbies
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Remove from list
UPDATE user_hobbies
SET hobbies = hobbies - ['gaming']
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Set Type¶

CREATE TABLE user_tags (
    user_id UUID PRIMARY KEY,
    tags SET<TEXT>
);

-- Insert set
INSERT INTO user_tags (user_id, tags)
VALUES (uuid(), {'premium', 'verified', 'active'});

-- Add to set
UPDATE user_tags
SET tags = tags + {'vip'}
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Remove from set
UPDATE user_tags
SET tags = tags - {'active'}
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Map Type¶

CREATE TABLE user_preferences (
    user_id UUID PRIMARY KEY,
    preferences MAP<TEXT, TEXT>
);

-- Insert map
INSERT INTO user_preferences (user_id, preferences)
VALUES (uuid(), {'theme': 'dark', 'language': 'en', 'timezone': 'UTC'});

-- Update map entries
UPDATE user_preferences
SET preferences['theme'] = 'light'
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Add new map entry
UPDATE user_preferences
SET preferences = preferences + {'notifications': 'enabled'}
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Remove map entry
DELETE preferences['timezone'] FROM user_preferences
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

User-Defined Types (UDT)¶

-- Create a UDT
CREATE TYPE address (
    street TEXT,
    city TEXT,
    state TEXT,
    zip_code TEXT,
    country TEXT
);

-- Use UDT in table
CREATE TABLE customers (
    customer_id UUID PRIMARY KEY,
    name TEXT,
    billing_address FROZEN<address>,
    shipping_address FROZEN<address>
);

-- Insert with UDT
INSERT INTO customers (customer_id, name, billing_address, shipping_address)
VALUES (
    uuid(),
    'John Doe',
    {street: '123 Main St', city: 'New York', state: 'NY', zip_code: '10001', country: 'USA'},
    {street: '456 Oak Ave', city: 'Boston', state: 'MA', zip_code: '02101', country: 'USA'}
);

-- Query UDT fields
SELECT name, billing_address.city, shipping_address.city FROM customers;

Secondary Indexes¶

Create Secondary Index¶

-- Index on regular column
CREATE INDEX ON users (email);

-- Named index
CREATE INDEX users_email_idx ON users (email);

-- Query using index
SELECT * FROM users WHERE email = '[email protected]';

Storage-Attached Index (SAI) - Cassandra 5.0+¶

-- Create SAI index (more efficient)
CREATE INDEX ON users (email) USING 'sai';

-- SAI with options
CREATE INDEX ON users (username) USING 'sai'
WITH OPTIONS = {'case_sensitive': 'false'};

Aggregate Functions¶

-- Count rows
SELECT COUNT(*) FROM users;

-- Min/Max (requires primary key or index)
SELECT MIN(created_at), MAX(created_at) FROM user_messages
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Sum and Average
CREATE TABLE sales (
    product_id UUID,
    sale_date DATE,
    amount DECIMAL,
    PRIMARY KEY ((product_id), sale_date)
);

SELECT SUM(amount), AVG(amount) FROM sales
WHERE product_id = 550e8400-e29b-41d4-a716-446655440000;

Useful Functions¶

UUID Functions¶

-- Generate random UUID
SELECT uuid();

-- Generate time-based UUID
SELECT now();  -- Returns timeuuid

-- Convert timeuuid to timestamp
SELECT toTimestamp(now());

-- Extract date from timeuuid
SELECT toDate(now());

Timestamp Functions¶

-- Current timestamp
SELECT toTimestamp(now());

-- Date from timestamp
SELECT toDate(toTimestamp(now()));

-- Specific timestamp
INSERT INTO users (user_id, username, email, created_at)
VALUES (uuid(), 'user', '[email protected]', '2024-01-15 14:30:00+0000');

Token Function¶

-- Get token value for partition key
SELECT token(user_id), * FROM users;

-- Query token range (useful for debugging)
SELECT * FROM users WHERE token(user_id) > -9223372036854775808;

TTL and WriteTime¶

-- Check TTL remaining
SELECT TTL(email) FROM users WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Check write timestamp
SELECT WRITETIME(email) FROM users WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Consistency Levels¶

-- Set consistency for session
CONSISTENCY QUORUM;

-- Check current consistency
CONSISTENCY;

-- Common consistency levels:
-- ONE        - Fastest, least consistent
-- QUORUM     - Balanced (majority of replicas)
-- LOCAL_QUORUM - Majority in local datacenter
-- ALL        - Slowest, most consistent

Consistency Level Reference¶

Level	Description	Use Case
`ONE`	One replica responds	High throughput reads
`TWO`	Two replicas respond	Improved consistency
`THREE`	Three replicas respond	High consistency
`QUORUM`	Majority responds	Default for most apps
`LOCAL_QUORUM`	Majority in local DC	Multi-DC deployments
`EACH_QUORUM`	Quorum in each DC	Strong multi-DC consistency
`ALL`	All replicas respond	Highest consistency
`ANY`	Any node (including hints)	Highest availability writes
`LOCAL_ONE`	One replica in local DC	Low-latency local reads

Tracing Queries¶

-- Enable tracing
TRACING ON;

-- Run query (trace shown automatically)
SELECT * FROM users WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

-- Disable tracing
TRACING OFF;

Practical Examples¶

Example 1: User Profile System¶

-- Create keyspace
CREATE KEYSPACE social_app WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'dc1': 3
};

USE social_app;

-- User profiles table
CREATE TABLE user_profiles (
    user_id UUID,
    username TEXT,
    display_name TEXT,
    bio TEXT,
    avatar_url TEXT,
    follower_count COUNTER,
    created_at TIMESTAMP,
    PRIMARY KEY (user_id)
);

-- Separate counter table (counters need separate table)
CREATE TABLE user_counters (
    user_id UUID PRIMARY KEY,
    followers COUNTER,
    following COUNTER,
    posts COUNTER
);

-- Update counters
UPDATE user_counters SET followers = followers + 1
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000;

Example 2: Time-Series IoT Data¶

-- IoT sensor readings
CREATE TABLE sensor_readings (
    sensor_id TEXT,
    date DATE,
    reading_time TIMESTAMP,
    temperature DOUBLE,
    humidity DOUBLE,
    pressure DOUBLE,
    PRIMARY KEY ((sensor_id, date), reading_time)
) WITH CLUSTERING ORDER BY (reading_time DESC);

-- Insert reading
INSERT INTO sensor_readings (sensor_id, date, reading_time, temperature, humidity, pressure)
VALUES ('sensor-001', '2024-01-15', toTimestamp(now()), 23.5, 65.2, 1013.25);

-- Get today's readings for a sensor
SELECT * FROM sensor_readings
WHERE sensor_id = 'sensor-001'
  AND date = '2024-01-15'
LIMIT 100;

-- Get readings in time range
SELECT * FROM sensor_readings
WHERE sensor_id = 'sensor-001'
  AND date = '2024-01-15'
  AND reading_time >= '2024-01-15 08:00:00'
  AND reading_time < '2024-01-15 17:00:00';

Example 3: E-commerce Orders¶

-- Orders by customer
CREATE TABLE orders_by_customer (
    customer_id UUID,
    order_date DATE,
    order_id UUID,
    status TEXT,
    total DECIMAL,
    items LIST<FROZEN<MAP<TEXT, TEXT>>>,
    PRIMARY KEY ((customer_id), order_date, order_id)
) WITH CLUSTERING ORDER BY (order_date DESC, order_id ASC);

-- Insert order
INSERT INTO orders_by_customer (customer_id, order_date, order_id, status, total, items)
VALUES (
    550e8400-e29b-41d4-a716-446655440000,
    '2024-01-15',
    uuid(),
    'pending',
    99.99,
    [
        {'product_id': 'prod-001', 'name': 'Widget', 'quantity': '2', 'price': '49.99'},
        {'product_id': 'prod-002', 'name': 'Gadget', 'quantity': '1', 'price': '0.01'}
    ]
);

-- Get customer's recent orders
SELECT * FROM orders_by_customer
WHERE customer_id = 550e8400-e29b-41d4-a716-446655440000
LIMIT 10;

Common Mistakes to Avoid¶

1. Using ALLOW FILTERING in Production¶

-- BAD: Full table scan
SELECT * FROM users WHERE email = '[email protected]' ALLOW FILTERING;

-- GOOD: Create an index or denormalize
CREATE INDEX ON users (email);
SELECT * FROM users WHERE email = '[email protected]';

2. Large Partitions¶

-- BAD: All orders in one partition (unbounded growth)
CREATE TABLE orders (
    order_id UUID PRIMARY KEY,
    customer_id UUID,
    ...
);

-- GOOD: Partition by customer with time bucketing
CREATE TABLE orders_by_customer (
    customer_id UUID,
    year_month TEXT,  -- e.g., '2024-01'
    order_id UUID,
    ...
    PRIMARY KEY ((customer_id, year_month), order_id)
);

3. Using Collections for Large Data¶

-- BAD: Collections have 64KB limit per element
CREATE TABLE user_posts (
    user_id UUID PRIMARY KEY,
    posts LIST<TEXT>  -- Will fail with many/large posts
);

-- GOOD: Use a separate table
CREATE TABLE posts_by_user (
    user_id UUID,
    post_time TIMESTAMP,
    post_id UUID,
    content TEXT,
    PRIMARY KEY ((user_id), post_time, post_id)
);

Next Steps¶

After learning CQL basics:

Data Modeling Guide - Design effective schemas
First Cluster Setup - Multi-node deployment
CQL Reference - Complete language reference
Install CQLAI - Modern CQL shell with AI