Skip to content

Cassandra on AWS

This guide covers deploying Apache Cassandra on Amazon Web Services.

Instance Types

Use Case Instance vCPUs RAM Storage Notes
Development m5.large 2 8GB EBS Testing only
Small Prod i3.xlarge 4 30.5GB 950GB NVMe Good starting point
Standard Prod i3.2xlarge 8 61GB 1.9TB NVMe Recommended
High Perf i3en.3xlarge 12 96GB 7.5TB NVMe Write-heavy
Memory-Optimized r5.4xlarge 16 128GB EBS Large heaps

Storage Options

# EBS configurations
gp3:
  iops: 3000-16000
  throughput: 125-1000 MB/s
  use_case: "General purpose, cost-effective"

io2:
  iops: up to 64000
  throughput: up to 1000 MB/s
  use_case: "High performance, consistent latency"

# Instance storage (NVMe)
i3_series:
  type: "NVMe SSD"
  iops: "High (instance dependent)"
  note: "Ephemeral - data lost on stop/terminate"

Network Configuration

VPC Setup

# Terraform example
resource "aws_vpc" "cassandra" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "cassandra-vpc"
  }
}

# Subnets across AZs
resource "aws_subnet" "cassandra" {
  count             = 3
  vpc_id            = aws_vpc.cassandra.id
  cidr_block        = "10.0.${count.index}.0/24"
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

Security Groups

resource "aws_security_group" "cassandra" {
  name        = "cassandra-sg"
  vpc_id      = aws_vpc.cassandra.id

  # CQL client access
  ingress {
    from_port   = 9042
    to_port     = 9042
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/16"]
  }

  # Inter-node communication
  ingress {
    from_port   = 7000
    to_port     = 7001
    protocol    = "tcp"
    self        = true
  }

  # JMX
  ingress {
    from_port   = 7199
    to_port     = 7199
    protocol    = "tcp"
    cidr_blocks = ["10.0.0.0/16"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Snitch Configuration

Ec2Snitch (Single Region)

# cassandra.yaml
endpoint_snitch: Ec2Snitch

# Automatically detects:
# - DC = region (e.g., us-east-1)
# - Rack = availability zone (e.g., us-east-1a)

Ec2MultiRegionSnitch (Multi-Region)

# cassandra.yaml
endpoint_snitch: Ec2MultiRegionSnitch

# For multi-region:
# - Uses public IP for cross-region communication
# - Private IP for intra-region
listen_address: <private_ip>
broadcast_address: <public_ip>

Deployment Example

CloudFormation Template

AWSTemplateFormatVersion: '2010-09-09'
Description: Cassandra cluster on AWS

Parameters:
  InstanceType:
    Type: String
    Default: i3.2xlarge
  ClusterSize:
    Type: Number
    Default: 3

Resources:
  CassandraASG:
    Type: AWS::AutoScaling::AutoScalingGroup
    Properties:
      LaunchTemplate:
        LaunchTemplateId: !Ref CassandraLaunchTemplate
        Version: !GetAtt CassandraLaunchTemplate.LatestVersionNumber
      MinSize: !Ref ClusterSize
      MaxSize: !Ref ClusterSize
      VPCZoneIdentifier:
        - !Ref SubnetA
        - !Ref SubnetB
        - !Ref SubnetC

  CassandraLaunchTemplate:
    Type: AWS::EC2::LaunchTemplate
    Properties:
      LaunchTemplateData:
        InstanceType: !Ref InstanceType
        ImageId: !Ref AmiId
        SecurityGroupIds:
          - !Ref CassandraSecurityGroup
        UserData:
          Fn::Base64: |
            #!/bin/bash
            # Install Cassandra
            # Configure based on instance metadata

Multi-AZ Best Practices

uml diagram

Monitoring with CloudWatch

# CloudWatch agent configuration
{
  "metrics": {
    "metrics_collected": {
      "disk": {
        "measurement": ["used_percent"],
        "resources": ["/var/lib/cassandra"]
      },
      "mem": {
        "measurement": ["mem_used_percent"]
      }
    }
  }
}

Cost Optimization

Strategy Savings Trade-off
Reserved Instances 30-60% 1-3 year commitment
Savings Plans 20-40% Flexible commitment
Spot for non-prod 60-90% Can be interrupted
Right-sizing Variable Requires monitoring

Next Steps