CQL Functions Reference
CQL includes built-in functions for common operations: generating UUIDs, working with timestamps, computing aggregates. The most useful ones now() for time-based UUIDs, token() for understanding data distribution, toTimestamp() for converting between time formats show up constantly in production schemas.
Aggregates deserve a warning: SELECT COUNT(*) FROM users looks innocent, but without a partition key it scans the entire cluster. That is fine for a thousand rows; it will timeout with a billion. Aggregates work efficiently when scoped to a partition.
This reference covers scalar functions, aggregates, and user-defined functions (available since Cassandra 4.0).
Behavioral Guarantees
What Functions Guarantee
- Scalar functions are evaluated on the coordinator for each result row
uuid() generates a unique random UUID (version 4) per invocation
now() generates a unique time-based UUID (version 1) per invocation within a statement
token() returns the partition token for the given partition key columns
- Aggregate functions are evaluated on the coordinator after collecting all matching rows
- Type conversion functions fail predictably on incompatible input
What Functions Do NOT Guarantee
Undefined Behavior
The following behaviors are undefined and must not be relied upon:
now() monotonicity: Multiple now() calls in a statement return the same value; across statements, clock skew may cause non-monotonic values
- Aggregate performance: Aggregates without partition key constraints scan the entire cluster with unbounded latency
uuid() ordering: Random UUIDs have no ordering relationship
- Cross-node time consistency: Time functions reflect coordinator node's clock; clock skew between nodes affects results
- Null propagation: Some functions return null on null input; others may fail
Function Evaluation Contract
| Function Type |
Evaluation Location |
Per-Row |
Notes |
| Scalar |
Coordinator |
✅ Yes |
Evaluated for each result row |
| Aggregate |
Coordinator |
❌ No |
Evaluated once after collecting rows |
now() |
Coordinator |
❌ No |
Same value within statement |
uuid() |
Coordinator |
✅ Yes |
Different value per invocation |
token() |
Coordinator |
✅ Yes |
Computed from partition key |
Time Function Contract
| Function |
Return Type |
Uniqueness |
Monotonicity |
now() |
timeuuid |
Unique per statement |
Within node only |
currentTimestamp() |
timestamp |
Not unique |
Within node only |
currentTimeUUID() |
timeuuid |
Unique per invocation |
Within node only |
currentDate() |
date |
Not unique |
Within node only |
currentTime() |
time |
Not unique |
Within node only |
| Query Pattern |
Performance |
Notes |
SELECT COUNT(*) FROM t WHERE pk = ? |
Fast |
Single partition |
SELECT COUNT(*) FROM t |
Slow/Timeout |
Full cluster scan |
SELECT SUM(x) FROM t WHERE pk = ? |
Fast |
Single partition |
SELECT AVG(x) FROM t |
Slow/Timeout |
Full cluster scan |
Version-Specific Behavior
| Version |
Behavior |
| 2.2+ |
User-defined functions (UDFs) |
| 3.0+ |
toDate(), toTimestamp() conversion functions |
| 4.0+ |
Improved function execution, currentTimestamp() |
| 5.0+ |
Vector similarity functions, masking functions |
Function Categories
| Category |
Purpose |
Examples |
| Scalar |
Transform single values |
toDate(), token(), cast() |
| Aggregate |
Compute across rows |
COUNT(), SUM(), AVG() |
| Time/UUID |
Generate temporal values |
now(), uuid(), toTimestamp() |
| Cell Metadata |
Query column metadata |
ttl(), writetime() |
| Collection |
Operate on collections |
map_keys(), collection_count() |
| Data Masking |
Protect sensitive data |
mask_default(), mask_inner() |
| Vector |
Vector similarity search |
similarity_cosine() |
Scalar Functions
Type Conversion
-- Cast between types
SELECT CAST(age AS text) FROM users;
SELECT CAST('123' AS int) FROM users;
-- Convert blob to type
SELECT blobAsInt(blob_column) FROM data;
SELECT intAsBlob(123);
-- Type conversion functions
SELECT toDate(timestamp_column) FROM events;
SELECT toTimestamp(date_column) FROM events;
SELECT toUnixTimestamp(timestamp_column) FROM events;
Token Function
-- Get token value for partition key
SELECT token(user_id), username FROM users;
-- Query by token range (full table scan)
SELECT * FROM users WHERE token(user_id) > -9223372036854775808;
-- Useful for parallel scanning
SELECT * FROM users
WHERE token(user_id) >= -9223372036854775808
AND token(user_id) < 0;
UUID Functions
-- Generate random UUID
INSERT INTO users (user_id, username) VALUES (uuid(), 'john');
-- Generate time-based UUID (timeuuid)
INSERT INTO events (event_id, event_time) VALUES (now(), 'click');
-- Extract timestamp from timeuuid
SELECT dateOf(event_id) FROM events;
SELECT unixTimestampOf(event_id) FROM events;
SELECT toTimestamp(event_id) FROM events;
-- Min/max timeuuid for time range queries
SELECT * FROM events
WHERE event_id > minTimeuuid('2024-01-01 00:00:00+0000')
AND event_id < maxTimeuuid('2024-01-31 23:59:59+0000');
Time Functions
-- Current timestamp
SELECT toTimestamp(now()) AS current_time FROM system.local;
-- Current date
SELECT currentDate() FROM system.local;
-- Current time
SELECT currentTime() FROM system.local;
-- Current timestamp in various formats
SELECT currentTimestamp() FROM system.local;
SELECT currentTimeUUID() FROM system.local;
Blob Functions
-- Convert types to blob
SELECT bigintAsBlob(9223372036854775807);
SELECT booleanAsBlob(true);
SELECT doubleAsBlob(3.14159);
SELECT floatAsBlob(3.14);
SELECT intAsBlob(42);
SELECT textAsBlob('hello');
SELECT timestampAsBlob(toTimestamp(now()));
SELECT uuidAsBlob(uuid());
SELECT varintAsBlob(12345);
-- Convert blob to types
SELECT blobAsBigint(blob_column) FROM data;
SELECT blobAsBoolean(blob_column) FROM data;
SELECT blobAsDouble(blob_column) FROM data;
SELECT blobAsFloat(blob_column) FROM data;
SELECT blobAsInt(blob_column) FROM data;
SELECT blobAsText(blob_column) FROM data;
SELECT blobAsTimestamp(blob_column) FROM data;
SELECT blobAsUuid(blob_column) FROM data;
SELECT blobAsVarint(blob_column) FROM data;
Text Functions
-- Convert blob to text
SELECT blobastext(blob_column) FROM data;
-- Convert text to blob
SELECT textasblob('hello');
-- Note: CQL has no built-in string manipulation functions
-- (no length, substring, concat, etc.) - use application layer
Math Functions (Cassandra 4.0+)
-- Absolute value
SELECT abs(numeric_column) FROM data;
-- Exponential (e raised to power)
SELECT exp(numeric_column) FROM data;
-- Logarithm (natural log)
SELECT log(numeric_column) FROM data;
SELECT log10(numeric_column) FROM data;
-- Round to nearest integer (HALF_UP mode)
SELECT round(numeric_column) FROM data;
Aggregate Functions
COUNT
-- Count all rows
SELECT COUNT(*) FROM users;
-- Count non-null values in column
SELECT COUNT(email) FROM users;
-- Count with grouping
SELECT department, COUNT(*) as employee_count
FROM employees
GROUP BY department;
SUM
-- Sum numeric column
SELECT SUM(amount) FROM orders WHERE user_id = ?;
-- Sum with grouping
SELECT user_id, SUM(amount) as total_spent
FROM orders
GROUP BY user_id;
-- Sum with time grouping
SELECT user_id, toDate(order_time) as day, SUM(amount)
FROM orders
WHERE user_id = ?
GROUP BY user_id, toDate(order_time);
AVG
-- Average of column
SELECT AVG(price) FROM products;
-- Average with grouping
SELECT category, AVG(price) as avg_price
FROM products
GROUP BY category;
MIN / MAX
-- Minimum value
SELECT MIN(price) FROM products;
-- Maximum value
SELECT MAX(created_at) FROM users;
-- Min/Max with grouping
SELECT user_id, MIN(order_time), MAX(order_time)
FROM orders
GROUP BY user_id;
GROUP BY Usage
-- Group by partition key
SELECT user_id, COUNT(*) as message_count
FROM messages
GROUP BY user_id;
-- Group by clustering columns
SELECT user_id, toDate(sent_at) as day, COUNT(*)
FROM messages
WHERE user_id = ?
GROUP BY user_id, toDate(sent_at);
-- Multiple aggregates
SELECT user_id,
COUNT(*) as orders,
SUM(total) as revenue,
AVG(total) as avg_order,
MIN(order_time) as first_order,
MAX(order_time) as last_order
FROM orders
GROUP BY user_id;
Time and UUID Functions
now() and uuid()
-- Generate timeuuid (time-based UUID v1)
INSERT INTO events (event_id, data) VALUES (now(), 'event');
-- Generate random UUID (v4)
INSERT INTO users (user_id, name) VALUES (uuid(), 'John');
-- From timeuuid to timestamp
SELECT toTimestamp(event_id) FROM events;
-- From timeuuid to date
SELECT toDate(event_id) FROM events;
-- From timeuuid to Unix timestamp (milliseconds)
SELECT toUnixTimestamp(event_id) FROM events;
-- dateOf (deprecated, use toTimestamp)
SELECT dateOf(event_id) FROM events;
-- unixTimestampOf (deprecated, use toUnixTimestamp)
SELECT unixTimestampOf(event_id) FROM events;
Time Range Queries
-- minTimeuuid / maxTimeuuid for time-based queries
SELECT * FROM events
WHERE user_id = ?
AND event_id > minTimeuuid('2024-01-01 00:00:00+0000')
AND event_id < maxTimeuuid('2024-01-31 23:59:59+0000');
-- Query events up to current time
SELECT * FROM events
WHERE user_id = ?
AND event_id <= now();
-- Note: CQL has no timestamp arithmetic; calculate time bounds
-- in application code and pass as prepared statement parameters
Current Time Functions
-- Current timestamp (microseconds)
SELECT currentTimestamp() FROM system.local;
-- Current date
SELECT currentDate() FROM system.local;
-- Current time (nanoseconds since midnight)
SELECT currentTime() FROM system.local;
-- Current timeuuid
SELECT currentTimeUUID() FROM system.local;
TTL Function
-- Check remaining TTL (seconds)
SELECT TTL(email) FROM users WHERE user_id = ?;
-- TTL returns null for columns without TTL
SELECT username, TTL(username), email, TTL(email) FROM users;
WRITETIME Function
-- Check write timestamp (microseconds since epoch)
SELECT WRITETIME(email) FROM users WHERE user_id = ?;
-- Compare write times
SELECT username, WRITETIME(username), email, WRITETIME(email) FROM users;
Collection Functions
Map Functions
-- Extract keys from map as a set
SELECT map_keys(preferences) FROM users WHERE user_id = ?;
-- Extract values from map as a list
SELECT map_values(preferences) FROM users WHERE user_id = ?;
Collection Aggregates
-- Count elements in collection
SELECT collection_count(tags) FROM users WHERE user_id = ?;
-- Min/max element in set or list
SELECT collection_min(scores) FROM users WHERE user_id = ?;
SELECT collection_max(scores) FROM users WHERE user_id = ?;
-- Sum/average of numeric collection
SELECT collection_sum(scores) FROM users WHERE user_id = ?;
SELECT collection_avg(scores) FROM users WHERE user_id = ?;
JSON Functions
toJson
-- Convert column to JSON
SELECT user_id, toJson(preferences) FROM users;
-- Convert entire row to JSON
SELECT JSON * FROM users WHERE user_id = ?;
-- Convert specific columns
SELECT JSON user_id, username, email FROM users;
fromJson
-- Insert from JSON
INSERT INTO users (user_id, username, preferences)
VALUES (fromJson('"550e8400-e29b-41d4-a716-446655440000"'),
fromJson('"john"'),
fromJson('{"theme": "dark", "lang": "en"}'));
-- Update with JSON
UPDATE users
SET preferences = fromJson('{"theme": "light"}')
WHERE user_id = ?;
JSON Insert
-- Full row as JSON
INSERT INTO users JSON '{
"user_id": "550e8400-e29b-41d4-a716-446655440000",
"username": "john",
"email": "[email protected]",
"preferences": {"theme": "dark"}
}';
-- With DEFAULT UNSET for missing columns
INSERT INTO users JSON '{"user_id": "...", "username": "john"}' DEFAULT UNSET;
-- With DEFAULT NULL for missing columns
INSERT INTO users JSON '{"user_id": "...", "username": "john"}' DEFAULT NULL;
User-Defined Functions (UDF)
Creating UDFs
-- Enable UDFs in cassandra.yaml first:
-- user_defined_functions_enabled: true
-- Create Java UDF
CREATE FUNCTION my_keyspace.double_value(input int)
CALLED ON NULL INPUT
RETURNS int
LANGUAGE java
AS 'return input * 2;';
-- Create UDF with null handling
CREATE FUNCTION my_keyspace.safe_double(input int)
RETURNS NULL ON NULL INPUT
RETURNS int
LANGUAGE java
AS 'return input * 2;';
Using UDFs
-- Use in SELECT
SELECT user_id, double_value(score) FROM scores;
-- Use in INSERT/UPDATE
INSERT INTO scores (user_id, adjusted_score)
VALUES (?, double_value(?));
Managing UDFs
-- List functions
SELECT * FROM system_schema.functions;
-- Drop function
DROP FUNCTION my_keyspace.double_value;
-- Replace function
CREATE OR REPLACE FUNCTION my_keyspace.double_value(input int)
CALLED ON NULL INPUT
RETURNS int
LANGUAGE java
AS 'return input * 3;'; -- Changed to triple
User-Defined Aggregates (UDA)
Creating UDAs
-- First create state function
CREATE FUNCTION my_keyspace.avgState(state tuple<int, bigint>, val int)
CALLED ON NULL INPUT
RETURNS tuple<int, bigint>
LANGUAGE java
AS 'if (val != null) {
state.setInt(0, state.getInt(0) + 1);
state.setLong(1, state.getLong(1) + val);
}
return state;';
-- Create final function
CREATE FUNCTION my_keyspace.avgFinal(state tuple<int, bigint>)
CALLED ON NULL INPUT
RETURNS double
LANGUAGE java
AS 'if (state.getInt(0) == 0) return null;
return (double) state.getLong(1) / state.getInt(0);';
-- Create aggregate
CREATE AGGREGATE my_keyspace.custom_avg(int)
SFUNC avgState
STYPE tuple<int, bigint>
FINALFUNC avgFinal
INITCOND (0, 0);
Using UDAs
-- Use like built-in aggregates
SELECT user_id, custom_avg(score) FROM scores GROUP BY user_id;
Data Masking Functions (Cassandra 5.0+)
Data masking functions protect sensitive data by replacing values with masked versions.
-- Return null for any value
SELECT mask_null(email) FROM users;
-- Return type-appropriate default (asterisks for text, zero for numbers)
SELECT mask_default(email) FROM users;
-- Replace with specific value
SELECT mask_replace(email, '***@***.***') FROM users;
-- Mask inner characters, expose prefix/suffix
-- mask_inner(value, begin_unmasked, end_unmasked, [padding_char])
SELECT mask_inner(phone, 3, 4) FROM users; -- '+1-***-**-5678'
-- Mask outer characters, expose middle
-- mask_outer(value, begin_masked, end_masked, [padding_char])
SELECT mask_outer(ssn, 3, 4) FROM users; -- '***-45-****'
-- Return SHA-256 hash as blob
SELECT mask_hash(email) FROM users;
Vector Similarity Functions (Cassandra 5.0+)
Vector similarity functions compare float vectors for similarity search operations.
-- Cosine similarity between vectors (range: -1 to 1)
SELECT similarity_cosine(embedding, ?) FROM documents;
-- Euclidean distance between vectors
SELECT similarity_euclidean(embedding, ?) FROM documents;
-- Dot product of vectors
SELECT similarity_dot_product(embedding, ?) FROM documents;
Function Reference Table
Scalar Functions
| Function |
Parameters |
Returns |
Description |
token() |
partition key |
bigint |
Murmur3 token value |
uuid() |
none |
uuid |
Random UUID v4 |
now() |
none |
timeuuid |
Current time UUID |
to_date() |
timestamp/timeuuid |
date |
Extract date |
to_timestamp() |
date/timeuuid |
timestamp |
Convert to timestamp |
to_unix_timestamp() |
timestamp/timeuuid |
bigint |
Unix time (ms) |
min_timeuuid() |
timestamp |
timeuuid |
Min UUID for time |
max_timeuuid() |
timestamp |
timeuuid |
Max UUID for time |
ttl() |
column |
int |
Remaining TTL (s) |
writetime() |
column |
bigint |
Write timestamp (μs) |
cast() |
value AS type |
varies |
Type conversion |
to_json() |
value |
text |
Convert to JSON |
from_json() |
json text |
varies |
Parse JSON |
Math Functions (4.0+)
| Function |
Parameters |
Returns |
Description |
abs() |
numeric |
same |
Absolute value |
exp() |
numeric |
same |
e raised to power |
log() |
numeric |
same |
Natural logarithm |
log10() |
numeric |
same |
Base-10 logarithm |
round() |
numeric |
same |
Round (HALF_UP) |
Aggregate Functions
| Function |
Parameters |
Returns |
Description |
count() |
* or column |
bigint |
Row/value count |
sum() |
numeric column |
varies |
Sum of values |
avg() |
numeric column |
varies |
Average value |
min() |
column |
varies |
Minimum value |
max() |
column |
varies |
Maximum value |
Collection Functions
| Function |
Parameters |
Returns |
Description |
map_keys() |
map |
set |
Extract map keys |
map_values() |
map |
list |
Extract map values |
collection_count() |
collection |
int |
Element count |
collection_min() |
set/list |
element type |
Minimum element |
collection_max() |
set/list |
element type |
Maximum element |
collection_sum() |
numeric set/list |
numeric |
Sum of elements |
collection_avg() |
numeric set/list |
numeric |
Average of elements |
Data Masking Functions (5.0+)
| Function |
Parameters |
Returns |
Description |
mask_null() |
value |
null |
Returns null |
mask_default() |
value |
same type |
Type-appropriate mask |
mask_replace() |
value, replacement |
same type |
Replace with value |
mask_inner() |
value, begin, end, [pad] |
same type |
Mask inner chars |
mask_outer() |
value, begin, end, [pad] |
same type |
Mask outer chars |
mask_hash() |
value, [algorithm] |
blob |
SHA-256 hash |
Vector Functions (5.0+)
| Function |
Parameters |
Returns |
Description |
similarity_cosine() |
vector, vector |
float |
Cosine similarity |
similarity_euclidean() |
vector, vector |
float |
Euclidean distance |
similarity_dot_product() |
vector, vector |
float |
Dot product |
Next Steps