nodetool removenode¶
Removes a dead or unreachable node from the cluster by streaming its data from remaining replicas.
Synopsis¶
nodetool [connection_options] removenode <status | force | <host-id>>
Description¶
nodetool removenode removes a node that cannot be decommissioned because it is dead or unreachable. Unlike decommission (which runs on the node being removed), removenode runs from any live node and reconstructs the dead node's data from other replicas.
What Removenode Does¶
When removenode executes, Cassandra performs the following operations:
-
Removes token ownership - The dead node's tokens are removed from the cluster's token ring. These tokens defined which partition ranges the node was responsible for.
-
Updates the ring topology - The cluster recalculates token range assignments. The removed node's token ranges are redistributed to the remaining nodes based on the token allocation strategy (vnode or single-token).
-
Streams data from replicas - For each partition range the dead node owned, data is streamed from surviving replicas to the nodes now responsible for those ranges. This ensures the cluster maintains the configured replication factor.
-
Updates system tables - The node's entry is removed from
system.peersand other system tables across all nodes in the cluster. -
Propagates via gossip - The removal is disseminated to all nodes via the gossip protocol, ensuring every node updates its local view of the cluster topology.
Data Reconstruction Requirement
Since the dead node's data is unavailable, removenode relies entirely on replica nodes to reconstruct the data. If the replication factor is 1 (no replicas), or if all replicas for a partition range are unavailable, that data cannot be recovered and will be lost.
Arguments¶
| Argument | Description |
|---|---|
status |
Show status of current removenode operation |
force |
Force completion of a stuck removenode |
<host-id> |
UUID of the node to remove |
Finding the Host ID¶
# From nodetool status - shows host ID for all nodes
nodetool status
# Output includes Host ID column
# Datacenter: dc1
# Status=Up/Down
# |/ State=Normal/Leaving/Joining/Moving
# -- Address Load Tokens Owns Host ID Rack
# DN 192.168.1.102 248.87 GiB 256 33.3% b2c3d4e5-f6a7-8901-bcde-f12345678901 rack1
The dead node shows DN (Down/Normal). Copy its Host ID.
When to Use¶
Node Hardware Failure¶
When a node's hardware fails and cannot be recovered:
nodetool removenode b2c3d4e5-f6a7-8901-bcde-f12345678901
Node Cannot Start¶
When Cassandra cannot start on a node due to corruption or configuration issues that cannot be resolved:
nodetool removenode <host-id>
Unplanned Node Loss¶
When a node is permanently lost (datacenter issue, etc.):
nodetool removenode <host-id>
When NOT to Use¶
Node is Still Alive¶
Don't Removenode Live Nodes
If the node is running (even if unhealthy):
- Try to repair the issue
- If removal needed, use
decommissioninstead - Never removenode a live node - causes data inconsistency
Multiple Nodes Down¶
Data Loss Risk
If multiple replica nodes are down, removenode cannot reconstruct all data:
RF=3, 2 nodes down → Data on only 1 replica
RF=3, 3 nodes down → Complete data loss for some ranges
Bring nodes back online if possible before removing any.
Node Was Not Fully Down¶
If the node was intermittently available:
- Ensure it's completely stopped
- Remove from network
- Then run removenode
Removenode Process¶
- Identify dead node by Host ID
- Verify node is actually down
- Calculate token ranges owned by dead node
- Find remaining replicas for each range
- Stream data from replicas to new owners (monitor with
nodetool netstatsandnodetool removenode status) - Update ring topology
- Complete removal
Examples¶
Remove Dead Node¶
# Get host ID
nodetool status | grep DN
# Remove the node
nodetool removenode b2c3d4e5-f6a7-8901-bcde-f12345678901
Check Removenode Status¶
nodetool removenode status
Output:
RemovalStatus: InProgress
Progress: 45%
Streams: 12 active, 8 completed
Force Completion¶
nodetool removenode force
Force Removal
Only use force if removenode is stuck and you accept potential data loss. This completes the removal without waiting for all streams.
Before Removenode¶
Verify Node is Dead¶
# Should show DN (Down/Normal)
nodetool status
# Try to reach the node
ping <node-ip>
ssh <node-ip> 'nodetool info'
Check Replication Factor¶
DESCRIBE KEYSPACE my_keyspace;
Ensure RF > 1 for data availability during removal.
Verify Other Nodes Healthy¶
nodetool status
All other nodes should be UN (Up/Normal).
Consider Consequences¶
| RF | Nodes | After Remove 1 | Risk |
|---|---|---|---|
| 3 | 6 | 5 nodes, RF 3 | Safe |
| 3 | 4 | 3 nodes, RF 3 | Minimum |
| 3 | 3 | 2 nodes, RF 3 | Cannot maintain RF |
During Removenode¶
Monitor Progress¶
# Check removenode status
nodetool removenode status
# Watch streaming
nodetool netstats
# Check system logs
tail -f /var/log/cassandra/system.log | grep -i remove
Do NOT¶
During Removenode
- Do NOT start the dead node
- Do NOT run other topology changes
- Do NOT start repairs
- Do NOT restart live nodes
After Removenode¶
Verify Completion¶
nodetool status
The removed node should no longer appear.
Run Repair¶
After removenode, run repair to ensure consistency:
# On each remaining node
nodetool repair -pr my_keyspace
Common Issues¶
"This host ID is not part of the ring"¶
The node was already removed or the Host ID is incorrect:
# Double-check Host ID
nodetool ring | grep <host-id>
"Cannot remove node - not enough replicas"¶
Not enough live replicas to reconstruct data:
# Check how many nodes are up
nodetool status | grep UN
# May need to bring another node back online first
Removenode Stuck¶
# Check status
nodetool removenode status
# Check for issues
nodetool netstats
tail /var/log/cassandra/system.log
If truly stuck:
# Force complete (data loss risk)
nodetool removenode force
"Cannot removenode while bootstrapping"¶
Another node is joining. Wait for bootstrap to complete:
# Check for joining nodes
nodetool status | grep UJ
# Wait for UN status
Removenode vs. Decommission vs. Assassinate¶
| Operation | Runs On | Use When |
|---|---|---|
decommission |
Node being removed | Node is alive |
removenode |
Any live node | Node is dead, data can be reconstructed |
assassinate |
Any live node | Removenode fails, node gossip state stuck |
Decision Flow¶
| Question | Yes | No |
|---|---|---|
| Is node responding? | Use decommission |
Continue to next question |
| RF > 1 and other replicas alive? | Use removenode |
Continue to next question |
| Accept data loss? | Use removenode force or assassinate |
Try to recover the node |
Best Practices¶
Removenode Guidelines
- Verify node is truly dead - Don't remove a node that might rejoin
- Check replication first - Ensure data can be reconstructed
- One at a time - Never remove multiple nodes simultaneously
- Monitor progress - Watch streaming and logs
- Repair after - Run repair on remaining nodes
- Document - Record which nodes were removed and when
Related Commands¶
| Command | Relationship |
|---|---|
| decommission | Remove live node |
| assassinate | Force remove stuck node |
| status | Check cluster state |
| netstats | Monitor streaming |
| repair | Run after removal |