Token balancing in a blank new Cassandra-Cluster

docker cassandra token

As described in this link https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.htmlit seems to be the solution, at least for the distribution of tokens and data for a keyspace.The following steps i take to get a balanced system:

Setup cassandra.yaml for the seed-node with (for my testcase num_tokens=8) let the other parameter as default
startup the seednode, wait until ready
connect via cqlsh or programmatic solution and create the keyspace (for my test-case with replication-factor=1).
shutdown the seed-node
edit the cassandra.yaml of the seed-node and outcomment/add the parameter for allocate_tokens_for_keyspace: [your_keyspace_name_from_step_3]
startup the seed-node and wait until the node is ready
edit the cassandra.yaml for the second node in the cluster take the step 5. in this file and the num_token equal to num_token of the seed-node.
run the second node an wait until it is ready
do the steps 7-8 for any other node in your cluster.

With that and e.g. a testrun with adding 2.000.000 datarows in a testtable in the keyspace i see the following result:

docker exec -ti docker_cassandra-seed_1 nodetool statusDatacenter: tc1===============Status=Up/Down|/ State=Normal/Leaving/Joining/Moving--  Address      Load       Tokens       Owns (effective)  Host ID                               RackUN  172.30.10.4  36.03 MiB  8            33.3%             1e0d781f-d71f-4704-bcd1-efb5d4caff0e  rack1UN  172.30.10.2  36.75 MiB  8            33.3%             56287b3c-b0f1-489f-930e-c7b00df896f3  rack1UN  172.30.10.3  36.03 MiB  8            33.3%             943acc5f-7257-414a-b36c-c06dcb53e67d  rack1

Even the tokendistribution ist better then before:

172.30.10.2                         6.148.914.691.236.510.000172.30.10.3                         6.148.914.691.236.520.000172.30.10.4                         5.981.980.531.853.070.000

At the moment, there is some clarification about the problem with the uneven distribution, so thank you again Chris Lohfink for the link with the solution.

docker cassandra token

I´ve testing a little bit around the above scenario.My testcluster consists of 5 nodes (1 seed, 4 normal nodes).

The first 5 steps from above remains valid:

Setup cassandra.yaml for the seed-node with (for my testcasenum_tokens=8) let the other parameter as default
startup the seednode, wait until ready
connect via cqlsh or programmaticsolution and create the keyspace (for my test-case withreplication-factor=1).
shutdown the seed-node, edit the cassandra.yaml of the seed-node and outcomment/add the parameter for allocate_tokens_for_keyspace: [your_keyspace_name_from_step_3]
startup the seed-node and wait until the node is ready

Then, you can startup all the other nodes (in my case 4) at same time (or with 1 minute delay between startup of each node), but automated. Important is, that all nodes have the allocate_tokens_for_keyspace: [your_keyspace....] set.

After all nodes are up, and fill with 1.000.000 rows, there´s an even balance of 20%.

That scenario makes the life easier, if you start a cluster with a lot of nodes.

CodeHunter

Token balancing in a blank new Cassandra-Cluster

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last