Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka rebalances during the rolling restart #1642

Open
filimonov opened this issue Feb 19, 2025 · 0 comments
Open

Kafka rebalances during the rolling restart #1642

filimonov opened this issue Feb 19, 2025 · 0 comments

Comments

@filimonov
Copy link
Member

If you have a lot of nodes which run a lot of Kafka tables the rolling restart can lead to a terrible sequence of the rebalances.

node1 going down
every kafka table on that node get shutting down
that triggers rebalances in consumer groups
rebalance is 'stop the world' thing in clickhouse
so all other replicas pauses the consumtion and start the relabalnce protocol to redestribute the topics / partitions
it usually takes seconds to dozen of seconds till they will get the new assignment
in the meanwhile the node1 get back online and trigger one more rebalance

then the situation repeats for other nodes.

Possible solution:

let's introduce some setting like stopSteamingTablesDuringRestarts
when enabled clickhouse-operator before restarting the first node should
do

DETACH TABLE db.table ON CLUSTER '{cluster}' PERMANENTLY

for every table with engine Kafka (maybe also RabbitMQ? and others)
and store in the state that the table were detached (wouldn't it be too much?)
after that do normal reconsile / restarts.

in cases or success / failure do ATTACH TABLE for every table stored in the state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant