kafka擴大集群
小編:管理員 22閱讀 2022.07.25
Adding servers to a Kafka cluster is easy, just assign them a unique broker id and start up Kafka on your new servers. However these new servers will not automatically be assigned any data partitions, so unless partitions are moved to them they won't be doing any work until new topics are created. So usually when you add machines to your cluster you will want to migrate some existing data to these machines.
增加新服務到kafka集群是很容易的,只要為新服務分配一個獨一無二的Broker ID并啟動即可。但是,新的服務不會自動分配到任何數據,需要把分區數據遷移給它們,在此期間它們一直不工作,直到新的topic創建,所以,通常向集群添加機器時,你需要將一些現有的數據遷移到這些機器上。
The process of migrating data is manually initiated but fully automated. Under the covers what happens is that Kafka will add the new server as a follower of the partition it is migrating and allow it to fully replicate the existing data in that partition. When the new server has fully replicated the contents of this partition and joined the in-sync replica one of the existing replicas will delete their partition's data.
遷移數據的過程是手動啟動的,但是執行過程是完全自動化的。在kafka后臺內部中,kafka將添加新的服務器,并作為正在遷移分區的follower,來完全復制該分區現有的數據。當新服務器完全復制該分區的內容并加入同步副本,成為現有副本之一后,就將現有的副本分區上的數據刪除。
The partition reassignment tool can be used to move partitions across brokers. An ideal partition distribution would ensure even data load and partition sizes across all brokers. In 0.8.1, the partition reassignment tool does not have the capability to automatically study the data distribution in a Kafka cluster and move partitions around to attain an even load distribution. As such, the admin has to figure out which topics or partitions should be moved around.
分區重新分配工具可以用于跨broker遷移分區,理想的分區分配將確保所有的broker數據負載和分區大小。分區分配工具沒有自動研究kafka集群的數據分布和遷移分區達到負載分布的能力,因此,管理員要弄清楚哪些topic或分區應該遷移。
The partition reassignment tool can run in 3 mutually exclusive modes -
分區分配工具的3種模式 -
-
--generate: In this mode, given a list of topics and a list of
brokers, the tool generates a candidate reassignment to move all
partitions of the specified topics to the new brokers. This option
merely provides a convenient way to generate a partition reassignment
plan given a list of topics and target brokers.
--generate: 這個選項命令,是生成分配規則json文件的,生成“候選人”重新分配到指定的topic的所有parition都移動到新的broker。此選項,僅提供了一個方便的方式來生成特定的topic和目標broker列表的分區重新分配 “計劃”。
-
--execute: In this mode, the tool kicks off the reassignment of
partitions based on the user provided reassignment plan. (using the
--reassignment-json-file option). This can either be a custom
reassignment plan hand crafted by the admin or provided by using the
--generate option
--execute: 這個選項命令,是執行你用--generate 生成的分配規則json文件的,(用--reassignment-json-file 選項),可以是自定義的分配計劃,也可以是由管理員或通過--generate選項生成的。
-
--verify: In this mode, the tool verifies the status of the
reassignment for all partitions listed during the last --execute. The
status can be either of successfully completed, failed or in progress
--verify: 這個選項命令,是驗證執行--execute重新分配后,列出所有分區的狀態,狀態可以是成功完成,失敗或正在進行中的。
自動將數據遷移到新機器
The partition reassignment tool can be used to move some topics off of the current set of brokers to the newly added brokers. This is typically useful while expanding an existing cluster since it is easier to move entire topics to the new set of brokers, than moving one partition at a time. When used to do this, the user should provide a list of topics that should be moved to the new set of brokers and a target list of new brokers. The tool then evenly distributes all partitions for the given list of topics across the new set of brokers. During this move, the replication factor of the topic is kept constant. Effectively the replicas for all partitions for the input list of topics are moved from the old set of brokers to the newly added brokers.
使用分區重新分配工具將從當前的broker集的一些topic移到新添加的broker。同時擴大現有集群,因為這很容易將整個topic移動到新的broker,而不是每次移動一個parition,你要提供新的broker和新broker的目標列表的topic列表(就是剛才的生成的json文件)。然后工具將根據你提供的列表把topic的所有parition均勻地分布在所有的broker,topic的副本保持不變。
For instance, the following example will move all partitions for topics
foo1,foo2 to the new set of brokers 5,6. At the end of this move, all
partitions for topics foo1 and foo2 will only exist on brokers 5,6
例如,下面的例子將主題foo1,foo2的所有分區移動到新的broker 5,6。移動結束后,主題foo1和foo2所有的分區都會只會在broker 5,6。
注意:站長友情提示各位kafka學習者,下面所有的json文件,都是要你自己新建的,不是自動創建的,需要你自己把生成的規則復制到你新建的json文件里,然后執行。
Since, the tool accepts the input list of topics as a json file, you
first need to identify the topics you want to move and create the json
file as follows-
執行遷移工具需要接收一個json文件,首先需要你確認topic的遷移計劃并創建json文件,如下所示
> cat topics-to-move.json {"topics": [{"topic": "foo1"}, {"topic": "foo2"}], "version":1 }
Once the json file is ready, use the partition reassignment tool to generate a candidate assignment-
一旦json準備好,使用分區重新分配工具生成一個“候選人”分配規則 -
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file topics-to-move.json --broker-list "5,6" --generate Current partition replica assignment {"version":1, "partitions":[{"topic":"foo1","partition":2,"replicas":[1,2]}, {"topic":"foo1","partition":0,"replicas":[3,4]}, {"topic":"foo2","partition":2,"replicas":[1,2]}, {"topic":"foo2","partition":0,"replicas":[3,4]}, {"topic":"foo1","partition":1,"replicas":[2,3]}, {"topic":"foo2","partition":1,"replicas":[2,3]}] } Proposed partition reassignment configuration {"version":1, "partitions":[{"topic":"foo1","partition":2,"replicas":[5,6]}, {"topic":"foo1","partition":0,"replicas":[5,6]}, {"topic":"foo2","partition":2,"replicas":[5,6]}, {"topic":"foo2","partition":0,"replicas":[5,6]}, {"topic":"foo1","partition":1,"replicas":[5,6]}, {"topic":"foo2","partition":1,"replicas":[5,6]}] }
The tool generates a candidate assignment that will move all partitions
from topics foo1,foo2 to brokers 5,6. Note, however, that at this point,
the partition movement has not started, it merely tells you the current
assignment and the proposed new assignment. The current assignment
should be saved in case you want to rollback to it. The new assignment
should be saved in a json file (e.g. expand-cluster-reassignment.json)
to be input to the tool with the --execute option as follows-
生成從主題foo1,foo2遷移所有的分區到broker 5,6的候選人分配規則。注意,這個時候,遷移還沒有開始,它只是告訴你當前分配和新的分配規則,當前分配規則用來回滾,新的分配規則保存在json文件(例如,我保存在 expand-cluster-reassignment.json這個文件下)然后,用--execute選項來執行它。
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file expand-cluster-reassignment.json --execute Current partition replica assignment {"version":1, "partitions":[{"topic":"foo1","partition":2,"replicas":[1,2]}, {"topic":"foo1","partition":0,"replicas":[3,4]}, {"topic":"foo2","partition":2,"replicas":[1,2]}, {"topic":"foo2","partition":0,"replicas":[3,4]}, {"topic":"foo1","partition":1,"replicas":[2,3]}, {"topic":"foo2","partition":1,"replicas":[2,3]}] } Save this to use as the --reassignment-json-file option during rollback Successfully started reassignment of partitions {"version":1, "partitions":[{"topic":"foo1","partition":2,"replicas":[5,6]}, {"topic":"foo1","partition":0,"replicas":[5,6]}, {"topic":"foo2","partition":2,"replicas":[5,6]}, {"topic":"foo2","partition":0,"replicas":[5,6]}, {"topic":"foo1","partition":1,"replicas":[5,6]}, {"topic":"foo2","partition":1,"replicas":[5,6]}] }
Finally, the --verify option can be used with the tool to check the status of the partition reassignment. Note that the same expand-cluster-reassignment.json (used with the --execute option) should be used with the --verify option
最后,--verify 選項用來檢查parition重新分配的狀態,注意, expand-cluster-reassignment.json(與--execute選項使用的相同)和--verify選項一起使用。
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file expand-cluster-reassignment.json --verify Status of partition reassignment: Reassignment of partition [foo1,0] completed successfully Reassignment of partition [foo1,1] is in progress Reassignment of partition [foo1,2] is in progress Reassignment of partition [foo2,0] completed successfully Reassignment of partition [foo2,1] completed successfully Reassignment of partition [foo2,2] completed successfully
分區重新分配工具也可以有選擇性將分區副本移動到指定的broker。當用這種方式,假定你已經知道了分區規則,不需要通過工具生成規則,可以跳過--generate,直接使用—execute
For instance, the following example moves partition 0 of topic foo1 to brokers 5,6 and partition 1 of topic foo2 to brokers 2,3
例如,下面的例子是移動主題foo1的分區0到brokers 5,6 和主題foo2的分區1到broker 2,3。
The first step is to hand craft the custom reassignment plan in a json file-
第一步是,手工寫一個自定義的分配計劃到json文件中 -
> cat custom-reassignment.json {"version":1,"partitions":[{"topic":"foo1","partition":0,"replicas":[5,6]},{"topic":"foo2","partition":1,"replicas":[2,3]}]}
Then, use the json file with the --execute option to start the reassignment process-
然后,--execute 選項執行分配處理 -
> bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file custom-reassignment.json --execute Current partition replica assignment {"version":1, "partitions":[{"topic":"foo1","partition":0,"replicas":[1,2]}, {"topic":"foo2","partition":1,"replicas":[3,4]}] } Save this to use as the --reassignment-json-file option during rollback Successfully started reassignment of partitions {"version":1, "partitions":[{"topic":"foo1","partition":0,"replicas":[5,6]}, {"topic":"foo2","partition":1,"replicas":[2,3]}] }
The --verify option can be used with the tool to check the status of the partition reassignment. Note that the same expand-cluster-reassignment.json (used with the --execute option) should be used with the --verify option
最后使用--verify 驗證。
bin/kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file custom-reassignment.json --verify Status of partition reassignment: Reassignment of partition [foo1,0] completed successfully Reassignment of partition [foo2,1] completed successfully
相關推薦
- kafka消費者Java客戶端 一個從kafka集群中獲取消息的java客戶端。kafka客戶端從kafka集群中獲取消息,并透明地處理kafka集群中出現故障broker,透明地調節適應集群中變化的數據分區。也和broker交互,負載平衡消費者。public class KafkaConsumerK,V extends Object implements Consu…
- Apache RocketMQ 社區創建和協同創新 去年,我曾經撰寫了一篇關于 非英語系國家的社區是如何理解并使用 Apache way 進行開放式創新 的博客。在那篇文章里,我表達了作為一名開發者的期待,即能夠熟練地使用郵件列表功能,認真傾聽社區的聲音,再做出決策。此外,開源社區也可以多開展類似“ GSoC ”…