Activity
Prerequisites
In this section, we will:
-
Examine the
docker-compose.ymlfile that defines our Kakfa cluster environment -
Start 3
controller,brokercluster nodes that are defined in thedocker-compose.ymlfile -
Create 2 topics, each with 12 partitions
-
In the gitpod explorer hierarchy, expand
kraftand clickdocker-compose.yml. -
Notice the following settings that are being established:
-
Line 20 sets
metadata.log.segment.msto `15000`or 15 seconds. This is the maximum time before a new metadata log file is rolled out (in milliseconds).20 KAFKA_METADATA_LOG_SEGMENT_MS: 15000
-
Line 21 sets
metadata.max.retention.msto1200000or 20 minutes. When a snapshot (.checkpoint) file is this age, it is deleted unless it is the only available snapshot:21 KAFKA_METADATA_MAX_RETENTION_MS: 1200000
When a snapshot is deleted, at the same time all
.logfiles other than the one created immediatly prior to the next snapshot will also be deleted. -
Line 22 sets
metadata.log.max.record.bytes.between.snapshotsto2800. This configures the node to create a snapshot whenever the__cluster_metadatalog size increases by 3000 bytes since the last snapshot:22 KAFKA_METADATA_LOG_MAX_RECORD_BYTES_BETWEEN_SNAPSHOTS: 2800
All nodes in the cluster are configured with these settings. They are artificially small to force log segments and snapshot files to be created so that we can observe how the log cleaner operates on the
__cluster_metadatatopic.
-
-
Close the
docker-compose.ymltab. -
Maximize the terminal window.
-
In the terminal window, navigate to the exercise directory:
cd kraft -
Create the server containers:
docker-compose create server-1 server-2 server-3 server-4 -
Start the three
'controller/broker'nodes:docker-compose start server-1 server-2 server-3 -
Verify they are up and running:
docker-compose ps -
Create two 12 partition topics:
docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create --topic topic-1 \ --replication-factor 3 \ --partitions 12 && \ docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create --topic topic-2 \ --replication-factor 3 \ --partitions 12
Examine the Logs Directory
In this section, we will:
-
Examine the contents of the server logs file
-
Observe the creation of a cluster metadata snapshot
-
List the contents of the logs directory.
docker-compose exec server-1 \ ls /tmp/kraft-combined-logs -
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0Notice that a metadata snapshot (
.checkpointfile) hasn’t been created yet. The size of the.loghas not yet reached the 2800 byte size that will trigger its creation. We will now reassign the topic-1 partitions which will result in the log size being pushed over the 2800 byte size.NOTE: The
.snapshotfile is related to producer idempotency and while it exists for the__cluster_metadatatopic, it serves no real purpose at this time. Since this file name extension was already in use by Kafka,.checkpointwas chosen as the file name extension for the metadata snapshot. -
Reassign the preferred leader for each
topic-1partition:docker-compose exec server-1 \ kafka-reassign-partitions \ --bootstrap-server server-1:39094 \ --reassignment-json-file /tmp/reassign-topic-1-a.json \ --execute -
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0Notice a snapshot
.checkpointfile has now been created. -
Save the
.checkpointfile name to an environment variable for use in upcoming commands.export SNAPSHOT_1=<file name>.checkpointFor example:
export SNAPSHOT_1=00000000000000000042-0000000001.checkpoint
Generate Additional Cluster Metadata Events and a Second Snapshot
In this section, we will:
-
Start
server-4 -
Reassign
topic-1andtopic-2partitions multiple times to cause a second snapshot to be created
-
Start
server-4:docker-compose start server-4 -
Run partition reassignment script:
scripts/reassign_partitions.sh -
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0Notice a second snapshot
.checkpointfile has now been created. -
Save the
.checkpointfile name to an environment variable for use in upcoming commands.export SNAPSHOT_2=<file name>.checkpoint
Compare the Snapshot File Contents
In this section, we will verify:
-
The first snapshot metadata structure does not include
server-4since it was created prior toserver-4being started -
The second snapshot metadata structure does include
server-4since it was created afterserver-4was started
-
List cluster brokers in the first snapshot:
docker-compose exec server-1 \ kafka-metadata-shell \ --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_1 \ ls brokersNotice the broker list does not include
server-4. -
List cluster brokers in the snapshot:
docker-compose exec --env SNAPSHOT=$SNAPSHOT server-1 \ kafka-metadata-shell \ --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_2 \ ls brokersNotice the broker list does include
server-4.
Observe Partition Leader Election Related Events in the Cluster Metadata Log
In this section, we will:
-
Add a new single partition topic with
server-4as its leader -
Kill
server-4 -
Examine the partition leader election related events written to the cluster metadata log by the active controller
-
Create a new topic with
server-4as the leader:docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create \ --topic topic-3 \ --replica-assignment 4:1:2 -
Kill
server-4:docker-compose kill server-4 -
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0Note the filename of the active log segment.
-
Review the
__cluster_metadatalog for events related to the failed broker and new leader election fortopic-3:docker-compose exec server-1 \ kafka-dump-log --cluster-metadata-decoder \ --files /tmp/kraft-combined-logs/__cluster_metadata-0/<active log segment filename> -
Locate the
FENCE_BROKER_RECORD`and `PARTITION_CHANGE_RECORDat the end of the log.These events were written to the log by the active controller. When other controllers and brokers replicate these events to their log, they update their in-memory metadata structure with these changes.
Observe the Cleanup Process in Action
In this section, we will:
-
Wait 20 minutes for the snapshot retention period set by
metadata.max.retention.msto expire -
Observe the initial snapshot
.checkpointfile is deleted since its age is greater than the retention period and a more recent snapshot exists. All__cluster_metadatalog segments whoseEndOffsetis less than the next snapshotLogStartOffsetare also deleted.
-
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0 -
Wait until 20 minutes has elapsed since the first snapshot file was created.
-
List the contents of
__cluster_metadata-0directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0Observe the directory has been cleaned.
The cleaner runs every 5 minutes so you may have to wait a little longer than 20 minutes for this directory cleanup to happen.