Activity
Prerequisites
In this section, we will:
-
Examine the
docker-compose.yml
file that defines our Kakfa cluster environment -
Start 3
controller,broker
cluster nodes that are defined in thedocker-compose.yml
file -
Create 2 topics, each with 12 partitions
-
In the gitpod explorer hierarchy, expand
kraft
and clickdocker-compose.yml
. -
Notice the following settings that are being established:
-
Line 20 sets
metadata.log.segment.ms
to `15000`or 15 seconds. This is the maximum time before a new metadata log file is rolled out (in milliseconds).20 KAFKA_METADATA_LOG_SEGMENT_MS: 15000
-
Line 21 sets
metadata.max.retention.ms
to1200000
or 20 minutes. When a snapshot (.checkpoint
) file is this age, it is deleted unless it is the only available snapshot:21 KAFKA_METADATA_MAX_RETENTION_MS: 1200000
When a snapshot is deleted, at the same time all
.log
files other than the one created immediatly prior to the next snapshot will also be deleted. -
Line 22 sets
metadata.log.max.record.bytes.between.snapshots
to2800
. This configures the node to create a snapshot whenever the__cluster_metadata
log size increases by 3000 bytes since the last snapshot:22 KAFKA_METADATA_LOG_MAX_RECORD_BYTES_BETWEEN_SNAPSHOTS: 2800
All nodes in the cluster are configured with these settings. They are artificially small to force log segments and snapshot files to be created so that we can observe how the log cleaner operates on the
__cluster_metadata
topic.
-
-
Close the
docker-compose.yml
tab. -
Maximize the terminal window.
-
In the terminal window, navigate to the exercise directory:
cd kraft
-
Create the server containers:
docker-compose create server-1 server-2 server-3 server-4
-
Start the three
'controller/broker'
nodes:docker-compose start server-1 server-2 server-3
-
Verify they are up and running:
docker-compose ps
-
Create two 12 partition topics:
docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create --topic topic-1 \ --replication-factor 3 \ --partitions 12 && \ docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create --topic topic-2 \ --replication-factor 3 \ --partitions 12
Examine the Logs Directory
In this section, we will:
-
Examine the contents of the server logs file
-
Observe the creation of a cluster metadata snapshot
-
List the contents of the logs directory.
docker-compose exec server-1 \ ls /tmp/kraft-combined-logs
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
Notice that a metadata snapshot (
.checkpoint
file) hasn’t been created yet. The size of the.log
has not yet reached the 2800 byte size that will trigger its creation. We will now reassign the topic-1 partitions which will result in the log size being pushed over the 2800 byte size.NOTE: The
.snapshot
file is related to producer idempotency and while it exists for the__cluster_metadata
topic, it serves no real purpose at this time. Since this file name extension was already in use by Kafka,.checkpoint
was chosen as the file name extension for the metadata snapshot. -
Reassign the preferred leader for each
topic-1
partition:docker-compose exec server-1 \ kafka-reassign-partitions \ --bootstrap-server server-1:39094 \ --reassignment-json-file /tmp/reassign-topic-1-a.json \ --execute
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
Notice a snapshot
.checkpoint
file has now been created. -
Save the
.checkpoint
file name to an environment variable for use in upcoming commands.export SNAPSHOT_1=<file name>.checkpoint
For example:
export SNAPSHOT_1=00000000000000000042-0000000001.checkpoint
Generate Additional Cluster Metadata Events and a Second Snapshot
In this section, we will:
-
Start
server-4
-
Reassign
topic-1
andtopic-2
partitions multiple times to cause a second snapshot to be created
-
Start
server-4
:docker-compose start server-4
-
Run partition reassignment script:
scripts/reassign_partitions.sh
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
Notice a second snapshot
.checkpoint
file has now been created. -
Save the
.checkpoint
file name to an environment variable for use in upcoming commands.export SNAPSHOT_2=<file name>.checkpoint
Compare the Snapshot File Contents
In this section, we will verify:
-
The first snapshot metadata structure does not include
server-4
since it was created prior toserver-4
being started -
The second snapshot metadata structure does include
server-4
since it was created afterserver-4
was started
-
List cluster brokers in the first snapshot:
docker-compose exec server-1 \ kafka-metadata-shell \ --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_1 \ ls brokers
Notice the broker list does not include
server-4
. -
List cluster brokers in the snapshot:
docker-compose exec --env SNAPSHOT=$SNAPSHOT server-1 \ kafka-metadata-shell \ --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_2 \ ls brokers
Notice the broker list does include
server-4
.
Observe Partition Leader Election Related Events in the Cluster Metadata Log
In this section, we will:
-
Add a new single partition topic with
server-4
as its leader -
Kill
server-4
-
Examine the partition leader election related events written to the cluster metadata log by the active controller
-
Create a new topic with
server-4
as the leader:docker-compose exec server-1 \ kafka-topics --bootstrap-server server-1:39094 \ --create \ --topic topic-3 \ --replica-assignment 4:1:2
-
Kill
server-4
:docker-compose kill server-4
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
Note the filename of the active log segment.
-
Review the
__cluster_metadata
log for events related to the failed broker and new leader election fortopic-3
:docker-compose exec server-1 \ kafka-dump-log --cluster-metadata-decoder \ --files /tmp/kraft-combined-logs/__cluster_metadata-0/<active log segment filename>
-
Locate the
FENCE_BROKER_RECORD`and `PARTITION_CHANGE_RECORD
at the end of the log.These events were written to the log by the active controller. When other controllers and brokers replicate these events to their log, they update their in-memory metadata structure with these changes.
Observe the Cleanup Process in Action
In this section, we will:
-
Wait 20 minutes for the snapshot retention period set by
metadata.max.retention.ms
to expire -
Observe the initial snapshot
.checkpoint
file is deleted since its age is greater than the retention period and a more recent snapshot exists. All__cluster_metadata
log segments whoseEndOffset
is less than the next snapshotLogStartOffset
are also deleted.
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
-
Wait until 20 minutes has elapsed since the first snapshot file was created.
-
List the contents of
__cluster_metadata-0
directory:docker-compose exec server-1 \ ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
Observe the directory has been cleaned.
The cleaner runs every 5 minutes so you may have to wait a little longer than 20 minutes for this directory cleanup to happen.