Explore a KRaft Cluster

Activity

Prerequisites

In this section, we will:

Examine the docker-compose.yml file that defines our Kakfa cluster environment
Start 3 controller,broker cluster nodes that are defined in the docker-compose.yml file
Create 2 topics, each with 12 partitions

In the gitpod explorer hierarchy, expand kraft and click docker-compose.yml.
Notice the following settings that are being established:
1. Line 20 sets metadata.log.segment.ms to `15000`or 15 seconds. This is the maximum time before a new metadata log file is rolled out (in milliseconds).
  20 KAFKA_METADATA_LOG_SEGMENT_MS: 15000
2. Line 21 sets metadata.max.retention.ms to 1200000 or 20 minutes. When a snapshot (.checkpoint) file is this age, it is deleted unless it is the only available snapshot:
  21 KAFKA_METADATA_MAX_RETENTION_MS: 1200000
  When a snapshot is deleted, at the same time all .log files other than the one created immediatly prior to the next snapshot will also be deleted.
3. Line 22 sets metadata.log.max.record.bytes.between.snapshots to 2800. This configures the node to create a snapshot whenever the __cluster_metadata log size increases by 3000 bytes since the last snapshot:
  22 KAFKA_METADATA_LOG_MAX_RECORD_BYTES_BETWEEN_SNAPSHOTS: 2800
  All nodes in the cluster are configured with these settings. They are artificially small to force log segments and snapshot files to be created so that we can observe how the log cleaner operates on the __cluster_metadata topic.
Close the docker-compose.yml tab.
Maximize the terminal window.
In the terminal window, navigate to the exercise directory:
```
cd kraft
```

Create the server containers:

docker-compose create server-1 server-2 server-3 server-4

Start the three 'controller/broker' nodes:

docker-compose start server-1 server-2 server-3

Verify they are up and running:
```
docker-compose ps
```

Create two 12 partition topics:

docker-compose exec server-1 \
    kafka-topics --bootstrap-server server-1:39094 \
        --create --topic topic-1 \
        --replication-factor 3 \
        --partitions 12 && \
docker-compose exec server-1 \
    kafka-topics --bootstrap-server server-1:39094 \
        --create --topic topic-2 \
        --replication-factor 3 \
        --partitions 12

Examine the Logs Directory

In this section, we will:

Examine the contents of the server logs file
Observe the creation of a cluster metadata snapshot

List the contents of the logs directory.

docker-compose exec server-1 \
    ls /tmp/kraft-combined-logs

List the contents of __cluster_metadata-0 directory:
```
docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
```
Notice that a metadata snapshot (.checkpoint file) hasn’t been created yet. The size of the .log has not yet reached the 2800 byte size that will trigger its creation. We will now reassign the topic-1 partitions which will result in the log size being pushed over the 2800 byte size.

NOTE: The .snapshot file is related to producer idempotency and while it exists for the __cluster_metadata topic, it serves no real purpose at this time. Since this file name extension was already in use by Kafka, .checkpoint was chosen as the file name extension for the metadata snapshot.

Reassign the preferred leader for each topic-1 partition:

docker-compose exec server-1 \
    kafka-reassign-partitions \
        --bootstrap-server server-1:39094 \
        --reassignment-json-file /tmp/reassign-topic-1-a.json \
        --execute

List the contents of __cluster_metadata-0 directory:
```
docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
```
Notice a snapshot .checkpoint file has now been created.

Save the .checkpoint file name to an environment variable for use in upcoming commands.

export SNAPSHOT_1=<file name>.checkpoint

For example:

export SNAPSHOT_1=00000000000000000042-0000000001.checkpoint

Generate Additional Cluster Metadata Events and a Second Snapshot

In this section, we will:

Start server-4
Reassign topic-1 and topic-2 partitions multiple times to cause a second snapshot to be created

Start server-4:
```
docker-compose start server-4
```
Run partition reassignment script:
```
scripts/reassign_partitions.sh
```
List the contents of __cluster_metadata-0 directory:
```
docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
```
Notice a second snapshot .checkpoint file has now been created.
Save the .checkpoint file name to an environment variable for use in upcoming commands.
```
export SNAPSHOT_2=<file name>.checkpoint
```

Compare the Snapshot File Contents

In this section, we will verify:

The first snapshot metadata structure does not include server-4 since it was created prior to server-4 being started
The second snapshot metadata structure does include server-4 since it was created after server-4 was started

List cluster brokers in the first snapshot:

docker-compose exec server-1 \
     kafka-metadata-shell \
        --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_1 \
        ls brokers

Notice the broker list does not include server-4.

List cluster brokers in the snapshot:

docker-compose exec --env SNAPSHOT=$SNAPSHOT server-1 \
    kafka-metadata-shell \
        --snapshot /tmp/kraft-combined-logs/__cluster_metadata-0/$SNAPSHOT_2 \
        ls brokers

Notice the broker list does include server-4.

Observe Partition Leader Election Related Events in the Cluster Metadata Log

In this section, we will:

Add a new single partition topic with server-4 as its leader
Kill server-4
Examine the partition leader election related events written to the cluster metadata log by the active controller

Create a new topic with server-4 as the leader:

docker-compose exec server-1 \
    kafka-topics --bootstrap-server server-1:39094 \
        --create \
        --topic topic-3 \
        --replica-assignment 4:1:2

Kill server-4:
```
docker-compose kill server-4
```

List the contents of __cluster_metadata-0 directory:

docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0

Note the filename of the active log segment.

Review the __cluster_metadata log for events related to the failed broker and new leader election for topic-3:

docker-compose exec server-1 \
    kafka-dump-log --cluster-metadata-decoder \
        --files /tmp/kraft-combined-logs/__cluster_metadata-0/<active log segment filename>

Locate the FENCE_BROKER_RECORD`and `PARTITION_CHANGE_RECORD at the end of the log.

These events were written to the log by the active controller. When other controllers and brokers replicate these events to their log, they update their in-memory metadata structure with these changes.

Observe the Cleanup Process in Action

In this section, we will:

Wait 20 minutes for the snapshot retention period set by metadata.max.retention.ms to expire
Observe the initial snapshot .checkpoint file is deleted since its age is greater than the retention period and a more recent snapshot exists. All __cluster_metadata log segments whose EndOffset is less than the next snapshot LogStartOffset are also deleted.

List the contents of __cluster_metadata-0 directory:

docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0

Wait until 20 minutes has elapsed since the first snapshot file was created.
List the contents of __cluster_metadata-0 directory:
```
docker-compose exec server-1 \
    ls -al /tmp/kraft-combined-logs/__cluster_metadata-0
```
Observe the directory has been cleaned.

The cleaner runs every 5 minutes so you may have to wait a little longer than 20 minutes for this directory cleanup to happen.