#atom

Subtitle:

Fault-tolerance mechanism ensuring data durability and high availability in Apache Kafka


Core Idea:

Kafka replication creates and maintains copies of topic partitions across multiple broker servers, ensuring that data remains available and durable even when servers fail or require maintenance.


Key Principles:

  1. Leader-Follower Model:
    • Each partition has one leader broker and zero or more follower brokers
  2. Replication Factor:
    • Configurable number of copies maintained for each partition (typically 3)
  3. In-Sync Replicas (ISR):
    • Set of brokers that are currently in sync with the leader
  4. Leader Election:
    • Process of selecting a new leader when the current one fails

Why It Matters:


How to Implement:

  1. Set Replication Factor:
    • Configure topic-level replication when creating topics
  2. Configure Min.ISR:
    • Set minimum number of in-sync replicas required for writes
  3. Rack Awareness:
    • Distribute replicas across different physical racks or availability zones

Example:

# Create a topic with replication factor 3
bin/kafka-topics.sh --create --topic financial-transactions \
	--partitions 12 \
	--replication-factor 3 \
	--config min.insync.replicas=2 \
	--bootstrap-server localhost:9092

Topic description showing replication:

bin/kafka-topics.sh --describe --topic financial-transactions --bootstrap-server localhost:9092

Topic: financial-transactions   PartitionCount: 12   ReplicationFactor: 3
	Topic: financial-transactions   Partition: 0    Leader: 1   Replicas: 1,2,3   Isr: 1,2,3
	Topic: financial-transactions   Partition: 1    Leader: 2   Replicas: 2,3,1   Isr: 2,3,1
	...

Connections:


References:

  1. Primary Source:
    • Apache Kafka documentation on replication
  2. Additional Resources:
    • "Kafka: The Definitive Guide" (Chapter on Reliable Data Delivery)

Tags:

#kafka #replication #fault-tolerance #high-availability #distributed-systems


Connections:


Sources: