Notes: Kafka Producers
source: Kafka: The Definitive Guide by Neha Narkhede, Gwen Shapira, Todd Palino (Chapter 3)
Producers write messages to Kafka
- you can use the built-in API or use a third party one
ProducerRecord - A key/value pair to be sent to Kafka
- includes (required) topic name, an (optional) partition number, and an (optional) key and value.
The producer serializes the key and value objects into ByteArrays, then sends it to a partitioner
- it adds the record to a batch of records. All these records will be sent to the same topic and partition
- a separate thread actually sends the batches to the right Kafka brokers
The broker acks when it gets the messages
Serializers
Kafka includes serializers for integers and byteArrays by default, but that doesn’t cover other types. It’s recommend that people use a generic serialization library (like Protobuf)
Partitions
Kafka uses a message’s keys to determine which topic partition to put the message into
- if you don’t include a key, the producer will chose a partition at random (round robin)
- if you do include a key, the producer hashes it and maps the hash to a partition
Because the partition depends on the key:
- You can use this information to optimize your application
- if the partition is unavailable, you’ll get an error
- once you add more partitions, Kafka can’t guarantee that messages map to partitions