The following blog will make you understand about the important configurations of Kafka Producer .Based upon the below configuration parameters you can configure your producer as per your requirement.

Kafka Producer Configs

Although there are alot of producer configurations but following are widely used. So the blog will tell you what they actually are and where to use these.

1) bootstrap.servers

2) key.serializer

3) value.serializer

4) partitioner.class

5) acks

6) retries

1) bootstrap.servers

bootstrap.servers is the list of kafka brokers which consists of url string which further includes IP address followed by port number of your Kafka server. Please keep in mind that it is the mandatory property for kafka producer and if you don’t specify it then your producer will not reach to the kafka cluster. It is generally advised to give atleast two addresses of kafka brokers because in case if one broker is down then you wil connect to other broker and perform your operation.

2) key.serializer & value.serializer

As you know that the kafka server accpets messages in the form of array of bytes as key and value pair. So the key.serializer takes the name of class (along with package name) to serialize your key in message and value.seriaizer accepts the name of class (along with package name) to serialize your value. It is important to note that you can use same class name for serializing your value and key.

3)partitioner.class

The partitioner.class property takes the name of class which acts as a custom partitioner in your kafka producer. You will be wondering that what is custom partitioner , how it works and when we use it. Don’t worry, i will be writing it in next blog and afterwards you will easily understand what is custom partitioner.

4) acks

The acks property basically refers to the acknowledgment send by the kafka broker to producer. We have already seen that whenever you send the data to kafka you will get a response as a RecordMetaData object in case of success while in case of failure you will get an Exception. The acks parameter can take three values:

i) acks =0

It is same as forget and send approach. The producer will not wait for the response and send the message data to kafka and forget it. If you set acks = 0 then you may face three side effects of it. Since, producer is not getting any response then there may be a chance that you will loose some messages. Secondly, the value 0 helps you in achieving high throughput because producer is sending as fast as messages it can send to kafka without getting any response. The third side effect is there is no possible chance of retries. So use the acks =0 when loosing some messages is not an big problem and it can give you the best possible high throughput.

ii) acks = 1

In this case the producer will wait for the response. The response will be send by the leader(one of the kafka broker). The leader will send the response after storing the message to its local disk. If the leader is down and message sending fails then producer can retries after few milliseconds. Please keep in mind that there is still no guarantee that your message will not be lost because if your leader crashes before making the copy(replica) of your message then you will loose your messages. So keep in mind this situation while setting acks to 1.

iii) acks = all

if you want 100% reliability i.e. all replicas make successful copy of your message before leader sends the acknowledgement then use acks = all. In this case, the leader will send the acknowledgement back to producer after receiving the acknowledgement from all the replicas that copy of message is successfully created. This setting will makes the process little bit slow because the leader will be waiting till it gets successful response from all the replicas in the cluster.

5) retries

The retries parameter is too simple as it defines that in case of any error, how many times the producer can retry sending its messages to kafka. There is another parameter associated with it is retry.backoff.ms. This parameter will control the time between two retries.

These were some basics configs related to kafka producer. I will keep you updated with remaining configs as well whenever it is necessary. Till then keep reading Kafka.

Comments

comments

About the author

Dixit Khurana