The following blog will make you understand about the Kafka Consumer. It mainly focuses on java code to create your sample consumer.
Implementation Using Java
1. Import Necessary java Packages;
2. Create two variables which will define the topicName and groupId. The topicName will define the topic from which the kafka consumer is reading out the data or messages. The groupId is a string which binds all the consumer in a single group. Earlier versions of kafka (0.8) groupId was not necessary in case of single consumer but after that even if you start single or multiple consumer just like in this example; you have to specify the groupId.
3. Create Properties file Object and put some required configurations into it.
a) bootstrap.servers is the list of kafka brokers addresses.
b) StringDeserializer is the class which is used to deserialize your key and value. It is known to you that kafka accepts the data in the form of array of bytes. So the process of converting your message into bytes is known as serialization and again from bytes to your readable string it is known as desearilization.
c) group.id will be that id which will bind the all consumers in a single group.
4) Create Kafka Consumer Object
The constructor of KafkaConsumer takes the properties class object as an argument.
5) Now its time to subscribe to the topics. The kafka consumer can subscribe to more than one topic that’s why the subscribe method takes an list of topics as an argument.
6)The while loop is responsible for fetching your messages. The poll method will return some messages and you process them and again call to poll will return some messages. The parameter to poll method is timeout. If there is no data or messages to poll then you don’t want poll to be hang there. So the value inside poll method specifies that how quickly you want poll method will return with or without data.
7) The poll method is very powerful and takes care of lot of things like all the coordination, partition re balances and heartbeat for group coordinator. Internally the first call to poll finds the group coordinator, joins the group, receives partition assignment and fetches some messages. Every call to poll method sends heartbeat to group coordinator. So it is mandatory that whatever you do with consumer is fast and efficient because if don’t call the poll method for a while then group coordinator will assume the consumer is dead and trigger a partition re-balance activity. The while is said to be infinite loop because the consumers are long running processes as they keep on receiving the data or messages from kafka.
7) Lastly, run your java program and in order to test your program just run the kafka console producer on your terminal and you will see that the messages sent by you through console will be fetch by your sample consumer.
If you have any confusion then pls refer the whole java program as shown below.
Hope Now you get an idea to program a kafka consumer. In next parts, i will be covering the various aspects of kafka consumer. Till then keep reading Kafka.