Assume you have multiple data files. We want to create an application that takes a
list of data files as an argument and does following.
- Instantiate a Kafka Producer
- Create one independent thread to process each data file. For example, if you are
supplying three data files to the application, it must create three threads, one
for each data file.
- Start independent threads with one data file for each thread and share the same
Kafka Producer Instance among all threads.
- Each thread is responsible for reading records from one file and sending them to
Kafka producer in parallel.
- Wait for all threads to complete processing and finally, close the application.
This example is an excerpt from the Book Kafka Streams –
Real-time Stream Processing
For a detailed explanation of the example and much more, you can get access to
the Book using below link.
Creating Multithreaded Kafka Producer
The solution to the given problem can be implemented in two parts.
- Create a reusable message dispatcher that can send messages to Kafka.
- Create a main application that starts multiple threads using the dispatcher.
Creating a Kafka Message Dispatcher
Code Listing below shows the code snippet for the Dispatcher
Creating Kafka Producer
The next part of the puzzle is to implement a main() method to create threads and
start them. The main() method is implemented in the DispatcherDemo class
as shown in
the Code Listing below.
You can access fully function project in our GitHub folder.