Saturday, September 16, 2023

Pentaho PDI : working with Mongo & Kafka while processing hierarchical JSON!!

Need: When you are working with JSON objects, sometimes you need to create hierarchical JSON and store into Mongo collection. Then you may read Mongo collection and stream under a Kafka topic!

Here is flight data coming from various CSV files and once the files are collected, then used a Date dimension to capture few more columns as part of JSON. That is the source JSON. The data in preview mode, looks like this.







Then once you process the data through hierarchical [plugin from PDI], need to install before launching Spoon. 





Once stored the hierarchical collection, then read the data from Mongo collection and produce under the Kafka stream. Provide necessary parameter under a secured Kafka broker as below.















Already Kafka listener (consumer) listening (also secured Kafka parameters are supplied to the consumer site as well given below. Let this transformation is running and if the topics are already in Kafka broker then keep processing through consumer. This consumer transformation refers to stream transformation and then produces data in text file output or any other format of output as desired. 






No comments: