structured streaming with mapGroupWithState

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

structured streaming with mapGroupWithState

Srinivas V
Anyone using this combination for prod? I am planning to use for a use case with 15000 events per second from few Kafka topics. Through events are big, I would just have to take the businessIds, frequency, first and last event timestamp and save this into mapGroupWithState. I need to keep them for a window if say 20 mins then push them to output Kafka. Total memory of the state will not be more than say 50MB as I have limited number of businessIds say 1 million.
Questions,
1.Want you to share any issues you might have faced or I may face.
2. How to debug if I am unable to keep up with inflow of events and lag is increasing constantly?

Regards
Sri