Memory consumption and checkpointed data seems to increase incrementally when reduceByKeyAndWIndow with inverse function is used with mapWithState in Stateful streaming

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
SRK
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Memory consumption and checkpointed data seems to increase incrementally when reduceByKeyAndWIndow with inverse function is used with mapWithState in Stateful streaming

SRK
This post has NOT been accepted by the mailing list yet.
Hi,

Memory consumption and checkpointed data seems to increase incrementally when reduceByKeyAndWindow with inverse function is used with mapWithState.

My application uses stateful streaming with mapWithState. The keys generated by mapWithState are then used by reduceByKeyAndWindow to do rolling counts for  24 hours. The MapWithStateRDD seems to be getting persisted forever even though I have checkpointing enabled every 10 minutes and the ShuffledRDD generated by reduceByKeyAndWindow seems to be getting incremented in memory linearly. Any idea why this happens?

Is it a possibility that ShuffledRDD is caching some data from mapWithState as it is dependent on that for keys?



Thanks,
Swetha
Loading...