Spark streaming does not seem to clear MapPartitionsRDD and ShuffledRDD that are persisted after the use of updateStateByKey and reduceByKeyAndWindow with inverse functions even after checkpointing the data

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
SRK
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Spark streaming does not seem to clear MapPartitionsRDD and ShuffledRDD that are persisted after the use of updateStateByKey and reduceByKeyAndWindow with inverse functions even after checkpointing the data

SRK
This post has NOT been accepted by the mailing list yet.
Hi,

Spark streaming does not seem to clear MapPartitionsRDD and ShuffledRDD that are persisted after the use of updateStateByKey and reduceByKeyAndWindow with inverse functions even after checkpointing the data. Any idea as to why thing happens? Is there a way that I can set a time out to clear the persisted data after a while? It seems to be not clearing the cached MapPartitionsRDD and ShuffledRDD even after I explicitly call unpersist and also do the checkpointing.

Thanks!
Loading...